github lukilabs/craft-agents-oss v0.8.12

6 hours ago

v0.8.12 — GPT-5.5 + DeepSeek providers, Pi tool registration restored, composer & diff hardening

Features

  • GPT-5.5 is now the default for openai and openai-codex — Pi SDK 0.70.0 added gpt-5.5 to the OpenAI catalog, so PI_PREFERRED_DEFAULTS now picks it as the default model for both the openai and openai-codex auth providers instead of whatever the SDK returned first. New API-key connections and Craft Agents Backend (OpenAI) connections land on gpt-5.5 out of the box; existing connections keep their explicit model choice. Fixes #597.
  • DeepSeek is now a supported Pi-backed provider — Adds DeepSeek to PROVIDER_METADATA (dashboard URL), PI_PROVIDER_DISPLAY (label + placeholder), and PI_PREFERRED_DEFAULTS (deepseek-v4-pro / deepseek-v4-flash) so connections default to a modern model instead of whatever the Pi SDK returns first. The renderer picks up the new provider automatically via PI_AUTH_PROVIDER_DOMAINS (deepseek.com) for favicon resolution, the API-setup preset, and the settings page label. CLI gains a DEEPSEEK_API_KEY env key and extracts resolveApiKey, shouldSetupLlmConnection, and getProviderDisplayName as testable exports; --base-url auto-setup now works for non-anthropic providers and the validate step shares the same resolver path. Fixes #600.

Improvements

  • source_test now auto-enables and auto-restarts the turn so tools become callable immediately — Previously source_test only validated a source; users with a valid config but enabled: false had to flip the flag manually and restart the session, even though every check passed. The tool now flips enabled: true when needed and triggers the session's existing onSourceActivationRequest callback so the MCP/API servers are built and applied to the running agent. The follow-up fix (this release) routes the successful activation through the same source_activated + auto_retry machinery that already handled "tool not found on inactive source" errors: after activation, the current turn aborts cleanly and the renderer resends the user's original message with a [{slug} activated] suffix — giving the next query()/handlePrompt a fresh tool list with the new source live. This fixes a Claude-specific bug where source_test reported "tools available now" but the SDK had already frozen mcpServers at query-start, so mcp__{slug}__* tools were invisible to the model until the user typed another message. Pi behaves the same way for consistency (and also required a turn boundary — its subprocess only picks up new proxy tool defs on the next handlePrompt). Opt out with autoEnable: false to keep pure-validation behavior. No change to Codex or other backends without an activation callback — they still get the enabled flip and a clear "restart session to load tools" hint.
  • spawn_session accepts thinkingLevel — Agents can now set the reasoning level when delegating to a spawned session (off | low | medium | high | xhigh | max), instead of always inheriting the parent session's level or workspace default. Silently ignored on non-reasoning models (e.g. gpt-4o, gemini-2.5-flash): the Pi provider drivers and Claude SDK both gate the reasoning param on the model's capabilities, so passing thinkingLevel to a non-reasoning model is a safe no-op rather than an error. Also fixes a latent bug where createSession({ thinkingLevel }) in the session-manager API was silently ignored — the option is now honored with caller → workspace → global precedence, matching how permissionMode already worked. Fixes #462.
  • Real typecheck gate for pi-agent-server — The package's typecheck script was aliased to bun run build (bundler, not tsc), so API-shape drifts from the Pi SDK uplift slipped through CI (see the Pi-subprocess tool-registration fix below for the concrete regression that escaped). Added a dedicated tsc --noEmit -p tsconfig.typecheck.json step, wired into typecheck:all, plus ambient shims for turndown/pdfjs-dist/bash-parser so the new typecheck doesn't need @types packages. Fixed the cascade of pre-existing type drifts it surfaced (PiCredential vs AuthCredential, agent_end event shape, sdkTurnAnchor enrichment, CustomModelEntry at the dynamic-register call site, initConfig nullability in queryLlm closures, and an incorrect generic in web-fetch's result helper).

Bug Fixes

  • WebUI "Add New Label" no longer launches the desktop app — Typing #<new-label> in the WebUI chat input and clicking "Add New Label" previously opened a popover that, on submit, fired a craftagents://action/new-session deep link. The browser resolved that scheme by launching the Electron desktop app instead of creating the label in the browser. Root cause: the chat-input call site cherry-picked fields from the EditPopover config and dropped inlineExecution: true, falling back to the legacy same-window deep-link path (which happened to work inside Electron but broke across the WebUI ↔ OS boundary). Switched to a full config spread, matching how AppShell already invokes the same popover.
  • Attachments no longer leak between sessions (for real this time) — Attaching a file, switching sessions without sending, and switching back now restores the attachment in the original session across all four attach paths (file picker, OS drag-drop, clipboard paste, web drag). The first-pass fix assumed every attachment had a real OS path — true for Finder drag/OS picker, false for paste/web-drag where Chromium synthesises a File from a Blob with no disk origin — so draft refs fell back to filename-only values and failed to re-read on hydrate. The new persistence layer is hybrid: file-picker and OS-drag capture the absolute path via webUtils.getPathForFile (Electron 32+) and re-read on hydrate through a dedicated file:readUserAttachment RPC; paste and web-drag persist bytes inline in the draft (20 MB per-attachment cap — huge pastes log a warn and drop from the draft). Old 0.8.11-format drafts are rejected on load, so attachments saved by the previous broken release disappear once after upgrade instead of silently haunting the composer. Fixes #572.
  • Custom URL scheme links now open the right app — Clicking obsidian://, vscode://, zed://, notion://, slack://, and similar links in chat messages now dispatches to the OS protocol handler instead of being blocked (desktop) or rewritten to https://<host>/obsidian://... (WebUI). URL handling switched from a tight allowlist (http/https/mailto/craftdocs) to a blocklist of known-dangerous schemes (javascript:, data:, vbscript:, blob:, file:). The WebUI and Viewer now use an anchor-click fallback for non-http schemes so Chrome routes through the external-protocol dispatcher reliably. Fixes #590.
  • /compact no longer times out prematurely on GPT sessions — Manual compaction (including "Accept & compact" on a submitted plan) against Pi-backed OpenAI models failed after 60s because the subprocess RPC didn't leave room for GPT-5.4's long summary responses on large conversations. Bumped the timeout to 5 min — truly hung subprocesses are still caught by the stdio death watchdog. Claude sessions were unaffected (they use the SDK's native compact channel).
  • Pi subprocess tool registration restored — Pi SDK 0.70.0 quietly reshaped CreateAgentSessionOptions.tools from an array of tool objects into a string[] name allowlist. The subprocess kept passing AgentTool[], so at runtime allowedToolNames = new Set(objects) and .has(name) returned false for every lookup — every custom tool got filtered out by _refreshToolRegistry's allowlist guard, leaving the LLM with only the built-in [read, bash, edit, write]. Fix now routes tool objects through customTools: ToolDefinition[] plus a matching tools: string[] allowlist that includes every registered name, drops the private _baseToolsOverride + _buildRuntime defense-in-depth hack, and restores grep/find/ls that were last bundled pre-0.68 in the monolithic codingTools array. A regression test now locks the shape contract (every customTools[].nametools allowlist) so the next SDK uplift cannot silently drop tools again.
  • Pi call_llm honors the requested modelqueryLlm routed call_llm through the mini_completion RPC, which only carried prompt. Every call_llm silently ran on the connection's mini model (often the stale pi/gpt-5.1-codex-mini), ignoring both request.model and request.systemPrompt. Introduced a new llm_query RPC that carries the full LLMQueryRequest; the subprocess delegates verbatim to the model-aware queryLlm. PiAgent.queryLlm tracks pending queries in a map with cleanup on result / generic error / subprocess exit, and the event-adapter call_llm override now only fills in args.model when absent (never overwrites explicit values). A round-trip invariant test guards the full request envelope byte-for-byte. Fixes #596.
  • Pi mini completions pick a provider-appropriate modelhandleMiniCompletion was failing with "No API key found for openai." for users on ChatGPT Plus / openai-codex / google / github-copilot whenever the connection had no explicit miniModel. The provider-check fallback in queryLlm always assigned getDefaultSummarizationModel() (Haiku), which only resolves under anthropic auth — the Pi SDK 0.70.0 default then silently surfaced as an OpenAI model and the misleading auth error bubbled up. New pickProviderAppropriateMiniModel helper walks PI_PREFERRED_DEFAULTS[authProvider] for a resolvable, non-denied candidate (anthropic explicitly preserved to keep Haiku as its mini default), and runQueryWithModel now fails fast with an actionable message if no model resolves.
  • Composer no longer crashes on malformed drafts — Harden the chat input against untrusted draft content: the renderer now coerces draft text at the boundary, RichTextInput is defensive against non-string values, and the input area is wrapped in a local error boundary with recovery actions so a bad draft cannot take the whole window down. Fallback UI is localized and passes the staged i18n checks.
  • Bare @@ diff blocks now render as rich PatchDiff — Diff normalization moved into a pure helper and made robust for three shapes that previously fell through to the syntax-highlighted CodeBlock instead of pierre's diff viewer: bare @@ marker lines, valid numbered hunks without file headers, and already-valid unified/git patches (now preserved byte-identically). The synthesis path preserves user file headers when present, strips hunk markers cleanly without leaving blank context lines, and counts body lines without mistaking legitimate --- / +++ deletion/addition content for headers.

Dependency Changes

  • Pi SDK uplifted to 0.70.2@mariozechner/pi-{coding-agent,agent-core,ai} bumped across root, pi-agent-server, server-core, and shared. The 0.66.1 → 0.70.0 step replaced the removed codingTools export with a createCodingTools(cwd) factory and reshaped CreateAgentSessionOptions.tools (see the tool-registration fix above). The 0.70.0 → 0.70.2 step is a patch-level bump with no API surface changes exercised by our code. (See the GPT-5.5 Features entry above for the default-model change enabled by the 0.70.0 catalog.)
  • Claude Agent SDK pinned to exact 0.2.111 — The earlier 0.2.111 → 0.2.119 uplift broke Claude-backed sessions with "Claude Code SDK not found". Starting at 0.2.113 the SDK replaced its JS cli.js entry point with platform-scoped native binaries (@anthropic-ai/claude-agent-sdk-{platform}-{arch}), which our runtime resolver, build scripts, CI verify-bundles steps, and --preload-based unified network interceptor all assume does not exist. Pinned back to the last JS-based release and tightened peer-dep ranges from ^0.2.19 / >=0.2.19 to an exact 0.2.111 so the next reshape cannot land as a silent patch bump. Full native-binary migration (interceptor rehoming) is tracked separately.

Breaking Changes

  • None. source_test is backward compatible; autoEnable defaults to true but the previous validation output is a strict subset of the new output. Pass autoEnable: false to reproduce the old behavior exactly. spawn_session gains an optional field — existing callers are unaffected.

Don't miss a new craft-agents-oss release

NewReleases is sending notifications on new releases.