v0.8.12 — GPT-5.5 + DeepSeek providers, Pi tool registration restored, composer & diff hardening

Features

GPT-5.5 is now the default for openai and openai-codex — Pi SDK 0.70.0 added gpt-5.5 to the OpenAI catalog, so PI_PREFERRED_DEFAULTS now picks it as the default model for both the openai and openai-codex auth providers instead of whatever the SDK returned first. New API-key connections and Craft Agents Backend (OpenAI) connections land on gpt-5.5 out of the box; existing connections keep their explicit model choice. Fixes #597.
DeepSeek is now a supported Pi-backed provider — Adds DeepSeek to PROVIDER_METADATA (dashboard URL), PI_PROVIDER_DISPLAY (label + placeholder), and PI_PREFERRED_DEFAULTS (deepseek-v4-pro / deepseek-v4-flash) so connections default to a modern model instead of whatever the Pi SDK returns first. The renderer picks up the new provider automatically via PI_AUTH_PROVIDER_DOMAINS (deepseek.com) for favicon resolution, the API-setup preset, and the settings page label. CLI gains a DEEPSEEK_API_KEY env key and extracts resolveApiKey, shouldSetupLlmConnection, and getProviderDisplayName as testable exports; --base-url auto-setup now works for non-anthropic providers and the validate step shares the same resolver path. Fixes #600.

Improvements

source_test now auto-enables and auto-restarts the turn so tools become callable immediately — Previously source_test only validated a source; users with a valid config but enabled: false had to flip the flag manually and restart the session, even though every check passed. The tool now flips enabled: true when needed and triggers the session's existing onSourceActivationRequest callback so the MCP/API servers are built and applied to the running agent. The follow-up fix (this release) routes the successful activation through the same source_activated + auto_retry machinery that already handled "tool not found on inactive source" errors: after activation, the current turn aborts cleanly and the renderer resends the user's original message with a [{slug} activated] suffix — giving the next query()/handlePrompt a fresh tool list with the new source live. This fixes a Claude-specific bug where source_test reported "tools available now" but the SDK had already frozen mcpServers at query-start, so mcp__{slug}__* tools were invisible to the model until the user typed another message. Pi behaves the same way for consistency (and also required a turn boundary — its subprocess only picks up new proxy tool defs on the next handlePrompt). Opt out with autoEnable: false to keep pure-validation behavior. No change to Codex or other backends without an activation callback — they still get the enabled flip and a clear "restart session to load tools" hint.
spawn_session accepts thinkingLevel — Agents can now set the reasoning level when delegating to a spawned session (off | low | medium | high | xhigh | max), instead of always inheriting the parent session's level or workspace default. Silently ignored on non-reasoning models (e.g. gpt-4o, gemini-2.5-flash): the Pi provider drivers and Claude SDK both gate the reasoning param on the model's capabilities, so passing thinkingLevel to a non-reasoning model is a safe no-op rather than an error. Also fixes a latent bug where createSession({ thinkingLevel }) in the session-manager API was silently ignored — the option is now honored with caller → workspace → global precedence, matching how permissionMode already worked. Fixes #462.
Real typecheck gate for pi-agent-server — The package's typecheck script was aliased to bun run build (bundler, not tsc), so API-shape drifts from the Pi SDK uplift slipped through CI (see the Pi-subprocess tool-registration fix below for the concrete regression that escaped). Added a dedicated tsc --noEmit -p tsconfig.typecheck.json step, wired into typecheck:all, plus ambient shims for turndown/pdfjs-dist/bash-parser so the new typecheck doesn't need @types packages. Fixed the cascade of pre-existing type drifts it surfaced (PiCredential vs AuthCredential, agent_end event shape, sdkTurnAnchor enrichment, CustomModelEntry at the dynamic-register call site, initConfig nullability in queryLlm closures, and an incorrect generic in web-fetch's result helper).

Bug Fixes

WebUI "Add New Label" no longer launches the desktop app — Typing #<new-label> in the WebUI chat input and clicking "Add New Label" previously opened a popover that, on submit, fired a craftagents://action/new-session deep link. The browser resolved that scheme by launching the Electron desktop app instead of creating the label in the browser. Root cause: the chat-input call site cherry-picked fields from the EditPopover config and dropped inlineExecution: true, falling back to the legacy same-window deep-link path (which happened to work inside Electron but broke across the WebUI ↔ OS boundary). Switched to a full config spread, matching how AppShell already invokes the same popover.
Attachments no longer leak between sessions (for real this time) — Attaching a file, switching sessions without sending, and switching back now restores the attachment in the original session across all four attach paths (file picker, OS drag-drop, clipboard paste, web drag). The first-pass fix assumed every attachment had a real OS path — true for Finder drag/OS picker, false for paste/web-drag where Chromium synthesises a File from a Blob with no disk origin — so draft refs fell back to filename-only values and failed to re-read on hydrate. The new persistence layer is hybrid: file-picker and OS-drag capture the absolute path via webUtils.getPathForFile (Electron 32+) and re-read on hydrate through a dedicated file:readUserAttachment RPC; paste and web-drag persist bytes inline in the draft (20 MB per-attachment cap — huge pastes log a warn and drop from the draft). Old 0.8.11-format drafts are rejected on load, so attachments saved by the previous broken release disappear once after upgrade instead of silently haunting the composer. Fixes #572.
Custom URL scheme links now open the right app — Clicking obsidian://, vscode://, zed://, notion://, slack://, and similar links in chat messages now dispatches to the OS protocol handler instead of being blocked (desktop) or rewritten to https://<host>/obsidian://... (WebUI). URL handling switched from a tight allowlist (http/https/mailto/craftdocs) to a blocklist of known-dangerous schemes (javascript:, data:, vbscript:, blob:, file:). The WebUI and Viewer now use an anchor-click fallback for non-http schemes so Chrome routes through the external-protocol dispatcher reliably. Fixes #590.
/compact no longer times out prematurely on GPT sessions — Manual compaction (including "Accept & compact" on a submitted plan) against Pi-backed OpenAI models failed after 60s because the subprocess RPC didn't leave room for GPT-5.4's long summary responses on large conversations. Bumped the timeout to 5 min — truly hung subprocesses are still caught by the stdio death watchdog. Claude sessions were unaffected (they use the SDK's native compact channel).
Pi subprocess tool registration restored — Pi SDK 0.70.0 quietly reshaped CreateAgentSessionOptions.tools from an array of tool objects into a string[] name allowlist. The subprocess kept passing AgentTool[], so at runtime allowedToolNames = new Set(objects) and .has(name) returned false for every lookup — every custom tool got filtered out by _refreshToolRegistry's allowlist guard, leaving the LLM with only the built-in [read, bash, edit, write]. Fix now routes tool objects through customTools: ToolDefinition[] plus a matching tools: string[] allowlist that includes every registered name, drops the private _baseToolsOverride + _buildRuntime defense-in-depth hack, and restores grep/find/ls that were last bundled pre-0.68 in the monolithic codingTools array. A regression test now locks the shape contract (every customTools[].name ∈ tools allowlist) so the next SDK uplift cannot silently drop tools again.
Pi call_llm honors the requested model — queryLlm routed call_llm through the mini_completion RPC, which only carried prompt. Every call_llm silently ran on the connection's mini model (often the stale pi/gpt-5.1-codex-mini), ignoring both request.model and request.systemPrompt. Introduced a new llm_query RPC that carries the full LLMQueryRequest; the subprocess delegates verbatim to the model-aware queryLlm. PiAgent.queryLlm tracks pending queries in a map with cleanup on result / generic error / subprocess exit, and the event-adapter call_llm override now only fills in args.model when absent (never overwrites explicit values). A round-trip invariant test guards the full request envelope byte-for-byte. Fixes #596.
Pi mini completions pick a provider-appropriate model — handleMiniCompletion was failing with "No API key found for openai." for users on ChatGPT Plus / openai-codex / google / github-copilot whenever the connection had no explicit miniModel. The provider-check fallback in queryLlm always assigned getDefaultSummarizationModel() (Haiku), which only resolves under anthropic auth — the Pi SDK 0.70.0 default then silently surfaced as an OpenAI model and the misleading auth error bubbled up. New pickProviderAppropriateMiniModel helper walks PI_PREFERRED_DEFAULTS[authProvider] for a resolvable, non-denied candidate (anthropic explicitly preserved to keep Haiku as its mini default), and runQueryWithModel now fails fast with an actionable message if no model resolves.
Composer no longer crashes on malformed drafts — Harden the chat input against untrusted draft content: the renderer now coerces draft text at the boundary, RichTextInput is defensive against non-string values, and the input area is wrapped in a local error boundary with recovery actions so a bad draft cannot take the whole window down. Fallback UI is localized and passes the staged i18n checks.
Bare @@ diff blocks now render as rich PatchDiff — Diff normalization moved into a pure helper and made robust for three shapes that previously fell through to the syntax-highlighted CodeBlock instead of pierre's diff viewer: bare @@ marker lines, valid numbered hunks without file headers, and already-valid unified/git patches (now preserved byte-identically). The synthesis path preserves user file headers when present, strips hunk markers cleanly without leaving blank context lines, and counts body lines without mistaking legitimate --- / +++ deletion/addition content for headers.

Dependency Changes

Pi SDK uplifted to 0.70.2 — @mariozechner/pi-{coding-agent,agent-core,ai} bumped across root, pi-agent-server, server-core, and shared. The 0.66.1 → 0.70.0 step replaced the removed codingTools export with a createCodingTools(cwd) factory and reshaped CreateAgentSessionOptions.tools (see the tool-registration fix above). The 0.70.0 → 0.70.2 step is a patch-level bump with no API surface changes exercised by our code. (See the GPT-5.5 Features entry above for the default-model change enabled by the 0.70.0 catalog.)
Claude Agent SDK pinned to exact 0.2.111 — The earlier 0.2.111 → 0.2.119 uplift broke Claude-backed sessions with "Claude Code SDK not found". Starting at 0.2.113 the SDK replaced its JS cli.js entry point with platform-scoped native binaries (@anthropic-ai/claude-agent-sdk-{platform}-{arch}), which our runtime resolver, build scripts, CI verify-bundles steps, and --preload-based unified network interceptor all assume does not exist. Pinned back to the last JS-based release and tightened peer-dep ranges from ^0.2.19 / >=0.2.19 to an exact 0.2.111 so the next reshape cannot land as a silent patch bump. Full native-binary migration (interceptor rehoming) is tracked separately.

Breaking Changes

None. source_test is backward compatible; autoEnable defaults to true but the previous validation output is a strict subset of the new output. Pass autoEnable: false to reproduce the old behavior exactly. spawn_session gains an optional field — existing callers are unaffected.

lukilabs/craft-agents-oss v0.8.12 on GitHub

v0.8.12 — GPT-5.5 + DeepSeek providers, Pi tool registration restored, composer & diff hardening

Features

Improvements

Bug Fixes

Dependency Changes

Breaking Changes

lukilabs/craft-agents-oss v0.8.12
on GitHub