@oh-my-pi/pi-ai
Fixed
-
Fixed OpenAI Responses/Realtime SSE stream handler crashing with "Error Code undefined: undefined" when parsing error events with nested error details by falling back to the nested error object fields.
-
Fixed OpenAI-compatible providers that reject forced
tool_choiceon thinking-required models by downgrading unsupported forced choices toautowhile keeping tools available (#2546). -
Fixed GitHub Copilot Anthropic transport (
api.githubcopilot.com/v1/messages) returning400 tools.0.custom.eager_input_streaming: Extra inputs are not permittedon every tool-bearing turn by stopping the emission of the per-tooleager_input_streamingflag and thefine-grained-tool-streaming-2025-05-14beta header on the Copilot transport — the proxy whitelists neither (#2558). -
Disabled Bun's native ~300s pre-response
fetchtimeout in every streaming provider (OpenAI completions/responses, Azure responses, Anthropic, Codex SSE, Bedrock, Gemini CLI, Ollama). The configurable first-event/idle/SDK watchdogs (PI_STREAM_FIRST_EVENT_TIMEOUT_MS,PI_OPENAI_STREAM_IDLE_TIMEOUT_MS,compat.streamIdleTimeoutMs) were silently capped by Bun's hidden ceiling, so cold large-context streams (e.g. self-hosted vLLM at multi-hundred-K prompts) died at exactly 300s withTimeoutError: The operation timed out.Direct callers of./providers/{amazon-bedrock,google-gemini-cli,ollama,openai-codex-responses}(which bypassregister-builtins' iterator-level watchdog) now install a pre-responseAbortSignal.timeout(firstEventTimeoutMs)alongside the disable, so a stalled upstream still fails within the configured budget instead of hanging forever (#2422) -
Fixed Gemini / Antigravity streams (Google Cloud Code Assist API) creating a trailing empty text block and emitting redundant
text_start/text_delta/text_endevents at the end of the turn when the final SSE chunk contains an empty text part (text: ""). The parser now ignores empty text parts, preserving the active transcript block state and ensuring proper nesting and rendering of subsequent background jobs or new turns. -
Preserved terminal Google
thoughtSignatures by still extracting and applying the signature on the active block even when the text part is empty or undefined. -
Stopped Gemini Antigravity sessions (
gemini-3*/ Claude under Cloud Code Assist) from leaking system rule reminders and personality preambles into the final response, by appending an explicit 'do not output rule checks' instruction to the injected system parts. -
Fixed Gemini / Antigravity streams (Google Cloud Code Assist API) letting a
functionCallpart's ownthoughtSignatureclobber the preceding text or thinking block's signature onthink → toolandtext → toolturns. A signed function-call part hastext: undefined, so it fell into the terminal-signature branch while the prior block was still active; that branch now skips function-call parts, leaving the tool call's signature on the tool call where it belongs and preventing corrupted signatures on same-model replay. -
Fixed MiniMax-M3 OpenAI-compatible streams rendering reasoning twice when the same chunk carried both
<think>…</think>content and structuredreasoning_content; structured reasoning now wins and cumulative MiniMax reasoning snapshots are collapsed to deltas using a per-signature snapshot tracker that survives the</think>-to-text block transition (so post-answer cumulative snapshots don't reinstate a duplicate thinking block). (#2433)
@oh-my-pi/pi-coding-agent
Breaking Changes
- Replaced the
omp setup sttcommand withomp setup speech. The oldsttsetup component is gone (no alias);omp setup speechnow provisions the full speech stack — audio recorder, speech-to-text model, and text-to-speech model. - Renamed the
tts.enabledsetting tospeechgen.enabled(same boolean, default off; no alias). It still gates the on-demandttsspeech-generation tool, now labelled "Speech Generation" in the settings panel.
Added
-
Added
paste.largeMenuThresholdsetting (0/100/250/500/1000, default 100) to control when large pasted content triggers the large-paste menu or stays as a normal[Paste]marker -
Added a large-paste editor menu for pasted text over the threshold that lets users choose to wrap the paste in a fenced code block, wrap it in
<pasted_text>XML tags, or save it aslocal://attachment-Nfor on-demand reading -
Added
snapcompact-savings.jsonljournaling for snapcompact tool-result compaction, recording session, provider, model, tool call, and estimated token savings whenever tool output is rendered as image frames -
Added
subagent:<id>loop-phase breadcrumbs around in-process subagent event dispatch and finalization so the TUI event-loop watchdog can attribute a main-thread stall to subagent execution (#2485) -
highlightMagicKeywords(text, resetTo?, phase?)now accepts an optionalphase∈ [0, 1) that rotates the gradient cyclically; sent bubbles omit it (static palette unchanged).hasMagicKeyword(text)exported frommodes/magic-keywordsis the cheap shimmer-gate the editor uses on every render. -
Added a
fastModeScopesetting (both|openai|claude, defaultboth) controlling which providers/fast on(and the fast-mode toggle) target.bothkeeps the prior unscoped priority behavior;openai/claudescope fast mode to one family./fast statusnow reports the active scope. -
Added the
mnemopi.embeddingVariantsetting (en|multilingual) selecting a stronger SOTA local embedding model —en→BAAI/bge-base-en-v1.5(768d),multilingual→intfloat/multilingual-e5-large(1024d). Resolution precedence ismnemopi.embeddingModelsetting >MNEMOPI_EMBEDDING_MODELenv > variant default, so the documented env override is still honored. Changing the active model wipes and rebuilds stored embeddings on the next writable start (#2476) -
Added a
/guided-goalslash command that interviews you to refine an objective before enabling goal mode, then seeds goal mode with the agreed objective. The bounded interview (up to six turns) runs on the plan or slow model and falls back with a hint when the goal is still too vague (#2502). -
Added a large-paste menu: when a paste reaches
paste.largeMenuThresholdlines (default 100;0disables), the editor offers to wrap it in a code block, wrap it in<pasted_text>XML tags (both collapse to a[Paste]marker that expands on submit), or save it to the session'slocal://store and insert a cleanlocal://attachment-Nreference the agent canreadon demand. Esc keeps the previous inline-paste behavior, so the content is never lost. -
Added
8on22-bw(leading) and11on16-bw(tracking) options to thesnapcompact.shapesetting, the spacing-tuned cells that are now the per-provider defaults (Anthropic → tracking, OpenAI/Google → leading) -
Added a local on-device neural TTS backend for the
ttstool and aproviders.ttsswitch (auto|local|xai, defaultauto).localsynthesizes speech with Kokoro-82M — SoTA on-device TTS quality — viakokoro-json the shared ONNX runtime (@huggingface/transformers+onnxruntime-node) in a subprocess worker (mirroring the tiny-model worker), keeping the model warm across calls and emitting 24 kHz WAV/PCM16 with no network call;xaikeeps the existing Grok Voice cloud path;autoprefers local but routes.mp3requests to xAI when credentials exist (no local MP3 encoder is bundled, so a local.mp3request is written as a sibling.wav).kokoro-jsis never a hard dependency: it is lazilybun installed into a version-keyed runtime dir on first use (withonnxruntime-nodeforce-pinned to a Bun-safe version), so its transformers@3.x graph never pollutes the main tree. Newtts.localModel(defaultkokoro) andtts.localVoice(defaultaf_heart; American/British, female/male voices) settings select the on-device voice. -
Added a unified, interactive
omp setup speechcommand that walks one reusable flow across all three speech dependencies: it lets you pick and persist the speech-to-text (stt.modelName) and text-to-speech (tts.localModel) models from a TUI list, then downloads both models plus an audio recorder with live progress. The recorder is now auto-provisioned cross-platform (a staticffmpegbinary is fetched via the shared tools-manager when no SoX/FFmpeg/arecord is present, with the PowerShell fallback on Windows) instead of dead-ending with "install sox manually".--checkand--jsonreport recorder + STT-model + TTS-model readiness without installing. -
Added
omp say <text>, which synthesizes text with the local on-device TTS engine and plays it through the speakers (cross-platform:afplayon macOS,paplay/aplay/bundledffmpegon Linux, PowerShellMedia.SoundPlayeron Windows).--out <file>writes a WAV instead of playing,--voice/--modeloverride thetts.localVoice/tts.localModelsettings, and an uninstalled model prints an actionableomp setup speechhint. -
Added streaming speech vocalization: with
speech.enabledon, the assistant speaks its reply through the speakers as it streams. Assistant text deltas are fed directly into the engine's incremental text input (Kokoro'sTextSplitterStreamvia the worker) as they arrive, rather than pre-chunked in JS and synthesized one batch call per sentence — the engine owns sentence segmentation and emits one audio chunk per sentence. A single persistent player (StreamingAudioPlayer) drains those chunks gaplessly (raw 32-bit-float PCM piped to oneffmpeg→PulseAudio/ALSA process on Linux; interruptible per-fileafplay/PowerShellSoundPlayeron macOS/Windows), replacing the spawn-a-player-per-sentence path that added latency and audible gaps. Overspeech is handled end to end: a new turn, a sent message, or an Esc/Ctrl+C interrupt stops playback instantly (the player process is killed rather than letting the current sentence finish); holding the push-to-talk key ducks the volume while you speak and restores it when you stop; and sequential utterances queue and drain in order instead of overlapping.speech.mode(all|assistant|yield, defaultassistant) picks what is spoken —alladds thinking,yieldspeaks only the final message at turn end — andspeech.voiceselects the Kokoro voice.ask-tool questions are spoken in every mode. Synthesis reuses the local Kokoro engine (tts.localModel) through a new streaming synthesis path (TtsClient.synthesizeStream) that pushes text in and streams audio chunks back over the worker protocol. -
Added live (streaming) speech-to-text: with
stt.enabledon, transcription now appears in the composer as you speak instead of all at once after you stop. The recorder streams raw 16 kHz mono PCM from sox/ffmpeg/arecord stdout to the warm STT worker, where an energy-based endpointer (no extra model) splits speech into segments at natural pauses; each finalized segment is committed into the editor while the in-progress segment shows a live volatile preview that refreshes in place and is kept out of the undo history. Works with both the default Parakeet (sherpa-onnx) and the Whisper (transformers.js) tiers. Recorders that cannot stream to a pipe (the Windows PowerShell mci fallback) transparently fall back to single-shot transcription. -
Added
skills.enableAgentsUserandskills.enableAgentsProjectsettings (default on) so the canonical OMP-native~/.agent[s]/skillsand project-walkup.agent[s]/skillsare configurable independently from the third-party Claude/Codex/Pi toggles. -
Added a read-only
ctx.modelsfacade for extensions:list()(authenticated models),current()(live session model),resolve(spec)(a model string or role alias →Model, using the same settings-backed aliases and match preferences as core selection), andfamily(model)(opaque canonical-identity lineage token for cross-family comparisons). Lets extension tools select models the way core does without reaching into the mutable registry (#2406) -
Added RPC prompt lifecycle hints so hosts can distinguish scheduled agent turns from local-only slash commands via
data.agentInvokedandprompt_result. -
Added extension lifecycle events for tool approval prompts:
tool_approval_requestedbefore the approval wait andtool_approval_resolvedafter approve, deny, or approval prompt failure.
Changed
-
Changed
handoffcustom messages (customType: "handoff") to render in the transcript as a compaction-style expandable divider in both the main session and Agent Hub views, and expanded handoff details now show the handoff context body without<handoff-context>tags -
Changed the double-tap-← gesture (empty editor, main session) to stay inert when there are no subagents to show, instead of opening an empty Agent Hub roster. The explicit Agent Hub / observe keybindings still open the empty roster. The gating reuses the hub's own row count (after its persisted-subagent scan), so it matches exactly what the hub would display.
-
Changed the
jobtool'sasync.pollWaitDurationsetting (relabeled Max Poll Time) to add asmartvalue, now the default. A fixed value (5s–5m) still blocks for exactly that long;smartadapts: a blocking poll starts at a 5s floor and climbs a ladder (5s → 10s → 30s → 1m → 5m) with each back-to-back poll, so a tight poll loop backs off and stops spending turns on "still running" frames, then resets to the 5s floor after ~1 minute without polling (i.e. when the agent steps away to do real work). Escalation is tracked per agent (owner-scoped onAsyncJobManager). -
Added the
compat.supportsForcedToolChoicecustom-model flag for OpenAI-compatible models whose endpoints accept tools but reject forcedtool_choicevalues (#2546). -
Changed speech-to-text to run fully local on-device with a tiered, multi-engine model picker. Transcription runs in a subprocess worker (mirroring the tiny-model worker; the native ONNX addons are hard-killed on shutdown to dodge the Bun NAPI-finalizer segfault) instead of shelling out to Python
openai-whisper, keeps the model warm across recordings, and decodes WAV to 16 kHz mono float32 in-process.stt.modelNamenow selects on-device tiers across two engines:parakeet(default) — NVIDIA Parakeet TDT 0.6B v3 (25 languages) via the nativesherpa-onnx-node, the Open ASR Leaderboard accuracy + throughput leader (lower WER than, and ~20× faster decoding than, Whisper large-v3) — plusfast/balanced/turbomapping to Whisper base/small/large-v3-turbo (multilingual, up to 99 languages) via@huggingface/transformers.omp setup speechno longer mentions pip/python-whisper and reports recorder + model-cache readiness. -
Changed the speech-to-text trigger from the
Alt+Hkeybinding to a hold-Spacepush-to-talk gesture. Holding the space bar emits an OS auto-repeat burst; once more than 5 spaces land in the editor it recognizes the hold, deletes (tracks back) those inserted spaces, and starts recording, then stops and transcribes when the repeats stop (the space bar is released).app.stt.toggleis now unbound by default but can be rebound to a chord for press-to-toggle; the gesture is gated onstt.enabled, andShift+Spacestill inserts a literal space. -
task.eager("Prefer Task Delegation") andtodo.eager("Create Todos Automatically") are now three-level enums (default/preferred/always) instead of booleans. Fortodo.eager,preferredrenders a soft first-message reminder whilealwaysforces thetodotool (the previous "on" behavior); fortask.eager,preferredadds a soft (SHOULD) delegation nudge to the system prompt whilealwaysuses hard (MUST/ONLY) wording plus a first-turn delegation reminder. Existing boolean configs migrate automatically (true → always,false → default). On models that cannot be forced to calltodo,todo.eager: "always"now emits the first-turn reminder without forcing the call (previously such models received nothing) (#2539, #2540 by @metaphorics). -
Fixed
/model-switching to a non-default OpenRouter model returning404 No route: POST /chat/completionswhen the provider is routed through the auth-gateway broker. The background catalog refresh re-ranmergeDiscoveredModelon every openrouter entry; for models whose bundled record already existed, the merge re-appliedbaseUrl,headers, andcompatbut droppedtransport: pi-nativebecause the raw/v1/modelspayload carries no transport hint. The next/modelswitch then picked the now-transport-less entry and routed through the default openai-completions client to${baseUrl}/chat/completions— a path the auth-gateway never serves.mergeDiscoveredModelnow propagates the override/existing transport on the rediscovery branch (#2555).
Fixed
-
Fixed npm plugin installs to reject packages whose declared extension entry points cannot load because imports or nested dependencies are unresolved (#2312).
-
Fixed the deferred MCP discovery banner (
Connecting to MCP servers: …) overdrawing the chat input bar.onMCPConnectingwrote the banner straight toprocess.stderrwhile the TUI owned the terminal; it now emits on anmcp:connectingevent channel thatInteractiveModerenders throughshowStatus(the status container), mirroring the existing LSP-startup pattern so the banner can never paint over the input box border (#2483) -
Fixed Win+Shift+S screenshot paste on Windows dead-ending on
Image not found: Windows Terminal forwards a bracketed paste of the Snipping Tool's transient…\MicrosoftWindows.Client.Core_*\TempState\…file path, which is already gone (or never materialized) by the time omp reads it, sohandleImagePathPastefailed with ENOENT even though the screenshot bitmap was still on the clipboard. The handler now falls back to the clipboard image across every local read failure (missing file, undecodable file, or generic error) before degrading, and the fallback is skipped over SSH where the clipboard lives on the remote host rather than the terminal holding the screenshot. -
Fixed npm prebuilt extension compatibility shims deriving their own package root through bare
@oh-my-pi/pi-coding-agentresolution, which could select an older Bun cache copy in global installs and reintroduce mixed-runtime plugin loading stack overflows. -
Honor the
context_lengthreported by OpenAI-compatible/v1/modelsdiscovery (discovery: { type: "proxy" }anddiscovery: { type: "openai-models-list" }) when present, so aggregator-reported windows override the stale bundled reference; the value is validated through the positive-number guard so a0/negative/stringly-typed upstream value cleanly falls back to the bundled reference (then128000) instead of pinning a broken window (#2466 by @androw). -
Fixed
web_searchSearXNG fallback when HTTP 200 responses contain no usable results plusunresponsive_engines; SearXNG now raises a transient provider error, and the fallback loop rejects any provider response with no renderable content before formatting an invisible success (#2571). -
Fixed
readon a GitHub commit URL (github.com/<owner>/<repo>/commit/<sha>) returning the raw commit HTML page instead of structured content.parseGitHubUrlhad nocommitcase, so commit URLs fell through to generic HTML rendering; they now resolve via the commits API and render as markdown (subject, author, stats, parents, full commit message, and a per-file unified diff), matching the existing blob/tree/issue/PR handling. -
Fixed release runs being silently cancelled by a later
mainpush, which left tagged versions (v15.12.6in the wild) without a GitHub Release or npm publish. The CI workflow'sconcurrencygroup was${{ github.workflow }}-${{ github.ref }}, and since the release-script commit +v*tag are pushed atomically torefs/heads/main, the release run shared theCI-refs/heads/maingroup with every subsequent push;cancel-in-progress: truethen killed it beforerelease_binary/release_github/release_npmcould run, and no future run carried the release tag at HEAD. The group now resolves to a per-sharelease-<sha>slot withcancel-in-progress: falsewhenever the push subject matcheschore: bump version to(the release-script convention) orgithub.refis av*tag (workflow_dispatchrecovery), so release runs are isolated from PR/main churn (#2564). -
Allowed
compat.streamIdleTimeoutMs: 0inmodels.yml. The schema was.positive(), so the documented "set to 0 to disable" escape hatch was only reachable via the global env var (#2422) -
Fixed LaTeX math delimiters (
$/$$) and commands (such as\text,\times) rendering raw in the terminal by instructing the model to write equations as plain text / Unicode in its replies. The instruction is scoped to conversational output, so it does not constrain LaTeX or Markdown/KaTeX content the agent is asked to write to files (#2550 by @usr-bin-roygbiv). -
Fixed the MCP stdio transport
close()potentially hanging while awaiting a read loop that can block indefinitely; teardown now detaches the read loop instead of awaiting it (#2550 by @usr-bin-roygbiv). -
Fixed per-project memory isolation pulling other projects' memories into recall. Legacy (pre-#2412) mnemopi banks were rescued into recall when any working-memory row tagged the active cwd, but recall reads a bank wholesale and cannot filter rows by cwd, so a mixed-cwd bank leaked unrelated projects' memories. A legacy bank is now rescued only when every row tags the active cwd (#2412).
-
Fixed a prompt template whose name collides with a builtin slash-command alias (e.g. a
modelstemplate beside the builtin/model, which owns themodelsalias) appearing as a duplicate entry in the slash-command autocomplete picker. The picker's reserved-name filter now also excludes command aliases, matching runtime resolution (slash commands expand before prompt templates, so the template was already unreachable). -
Fixed the tool-result renderer re-shaping on every
invalidate()(spinner tick, stream chunk, resize, keystroke), which made large grep/find/read results block the main thread for seconds and made typing sluggish.ToolExecutionComponent.#updateDisplay()now memoizes on a dirty key (result version, expand state, partial flag, spinner frame, image visibility, theme epoch, background-task freeze state, the resolved terminal image protocol, and a display-input version that covers streamed call args, the async edit-diff preview, and Kitty image conversions) and a#displayBuiltguard that also fast-paths the#contentTextfallback, so the O(result-size) shaping runs once per change instead of every frame without freezing streamed args, previews, converted images, a backgrounded task settling to its static form, or images that arrive before the async image-protocol probe resolves. Image-bearing results also re-shape on terminal resize (keyed on the resolved image dimensions only when images are present) so inline images rescale, while image-free results never re-shape on resize (#2484) -
Fixed
setTheme()not bumping the theme epoch on its invalid-theme fallback path: a failed theme load swaps the active theme to the dark fallback, so memoized renderers (the tool-result renderer above) must re-shape — previously they kept the failed theme's stale colors until some other state changed (#2484) -
Fixed tool-call spinners animating out of phase across parallel tool calls — each live tool block advanced its glyph from its own per-instance start time, so concurrent spinners showed different frames. Glyphs now derive from a single shared monotonic clock (
sharedSpinnerFrame), keeping every live block in lockstep. -
Fixed the per-turn token-usage row (
display.showTokenUsage) churning and duplicating in scrollback, most visibly with parallel tool calls. The row was rendered inside the assistant block above the turn's tool blocks, so finalizing the block was deferred and late appends recommitted the already-committed tool rows. The assistant block now always finalizes as soon as a tool-call appears, and the usage row is emitted as a standalone finalized block below the turn's tool blocks across all three render paths (live, transcript rebuild, agent-hub). -
Fixed the editor input box claiming a disproportionate share of small terminals (<=18 rows): the editor max-height floor (6 rows) ignored the available space. The height now yields to terminal size (
EDITOR_MIN_CHROME_ROWS), reserving rows for the transcript and status line whenever the terminal can host both; on terminals too small for both, the editor collapses to its real bordered minimum instead of overshooting a fictitious cap. -
Fixed
ultrathink/orchestrate/workflowzkeywords not glowing while typing — the editor appended a CURSOR_MARKER (ESC-prefixed) after each render, so the magic-keyword regex's right-boundary(?!\S)tripped on ESC and dropped the gradient until a trailing character was typed. The marker-aware editor decorate (packages/tui) plus a phase-awarehighlightMagicKeywordsoverload restore the live glow and add a Claude-Code-style shimmer while the prompt is focused, gated onmagicKeywords.enabled(#2475). -
Fixed the compaction flow (
/compactand plan-mode "Approve and compact context") leaving UI artifacts.executeCompactionadded aSpacer(1)to the transcript that the sibling handoff path never adds and that leaked as an orphan blank line whenever compaction was cancelled or failed; that spacer is removed. On success the compaction loader is now stopped and the status container cleared before the transcript is rebuilt, so the live loader row no longer flickers over the reconciled transcript near the status seam (the idempotentfinallystill covers the cancel/fail paths) (#2486) -
Fixed plan approval's "Approve and compact context" running the compaction summarizer on the pre-plan model instead of the plan model, cold-missing the plan model's prompt cache. Compaction now runs on the plan model (warm cache); the switch to the execution/pre-plan model happens only after a successful compaction and before any input queued during compaction is dispatched, so the queued turn runs on the post-compaction model. A cancelled compaction now also restores the pre-plan model (it previously stranded the session on the plan model), while a failed compaction stays on the plan model with its context intact.
-
Fixed
Alt+Up(dequeue) reporting "No queued messages to restore" for messages — including skills — typed while the session was compacting.restoreQueuedMessagesToEditornow drainscompactionQueuedMessagesalongside the agent queue, so theAlt+Up to edithint restores every pending message it advertises. -
Fixed
restoreQueuedMessagesToEditor(Alt+Up dequeue and Esc-abort) producing colliding[Image #N]markers when the editor draft already held pending image(s): queued text was prepended but queued images were appended, so positional marker → image lookup at submit time resolved to the wrong image. Each queued message's image markers are now renumbered by the running pending-image count before merge so the combined text stays aligned with the mergedpendingImagesorder (#2531). -
Fixed
AgentBusyError("Agent is already processing. Use steer() or followUp()...") surfacing on mode transitions — asFailed to finalize approved plan: ...when a plan was approved while the agent was still streaming the post-resolvecontinuation (or a turn started by the approve-time compaction/clear), and as an error toast when a loop auto-submit or goal continuation fired during a streaming/compaction race. Plan approval now aborts any in-flight turn before dispatching the executor's first prompt, andsubmitInteractiveInputroutes streaming-time loop, goal-continuation, and manual submissions through the follow-up queue (streamingBehavior: "followUp") instead of throwing (synthetic continue-shortcuts stay developer-attributed and keep their prior behavior). Extends the manual-/goalfix in #2454 to the continuation and plan-approval paths. -
Fixed
/plancycling betweenplanandplan_pausedwith no path back to modenone.handlePlanModeCommandhad branches for entering and pausing but fell through to#enterPlanMode()when invoked from the paused state, so once a session entered plan mode the only operator-visible toggle re-entered it. The handler now matchesplanModePausedand fully exits — clearingplanModeHasEnteredand appending amode_changeto"none"— so/goal(and any other mode gated onplanModeEnabled || planModePaused) can run again after a third/plan(#2510). -
Fixed HTML session export rendering empty text tokens (
text,userMessageText,customMessageText,toolTitle) as the dark-theme grey#e5e5e7on every theme not literally namedlight, making transcripts illegible on custom light themes likesandstone,limestone, andporcelain.getResolvedThemeColorsand the standaloneisLightThemehelper now classify against the resolvedstatusLineBgluminance (the same surfaceTheme.isLightuses), so the HTMLdefaultTextfalls back to#000000on light themes and the standalone helper stays in lockstep withTheme.isLight(#2516). -
Fixed the Agent Hub opening on its own from a stray mouse click. The double-tap-← gesture (empty editor) fired on any two
leftkeys within 500ms, but terminals with "click to move cursor" / pointer features (iTerm2 option-click, WezTerm, kitty, tmux) synthesize a burst of arrow keys on click — delivered sub-millisecond apart in one stdin read — so a single click could pop the hub with no key ever pressed. The gesture now requires the second tap to land a human-plausible interval after the first (≥40ms, <500ms) and ignores any third-or-later rapid tap, so synthesized bursts are rejected while a deliberate double-tap still works. The same hardening applies to the focused-subagent ←← "return to main" gesture. -
Fixed the
ctrl+pmodel-role cycle indicator (thedefault / gpt / fable / …chip track) stacking duplicate copies in the scrollback when other chat activity landed between two cycles. The track was emitted throughshowStatus, whose back-to-back coalescing only merges when the previous status is still the last transcript child; any interleaved append broke that identity check and appended a second track. It now renders into a dedicated anchored container above the editor (cleared and rebuilt in place each cycle, like the Todos HUD) and auto-clears after a short linger, so rapid presses or concurrent activity can never duplicate it. -
Fixed the
Working…loader vanishing for the rest of a turn after an auto-compaction (context-overflow recovery) or auto-retry. Those overlays took over the shared status container with a barestatusContainer.clear(), which detached the working loader but leftloadingAnimationset; the resumed turn'sagent_start→ensureLoadingAnimation()is guarded byif (!this.loadingAnimation), so it skipped re-attaching the loader and the spinner stayed gone while the agent kept streaming. The overlay handlers now fully tear the working loader down (stop + dereference) via#stopWorkingLoader(), so the nextagent_startrecreates and re-attaches it. -
Fixed JS eval helper optional arguments rejecting Python-style positional calls.
read(path, offset, limit)now works alongsideread(path, { offset, limit }),null/undefinedskip optional positional slots, and non-local URI reads such asartifact://...delegate through the read tool so line slicing works on spilled artifacts. -
Fixed legacy extension compatibility remapping for
@*-pi-ai/utils/oauthimports so background workers load the relocated@oh-my-pi/pi-ai/oauthexports instead of resolving missingsrc/utils/oauth/*files (#2566). -
Fixed eager todo initialization prompting GPT-5.5 to emit unsupported task metadata fields, which could leave fresh sessions stuck on the forced first
todocall (#2561). -
Fixed Windows plan-mode task fan-out crashing the TUI when nested async task progress formed a cycle; task rendering now cuts recursive snapshots and long Windows
local://roots are shortened under temp storage (#2551). -
Fixed a band of streaming assistant output being lost from native scrollback — committed nowhere, repainted nowhere — once a reply grew taller than the viewport. Markdown whose layout keeps changing above the streaming tail (most visibly a table whose columns re-align as rows arrive) never earns a byte-stable commit-safe end, so as its head scrolled above the window the rows fell into the gap between the commit boundary and the window top and vanished.
TranscriptContainernow reports agetNativeScrollbackSnapshotSafeEnd()for commit-stable live blocks (their whole body is durable content), and the renderer commits those scrolled-off rows audit-exempt — a later layout change of an already-committed row freezes a slightly-stale row in scrollback (duplication never loss) instead of dropping it. Provisional blocks (collapsing tool/edit previews) are unaffected. -
Fixed unknown
---prefixed flags being silently consumed as prompt text, which let a stale or typoed flag start a real agent session (connecting to MCP servers, waiting on the model) instead of failing fast.parseArgsnow tracks unrecognized flag-shaped tokens andrunRootCommandcallsreportUnrecognizedFlagsimmediately after the post-extension reparse, exiting2withError: unknown flag: --…before any session, MCP, or initial-message work runs. Extension-registered flags still pass cleanly since the validation runs after the extension-aware reparse, and--is honored as a POSIX positional separator so flag-shaped prompts (omp -p -- --explain-this) survive the new guard (#2459). -
Fixed
/goal <objective>and/goal set <objective>during streaming so goal context is steered immediately but objective submission waits for the active turn to finish instead of spammingAgentBusyError. The interactive goal-continuation timer is now streaming-aware too: if a turn starts inside the 800 ms idle window the timer was scheduled in, it drops the tick instead of submitting a stalegoal-continuationthat would resurface the sameAgentBusyError; the nextagent_endreschedules (#2454). -
Fixed
~/.agent[s]/skillsnot appearing as/skill:<name>commands when every named source toggle (skills.enableCodexUser,skills.enableClaudeUser,skills.enableClaudeProject,skills.enablePiUser,skills.enablePiProject) was off:loadSkillsgated theagentsprovider onanyBuiltInSkillSourceEnabled, so a user who turned off the Claude/Codex/Pi sources to clean noise also lost their own canonical OMP-native skills. Theagentsprovider now reads the dedicatedenableAgentsUser/enableAgentsProjecttoggles, and the unknown-third-party fall-through gate is restricted to the named third-party toggles so keeping the default agents toggles on no longer silently re-enablesopencode/github/claude-plugins/geminiskill sources (#2401). -
Fixed Claude Code marketplace plugin skills installed under
skills/<name>/SKILL.mdto also appear as bare slash commands such as/understand, matching Claude-native plugin docs. The slash command name is taken from the skill directory basename so display-style frontmatter names likename: Understand Anythingstill resolve to/understand(#2415). -
Fixed ACP
/movebuiltin test expectations to compare the resolved destination path so the test is portable on Windows and Unix (#2381 by @oldschoola). -
Fixed vLLM discovery so
providers.vllm.baseUrldrives the built-in endpoint, additional OpenAI-compatible vLLM provider IDs work throughopenai-models-list, and discoveredmax_model_lenor fallbackcontext_lengthvalues set context windows instead of falling back to 128k. -
Fixed extension discovery ignoring package directories symlinked into an
extensions/directory.
Removed
- Removed the Python
openai-whisperdependency andpipinstall path from speech-to-text — the bundledtranscribe.pyand all Python/whisper probes inomp setup speechare gone; the recorder (SoX/FFmpeg/arecord) remains the only external tool. - Changed the speech-to-text trigger from the
Alt+Hkeybinding to a hold-Spacepush-to-talk gesture. Holding the space bar emits an OS auto-repeat burst; once more than 10 spaces land in the editor it recognizes the hold, deletes (tracks back) those inserted spaces, and starts recording, then stops and transcribes when the repeats stop (the space bar is released).app.stt.toggleis now unbound by default but can be rebound to a chord for press-to-toggle; the gesture is gated onstt.enabled, andShift+Spacestill inserts a literal space. - Added an experimental, opt-in auto-learn loop (default off,
autolearn.enabled). When enabled, after the agent stops it is nudged to capture reusable lessons: durable facts go to long-term memory and repeatable procedures become managed skills —SKILL.mdfiles written to an isolated~/.omp/agent/managed-skillsdirectory that is discovered and surfaced like authored skills but never overwrites user-authored skills (authored names always win). Two tools back this:manage_skill(create/update/delete managed skills) andlearn(record a lesson, optionally minting a managed skill in the same call).learnworks with thehindsight,mnemopi, or file-basedlocalmemory backend; underlocal, lessons append to alearned.mdin the project's memory root (kept separate from the consolidation artifacts so they survive a consolidation pass) and are injected into future sessions. The nudge is passive by default (a hidden reminder rides the next turn);autolearn.autoContinueinstead auto-runs one capture turn at stop, andautolearn.minToolCalls(default 5) gates trivial turns. Plan/goal-mode turns and subagents are never nudged. - Fixed
/plancycling betweenplanandplan_pausedwith no path back to modenone, while preserving prompted paused-mode requests. The no-arg third toggle now fully exits — clearingplanModeHasEnteredand appending amode_changeto"none"— and/plan <prompt>fromplan_pausedre-enters plan mode and submits the prompt as the first turn (#2510). - Fixed HTML session export rendering empty text tokens (
text,userMessageText,customMessageText,toolTitle) as the dark-theme grey#e5e5e7on every theme not literally namedlight, making transcripts illegible on custom light themes likesandstone,limestone, andporcelain.isLightThemenow classifies against the resolvedstatusLineBgluminance (the same surfaceTheme.isLightuses), while HTMLdefaultTextcontrasts the actual export surface (export.cardBg/export.pageBg/ deriveduserMessageBg) so light-status themes with dark export cards keep light transcript text (#2516).
What's Changed
- fix(coding-agent): restore compaction-queued messages on Alt+Up dequeue by @metaphorics in #2530
- fix(coding-agent): renumber image markers when restoring queued messages by @roboomp in #2533
- feat(task): inline
rolespecialization for spawned subagents by @metaphorics in #2478 - feat(task): advertise
roleand tailored delegation in task prompts by @metaphorics in #2479 - feat(task): nudge generic spawns toward specialization by @metaphorics in #2480
- feat(irc): work-aware peer roster (role + current activity) by @metaphorics in #2481
- feat(tui): add event-loop watchdog and phase breadcrumbs for stall diagnosis by @metaphorics in #2488
- fix(coding-agent): memoize tool-result rendering to stop per-frame re-shape stalls by @metaphorics in #2489
- fix(coding-agent): clean up compaction UI lifecycle by @metaphorics in #2490
- feat(coding-agent): add guided-goal setup interview by @metaphorics in #2502
- feat(tui): support ctrl+j as an additional newline key by @metaphorics in #2504
- fix(magic-keywords): glow keywords through the cursor seam and animate a focused shimmer by @roboomp in #2506
- feat(mnemopi): add embedding variant setting (English / multilingual) by @metaphorics in #2508
- fix(native): handle Windows worker pressure and OpenCode MiMo tool choice by @roboomp in #2511
- fix(plan): /plan exits paused state back to mode 'none' by @roboomp in #2513
- fix(theme): classify HTML export defaults by status-line luminance by @roboomp in #2519
- fix(coding-agent): run plan-approval compaction on the plan model by @metaphorics in #2520
- fix(tui): clamp editor height and overlays on small terminals by @metaphorics in #2521
- feat(coding-agent): make /fast default scope configurable by @metaphorics in #2522
- fix: stop AgentBusyError on plan approval and loop/goal continuations by @metaphorics in #2524
- fix: phase-lock tool-call spinners across parallel calls by @metaphorics in #2526
- fix: emit per-turn token-usage row as a standalone block by @metaphorics in #2528
- fix(tui): detect tmux/screen via TERM prefix when TMUX env is stripped by @roboomp in #2545
- Fix resolve gate for OpenCode Go Kimi K2.7 Code by @roboomp in #2547
- fix(ai): disable Bun native stream fetch timeout by @roboomp in #2428
- fix(coding-agent): guard Windows task fan-out render cycles by @roboomp in #2552
- fix(anthropic): drop eager_input_streaming on github-copilot transport by @roboomp in #2559
- fix(natives): clean stale native cache versions by @roboomp in #2562
- fix(agent): correct eager todo init prompt by @roboomp in #2563
- fix(ci): scope release runs to per-sha concurrency group by @roboomp in #2565
- fix(coding-agent): remap legacy OAuth imports by @roboomp in #2568
- fix(cli): expose Claude plugin skills as slash commands by @roboomp in #2417
- fix(plan-mode): allow hashline plan edits by @roboomp in #2477
- fix(providers/google): avoid creating empty text blocks on stream ending by @usr-bin-roygbiv in #2538
- feat(coding-agent): three-level eagerness enum for task.eager and todo.eager by @metaphorics in #2540
- feat(coding-agent): opt-in experimental auto-learn (memory + isolated managed skills) by @metaphorics in #2542
- fix(providers/google): avoid outputting system rules and preambles in Antigravity by @usr-bin-roygbiv in #2543
- fix(agent-loop): detect repetition loops during assistant stream and abort gracefully by @usr-bin-roygbiv in #2549
- fix(prompts): instruct models to avoid LaTeX math delimiters and commands by @usr-bin-roygbiv in #2550
- fix(coding-agent): keep
transport: pi-nativeon openrouter after catalog refresh (#2555) by @roboomp in #2556 - fix: handle nested error object in openai responses stream by @usr-bin-roygbiv in #2570
- fix(web-search): fall back on empty SearXNG results by @roboomp in #2572
- fix(natives): avoid linux addon load hang by @roboomp in #2554
- fix(mnemopi): rank exact memories above weak facts by @DarkPhilosophy in #2452
- fix(cli): reject unloadable npm plugin installs by @roboomp in #2313
- feat(coding-agent): emit tool approval lifecycle events by @381181295 in #2444
- fix(catalog): pin MiniMax-M3 context window by @roboomp in #2578
- fix(coding-agent): avoid placeholder crash before theme init by @camopy in #2581
New Contributors
- @381181295 made their first contribution in #2444
- @camopy made their first contribution in #2581
Full Changelog: v15.12.6...v15.13.0