can1357/oh-my-pi v16.3.2 on GitHub

@oh-my-pi/pi-ai

Changed

Removed automated injection of reasoning suppression prompts in OpenAI responses

@oh-my-pi/pi-catalog

Fixed

Fixed ZenMux model discovery to run without a ZENMUX_API_KEY, so newly published ZenMux models (for example anthropic/claude-fable-5-free) auto-update into the runtime models.db cache instead of waiting on a regenerated models.json.
Fixed ZenMux runtime discovery to query the /api/v1/models endpoint even when the resolved provider base URL points at the Anthropic-compatible route, so discovery no longer requests a non-existent /api/anthropic/models path.

Removed

Removed reasoning suppression prompt logic for GPT-5 models

@oh-my-pi/pi-coding-agent

Breaking Changes

Changed search tool paths parameter to a single semicolon-delimited path string parameter
Changed the grep, glob, and ast_grep tools to take a single optional path argument instead of a paths array. path accepts one path or a semicolon-delimited list (src; tests); omitting it searches the workspace root (.). Multi-path search, delimited expansion, and internal-URL scopes are unchanged. (ast_edit continues to take paths.)

Added

Added speech.enhanced setting to rewrite assistant output into natural spoken prose
Added speech.enhanced setting: assistant output is rewritten into natural spoken prose by the tiny/smol model before synthesis — code blocks become one-clause descriptions, links speak their label or site name, numbers and symbols read naturally, lists become flowing sentences. Blocks are rewritten fence-aware and coalesced (bounded to two concurrent completions); any failed or timed-out rewrite falls back to the mechanical cleanup so speech never blocks on the model.

Changed

Reduced extension startup cost, especially on Windows, by reading each extension source-graph module from disk once per load instead of twice (the graph scan now feeds the load-time rewrite hook) (#4196).
Redesigned speech vocalization for low latency and clean spoken content. Assistant markdown now runs through a speakable-text pipeline before synthesis: code blocks and tables are silent, links speak their label, bare URLs speak their host, inline-code ticks/emphasis/heading/bullet markers are stripped, and long file paths collapse to their basename. Segmentation is now parent-side and emits at sentence boundaries immediately (the previous engine-side splitter held each sentence until the next one arrived), with clause-level cuts for long sentences and an idle flush when generation stalls mid-sentence. macOS gains a gapless streaming playback backend (ffmpeg AudioToolbox, sox fallback) instead of spawning afplay per sentence.

Fixed

Fixed ALL-CAPS acronyms (e.g. CNPG, ETL, JWT) being lowered to title case in auto-generated session titles. reconcileTitleCasing (packages/coding-agent/src/tiny/text.ts) now maps ALL-CAPS source tokens into an acronyms table and restores them when the model produces a title-cased artifact (Cnpg), while still declining restoration on shouty input (FIX the BUG NOW, ALL ERROR HANDLING) via a consecutive-ALL-CAPS heuristic. Title prompts also instruct the model to preserve ALL-CAPS acronyms verbatim. (#4220)
Fixed cold-start --model resolution for extension providers whose catalogs come only from fetchDynamicModels, so fresh cached runtime models are available before session startup falls back or hard-fails. (#4216)
Fixed plugin and legacy extension discovery repeatedly re-reading plugin manifests and walking extension node_modules by caching results until plugin cache invalidation. (#4197)
Fixed discoverExtensionPaths invoking every registered extension-module provider (claude, codex, gemini, opencode) on startup and discarding all non-native results. The extension-module capability is now loaded with providers: ["native"], skipping four foreign directory walks per session — noticeable on Windows where the walks are slowest (#4198).
Fixed /move overlay running an fs.statSync per directory entry per keystroke; the directory listing cache now stores Dirent[] and classifies entries without a syscall, falling back to statSync only for symlink entries (#4199).
Fixed default model switches being persisted without changing the active goal-mode session when the current context exceeded the target model window. (#4219)
Fixed live tool preview spinners staying pinned to their first frame for eval and shell-style renderers. (#4170)
Fixed isolated task merges failing when the parent working tree carried WIP for a file the isolated subagent also touched. commitPatchToBranchWorktree now tries plain apply and git apply --3way first (agent-only outcome when the WIP-side blob is tracked in HEAD), then falls back to seeding the temp worktree with the baseline WIP so the delta patch's HEAD+WIP context matches, and rewinds WIP-only files afterward so they don't leak into the branch commit. Covers untracked WIP files, staged-new WIP files, and overlaps --3way cannot resolve. (#4136)
Fixed discoverAgents() skipping agents/ subdirectories inside OMP extension packages, so agents shipped by omp plugin install-ed npm plugins (e.g. loom) and --extension/extensions: settings roots now load the same way their sibling skills/, hooks/, tools/ directories already do. The new scan goes through listOmpExtensionRoots, so Claude marketplace installs continue to flow through the claude-plugins provider without being double-counted. (#3920)
Fixed plan mode hanging without converging on ask/resolve after advisor cards, idle IRC messages, or follow-on turns. Plan-mode decision enforcement ran on only the non-synthetic prompt() return; continuation/wake paths settled via agent_end and bypassed it. Advisor cards and idle IRC are now recorded into context without waking an autonomous turn, and the ask/resolve decision is enforced at the universal agent_end terminal settle via a bounded-retry counter (provider-neutral required, both tools kept available) that reminds-then-forces a fixed number of times and then yields to the user — never looping, never silently ending plan mode un-converged. An irc send await:true to an idle plan-mode session now answers the sender through the existing ephemeral side-channel auto-reply instead of stranding it until its wait timeout, and a queued forced plan decision is dropped when its continuation is skipped or plan mode exits. (#3910)
Reduced subagent streaming CPU cost: the recent-output window no longer re-splits the full (up to 8 KB) tail on every streamed text token. Fragments without a newline extend the current last line in place, and a full recompute runs only when line boundaries actually change.
Reduced task-render CPU cost: the task result frame (repainted ~30×/sec via the spinner) previously did 7+ full passes over the result set (some/filter/reduce); a single pass now derives the status booleans, footer counts, and request total, and incremental review extraction reuses the yield data the caller already normalized instead of re-normalizing it.
Reduced model-resolution cost: resolveModelRoleValue now builds the preference context (an O(n) model-order map over all available models) once and reuses it across every fallback pattern instead of rebuilding it per pattern, and matchModel hoists the case-folded pattern once instead of .toLowerCase()-ing it for every candidate across each filter pass.
Reduced read-tool allocation: line counting counts newlines directly instead of allocating via split("\n"), and the hashline formatter no longer counts the same content twice.
Fixed the assistant-message streaming fast path dropping the transient flag, which disabled the transient render path (code-highlight skip and streaming prefix caches) on every same-shape streaming tick. In-flight renders now correctly skip per-tick syntax highlighting; highlighting applies once at message finalization.
Fixed hidden goal-mode todo context: phase names and task text are now sanitized before prompt injection (no raw newlines or control characters forging extra context lines), and the block is only rendered with tool-accurate guidance when the todo tool is active or discoverable instead of unconditionally instructing the agent to call an unavailable tool.
Fixed custom tool loading treating process.exit() from a tool module's import or factory as a host process exit instead of a recoverable load failure. Custom tools now load under the shared extension exit guard, so an exiting tool is skipped with a load error while remaining tools still load (#1704).
Fixed stuttering/latency in speech by running synthesis chunks through the player gaplessly
Fixed race condition causing EPIPE errors and broken pipes during speech playback
Fixed interrupted speech audio by ensuring segments queue and drain in order
Fixed speech vocalization starting only after the entire reply was synthesized: ONNX inference blocks the TTS worker's event loop, so per-segment IPC audio chunks queued unflushed and arrived in one burst. Streaming sends now drain the IPC channel before the next segment's inference, cutting time-to-first-audio to ~1.5s regardless of reply length.
Fixed an unhandled EPIPE: broken pipe, write rejection at the end of speech playback: the streaming player's stop() raced an un-awaited FileSink.end() against the backend SIGKILL, and mid-session writes never awaited the flush. Writes now await the flush (so a dead backend is detected and the chunk replays on the next candidate or the per-file path) and stop() swallows the expected teardown rejection.

@oh-my-pi/collab-web

Changed

Updated the glob, grep, and ast_grep tool cards to read the new single path argument, falling back to the legacy paths array so historical transcripts still render their search scope.

@oh-my-pi/omp-stats

Added

Added a Tools tab to the omp stats dashboard (/#/tools): per-tool call counts, error rates, result/argument payload sizes, per-model breakdown, and a stacked calls-over-time chart. Token and cost columns attribute each invoking turn's real provider usage evenly across that turn's tool calls. Existing databases re-parse sessions once on the next sync to backfill historical tool calls.

@oh-my-pi/pi-utils

Fixed

Fixed parseJsonWithRepair failing tool calls whose streamed arguments contain an unquoted string value (e.g. {"paths": packages/foo/*, "i": "…"}). Final parsing now recovers such barewords in object/array value position as strings, terminating at , / } / ] / newline. Recovery deliberately refuses anything that could mask real structure or bad data — truncated values, tokens containing " / { / [ or a key-like : (URL :// and Windows :\ colons stay literal), and non-finite atoms (NaN, Infinity, undefined) — and streaming partial parses still roll back unfinished barewords instead of committing them.

What's Changed

Fix todo HUD and goal context follow-ups by @jeffscottward in #3777
perf: streaming-reveal/render throughput + core hot-path optimizations by @oldschoola in #3843
fix(session): converge plan mode on ask/resolve across continuation paths by @metaphorics in #3911
fix(task): scan OMP extension agents/ dirs in discoverAgents by @roboomp in #3922
fix(coding-agent): stopped isolated task merges failing when working tree carries WIP for files the agent also modifies by @roboomp in #4140
fix(tui): animate live tool spinners by @roboomp in #4172
fix(robomp): run sandbox setup/teardown off the event loop safely by @metaphorics in #4184
fix(coding-agent): scope discoverExtensionPaths to native extension-module provider by @roboomp in #4202
fix(model-discovery): auto-update ZenMux models into models.db without a key by @metaphorics in #4204
fix(coding-agent): cache plugin extension resolution by @roboomp in #4209
fix(providers): hydrate runtime model cache before selection by @roboomp in #4217
fix(session): keep model switches active after rate limits by @roboomp in #4221
fix(coding-agent): guard custom tool process exits during load by @roboomp in #1706
fix(tui): stop /move overlay from statting every entry per keystroke by @roboomp in #4200

Full Changelog: v16.3.1...v16.3.2