@oh-my-pi/pi-agent-core
Added
- Added the
interruptibletool field: when set, the agent loop may abort the tool mid-execution to deliver a queued steering message (honored only inimmediateinterrupt mode). - Added support for
geminiandgemmaas valid owned tool syntax values in environment configuration
Fixed
- Fixed
pruneToolOutputsblanking tiny tool results during overflow pruning: results below50tokens (MIN_PRUNE_TOKENS) are no longer replaced with the[Output truncated - N tokens]placeholder, which cost more tokens than the result itself and churned the prompt cache for zero savings.
@oh-my-pi/pi-ai
Added
- Added the
geminiin-band tool-call syntax with Python-styletool_codeblocks anddefault_apiinvocations - Added the
gemmatoken-delimited in-band tool-call syntax using<|tool_call>and<|tool_response>blocks - Added
geminiandgemmato owned stream tool-result token detection so their tool responses are recognized - Fixed truncated Gemini and Gemma tool blocks from being emitted as plain text during streaming
- Added the Azure OpenAI provider definition (
azure) to the registry;AZURE_OPENAI_API_KEYresolves as its env-var API key via the catalog provider table.
Changed
- Gemini tool-call examples now render without the
default_api.namespace prefix, keeping<example>blocks concise. The live wire format still usesdefault_api.per the Gemini grammar.
Fixed
- Fixed duplicate tool call projections by deduplicating provider-native
toolCallevents against in-bandtool_codecalls and keeping only the first real channel - Dropped nameless native
toolCallevents so they no longer appear as surfaced tool calls in owned-mode streams - Fixed Gemini/Gemma in-band tool-call parsing around Python comments, raw/unicode string literals, and Gemma close-token text inside string values.
@oh-my-pi/pi-catalog
Added
- Added Azure OpenAI as a catalog provider (
azure, default modelgpt-5.5, env varAZURE_OPENAI_API_KEY), bundling the OpenAI-family models Azure serves over the Responses API (GPT-4/4.1/4o, GPT-5 family, o-series, Codex). Like Amazon Bedrock it is catalog-only — models ship in the bundle and become selectable once the env key is set, with the deployment base URL resolved at runtime fromAZURE_OPENAI_BASE_URL/AZURE_OPENAI_RESOURCE_NAME. - Added models.dev-backed bundled catalogs for providers that previously shipped no offline models: Hugging Face, Kilo, Moonshot, NanoGPT, Synthetic, Venice, Ollama Cloud, and the Xiaomi Token Plan regions (ams/cn/sgp). They still discover live when credentialed; the bundle is now a non-empty baseline.
Changed
- Updated stale provider default models to their latest bundled versions: OpenAI-family providers (
azure,github-copilot,aimlapi) → GPT-5.5; Gemini providers (google,google-gemini-cli,google-vertex) →gemini-3.1-pro-preview; GLM providers (zai,zhipu-coding-plan) →glm-5.2,cerebras→zai-glm-4.7; Kimi providers (fireworks,opencode-go,moonshot) →kimi-k2.7-code,kimi-code→kimi-for-coding,together→moonshotai/Kimi-K2.7-Code;alibaba-coding-plan→qwen3.7-plus; and Claude-Sonnet defaults (cloudflare-ai-gateway,cursor,gitlab-duo,kilo,opencode-zen,vercel-ai-gateway) → Claude Opus 4.x. - Restricted models.dev Azure discovery to OpenAI-family IDs (
gpt-,o1,o3,o4,codex,chatgpt), excluding Foundry-hosted third parties (Claude/DeepSeek/Llama/Mistral/Phi) that Azure serves through non-Responses APIs. - Detected the Azure OpenAI Responses compat surface (developer role, strict tool mode, strict tool-result pairing) by provider id as well as base URL, so bundled
azuremodels whose deployment host is only known at runtime still get the right wire behavior. - Renamed the
Qwen3-ASR-Flashmodel label toQwen3 ASR Flash
Fixed
- Fixed tool syntax selection for Gemini-family and Gemma model IDs by routing them to dedicated
geminiandgemmaformats instead of generic XML - Fixed
zhipu-coding-planandtogethershipping no bundled models: their descriptors referenced non-existent models.dev keys (zhipu-coding-plan,together); pointed them at the real keys (zhipuai-coding-plan,togetherai) so they bundle their GLM and full catalogs respectively. - Folded the
azure-openai-responsesAPI into the OpenAI Responses thinking-inference branches so Azure reasoning models (o-series, GPT-5, Codex) resolve the discrete effort vocabulary (includingxhigh) and effort-control mode instead of falling through to generic defaults. - Fixed
ollama-clouddiscovery inheriting an unsafe cross-providercontextWindow/maxTokenswhen/api/showreturns no size metadata; it now falls back to the safe 128K context / 8K output caps. - Dropped internal Fireworks control-plane resource ids (
accounts/fireworks/{models,routers}/…) from the bundle; only the public request ids ship.
@oh-my-pi/pi-coding-agent
Added
- Unexpected stop detection: optional tiny/smol classifier that continues the turn when the assistant says it will act but emits no tool calls.
- Settings
features.unexpectedStopDetectionandproviders.unexpectedStopModel.
Changed
- Changed the
jobpoll to return early when a steering message is queued, draining the steer immediately instead of waiting out the poll window. - Capped unexpected-stop auto-continuation to three retry attempts before giving up on repeated stops
- Updated the
edittool's hashline prompt, grammar, and docs to recommend the.=inclusive range separator (SWAP 1.=3:); the legacy..form still parses. - Normalized all internal worker argv selectors under the
__omp_worker_prefix, skipping the async worker dispatch check during normal CLI startup.
Fixed
- Filtered out whitespace-only and dot-only (
.or…) assistant blocks so they are treated as empty and no longer appear as visible content in message rendering, streaming reveal counts, or session export output - Filtered placeholder-only thinking content from ACP notifications and message visibility checks so dot-only
reasoning_contentno longer triggers turn completion or read/run updates - Fixed ModelRegistry tests making outbound network calls by automatically stubbing fetch during test execution.
- Fixed
evalJS cells (and browser-tab worker startup) always stalling for the full init timeout — typically the cell's whole 30s budget — before silently falling back to the slower inline worker. The self-dispatching CLI host imports the worker module dynamically from its argv dispatch, so the worker's ownparentPort.on("message")attached only after Bun flushed the messages the parent posted before spawn; the synchronously-postedinithandshake was dropped and never answered withready. The host now installs a bufferingparentPortinbox synchronously in the entry's sync prefix (before importing the worker module) and the worker binds it on load, replaying the buffered handshake.omp --smoke-testnow also spawns the JS eval worker through the host entry and asserts it handshakes on a real worker thread. - Fixed pre-prompt context-full compaction on OpenAI Responses sessions to use provider-anchored context usage when available, so large encrypted reasoning signatures no longer trigger automatic maintenance while the visible context percentage remains below threshold (#2628).
@oh-my-pi/collab-web
Fixed
- Wrapped composer button labels to display icon-only on mobile devices for a more compact and readable layout
- Made the connect screen, ended session card, and notification toasts fully responsive for smaller device viewports
- Fixed mobile layout issues where the entire chat flow would overflow horizontally and text was rendered too large on iOS Safari (by setting
text-size-adjust: 100%) - Made transcript rows stack vertically on small screens to optimize reading space, and prevented grid track expansion
- Hid non-essential metadata (such as the model name, thinking level, and working directory path) and context gauge tracks on mobile headers to prevent overflow
@oh-my-pi/hashline
Changed
- Changed the recommended hashline range separator from
..to.=(e.g.SWAP 1.=3:,DEL 4.=5) so the inclusive<=-style end is self-evident.HL_RANGE_SEPis now.=; the prompt, grammar, error messages, and emitted headers all use it. The lenient parser still accepts the legacy..(and-/…/space) forms.
@oh-my-pi/omp-stats
Changed
- Renamed
__omp_stats_sync_workerto__omp_worker_stats_sync.
@oh-my-pi/pi-utils
Added
- Added
installWorkerInbox(port)/consumeWorkerInbox()to@oh-my-pi/pi-utils/worker-host. A self-dispatching CLI host that imports a Bun worker module dynamically attaches the worker's realmessagelistener after Bun flushes the messages the parent posted before spawn, dropping a synchronously-postedinit. The host installs this buffering inbox synchronously in the entry's sync prefix so a listener exists at flush time; the worker module consumes it and binds the real handler, replaying anything buffered.
Full Changelog: v15.13.2...v15.13.3