github can1357/oh-my-pi v15.10.11

latest release: v15.10.12
5 hours ago

@oh-my-pi/pi-agent-core

Changed

  • Editorial pass over the compaction prompts: fixed garbled grammar and missing articles, RFC-keyed prohibitions, deduped restated instructions; parsed markers (<read-files>/<modified-files>/<previous-summary>) and all output-format headings left byte-identical
  • Catalog imports moved to the new @oh-my-pi/pi-catalog package: subpath imports (calculateCost, Codex wire constants) plus catalog values previously taken from the @oh-my-pi/pi-ai root (getBundledModel, clampThinkingLevelForModel), which pi-ai no longer re-exports; type-only Model/Api/Effort imports from pi-ai are unchanged

@oh-my-pi/pi-ai

Breaking Changes

  • The model catalog moved to the new @oh-my-pi/pi-catalog package. Deep subpath exports @oh-my-pi/pi-ai/models.json, /models, /model-cache, /model-manager, /model-thinking, /effort, /provider-models*, /utils/discovery*, /providers/openai-codex/constants, /providers/google-gemini-headers, and /providers/openai-completions-compat are gone — import the @oh-my-pi/pi-catalog equivalents (/models.json, /models, /model-cache, /model-manager, /model-thinking, /effort, /provider-models*, /discovery*, /wire/codex, /wire/gemini-headers, /compat/openai). The pi-ai root barrel re-exports only the model/effort types its own signatures use (Model, Api, ThinkingConfig, Effort, Usage, compat interfaces) — catalog values (getBundledModel(s), calculateCost, modelsAreEqual, clampThinkingLevelForModel, DEFAULT_MODEL_PER_PROVIDER, …) must be imported from @oh-my-pi/pi-catalog.
  • ProviderDefinition is now auth-only: defaultModel, createModelManagerOptions, catalogDiscovery, dynamicModelsAuthoritative, allowUnauthenticated, and specialModelManager moved to pi-catalog's CATALOG_PROVIDERS table, and KnownProviderId was replaced by pi-catalog's KnownProvider (registry completeness is enforced by a compile-time check against that union). The pure GitHub Copilot key/endpoint helpers moved from registry/oauth/github-copilot to @oh-my-pi/pi-catalog/wire/github-copilot.

Added

  • Exported wrapFetchForCch so non-streaming OAuth callers (e.g. the web-search provider) can patch the Claude Code billing-header cch attestation into their request bodies instead of shipping the cch=00000 placeholder.

Changed

  • Reduced idle-watchdog churn on the token hot path: the abort promise/listener is created once per stream instead of per yielded item, the deadline uses a persistent re-armed timer instead of a setTimeout create/destroy pair per delta, and the persistent race promises are re-minted every 1024 items so per-race reaction records cannot accumulate for the stream's whole life.
  • Memoized Anthropic many-image downscaling by content-block identity, so long sessions with stable message objects no longer re-decode and re-encode every oversized image on each request and retry.
  • Tool-argument validation errors now truncate embedded argument strings at 256 chars per field — a failed write-class call no longer echoes hundreds of KB of payload back to the model as the error message.
  • Auth storage no longer issues per-boot no-op writes: the schema-version row is only rewritten when the recorded version actually changes, and the credential identity-key backfill skips rows whose derived identity is null — reopening a current-schema database now performs zero write transactions
  • Plain provider env-var names moved to the catalog table: registry defs dropped their 48 envKeys literals (including the pure $pickenv pickers for huggingface/qwen-portal/xai-oauth), getEnvApiKey now derives those fallbacks from CATALOG_PROVIDERS[].envVars, and envKeys remains only for computed resolvers (Anthropic Foundry, Vertex ADC, Bedrock credential chains) and non-catalog providers (kagi, tavily, parallel, perplexity)
  • Protocol handlers are now pure model.compat readers — the per-request resolve*Compat/detect*Compat calls (anthropic ×11, responses ×3, completions wrappers), inline strictResponsesPairing host detection, the OpenCode reasoning_content mutation block, and all resolvedBaseUrl threading are gone. Compat is materialized once at model build time (@oh-my-pi/pi-catalog buildModel); the OpenCode thinking-mode quirk is a precomputed compat.whenThinking pointer swap, and request-time base-URL overrides only feed the HTTP client. Behavior is unchanged (the Anthropic supportsLongCacheRetention official-endpoint gate is folded into detection).
  • Providers now read baked thinking/wire metadata instead of re-parsing model ids per request: the Anthropic handler gates sampling params on model.compat.supportsSamplingParams and adaptive display on model.thinking.supportsDisplay (Bedrock too), adaptive effort tiers come from the baked thinking.effortMap, the Google thinkingLevel map is static, and effort-dial-less reasoners (thinking: undefined, e.g. xai-oauth/grok-build) short-circuit resolveOpenAiReasoningEffort without the removed modelOmitsReasoningEffort predicate.
  • Anthropic streaming retries now use a 10-retry budget with the Anthropic-compatible 0.5s exponential backoff capped at 8s with jitter; server retry-after hints still win, and retryable pre-content failures such as 502s no longer stop after three tries.

Fixed

  • Fixed Ollama chat requests honoring omitMaxOutputTokens, sending think: false when reasoning is explicitly disabled, and preserving HTTP 400 response bodies in surfaced errors.
  • Fixed AuthStorage.markUsageLimitReached collapsing "every sibling is momentarily blocked" into "no sibling exists": it now returns UsageLimitMarkResult with the earliest sibling block expiry (retryAtMs), so retry layers can wait out a short-lived block (60s post-401, 5-min usage-probe) instead of adopting the provider's multi-hour retry-after. rotateSessionCredential and the auth-gateway adapt to the new shape.
  • Fixed Gemini streaming silently presenting truncated or blocked output as a successful stop: in-band {"error":{...}} events and promptFeedback.blockReason chunks were never inspected, and a stream ending without any finishReason kept the initialized stop — all three now surface as errors (both the API-key and gemini-cli/Antigravity consumers), and the toolUse stop-reason override no longer masks SAFETY/MALFORMED_FUNCTION_CALL finishes that arrive after a valid tool call.
  • Fixed Gemini/Bedrock error finishes reporting "An unknown error occurred": the raw finish/stop reason (MALFORMED_FUNCTION_CALL, RECITATION, guardrail_intervened, …) is now recorded into the surfaced error message.
  • Fixed the Anthropic provider retry loop ignoring server retry-after on 429/529 — it now waits max(headerDelay, backoff) instead of hammering a rate-limited endpoint three times within ~14s of guaranteed failures.
  • Fixed in-stream Anthropic SSE error events being thrown as raw JSON envelopes; the structured error.type/message is parsed out, keeping retry classification on the typed token instead of accidental regex hits.
  • Fixed transparent-reconnect tolerance duplicating content behind replaying proxies: after a duplicate message_start, replayed content_block_start events for already-closed indexes are now consumed silently instead of appending duplicate text/tool calls.
  • Fixed the Anthropic gateway accepting malformed known-type content blocks (e.g. {type:"text", text:123}) through the unknown-block catch-all, corrupting history and surfacing later as an opaque TypeError — they now fail validation with a clean 400. The gateway's encode stream also emits ping keepalives every 15s and a complete message_start/message_delta/message_stop envelope when the inner stream ends without a terminal event, so strict clients no longer classify slow or empty streams as protocol errors.
  • Fixed dotted-version Claude ids (claude-opus-4.7/4.8 on GitHub Copilot, Vercel AI Gateway, Zenmux) missing adaptive thinking display support — streamed reasoning stayed hidden on those entries because the display predicate only matched dash-form ids (same failure class as #1373).
  • Fixed the Mistral requiresThinkingAsText replay path calling .unshift() on string assistant content — an unconditional TypeError that failed any same-model history turn carrying both thinking and text.
  • Fixed the Responses gateway stripping encrypted_content from inbound reasoning items (strip-mode schema), which broke codex-style stateless replay; the schema is now loose, restoring the symmetry the outbound encoder already preserved. Composite internal callId|itemId ids are also split before hitting the wire so third-party clients that validate call_id charsets no longer reject them.
  • Ported the shared unfinished-tool-call sweep to the codex response.completed handler, so a lost output_item.done can no longer persist a tool call with stale {} arguments and transient parser fields into session history.
  • Fixed live text freezing until item completion when a lossy proxy drops content_part.added: the missing part is now synthesized on the first output_text/refusal delta (shared and codex decoders).
  • Fixed interleaved content/tool_calls deltas fragmenting a tool call into a truncated call plus a nameless phantom: text/thinking transitions no longer finish open tool-call blocks, so index-only continuation deltas re-find them.
  • Fixed the Azure chat-completions path ignoring AZURE_OPENAI_DEPLOYMENT_NAME_MAP (only the Responses provider honored it), producing opaque 404s when deployment names differ from catalog model ids.
  • Fixed the chat gateway discarding inbound assistant reasoning_content, which fed DeepSeek/Kimi exact-replay upstreams a placeholder instead of the model's actual reasoning; it now round-trips as a thinking block, and toolcall_end emits a corrective id/name chunk when the streamed start carried empty values.
  • Fixed the auth retry loop minting OAuth tokens and firing a doomed request after the caller aborted, and stopped masking resolver failures (broker/network/refresh errors) as "No API key" — the actual cause is preserved.
  • Fixed EventStream.end() without a terminal result leaving .result() pending forever (reachable via extension streams and the lazy wrapper); it now rejects with a synthesized error.
  • Fixed the Copilot retry wrapper blind-retrying every retryable error with fixed 400ms delays: 429/5xx now honor Retry-After (capped at 30s) and other statuses are not retried, while status-less transport blips keep the linear retry.
  • Fixed the OpenAI completions error path ending the stream without closing open text/thinking/tool-call blocks, leaving consumers with orphaned block lifecycles on every stream error or idle-timeout abort.
  • Fixed DSML hold-back freezing display on any bare < in model output for up to 256 chars: idle-state holding now only triggers on a strict DSML section-open prefix, and blowing the 1MB parameter cap no longer leaks the closing envelope tags as visible text; a capped parameter value also carries an explicit …[parameter truncated] marker instead of executing the tool with silently corrupted input.
  • Fixed schema normalization blanking DAG-shared subtrees to {}: the visited-set cycle guard treated a subschema object reused across two properties as a cycle; path-tracking enter/exit now allows sharing while still short-circuiting true cycles, frozen input schemas no longer throw, and the path counter no longer leaks depth on the cycle branch (which made every later normalization of the same object misreport a cycle).
  • Fixed shared in-flight Google token refreshes being bound to the first caller's AbortSignal, failing every concurrent waiter when one parallel Vertex call was cancelled; callers now race their own signal against a detached refresh, which is bounded by its own 30s timeout so a hung fetch cannot pin the in-flight slot until process restart.
  • Fixed Gemini <3 multimodal tool results breaking the single-function-response-turn invariant for parallel tool calls (image turns are buffered and flushed after the merged functionResponse turn), and the gemini-cli consumer now defaults missing functionCall.args to {} like the shared consumer.
  • Fixed Bedrock dropping toolConfig entirely when toolChoice is "none" while history still contains tool blocks — the Converse API rejects such requests, so tool specs are kept and only the choice is omitted.
  • Fixed AWS credential handling serving expired credentials until process restart: cache entries are invalidated on 401/403, file-sourced session-token credentials get a 5-minute TTL, and concurrent first requests single-flight instead of spawning duplicate credential_process/SSO fetches — the shared resolution is detached from the first caller's abort signal (one cancelled request no longer fails every waiter) and bounded by its own 30s timeout. The eventstream reader also cancels the response body on abnormal exit instead of leaving the HTTP connection draining.
  • Fixed an unbounded, zero-backoff Codex WebSocket reconnect loop on websocket_connection_limit_reached: the no-content reconnect path never consulted the retry budget and never waited, hammering the endpoint forever when the limit is account-scoped. Reconnects are now budgeted and delayed like every other WS retry path, falling back to a single SSE replay when exhausted.
  • Fixed the Codex whitespace-loop breaker not observing degenerate frames that arrive after their item closed (or before it opened) — those frames count as stream progress, so the idle watchdogs never fired and the turn hung forever, which is exactly the failure mode the breaker exists for. Whitespace-loop recovery now also refuses to replay the turn once a toolcall_end was delivered, surfacing the error instead of re-emitting the same tool calls.
  • Fixed the two remaining Codex retry paths (WS mid-stream reconnect and the empty-content SSE fallback) leaking blockless native output items (e.g. web_search_call) from the failed attempt into the replayed turn's providerPayload and append baseline.
  • Fixed Codex WebSocket failure handling closing whatever connection currently occupies the session slot — including a concurrent caller's in-flight CONNECTING handshake, whose rejection (websocket closed before open) is classified fatal and disabled WebSockets for the whole session. Failure cleanup now skips CONNECTING sockets and the pool re-joins replacement handshakes (bounded).
  • Fixed the Codex request transformer not repairing orphan custom_tool_call_output items (only function_call_output was folded into an assistant note) — a compaction splice that dropped an apply_patch call while keeping its result produced a hard 400 on the default GPT-5 Codex toolset.
  • Fixed processResponsesStream finalizing reasoning items via a bare itemId content scan instead of the routed entry: with id-less reasoning items (local hosts), every output_item.done matched the FIRST thinking block — the second item's text clobbered it and the second block was never finalized or signed.
  • Fixed processResponsesStream dropping tool calls and message text whose output_item.added event was lost (lossy proxies): toolcall_end was emitted with a dangling contentIndex while the call never entered message.content, so the agent loop silently never executed it. The done handler now synthesizes the missing block; still-open tool-call blocks are also final-parsed at response.completed so the toolUse override cannot hand the agent stale {} arguments.
  • Fixed response.incomplete with incomplete_details.reason: "content_filter" being reported as a token-cap truncation (stopReason: "length") — the agent loop's length recovery then asked the model to "shorten" a filtered prompt. Content-filtered turns now surface as errors; usage is also populated from response.failed events, and an unknown terminal status degrades to "stop" with a logged anomaly instead of throwing away a fully-streamed response.
  • Fixed Copilot premiumRequests accounting being dropped from failed/cancelled responses: populateResponsesUsageFromResponse replaced usage wholesale and the error path threw before the success-path re-apply. The populate now preserves the field.
  • Fixed deduplicateToolCallIds suffixing the whole composite Responses id (callId|itemId) — normalizeResponsesToolCallId extracts the first segment as the wire call_id at encode time, so both copies collapsed back onto one call_id and the request carried duplicate call/output pairs. The suffix and length budget now apply per segment.
  • Gated native history payload replay on api + model id in both Responses providers: after a mid-session model switch, reasoning items carrying encrypted content minted by the previous model were replayed verbatim under the new model. Replay now falls back to block re-encode (which already strips foreign signatures), matching transformMessages' same-model trust rule.
  • Fixed Azure OpenAI Responses requests omitting store: false while requesting reasoning.encrypted_content (stateless-only per OpenAI), replaying custom tool calls paired with mismatched function_call_output items (customCallIds was never threaded through), letting the SDK's internal retries (maxRetries 5) silently re-POST inside the explicit first-event deadline, and sending a prompt_cache_key when the caller opted out via cacheRetention: "none".
  • Fixed strict-pairing Responses backends (Azure, Copilot) silently discarding tool results whose call is absent from history — the result is now folded into an assistant note (same shape as orphan-output repair) so the model keeps the information.
  • Fixed the OpenAI Responses first-event watchdog staying armed across the onResponse notification callback (a slow callback aborted an already-connected stream), Copilot transient-model retries re-attempting on an already-aborted signal (instant dead retry surfacing the scheduler's AbortError), Codex reasoningSummary: null being coerced to "auto" (the documented omit-summary contract was unreachable), nested Codex error codes (response.error.code) being invisible to the connection-limit/previous-response recovery matchers, and the session id leaking unredacted into PI_CODEX_DEBUG logs via the x-client-request-id header.
  • Fixed processResponsesStream (shared by openai-responses and azure-openai-responses) ignoring the terminal response.incomplete event: a max-output-tokens-truncated response ended with stopReason: "stop", zero usage, and no cost instead of "length" with the reported token counts. response.incomplete is now handled alongside response.completed and counts as stream progress for the idle watchdogs.
  • Fixed custom tool-call content blocks keeping the transient partialJson accumulation buffer (and a potentially stale arguments.input) after response.output_item.done in the shared Responses stream processor — the function_call branch already cleaned these up.
  • Fixed two OpenAI Codex stream-retry paths (whitespace-loop recovery and retryable provider errors) leaking native output items from the abandoned attempt into the replayed turn's providerPayload — stale reasoning items completed before the failure were re-sent as history input on subsequent requests alongside the retry's own items.
  • Fixed the Codex WebSocket queue wiping already-received frames when a transport error arrived: a response.completed queued just before an eager server close was discarded, turning a finished response into a spurious websocket closed failure and a full request replay. Errors now append behind pending data frames.
  • Fixed concurrent getOrCreateCodexWebSocketConnection callers (prewarm racing the first request) tearing down each other's in-flight handshake — closing a CONNECTING socket rejected the other caller with a fatal websocket closed before open, disabling WebSockets for the entire session. Callers now join the pending handshake.
  • Stopped the Codex connection-limit recovery from replaying a turn over SSE after a toolcall_end had already been delivered to the consumer (canSafelyReplayWebsocketOverSse guard was bypassed, re-emitting the same tool calls); the error now surfaces instead.
  • Extended the Codex whitespace-only argument-delta circuit breaker to custom_tool_call_input.delta frames, which counted as stream progress and could keep a degenerate response alive forever with no cap on buffer growth.
  • Fixed Codex stream failures during transport open reporting a synthetic request dump (empty URL/body) instead of the real request, and a response.created event resetting the recorded time-to-first-token.
  • Fixed the Codex WebSocket connect watchdog timer leaking (pinning the event loop for up to 10s) when the request signal aborted before or during the handshake.
  • Fixed OpenRouter-hosted Anthropic adaptive reasoning models (Claude Fable/Mythos 5 and Opus 4.6+) so the catalog exposes xhigh; Fable/Mythos and Opus 4.7+ requests now map user high/xhigh onto OpenRouter's Anthropic xhigh/max effort scale.
  • Fixed an unknown Anthropic stop_reason failing the whole turn after the response had fully streamed. mapStopReason threw on unrecognized values, and since the reason arrives on the trailing message_delta the error was unretryable — the live model_context_window_exceeded stop reason (default on Sonnet 4.5+) hit this path. It now maps to length, and any future unknown reason degrades to a logged anomaly plus a normal stop instead of an error.
  • Stopped clamping API-key Anthropic requests to Claude Code's 64k output cap. The CLAUDE_CODE_MAX_OUTPUT_TOKENS clamp exists to match the OAuth wire fingerprint, but buildParams applied it unconditionally, silently halving the output budget of 128k-output models (e.g. Opus 4.8) for API-key callers. OAuth requests keep the clamp.
  • Stopped a successful strict-tools fallback from shipping errorMessage on a stopReason: "stop" assistant message. After a grammar-too-large 400 triggered the non-strict retry, the original 400 text was kept on the final message even when the retry succeeded — consumers that treat errorMessage presence as failure (e.g. balance probes) misclassified the turn, and the stale text suppressed later refusal explanations. The fallback is now logged instead.
  • Fixed model-supplied User-Agent headers being silently dropped on non-OAuth Anthropic requests. enforcedHeaderKeys filtered the header out of modelHeaders in every branch but only the OAuth branch set one back; the Cloudflare-gateway, bearer-gateway, and X-Api-Key branches now forward the caller's value verbatim.
  • Stopped sending the fast-mode-2026-02-01 beta header once a session has learned the endpoint+model rejects fast mode (fastModeDisabled provider state), matching the already-dropped speed param.
  • Stopped buildAnthropicHeaders defaulting API-key requests onto the full Claude Code OAuth beta list (oauth-2025-04-20, claude-code-20250219, …). The claudeCodeBetas default is now OAuth-gated, matching the streaming path — the web-search header builder was the only caller hitting the default, so API-key search requests now carry just their own betas (e.g. web-search-2025-03-05). An empty anthropic-beta header is omitted entirely instead of being sent as an empty string.
  • Fixed image-bearing developer messages being upgraded to mid-conversation system turns on Opus 4.8+/Fable/Mythos 5. System content is text-only on the wire, so a developer turn carrying image blocks in an upgrade-eligible position produced a 400; it now stays a user message.
  • Fixed a spliced reconnect's second envelope overwriting the completed Anthropic message: message_delta was not gated by the terminal-stop flag (content events and duplicate message_start were), so the splice's stop_reason/usage replaced the finished turn's — a tool_use turn could be relabeled stop, and the harness then never executed the streamed tool calls. Post-terminal deltas are now logged as envelope anomalies and skipped.
  • Fixed a ping arriving before message_start consuming the Anthropic first-event watchdog: the stall was then classified as a terminal mid-stream idle timeout instead of a retryable first-event timeout. Pings no longer count as the first item but still refresh the idle deadline once content is flowing.
  • Fixed Anthropic-compatible proxies that omit usage/delta objects from message_start/message_delta/content_block_* envelopes crashing the turn with an unretryable TypeError; the missing payloads now degrade to logged envelope anomalies like every other malformed-frame case.
  • Fixed applyPromptCaching placing cache_control on thinking/redacted_thinking blocks — Anthropic rejects that with a 400. A thinking-only assistant turn inside the trailing cache window (e.g. followed by the synthetic Continue. pad) no longer receives a breakpoint.
  • Fixed consecutive assistant params reaching the wire when an empty user/developer turn between two assistant turns was dropped by the converter (e.g. an empty "nudge" submission after a length-truncated reply); Anthropic 400s on non-alternating assistant turns, and the broken triple replayed on every subsequent request. A user: "Continue." separator is now inserted, mirroring the trailing-prefill fallback.
  • Fixed supportsAdaptiveThinkingDisplay misparsing bare dated Opus ids: claude-opus-4-20250514 (Opus 4.0) parsed as minor 20250514 ≥ 4.7, which silently dropped the interleaved-thinking-2025-05-14 beta for API-key Opus 4.0 requests.
  • Fixed output_config.effort shipping without the effort-2025-11-24 beta on thinking-off requests against adaptive-only Claude models (the effort:"low" pin), and the mid-conversation system role shipping without mid-conversation-system-2026-04-07 on API-key and OAuth-utility requests; both betas are now added whenever the request can carry the corresponding field.
  • Fixed GitHub Copilot anthropic-messages requests going out with no Content-Type and no anthropic-version header — the copilot branch builds its headers from scratch and Bun's fetch does not default Content-Type for string bodies. Both headers are now pinned to match every other branch.
  • Fixed Anthropic client/provider retry multiplication: with the first-event watchdog disabled (PI_STREAM_FIRST_EVENT_TIMEOUT_MS=0), the client's internal maxRetries: 5 reactivated and stacked with the provider loop's 3 retries — up to 24 wire attempts with double backoff. The provider now pins per-request maxRetries: 0 unconditionally.
  • Fixed AnthropicMessagesClient spreading fetchOptions after the core request fields, letting a caller-supplied signal/method/body silently disconnect the timeout controller or corrupt the request. Transport extras (TLS) still pass through; core fields now always win.
  • Fixed Foundry mTLS/CA material being cached for the process lifetime when the env vars point at files: the cache key now folds in the file mtime so on-disk certificate rotation takes effect.
  • Fixed the Claude Code fingerprint version drifting across surfaces: the usage endpoint (claude-cli/2.1.160) and OAuth bootstrap (claude-code/2.1.160) pinned a stale version while /v1/messages reported 2.1.165; both now derive from claudeCodeVersion.
  • Fixed a system prompt that merely mentions x-anthropic-billing-header: mid-text suppressing the entire Claude Code system-block injection (billing header, instruction, and cch attestation); the resumed-session guard now anchors with startsWith.
  • Fixed lone surrogates in cross-API tool-call arguments reaching Anthropic's strict UTF-8 validation: replayed OpenAI/Google-origin tool_use.input string leaves are now deep-sanitized with toWellFormed(), while same-API Anthropic arguments stay byte-identical to keep prompt-cache prefixes stable.
  • Bounded the many-image resize fan-out to 4 concurrent decodes (it previously decoded every oversized image at once, two encode pipelines each — multi-GB transient memory at the 20+-image threshold that activates the feature).
  • Fixed mergeHeaders merging case-sensitively on the Copilot/client-options path, where a miscased user-configured header (e.g. authorization next to the synthesized Authorization) survived as two keys that the Headers constructor joins comma-separated on the wire.
  • Hardened the Anthropic stream lifecycle: prologue failures (e.g. a malformed Copilot credential in buildCopilotDynamicHeaders) and error-finalization failures now surface as an error event instead of an unhandled rejection that left stream.result() hanging forever; the spurious "cch billing placeholder not patched" warning no longer fires when the placeholder only appears in user content.

Removed

  • Removed the dead iterateUntilAbort helper (superseded by iterateWithIdleTimeout); it leaked the upstream iterator when the consumer abandoned mid-yield and had no production call sites.

@oh-my-pi/pi-catalog

Added

  • Added hostMatchesUrl, modelMatchesHost, and endpoint-shape helpers in the new hosts module for consistent provider/baseUrl matching
  • buildModel(spec) (build.ts) is now the single Model constructor: it materializes the fully-resolved compat record and canonical thinking metadata exactly once (compat first, thinking derived from identity + resolved compat), so Model.compat is a required, complete CompatOf<TApi> (ResolvedOpenAICompat/ResolvedOpenAIResponsesCompat/ResolvedAnthropicCompat) and request-path code reads fields with zero URL parsing and zero per-request allocation. Sparse user/config overrides live on the new ModelSpec<TApi> input shape and survive on Model.compatConfig for introspection.
  • Added ResolvedAnthropicCompat.supportsSamplingParams (Opus 4.7+/Fable/Mythos reject temperature/top_p/top_k with a 400), baked at build time from model identity so the request path stops re-parsing model ids.
  • Compat detection gained model-time flags so handlers stop sniffing baseUrl: completions supportsReasoningParams, alwaysSendMaxTokens, isOpenRouterHost, isVercelGatewayHost, streamIdleTimeoutMs, and a precomputed whenThinking alternate view (OpenCode reasoning_content gating, #1071/#1484); responses strictResponsesPairing, supportsLongPromptCacheRetention, supportsReasoningEffort; anthropic officialEndpoint, requiresToolResultId, replayUnsignedThinking.
  • New @oh-my-pi/pi-catalog package: the model catalog extracted from @oh-my-pi/pi-ai. Owns the bundled models.json and its generation pipeline (scripts/generate-models.ts), the core model data types (Model, Api, ThinkingConfig, Effort, Usage, compat interfaces), thinking metadata enrichment and generated policies (model-thinking.ts), the SQLite model cache and model manager, per-provider discovery factories (provider-models/), the discovery protocol clients (discovery/), and the new CATALOG_PROVIDERS table — the single source of truth for provider ids, default models, and discovery wiring (KnownProvider, PROVIDER_DESCRIPTORS, and DEFAULT_MODEL_PER_PROVIDER are derived from it).
  • New identity/ module centralizing model-identity concerns that were previously duplicated across packages: family classification and version parsing (identity/classify.ts, extracted from pi-ai's model-thinking internals), canonical model equivalence with injected reference data (identity/equivalence.ts, from coding-agent's model-equivalence), proxy/reseller reference lookup (identity/reference.ts, from coding-agent's model-registry), bracket-affix and id-segment helpers (identity/id.ts), a single trailing-marker vocabulary with canonical vs reference flavors (identity/markers.tssearch stays reference-only so Perplexity's sonar-pro-search remains canonical-distinct), and provider priority ordering (identity/priority.ts).
  • Memoized bundled-reference accessors (getBundledCanonicalReferenceData / getBundledModelReferenceIndex in identity/bundled.ts): one lazy walk of the bundled catalog feeds both canonical equivalence and proxy-reference lookup, so consumers no longer hand-roll the glue.
  • identity/selection.ts: pure canonical-variant selection (resolveCanonicalVariant, buildCanonicalModelOrder, CanonicalVariantPreferences) extracted from the coding-agent registry — provider rank, then exact-id match, variant source, id length, and candidate order.

Changed

  • Changed OpenAI compatibility detection to use shared host classifiers (modelMatchesHost/hostMatchesUrl) with normalized matching instead of raw URL substring checks
  • Changed hostMatchesUrl/modelMatchesHost usage in compatibility detection to reduce mismatches across case variants and provider alias hosts
  • Provider catalog entries now carry the runtime API-key env fallback as an ordered envVars list; catalogDiscovery.envVars became an optional generation-time override (only cursor and vercel-ai-gateway differ) and PROVIDER_DESCRIPTORS materializes the resolved list for generate-models.ts.
  • Model's api parameter now defaults to Api instead of any (Model<TApi extends Api = Api>), so bare Model no longer behaves as Model<any> at call sites.
  • ThinkingConfig is now explicit and total: an ordered efforts array replaces the minLevel/maxLevel/levels range encoding, and the wire facts are baked alongside it — effortMap (anthropic-adaptive 4-tier vs 5-tier scale, shared with the OpenRouter completions remap) and supportsDisplay (adaptive display field support). Explicit spec thinking owns the capability surface (mode/efforts/defaultLevel) and wins over inference; missing wire facts are backfilled from identity so configs never need to know Anthropic's tier tables. Reasoning models that reject the wire effort param (compat.supportsReasoningEffort: false on openai-responses*) are encoded as thinking: undefined ("thinks, no control surface") instead of the removed modelOmitsReasoningEffort special case. models.json was re-baked in the new vocabulary behind a 3196-model behavioral parity gate, and the model cache schema bumped to v4 to invalidate old-shape rows.
  • mapEffortToGoogleThinkingLevel(effort) is now a static map (model parameter dropped — validation stays at the requireSupportedEffort call sites), and mapEffortToAnthropicAdaptiveEffort reads the baked thinking.effortMap instead of re-classifying the model id per request.
  • Generator-only policy code moved out of the runtime bundle into scripts/generated-policies.ts: applyGeneratedModelPolicies (now policy fixups + thinking re-bake via the shared deriver), linkOpenAIPromotionTargets, the Copilot context-window table, minimax/opencode-go compat fixups, and CLOUDFLARE_FALLBACK_MODEL. The anthropic id predicates (hasOpus47ApiRestrictions, supportsMidConversationSystemMessages, isAnthropicFableOrMythosModel) moved to identity/family for build-time use by the compat/thinking derivers only.

Fixed

  • Fixed Anthropic official-endpoint detection to require strict HTTPS hostname matching so non-official or lookalike URLs are no longer treated as official Anthropic hosts
  • Fixed Ollama Cloud dynamic discovery so same-id matches from other providers no longer supply context-window or max-output-token limits for discovered models.
  • Wired @oh-my-pi/pi-catalog into the release publish package list, tarball install smoke test, and root bun generate-models script.
  • Fixed supportsAdaptiveThinkingDisplay only matching dash-form version ids: dotted ids (claude-opus-4.7) now classify through identity/classify like every other anthropic predicate, so six bundled dotted Opus 4.7/4.8 entries (github-copilot, vercel-ai-gateway, zenmux) regain adaptive display support; bare dated ids (claude-opus-4-20250514 = Opus 4.0) stay excluded.
  • Fixed the OpenRouter anthropic adaptive-effort map misclassifying bare dated Opus ids (claude-opus-4-20250514 parsed as version 4.20 → wrongly adaptive); the map now derives from the shared classifier and the shared 4-/5-tier tables.

Removed

  • Removed the runtime enrichment layer: enrichModelThinking (and its non-enumerable memo-slot cache), refreshModelThinking, modelOmitsReasoningEffort, and the model-thinking re-exports of generator-only policies. Thinking metadata is resolved exactly once inside buildModel; runtime helpers (getSupportedEfforts, clampThinkingLevelForModel, requireSupportedEffort, the effort mappers) are pure field reads.

@oh-my-pi/pi-coding-agent

Added

  • Added supportsReasoningParams, alwaysSendMaxTokens, strictResponsesPairing, and a recursive whenThinking overlay (alongside streamIdleTimeoutMs/supportsLongPromptCacheRetention/requiresToolResultId/replayUnsignedThinking) to the OpenAI/Anthropic compat schema so custom model entries can configure those provider-specific capabilities
  • Custom model thinking config now uses the catalog's explicit vocabulary: efforts (ordered list) plus optional defaultLevel, effortMap, and supportsDisplay overrides; the legacy minLevel/maxLevel/levels range shape is still accepted and normalized at parse time. Wire facts (effortMap/supportsDisplay) are backfilled from model identity when not set, so existing claude-proxy configs keep the 5-tier adaptive scale and summarized display without changes.
  • New omp usage command: a detailed per-account breakdown of provider usage limits (bars, windows, reset times, plan metadata) covering every stored credential — accounts with no usage endpoint are listed as "no usage data" rows. Each provider section ends with per-window capacity stats ("capacity: 5h → 2.40/5 accounts used (2.60× quota left)"). Flags: --provider to filter, --json for the broker-shaped report payload, and --redact to mask account emails/ids down to a two-char anchor plus a minimal middle-out differentiator (ca*9*) for screenshot-safe sharing.
  • Startup hangs are now self-diagnosing (speculative fix for the "zero output, hangs even on omp -h" report class): a watchdog prints a stderr line every 10s naming the deepest in-flight startup phase (via logger.openSpanPath()) until a mode runner takes over, pausing around legitimate interactive waits (fork/move prompts, the --resume session picker); PI_DEBUG_STARTUP is restored as streaming synchronous [startup] phase markers covering command-module imports and the native addon load, which the post-startup PI_TIMING tree structurally cannot show for a hang; and waiting on piped-stdin EOF announces itself after 1s instead of blocking silently.
  • npm installs now execute a prebundled single-file entry: the published bin.omp points at dist/cli.js (built by scripts/bundle-dist.ts during prepack, ~18MB minified, natives/transformers/mupdf external), cutting npm-install cold start by roughly 3x versus transpiling the raw TypeScript graph per launch; src/** stays published for SDK consumers and worker fallbacks. The on-repo manifest keeps bin.omp at src/cli.ts — release rewrites it via the publishBin override in scripts/ci-release-publish.ts — so source installs (bun link, install.sh --source) keep working without a build step
  • Plain interactive TTY launches render the full welcome box (logo held on the intro's first frame, model, tips, LSP servers, recent-sessions loading placeholder) before session construction, clearing the screen so the TUI's first paint replaces it in place; the welcome box now reserves fixed slot counts (4 recent sessions, 4 LSP servers) so its height no longer shifts between the splash, loading, and loaded states. First-run launches keep the dim two-line splash (omp <version> / Initializing session…); resume/fork/continue flows, quiet mode, PI_TIMING, and non-TTY stdio still skip it
  • Added /stats to launch the local stats dashboard from an active session, syncing session files first and opening the same browser dashboard as omp stats.
  • /settings now supports type-to-search filtering on setting labels, paths, descriptions, and values; Escape clears an active search before closing the panel.
  • Added a read-only view op to the todo tool that echoes the current list without mutating state, so the agent can recover exact task text instead of guessing it from memory.

Changed

  • Centralized model-identity logic in the new @oh-my-pi/pi-catalog package: config/model-equivalence.ts, config/model-id-affixes.ts, and config/model-provider-priority.ts were removed in favor of @oh-my-pi/pi-catalog/identity, and the registry's proxy-reference lookup now shares the catalog's single lazily-built bundled-model walk (@oh-my-pi/pi-catalog/identity bundled accessors) with the canonical-equivalence index instead of walking the ~12K bundled models twice into duplicate maps
  • Split the configured/implicit provider discovery protocols (Ollama, llama.cpp, LM Studio, openai-models-list, proxy) out of config/model-registry.ts into config/model-discovery.ts; the registry keeps orchestration (caching, status tracking, merging) while the protocol clients take an injected fetch/auth context
  • Catalog values (bundled models, modelsAreEqual, clampThinkingLevelForModel, getSupportedEfforts, DEFAULT_MODEL_PER_PROVIDER, Gemini/Antigravity wire headers) are now imported from @oh-my-pi/pi-catalog/<module> instead of the @oh-my-pi/pi-ai barrel, which no longer re-exports them; the resolver's defaultModelPerProvider alias was removed and its duplicated default-model fallback / scoped-model dedupe blocks were factored into pickDefaultAvailableModel and a shared addScopedModel helper
  • Cached custom model alias maps and built them lazily on first custom model reference lookup, avoiding unnecessary startup model-registry initialization
  • Cached resolved auth broker configuration and snapshot reads for the process lifetime so repeated startup paths reuse the same OMP_AUTH_BROKER_* resolution instead of re-running config/token discovery
  • Reused task-agent discovery results for repeated TaskTool.create calls in the same working directory to avoid repeated plugin scans during subagent startup
  • SSH tool creation now formats host descriptions from synchronous host-info cache reads (memory hit or cached JSON) instead of per-host async reads — hosts without cached info render the existing placeholder; warm-cache descriptions are byte-identical
  • Deferred heavy dependencies off the startup import graph to first feature use: linkedom (web fetch feed parsing and scrapers), puppeteer-core/@puppeteer/browsers (browser launch), @mozilla/readability (page extraction), @xterm/headless (interactive bash PTY), @babel/parser (JS eval import rewriting), and the mnemopi memory engine (backend/state construction)
  • Renamed the PI_TIMING startup phase discoverModels to discoverAuthStorage — the timer only ever wrapped auth storage discovery
  • Worker threads (stats sync, browser tab, JS eval) and the tiny-model subprocess now re-enter the CLI entrypoint with hidden argv selectors (__omp_*, --tiny-worker) via the declared worker-host entry (workerHostEntry()), collapsing the per-distribution spawn branches; outside a CLI host (bun test, SDK embedding) spawn sites fall back to loading the worker module directly, and both binary build scripts dropped their per-worker --compile entrypoint lists
  • The CLI entry no longer top-level-awaits runCli — the floating call reports rejections to stderr and exits 1, keeping the entry module CJS-lowerable and the bundle parse-friendly
  • Tightened the system prompt and tool prompts: deduped restated warnings (bash "catch yourself" list, search/find shell-fallback recaps, read instruction/critical overlap, the AST metavariable primer duplicated across both ast tool descriptions), factored the repeated repo-default clause in the gh search ops, dropped a dead rsed reference and an internal tool-timeouts.ts pointer, and pruned internal mechanism the agent can't act on (screenshot temp-file/downscaling pipeline, browser spawn lifecycle, gh "replaces former op" history and run-watch grace period, output-minimizer heuristics, BM25 ranking name, task.maxConcurrency pointer)
  • Extended the prompt-efficiency pass to the full prompt surface (subagent/plan-mode/notice/title/commit system prompts, agent definitions, goals, memories, review and autoresearch prompts): RFC-keyed prescriptive prose, fixed garbled grammar and a stale <PLAN_TITLE> placeholder in the plan-approval reminder, deduped intra-file restatements, and corrected the todo op table's claim that rm requires a task/phase (bare rm clears the whole list)
  • Replace tool prompt no longer recommends sed -i/cat-heredoc commands that the bash interceptor blocks; its bash-alternatives table now only lists non-intercepted commands
  • Capped concurrent IRC cards in the transcript's live region at 4: cards landing below a still-running tool cannot commit to native scrollback, so an unbounded burst pushed the live block's uncommitted rows above the window top (content read as cut off until the cards expired). The oldest live-region card now retires as soon as a new one would exceed the cap.
  • Interactive PTY mode (pty: true) no longer injects the non-interactive environment (TERM=dumb, GIT_EDITOR=true, PAGER=cat, NO_COLOR=1) that defeated its purpose — the PTY child now gets a real TERM=xterm-256color; and when a PTY is requested but unavailable (headless/RPC), the result now carries an explicit downgrade notice instead of silently running through a dumb pipe.
  • Raw sqlite ?q= queries are now capped at 1000 rows with an "add a LIMIT clause" notice — statement.all() on a multi-million-row table previously materialized every row, blocking the process for minutes.
  • Plain-file range reads no longer scan to EOF on files over 4MB just to count total lines (the count is reported as approximate), and multi-range reads slice from a single pass instead of re-streaming the file once per range.
  • gh run_watch now polls adaptively (3s for the first minute, then 15s), survives rate-limit errors with backoff instead of dying and discarding accumulated context, reuses job data for completed runs, and gives up with a clear message after ~90s when a commit has no workflow runs at all (previously an infinite 3-second poll loop).
  • The legacy patch-mode fuzzy matcher pre-normalizes file and pattern lines once per seek with a Levenshtein lower-bound bail, replacing the per-position re-normalization that made a single mismatched hunk against a 10k-line file cost multi-second synchronous stalls; the streaming hashline preview also caches file text and tree-sitter block resolution across ticks instead of re-reading and re-parsing every target file per streamed chunk.
  • The DAP client reader now uses chunk-list buffering and the output buffer is a chunk deque with a running byte count — debugging a chatty program previously cost O(n²) Buffer.concat per chunk plus whole-buffer byte-scans per 1KB trim, freezing the session.
  • GitHub caching: the per-lookup auth key is memoized against hosts.yml mtime (was a blocking readFileSync on every issue:///pr:// read including cache hits), background refreshes are deduped by row identity, and PR diffs are stored once per row instead of twice (unified + rendered copies).
  • Task progress snapshots shallow-copy per-agent progress instead of structuredClone-ing nested tool payloads (up to 500KB) on every progress event; streaming assistant-message reveal caches per-block grapheme counts and skips the markdown render LRU for in-flight partials, eliminating 2-3 full Intl.Segmenter walks per 33ms tick and tens of MB of retained stale partial snapshots on long replies.
  • Python eval cells: the availability probe is cached per cwd (was two interpreter spawns per cell even with a hot kernel), and stdout frames coalesce per write instead of one locked+flushed JSON frame each.
  • Multi-entry edits now stop at the first failing entry and report exactly which entries were applied and which were not — continuing after a failure applied later entries authored against line numbers that assumed the failed entry succeeded, and a retry of the whole batch then double-applied the survivors.
  • Decomposed config/model-registry.ts further: model roles (MODEL_ROLES, getRoleInfo, getKnownRoleIds) moved to config/model-roles.ts, the models.json config handle and provider validation moved to config/models-config.ts, the two provider+id merge scaffolds collapsed into one mergeByModelKey helper, the four ~15-field override/overlay enumerations now share a ModelPatch type applied by a single applyModelPatch(base, patch, transport) core (the merge vs replace transport policies preserve the same-id custom-definition replacement semantics), and canonical-variant selection delegates to @oh-my-pi/pi-catalog/identity's new resolveCanonicalVariant
  • Resolver cleanup: five duplicated trailing-:level suffix parses collapsed into splitThinkingSuffix, the matching engine is now the documented matchModel core with the selector grammar and entry points layered on top, and resolveCliModel's hand-rolled decomposed provider/id lookup reuses findExactModelReferenceMatch; runtime discovery tests split out of test/model-registry.test.ts into test/model-discovery.test.ts
  • TranscriptContainer assembles the transcript incrementally: each block's render is reference-compared and its stripped contribution, separator, and row placement are reused when unchanged, with the persistent row array truncated and re-pushed only from the first divergent block; the leading byte-identical row count is reported to the renderer through pi-tui's new RenderStablePrefix seam so off-screen transcript rows are no longer re-rendered, re-prepared, or re-audited every frame. Block components became reference-stable to make this effective: UserMessageComponent memoizes its OSC 133 zone wrapping, WelcomeComponent and DynamicBorder cache their renders, and dashboards copy before padding (render results are readonly under the new pi-tui contract)
  • A live block whose trailing row grows in place as a visible prefix (token streaming into the cursor line) is now commit-safe through its full body instead of being held back by the volatile-tail margin — the growing row itself is the block's last and can never commit while it remains last, so a streaming reply's scrolled-off head reaches native scrollback (tmux pane history) mid-stream
  • Rewrote the bash tool's coreutils guidance (tool prompt and system prompt) around an explicit litmus: pipelines that compute a new fact (wc -l, sort | uniq -c, comm, diff) are legitimate bash, while commands that merely move, page, or trim bytes a dedicated tool can fetch remain banned — output trimming destroys data the artifact:// capture would have saved.
  • Default API auto-retries now use 10 attempts with a 500ms Anthropic-style exponential backoff capped at 8s with jitter, so transient 502/gateway failures get a longer retry budget without multi-minute local sleeps.

Fixed

  • Fixed ask question/result renders so option and answer rows are no longer duplicated when the component is re-rendered
  • Fixed streaming write/diff previews to keep line-number gutter widths stable while content grows, preventing already-rendered preview rows from being reflowed mid-stream
  • Fixed the welcome screen showing "No LSP servers" when lsp.lazy is enabled: recognized servers are now still discovered at startup and listed with a dim "available" dot (no warmup), and /status reports them as available instead of omitting the section
  • Fixed edit-tool diffs stacking adjacent ... markers around inserted block-context rows (each row added its own gap markers from a snapshot of the diff, so neighboring insertions doubled them, and a marker could be left stranded between contiguous lines): non-contiguous regions are now separated by a single blank row, normalized after insertion, and rendered as one dim in the TUI and HTML export
  • Fixed an uncaught questions.map is not a function TUI crash in the ask tool's call renderer when a model double-encoded the questions array as a JSON string (a bare string passes a truthy .length check but has no .map): the renderer now normalizes untrusted call args — parsing double-encoded questions, dropping malformed entries/options, and falling back to the "No question provided" frame instead of throwing
  • Fixed direct modelRoles consumers so comma-separated fallback chains are split before model parsing, preserving explicit thinking selectors instead of treating the comma tail as an invalid suffix.
  • Fixed model-provider detection for append-only mode, authoritative Vertex endpoint checks, and upstream-routing selection by switching from URL substring checks to catalog host-matching helpers
  • Fixed pasting into the ask tool's "Other (type your own)" text box (and hook input/editor dialogs) on terminals with OSC 5522 enhanced paste (kitty protocol): the enhanced-paste focus routing only targets components exposing a pasteText hook, and the dialog wrappers had none, so the payload was stuffed into the main prompt editor hidden behind the dialog. HookEditorComponent and HookInputComponent now forward pasteText to their inner editor/input (pasting also resets the input dialog's timeout countdown like any keystroke).
  • Fixed auto-retry giving up after one attempt ("Provider requested Xms wait, exceeds retry.maxDelayMs") on a usage-limit 429 when every sibling account was only momentarily blocked: the retry delay now waits for the earliest sibling unblock when that comes sooner than the provider's multi-hour retry-after, so the next attempt picks up the recovered account instead of failing fast.
  • Fixed Hindsight per-project-tagged mental-model seeding so each project gets its own conventions/decisions models and session context only injects active-project or untagged models (#2218).
  • Fixed Windows stdio MCP .cmd commands by wrapping batch shims with cmd.exe /d /s /c using the outer command quotes required by cmd /s, while preserving literal % and quoted JSON arguments for Codegraph MCP (#2220).
  • Fixed the bundled explore agent's thinking-level: med frontmatter — not a valid effort (minimal/low/medium/high/xhigh), so it silently parsed to undefined and the agent ran without its intended thinking level
  • Discovery context-file reads (~/.claude, ~/.cursor, project trees, @-imports) now stat-gate to regular files before reading: a FIFO/socket/char device dropped where a context file is expected previously blocked startup forever on a read that can never see EOF.
  • Fixed the read tool's provider-visible path schema and docs so web URLs and internal URI targets (omp://, issue://, pr://, etc.) are advertised alongside local files (#2215).
  • Kept IRC cards from being removed after their TTL once everything above them finalized: their rows may already be committed to native scrollback, and removing them was an interior deletion of the committed prefix that the engine could only repair by recommitting everything below the gap (duplicated blocks). Such cards now stay in the transcript as durable history.
  • Fixed the recommit storm that sprayed stale snapshots of a running task's progress tree into native scrollback. The stable-prefix ratchet promoted any row quiet for one 30-frame window, so slowly ticking rows (per-agent tool/cost counters updating every few seconds) were repeatedly promoted, committed, rewritten, and recommitted by the engine audit for the whole run. The ratchet now floors itself permanently at the first row that mutates after being promoted — settled heads (a task's prompt/context) still reach scrollback, genuine tickers never re-promote.
  • Fixed the artifact spill dropping the first ~20KB of output: head-retained bytes were never written to the artifact file, so for every bash/eval/ssh command exceeding the 50KB spill threshold, the artifact:// advertised as the "full capture" was permanently missing its head — the agent re-reading it got truncated data presented as lossless.
  • Fixed streaming-output chunk throttling dropping chunks instead of coalescing them: streaming previews and the auto-background "output so far" text the model reasons over contained output with arbitrary middles silently spliced out.
  • Fixed vault:// writes bypassing both the approval ladder and plan mode: internal-URL writes were uniformly rated tier read (auto-allowed even in always-ask) and the internal-router branch returned before the plan-mode guard, so the model could silently overwrite real Obsidian notes; writes through schemes with a mutating handler are now tier write and plan-mode-enforced.
  • Fixed writes into .tar.gz/.tgz archives silently stripping gzip compression (the rewritten archive was a bare tar under the .gz name — masked on re-read because Bun auto-detects, broken for tar xzf/CI consumers), and made archive rewrites atomic via temp-file + rename so a crash mid-write can no longer destroy every other member; symlinked archive paths resolve to their target before the swap so the rename writes through instead of replacing the link with a regular file.
  • Fixed merge-conflict detection being completely inert on CRLF files: the scanner split on \n and compared === "=======", so =======\r never matched and the agent edited around live conflict markers without warning; CRLF files now detect, splice, and round-trip their line endings correctly.
  • Fixed cross-line search (\n in the pattern) silently returning zero matches: the native searcher was never switched to multi-line mode (only the regex flag was set), so the advertised feature matched nothing on real files while reporting a confident "No matches found".
  • Fixed search results lying about completeness: one hot file could consume the entire 2000-match global budget in path order making later files unreachable by any skip (now capped per file with the footer hedging of N+ when truncated), paginating past the last page returned "No matches found" instead of "No more results", directory scans now report how many >4MB files were skipped instead of silently excluding them, adjacent matches in virtual resources no longer emit duplicated backwards-numbered context lines, and patterns are no longer trim()ed (only all-whitespace is rejected — leading/trailing whitespace is meaningful regex).
  • Fixed the search tool's native grep being uncancellable: neither the abort signal nor any timeout was threaded through, so Esc on a huge-tree search left the native walk burning CPU to completion; both now propagate (30s default timeout).
  • Fixed archive and sqlite reads that could OOM or hang the process: tar/tgz archives are stat-gated at 256MB before being loaded, zip members reject attacker-declared uncompressed sizes over 64MB before allocation, and binary plain files now return a NUL-sniff notice instead of filling the line budget with mojibake.
  • Fixed malformed internal-URL selectors (artifact://3:-100) silently dumping the whole resource instead of erroring, selectors directly on an archive root (a.zip:500, a.zip:raw) being misparsed as member names, archive members minting editable hashline tags keyed to the archive path (they are immutable resources), URL selector tokens being case-sensitive (:RAW 404ed), artifact://N resolving into another session's artifacts in multi-session hosts, and not-found paths with archive/sqlite extensions stacking multiple 5s workspace-wide suffix globs (now shared per read, with glob metachars escaped so foo[1].ts can match itself).
  • Fixed leading cd X && extraction breaking shell-expanded paths — cd "$(git rev-parse --show-toplevel)" && make failed with "Working directory does not exist" because the captured path was resolved literally; extraction now defers to the shell when the path contains $, backticks, or (.
  • Fixed the echo/printf write-redirect interceptor rule blocking legitimate commands containing > inside quotes (echo "a -> b", printf 'use 2>&1'); the rule is now quote-aware, and also catches >| clobber redirects and $VAR targets it previously missed.
  • Fixed every completed auto-backgrounded bash invocation leaking its persistent native Shell in the process-global session map, and the running-job cap failing all bash commands outright — at capacity, commands now degrade to direct foreground execution (explicit async: true still errors).
  • Fixed a duplicate-delivery race where a bash job completing just inside the auto-background threshold could be returned as the tool result and re-injected as a completion notification, and fixed auto-background silently preempting the ACP client-terminal route when an editor advertises terminal capability.
  • Fixed timed-out/cancelled PTY and client-bridge commands surfacing raw output with no annotation (the model couldn't distinguish timeout from failure and retried identically); the timeout/abort notice is now always appended.
  • Fixed ask reporting timeout auto-selection as "User selected: X" — fabricated consent for consequential questions; the result now says "(auto-selected after timeout)" with a timedOut detail flag, the transcript card marks the auto-selection distinctly, and a deliberate Esc seconds past the deadline is treated as a cancel instead of being reclassified as a timeout.
  • Fixed todo accepting duplicate task content/phase names in init (duplicates were permanently unaddressable — every targeting op hit the first match while auto-promotion kept resurrecting the twin) and persisting half-applied batches on error; failed batches no longer mutate state.
  • Fixed the auto-generated-file guard caching markers by path alone with no invalidation — a file regenerated after first check stayed editable (and vice versa); entries are now validated against mtime+size.
  • Fixed editor-bridged (ACP) writes skipping the post-write bookkeeping the direct path performs (bumpFileMutationVersion, shebang chmod), so mutation-version consumers saw stale state depending on whether an editor was attached.
  • Fixed conflict://* resolution failing spuriously when an out-of-band edit shifted a conflict block (stale duplicate registrations are now tolerated as already-resolved — but a DISTINCT conflict block that is merely byte-identical and still present in the file stays addressable), and partial conflict-resolution failures now set isError instead of burying failed files mid-text in a success result.
  • Fixed patch-mode prefix/substring matches silently truncating line content the model never saw: every non-exact match strategy now emits a warning with strategy + similarity, and prefix/substring matches are rejected unless the discarded fragment survives in the replacement lines.
  • Fixed ast-edit and file-mention snapshots being recorded under non-canonical paths (invisible to stale-tag recovery under symlinked cwds), and the ast-edit apply step leaving every just-issued preview tag stale — post-apply snapshots are re-recorded and fresh tags surfaced in the result.
  • Fixed notebook cells containing literal # %% [markdown] marker text being silently split into extra cells on any edit; marker-shaped source lines are now escaped on render and restored on parse.
  • Fixed the LSP client being published before initialize completed (concurrent callers hit "server not initialized" flakes on first use), reader-loop death leaving a permanent zombie client where every request times out at 30s forever (bad messages are now isolated per-message and a dead reader tears the client down for respawn), framing stalls on header blocks without Content-Length (now resynced past the junk in both LSP and DAP), lsp status hardcoding ready for every client including wedged ones, and shutdown skipping clients still mid-initialize (their server processes outlived exit).
  • Fixed numeric LSP code-action selectors being shadowed by substring title matches — query: "2" could apply a different quickfix whose title contained "2"; numeric queries now select strictly by index.
  • Fixed file:// URIs built without percent-encoding: a % in a path threw URIError on round-trip and a # truncated the server-side path, desynchronizing diagnostics and workspace edits; URIs from lax servers carrying a raw #/? now route to the lenient parser instead of parsing "successfully" as fragment/query and misrouting edits.
  • Fixed multiple LSP inserts at the same position applying in reverse of spec order (transposed import/reference insertions), and applyWorkspaceEdit now overlap-validates every file before writing any, so a conflicting rename no longer leaves the workspace half-renamed.
  • Fixed lsp reload hanging for the whole tool timeout (didChangeConfiguration was sent as a request; it is a notification), biome failures being silently reported as "no diagnostics", a hung language server adding up to 30s to every edit (writethrough init is now deadline-bounded at 5s with deterministic spawn failures negative-cached for 3 minutes), and DAP pause() burning its full timeout when the stopped event raced the subscription; concurrent DAP breakpoint mutations are also serialized per session (last-writer no longer silently drops the other's breakpoints), queued mutations honor the caller's abort at dequeue, and the DAP output buffer retains a full 128KB tail instead of dropping whole chunks below the cap.
  • Fixed concurrent isolated background tasks interleaving git stash push/pop + cherry-pick on the shared repository — the merge sequence now runs under the repo lock, eliminating a lost-uncommitted-changes race; a stash-pop failure after successful cherry-picks also no longer reports merged branches as "unmerged" (the duplicate-commit trap) and instead tells the user to pop the stash manually.
  • Fixed async task batches getting stuck "running" forever (unscheduled/failed-to-register tasks never counted toward completion), error-result jobs being marked completed, semaphore-queued tasks counting against the 15-job global cap (batches >15 dropped the remainder and starved other async work), duplicate task ids skipping validation on the async path, and an abort racing subagent session startup leaking the late-created session's LSP/MCP processes.
  • Fixed task fail-fast abandoning in-flight siblings uncancelled (the worker signal now propagates), patch-mode merges blocking ALL successful siblings' patches when one task failed, and $@ command expansion interpreting $-replacement patterns in user input.
  • Fixed eval cells double-writing artifacts (the tool and the per-cell executor each opened a sink on the same artifact path, corrupting >50KB outputs), JS parallel() early-rejecting in violation of its documented barrier (orphaning in-flight agent() thunks with worker-side promises hung forever), Python child subprocesses inheriting the NDJSON frame pipe (their stdout was dropped and could corrupt protocol frames — it is now captured and forwarded), JS cell timeouts silently wiping persistent VM state without annotation, and the JS console bridge throwing on console.dir/time/group/assert/trace.
  • Fixed pr_push never invalidating the PR/diff cache (the canonical push-then-verify flow read a pre-push diff for up to 5 minutes), current-branch gh pr merge/close with no positional never invalidating at all (exactly the staleness the cache layer claims to eliminate; numeric flag values like --milestone 3 also no longer steal the positional), multi-PR checkouts discarding successful checkouts and racing in-flight git mutations on first failure (allSettled with per-PR reporting), run-watch ending with a failure result and zero logs when an auto-retry raced the grace-period refetch, the per-watch completed-run job cache serving a rerun's FIRST-attempt jobs after the rerun completed (entries are evicted whenever a run is observed non-completed), pagination terminating on post-filter page length, millisecond precision leaking into GitHub search date qualifiers, leading-dash PR identifiers reaching gh as flags, and issue://?state= typos silently coercing to the open list.
  • Fixed the Exa API key being written to the log file on every failed MCP request (the key rode the query string of logged URLs; key/token/secret/auth params are now redacted), and removed the web-search query rewrite that replaced every 202x substring with the current year — it corrupted CVE identifiers and made historical-year searches silently impossible.
  • Fixed reopening the sole browser tab with a different dialogs policy disposing Chromium and then using the dead handle, a stale tab release evicting a live replacement browser from the registry (spawning duplicate Chromium processes), and concurrent same-name open calls leaking a worker + refcount via a check-then-set race (acquisitions are now single-flight per name); queued opens honor an abort at dequeue, and an init-payload failure releases the temporary browser hold instead of pinning the refcount forever.
  • Fixed fetch decoding every response as UTF-8 regardless of declared charset (Shift_JIS/EUC-KR/GBK pages rendered as mojibake through the whole reader pipeline; Content-Type and <meta charset> are now honored via TextDecoder), binary URLs being downloaded twice (body skipped on the first pass for convertible types), >50MB truncation being silent (now flagged in notes), all transport error detail being swallowed into a bare "Failed to fetch URL" (the cause is surfaced and 429s get one Retry-After-honoring, abort-aware retry), MCP SSE keep-alive lines escaping as raw SyntaxErrors, MCP calls having no default timeout (now 60s), and a YouTube fetch budget expiry being misreported as a user abort that also skipped temp-file cleanup.
  • Fixed archive directory listings silently ignoring the selector offset — a.zip:dir:50 now starts the listing at the 50th entry instead of relisting from the top.
  • Fixed the model selector dropping an immediate Enter when cached models were available but the selector's offline refresh was still pending.
  • Fixed dynamic import(...) inside functions passed to the browser tool's tab.evaluate/page.evaluate failing with __omp_import__ is not defined. The eval/browser JS runtime rewrites dynamic-import callees to the worker-injected __omp_import__ helper, but puppeteer serializes evaluate callbacks with Function.prototype.toString() and re-runs them inside the page, where the helper does not exist. The rewriter now substitutes a guarded shim that falls back to native dynamic import when the helper is absent, so serialized code works in the page realm while in-worker imports keep resolving against the session cwd.
  • Transcript block freezing is now unconditional instead of gated on ED3-risk terminal detection: every finalized block replays its frozen snapshot once it crosses out of the live region, on all terminals including Windows, because the rewritten renderer's committed scrollback is immutable everywhere. Still-mutating blocks (pending tools, streaming messages, async thinking renderers) anchor the live region and keep repainting until they finalize, which structurally fixes stale/duplicated output from late async expansions (#1823).
  • Fixed the edit tool's post-edit diff preview occasionally echoing a context line twice with out-of-order numbering. Block-boundary context injection classified space-prefixed diff rows as old-file-only, so an unchanged line sitting in a net-offset region (old N / new N+k) was missing from the new file's visibility window; findBlockContextLines then re-surfaced it under its post-edit number and the row was spliced in after the adjacent change run. New-file boundary lines are now translated back to pre-edit numbers (the compact-preview renumbering contract) and merged into a single old-numbered insertion pass — also fixing closers below a net-offset edit being dropped or renumbered incorrectly.
  • Fixed the Anthropic web-search provider claiming the Claude Code identity on API-key requests: the CC billing header + system instruction were injected whenever the model wasn't Haiku 3.5, regardless of auth mode. Injection is now OAuth-gated like the streaming path, and OAuth search requests patch the billing header's cch attestation (via wrapFetchForCch) instead of shipping the cch=00000 placeholder.
  • Fixed long streamed content appearing cut off mid-run: scrolled-off rows were erased from the viewport without ever being appended to terminal history. The transcript's commit boundary (deriveLiveCommitState) was all-or-nothing per block — one perpetually rewriting row (a task tool's ticking progress tree, per-agent cost/tool counters, spinner stats) suspended scrollback commits for the entire block, so once the block outgrew the viewport its static head (e.g. a task's prompt/context markdown) was neither committed nor on screen until the tool sealed, and was lost outright if the session ended mid-run. A stable-prefix ratchet now promotes leading rows that stayed visibly identical for a full 30-frame window as commit-safe, so the settled head reaches native scrollback while only the genuinely volatile tail stays deferred; a rewrite above the promoted run retreats the boundary and the engine audit recommits (duplication, never loss).
  • Fixed local tiny-title worker stdout/stderr leaking raw native model output such as </title> and cache/status lines into the interactive TUI scrollback (#2206).
  • Fixed task-agent discovery advertising Claude Code custom agents from .claude/agents/*.md as OMP subagents; direct task-agent discovery now only loads OMP-native .omp agent roots, while Claude marketplace plugin agents keep their existing provider path (#2209).

Removed

  • Removed the clearOnShrink setting and its PI_CLEAR_ON_SHRINK environment variable: the rewritten renderer always clears shrunken rows exactly, so the flicker/perf tradeoff the setting controlled no longer exists. Existing config entries are ignored.
  • Removed the prompt-submit native-scrollback reconciliation checkpoint and the eager streaming render mode from the interactive controllers — the renderer's append-only contract made both obsolete.

@oh-my-pi/hashline

Breaking Changes

  • Changed BlockResolution.isDelete to BlockResolution.op ("replace" | "delete" | "insert_after") so resolutions can describe every block-anchored op

Added

  • Added insert after block N: patch syntax to insert body rows after the last line of the tree-sitter-resolved block beginning on line N, so a statement can be placed after a construct without counting to its closing line
  • Added depth-guided landing correction for insert after N: hunks: a body indented shallower than its anchor line slides past the structural closer lines below the anchor until depth returns to the body's level, with a warning naming the final landing line. The shift never crosses content lines, skips incomparable indentation styles and pure-closer bodies, and is abandoned when another hunk targets a crossed line
  • Added a global byte ceiling to InMemorySnapshotStore (maxTotalBytes, default 64 MiB): the cap was previously per-file only, so a session reading many large files retained up to 30 paths × 4 full-text versions indefinitely

Changed

  • Trimmed the replace block N: ops entry in the patch prompt to grammar and pointing rules; the usage doctrine it duplicated stays in the rules section
  • Changed buildCompactDiffPreview to treat blank rows as gap separators alongside markers: separators never stack (removed lines omitted from the preview no longer leave two adjacent), and leading/trailing separators are trimmed

Fixed

  • Fixed the boundary-echo repair stripping payload edges without the balance-neutrality guard its own documentation promised: in brace-heavy code where bare } lines repeat, a payload intentionally beginning/ending with lines identical to the range's neighbors had both edges silently dropped, writing content that differed from what was authored
  • Fixed lenient bare-body handling silently mutating payloads: interior blank rows in an un-prefixed body were dropped outright, and a body of numeric-keyed literals (1: "one" dict/YAML shapes) satisfied the uniform line-prefix check and had its keys stripped from every line — blank rows are now preserved when proven interior, and the uniform strip refuses lone-literal remainders
  • Fixed the multi-section "all-or-nothing" claim being false for write failures: commits run serially, so a mid-batch write error left earlier sections on disk while the thrown error said nothing — the error now lists exactly which sections were written and which were not
  • Fixed delete/replace ranges ending on the phantom trailing line of a newline-terminated file silently stripping the file's final newline; such anchors are now rejected with guidance toward N-1 / insert tail: (inserts there remain valid, and genuine empty last lines of unterminated files stay deletable)

@oh-my-pi/pi-mnemopi

Fixed

  • Fixed embedding provider detection to match openrouter by URL host, so custom embedding endpoints are now recognized correctly instead of being misclassified by substring matching
  • Fixed the check for OpenRouter base URLs so only true openrouter hosts are treated as non-custom

@oh-my-pi/pi-natives

Added

  • Added a maxCountPerFile option to grep that caps how many matches a single file may contribute, so one hot file can no longer exhaust the global maxCount budget in path order and starve every file sorted after it out of the result set entirely.
  • Added PI_DEBUG_STARTUP streaming markers to the addon loader (native:loadNative:start, native:extractEmbeddedAddon:start, native:require:<file>, native:loadNative:done), written with synchronous stderr writes so a hang inside first-run extraction or dlopen() — which blocks the event loop and defeats any timer-based diagnostics — still leaves the failing step as the last marker on stderr.
  • Added a skippedOversized count to GrepResult: directory walks now report how many files were silently skipped for exceeding the 4MB per-file grep limit (previously they vanished without a trace, letting callers conclude a symbol does not exist).

Changed

  • Parallelized the mtime-ranked glob() walk (the path OMP find always takes): per-thread bounded top-N heaps replace the single-threaded full-stat traversal, so large trees rank in a fraction of the wall clock while keeping the deterministic mtime-desc/path ordering and bounded memory.

Fixed

  • Fixed cross-line grep being a silent no-op on real files: multiline set the (?m) flag on the regex matcher but never enabled multi_line on the Searcher, which stayed line-oriented, so any pattern spanning a \n returned zero matches with no error.

@oh-my-pi/omp-stats

Changed

  • Bundled-model lookups (getBundledModel, GeneratedProvider) now import from the new @oh-my-pi/pi-catalog package instead of the @oh-my-pi/pi-ai barrel, which no longer re-exports catalog values
  • The session-sync worker re-enters the host CLI entry (workerHostEntry() + __omp_stats_sync_worker argv selector) when running inside omp — source, npm bundle, or compiled binary — and keeps loading its own sync-worker.ts module directly for standalone omp-stats, bun test, and SDK hosts

@oh-my-pi/pi-tui

Added

  • SettingsList now supports type-to-search filtering with Escape clearing an active query before canceling.

Changed

  • Preserved list selection by item ID when replacing settings so focus stays on the same setting
  • Displayed a no matching settings message and search-editing hint when filtering returns no matches
  • Expanded settings search matching to include IDs, current values, descriptions, and option values as well as labels
  • Raised the stdin split-escape flush window from 10ms to 50ms: over laggy links (ssh, slow multiplexers) a CSI sequence split across reads was flushed as literal data, leaking [ + A style fragments into the editor as typed text
  • Lengthened the OSC 11 appearance poll on terminals without Mode 2031 from 2s to 30s — each poll's query write cleared the user's active text selection, breaking copy every two seconds on Alacritty/Warp/older WezTerm
  • Rewrote StdinBuffer.extractCompleteSequences to index-based scanning: the previous per-iteration slice + Array.from(remaining)[0] made plain-text bursts O(n²), turning a 100KB non-bracketed paste into a multi-second freeze
  • Capped the editor undo stack at 100 entries with word-level coalescing of consecutive single-character inserts (matching Input), capped the kill ring at 60 entries, cached word-wrap layout per (line, width) so each render and key handler shares one wrap pass, and batched ≤1000-char single-line pastes into one insert + one trigger-detection pass instead of per-character replay
  • Virtualized the frame pipeline around a stable-prefix contract — the renderer no longer does O(total transcript) work per frame. Component.render now returns readonly string[]: results are component-owned, callers must not mutate them, and an unchanged component returns the same array reference (reference equality proves byte-identical rows). Container.render memoizes its concatenation on child references (children are still rendered every frame for their side effects); Box replaced its content-hashing cache with the same child-reference memo (no more per-frame leftPad + line rebuilds and full-content hashing); Markdown, Spacer, and TruncatedText return their cached arrays by reference instead of defensive copies. The TUI composes a persistent frame from per-child segments and an opt-in RenderStablePrefix report (consumable floor semantics for in-place mutators like the transcript), so marker extraction, line preparation (persistent prepared-frame replacing the per-frame rebuilt cache arrays), and the committed-prefix audit now run only over rows at/after the first changed row instead of every line of the transcript every frame
  • Rewrote the render core around an append-only native-scrollback contract. Committed rows are immutable: rows enter terminal history exactly once, in order, when the component-reported commit boundary (NativeScrollbackLiveRegion) marks them final, and the visible window repaints in place with relative moves. The engine no longer probes the terminal's scroll position or guesses whether a destructive rebuild is safe — the entire ED3-risk/defer/checkpoint machinery (viewport probes, eager streaming mode, dirty-scrollback reconciliation, deferred shrink/mutation intents, streaming high-water rebuilds, ConPTY-specific defer paths) is deleted. ED3 (CSI 3 J) now fires only on explicit user gestures: session replace, resize outside multiplexers, and resetDisplay(). This structurally removes the yank / flash / duplicated-rows / invisible-until-resize failure families tracked across #1610, #1635, #1651, #1682, #1719, #1746, #1799, #1823, #1962, #1974, #2000, #2011, #2154.
  • A frame that shrinks into its committed prefix re-anchors the visible window at the new tail and restarts commit bookkeeping; previously committed rows stay in history (history is never rewritten without a gesture).
  • Overlays now composite into the visible window slice only and freeze commits while visible, so overlay pixels can never enter native scrollback and closing an overlay no longer triggers a destructive history rebuild.
  • Inline-image budget demotion now deletes the demoted image's graphics by id and lets the window diff repaint the text fallback — no more mid-session destructive full replay when the image cap is exceeded.
  • The render-stress harness now validates the contract with a shadow commit ledger (an independent reimplementation of the ledger math fed only by observed frames and bytes), asserting scrollback equals the committed prefix row-for-row and that tape growth matches physical scroll exactly, across randomized op sequences, resizes, overlays, and multiplexer scenarios. The ghostty-web virtual terminal additionally survives libghostty-vt 0.4's WASM allocator traps via an event-log replay/compaction recovery, and strips non-spacing combining marks on input (a margin-aligned combining cluster deterministically corrupts that engine; mark placement through it was already unverifiable).

Fixed

  • Fixed Windows rendering degrading into CP437 mojibake (Γöé/ΓöÇ instead of box-drawing borders and Nerd Font glyphs) after a console-sharing child process changed the console codepage (e.g. PHP CLI's implicit chcp, php.net request #73716): the breakage stayed latent until the next full repaint such as ctrl+o expand. The terminal now re-asserts the UTF-8 codepage (output and input) before each stdout write
  • Fixed crash recovery leaving the shell unusable: emergencyTerminalRestore (and terminal.stop()) never left the alt screen nor disabled mouse tracking, so a crash during a fullscreen overlay stranded the user on the alternate buffer with any-motion mouse reporting spewing escape garbage until a manual reset
  • Fixed bracketed paste with a lost ESC[201~ end marker (ssh/tmux truncation) silently eating all subsequent input forever while growing memory unboundedly — paste mode now has an inactivity watchdog (1s) and a byte cap (64 MiB) that exit paste mode and deliver the accumulated bytes through the paste event
  • Fixed vertical cursor movement using UTF-16 code units as visual columns: Up/Down over emoji/CJK lines could land the cursor mid-surrogate-pair, rendering a lone surrogate and permanently corrupting the buffer on the next insert; movement now walks graphemes and snaps the target offset to a cluster boundary, also fixing column drift across wide glyphs
  • Fixed cursor positions inside whitespace trimmed at a word-wrap boundary mapping to no layout line — the cursor vanished and the viewport jumped to the buffer's last line; the preceding chunk now owns the skipped whitespace run
  • Fixed word-delete and kill-to-line operations (Ctrl+W/Alt+D/Ctrl+U/Ctrl+K) cutting through atomic paste markers, leaving [Paste #1, +30 junk that no longer expanded to the pasted content on submit — delete ranges now extend over any atomic token they intersect
  • Fixed the kitty CSI-u printable dedup swallowing a real keystroke arbitrarily long after the duplicated event; the pending codepoint now expires after 25ms
  • Fixed resetDisplay() being a no-op on the alt screen: the redraw gesture could not repair a corrupted fullscreen modal because #emitAltFrame skipped identical-string repaints without consulting the force-repaint flag
  • Fixed the ghostty initial-image paint deferral consuming resize/cursor state before abandoning the frame, which could misclassify the deferred render's reflow and corrupt the paint — the deferral check now runs before any frame state is touched
  • Fixed the terminal-cursor inline-hint branch adding the full hint width to the line accounting even though the rendered hint was truncated, misaligning right padding whenever the hint overflowed
  • Fixed nested markdown list detection sniffing for hardcoded \x1b[36m (chalk cyan): every shipped theme emits truecolor/256-color SGR for bullets, so nested items doubled their indentation per level on all real themes; nesting is now tagged structurally by the list renderer. Ordered-list continuation lines also hang by the actual bullet width, so wrapped text under 10.+ items aligns
  • Fixed committed transcript rows silently vanishing when a component re-laid-out content the engine had already scrolled into native history — a TTSR stream rewind truncating a streamed block, or the image budget demoting a committed inline image to its one-line fallback, shifted every row below by the height delta and the engine kept committing from the stale index, skipping that many rows of everything after (missing interruption banners, half-cut images in scrollback). The engine now audits its committed prefix every ordinary frame: an in-place edit or restyle keeps its alignment (stale styling in history remains the accepted artifact), while any shift re-anchors the commit index at the first moved row and recommits from there — history keeps the stale copy and gains a fresh one. Duplication, never loss. The detector (findCommittedPrefixResync, exported for the stress harness's shadow ledger) samples the prefix tail SGR-stripped so theme restyles and single-row edits never trigger spurious recommits.
  • Fixed budget-demoted inline images shrinking their transcript block: the text fallback is now height-preserving once a graphic has rendered (reserved rows plus the fallback line), so demotion never shifts content below a committed image.
  • Fixed stale trailing cells bleeding into committed history on combining-heavy rows: the native width model can over-count Arabic/combining clusters, classifying a short-rendering row as full-width and skipping the trailing erase — the previous occupant's cells then scrolled into scrollback baked into the committed row. Non-ASCII row rewrites now erase the line before writing.

Removed

  • Removed the probe/defer API surface: TUI.setEagerNativeScrollbackRebuild(), TUI.refreshNativeScrollbackIfDirty(), TUI.setClearOnShrink()/getClearOnShrink(), RenderRequestOptions.allowUnknownViewportMutation, NativeScrollbackRefreshOptions, Terminal.isNativeViewportAtBottom(), Terminal.hasEagerEraseScrollbackRisk(), and the eagerEraseScrollbackRisk/submitPinsViewportToTail capability fields with their detectors.
  • Removed the PI_TUI_ED3_SAFE, PI_CLEAR_ON_SHRINK, and PI_TUI_DEBUG environment variables (the levers they tuned no longer exist; PI_DEBUG_REDRAW now logs the commit-ledger state per frame).

@oh-my-pi/pi-utils

Added

  • Restored PI_DEBUG_STARTUP streaming startup markers: logger.time now writes a synchronous [startup] <op>:start / :done / :fail stderr line per phase (independent of PI_TIMING), so a startup that hangs hard still names the phase it is stuck in — the PI_TIMING tree only prints after startup completes and is structurally unable to diagnose a hang. The CLI runner emits cli:load:<name> markers around each lazily-imported command module for the same reason.
  • Added logger.openSpanPath(): ops of the currently-open timing-span chain (root → deepest), used by the coding agent's startup watchdog to name the in-flight phase of a stalled startup.
  • Added declareWorkerHostEntry() / workerHostEntry() (env): self-dispatching CLI entrypoints declare Bun.main as the worker host so worker spawn sites can re-enter the single entry module with WorkerOptions.argv selectors across source, npm-bundle, and compiled distributions

Changed

  • Changed prompt.compile() to cache compiled templates by the raw template string so repeated calls reuse the same compiled function without re-disambiguating
  • Snowflake.formatParts packs the id as a single 64-bit BigInt hex format instead of stitching four 16-bit segments (simpler and ~1.7x faster), and getTimestamp extracts via exact double arithmetic instead of a BigInt round-trip. Output is bit-identical.
  • Logger initialization is lazy: the winston logger, file transport, and log-directory creation now happen on first log emission instead of at module import (the import previously cost ~8ms of fs work on the CLI startup path); the in-memory timing infrastructure never touches winston
  • prompt.format() post-processing got cheap per-line guards and a single-pass ASCII-symbol replacement (was 7 chained regex passes per line), roughly halving render post-processing cost; output is byte-identical

Fixed

  • Fixed prompt.format() so ASCII symbol replacements such as --> and != still run on lines containing a closing HTML comment token when not inside a comment
  • isCompiledBinary() now also honors a define-folded process.env.PI_COMPILED (only Bun.env was checked), so builds that constant-fold process.env keep compiled-binary detection without relying on import.meta.url bunfs markers
  • omp <cmd> --help now loads only the requested command module instead of the entire command table, so an unrelated command whose import graph hangs or crashes can no longer take down every per-command help invocation.

What's Changed

  • fix(tui): suppress tiny-title worker output by @roboomp in #2207
  • fix(agent): skip Claude Code custom agents in task discovery by @roboomp in #2210
  • fix(providers): cap ollama cloud discovery output by @roboomp in #2225
  • fix(models): split direct model role fallback chains by @roboomp in #2229

Full Changelog: v15.10.9...v15.10.11

Don't miss a new oh-my-pi release

NewReleases is sending notifications on new releases.