github can1357/oh-my-pi v16.0.6

8 hours ago

@oh-my-pi/pi-agent-core

Added

  • Added transformAssistantMessage hook to AgentOptions and Agent to allow mutating the finalized assistant message before UI emission, context appending, or tool dispatch

@oh-my-pi/pi-ai

Added

  • Added support for ArkType schemas as tool parameters alongside existing Zod schemas
  • Added getOpenRouterHeaders utility to export standard OpenRouter integration headers

Changed

  • Expanded thinking loop detection guard to also cover DeepSeek models (family, provider, or id matches).
  • Extended loop guard to monitor assistant response prose (via text_delta events) in addition to thinking logs, customizable via request options.
  • Modified loop guard error reporting to emit a non-retryable partial content block containing the accumulated streamed text if a loop is detected after response prose has started streaming, preventing unsafe agent session rollbacks.
  • Migrated internal wire-schema validation (auth-broker, Anthropic Messages request, OpenAI Chat/Responses requests, and /v1/usage shapes) from Zod to ArkType
  • Replaced the dedicated xai-responses provider with a unified openai-responses path that handles xAI-specific reasoning effort stripping dynamically
  • Updated OpenAI Responses stream handling to throw a clearer error message when a stream closes without a terminal response event
  • Consolidated shared OpenAI-compatible routing and strict-tool fallback helpers across Chat Completions and Responses providers.
  • Consolidated the OpenAI-family provider stack: merged openai-responses-shared into openai-shared and removed the now-dead openai-responses-shared re-export shim; folded the three duplicated service_tier request blocks and the per-provider wire model-id transform into shared applyOpenAIServiceTier/applyWireModelIdTransform helpers; moved residual provider-name wire-quirk checks (DeepSeek special-token strip, cumulative reasoning deltas, Ollama empty-length context error, OpenAI tool-call-id cap, Fireworks thinking drop, OpenRouter/OpenAI Responses request fields) into resolved compat fields; shared the Responses stream per-block accumulation helpers plus the terminal pending-tool-call finalization (finalizePendingResponsesToolCalls) and toolUse/pause stop-reason promotion (promoteResponsesToolUseStopReason) between processResponsesStream and the Codex stream handler; and removed the redundant getOpenAIResponsesCacheSessionId alias in favor of getOpenAIResponsesPromptCacheKey.
  • Centralized OpenAI-family request-param policy into shared resolveOpenAIOutputTokenParam (output-token field selection, OpenRouter default-cap omission, alwaysSendMaxTokens defaulting, model/provider clamp), applyOpenAIGatewayRouting (OpenRouter provider + Vercel AI Gateway providerOptions), and applyOpenAIExtraBody (extra-body merge + Fireworks thinking drop) helpers used by both Chat Completions and Responses buildParams, and moved the Chat Completions reasoning/thinking dialect dispatch (applyChatCompletionsReasoningParams + disableChatCompletionsReasoningForDialect) plus the OpenAICompletionsParams request type into openai-shared alongside applyResponsesReasoningParams. As a consistency consequence, direct streamOpenAIResponses calls (bypassing streamSimple) now emit max_output_tokens for alwaysSendMaxTokens (Kimi-family) models even without a caller cap — matching Chat Completions and the value streamSimple already supplied.
  • Centralized OpenAI-family reasoning compat resolution behind a shared resolveOpenAICompatPolicy consumed by both Chat Completions and Responses request builders. Shared policy now drives tool-choice reasoning suppression, dialect-specific disable encoding, reasoning-history replay filters, encrypted-reasoning inclusion, Mistral/OpenAI tool-call-id modes, stream healing/DeepSeek token stripping, and xAI/OpenRouter cache-affinity wiring instead of endpoint-local provider/model checks.

Fixed

  • Fixed OpenAI Responses cost accounting to apply standard service-tier pricing multipliers (flex 0.5×, priority 2×) to the calculated cost based on the served (or requested) service tier for provider "openai" models.
  • Fixed OpenAI Chat Completions to consume the dedicated requiresReasoningContentForAllAssistantTurns compatibility flag, preventing unnecessary reasoning replay on non-tool-call turns for OpenRouter DeepSeek and OpenCode models.
  • Fixed the Kimi Code and Synthetic dual-surface shim (streamOpenAIAnthropicShim) to correctly forward caller-supplied toolChoice, serviceTier, and disableReasoning options.
  • Fixed the OpenAI Responses tool-choice compatibility helper to drop tool_choice when supportsToolChoice is false, and downgrade forced choices to "auto" when supportsForcedToolChoice is false.
  • Fixed Azure Responses to avoid emitting tool_choice: "none" when context.tools is empty.
  • Fixed Kimi via OpenRouter forced-tool requests to omit the OpenRouter reasoning object instead of sending reasoning: { enabled: false }, preserving the generic OpenRouter explicit-disable behavior while avoiding Kimi's forced-tool reasoning conflict.
  • Fixed Google Gemini CLI credential parsing schema to gracefully handle empty or unexpected non-string shapes without throwing unhandled exceptions
  • Fixed Google Gemini CLI credential parsing to correctly prioritize projectId over project_id even when empty, and drop non-string values gracefully
  • Fixed OpenRouter Responses requests to omit default max token fields unless an explicit caller cap is provided, preventing upstream filtering issues
  • Fixed Chat Completions reasoning suppression (disableReasoningOnToolChoice / disableReasoningOnForcedToolChoice) to turn thinking off symmetrically across every dialect via a shared disableChatCompletionsReasoningForDialect helper. Previously the conflict path only deleted reasoning_effort/reasoning (and set Z.AI thinking: { type: "disabled" } on the forced branch alone), leaving Qwen enable_thinking, Qwen chat-template chat_template_kwargs.enable_thinking, and OpenRouter nested reasoning enabled — so those hosts could keep thinking on under forced/required tool choice and re-trip the incompatibility the policy guards against. OpenRouter is now set to { reasoning: { enabled: false } } (not deleted, which OpenRouter treats as default-on).
  • Fixed OpenRouter Responses requests to send session_id from sessionId in the request body for sticky provider routing and observability grouping.
  • Fixed OpenRouter Responses request shaping to preserve provider routing, variant suffixes, caller header overrides, and strict-tool fallback behavior while omitting only unsafe default max-token caps.
  • Fixed OpenAI Responses stateful chaining so a non-ZDR stale previous_response_id retry keeps store: true: the full-context retry stays chainable on the next turn and the consecutive stale-failure circuit breaker trips after the configured limit instead of alternating cold turns. Zero Data Retention rejections still disable chaining on the first strike.
  • Fixed Anthropic Messages tool schema normalization demoting root anyOf/allOf and all oneOf constraints into descriptions instead of forwarding provider-rejected keywords in MCP tool input_schema.
  • Fixed Ollama Cloud GLM-5.2 reasoning efforts to map xhigh to native think "max" (#2911 by @serverinspector)
  • Fixed OpenRouter Responses requests tagging the streamed assistant message with a hardcoded openai-responses API instead of the runtime model.api, which silently disabled native-history replay (buildResponsesInput) and cross-model tool-call item-id stripping on subsequent OpenRouter turns. The message now carries model.api (matching the Chat Completions path).
  • Fixed OpenAI-family streaming leaking a pre-retry errorMessage onto a successful turn: the OpenRouter Anthropic compiled-grammar strict-tool fallback set errorMessage before retrying with strict tools disabled and never cleared it on success, and the Chat Completions success path could carry an errorMessage from an internally-retried attempt — both made a successful turn read as errored in agent state and telemetry. The Responses fallback no longer assigns errorMessage, and the Completions success path clears it before emitting the terminal done event.
  • Fixed Codex stream-error .code resolution to use the same nested-first precedence (error.codeerror.type → top-level code) as isRetryableCodexFailureEvent and the formatted message. Previously the error factory resolved top-level-first, so a failure event carrying both a top-level and a differing nested error code surfaced a .code that could disagree with its own retryable flag and message text.

@oh-my-pi/pi-catalog

Added

  • Added a dedicated openrouter API type and ResolvedOpenRouterCompat configuration to support unified chat-completions and Responses-API compatibility for OpenRouter models

Changed

  • Migrated bundled OpenRouter models in the catalog from openai-completions to the new openrouter API type
  • Consolidated the resolved OpenAI compat shape: extracted a shared ResolvedOpenAISharedCompat core that both ResolvedOpenAICompat and ResolvedOpenAIResponsesCompat extend (each builder still computes its own per-surface value, preserving chat↔Responses divergence), added internal resolved wire-quirk fields (wireModelIdMode, stripDeepseekSpecialTokens, reasoningDeltasMayBeCumulative, emptyLengthFinishIsContextError, usesOpenAIToolCallIdLimit, dropThinkingWhenReasoningEffort, supportsObfuscationOptOut), and replaced buildOpenRouterCompat's cast-and-copy with an exhaustive pickResponsesOnly composition that fails to compile if a new Responses-only field is added without handling. The public OpenAICompat config vocabulary is unchanged.
  • Expanded OpenAICompat/ResolvedOpenAISharedCompat with shared reasoning/history/stream/request flags (reasoningDisableMode, omitReasoningEffort, includeEncryptedReasoning, filterReasoningHistory, requiresReasoningContentForAllAssistantTurns, streamMarkupHealingPattern, promptCacheSessionHeader, etc.) so model/provider/gateway constraints are declared once in catalog compat and then consumed uniformly by Chat Completions and Responses endpoints.

Fixed

  • Changed the default compatibility builder for openai-completions to set requiresAssistantAfterToolResult to isMistral, enabling the synthetic assistant bridge for built-in Mistral and Devstral models.
  • Fixed local Ollama (provider: "ollama") reasoning turns still failing with HTTP 400 invalid reasoning value: "minimal" when the model was selected from a stale ~/.omp/models.db cache row or a hand-written config: the minimal → low / xhigh → max remap was only stamped during fresh discovery, so cached and custom specs reached the wire unmapped. The remap now lives in the OpenAI chat-completions and Responses compat builders, so every buildModel (including cache loads, custom specs, and the whenThinking variant) backfills it — no omp models refresh required. Custom OpenAI-compatible providers registered under a non-ollama provider id still need their own compat.reasoningEffortMap.
  • Advertised Ollama Cloud GLM-5.2 reasoning efforts as high/xhigh-only and mapped xhigh to native max effort (#2911 by @serverinspector)
  • Fixed OpenRouter pseudo-API model construction so bundled OpenRouter models resolve shared OpenAI compatibility metadata instead of an undefined compat record.
  • Fixed custom/direct xai-oauth Responses model specs (e.g. grok-build) emitting reasoning.effort and hitting xAI's HTTP 400: buildOpenAIResponsesCompat now defaults supportsReasoningEffort to false for xai-oauth Grok models that are off the effort-capable allowlist (grok-3-mini/grok-4.20-multi-agent/grok-4.3), matching the curated discovery path; explicit compat.supportsReasoningEffort still overrides. The allowlist moved to a shared isGrokReasoningEffortCapable identity helper consumed by both the compat builder and provider-model curation so the two cannot drift.

@oh-my-pi/pi-coding-agent

Added

  • Added model.loopGuard.enabled (default true) and model.loopGuard.checkAssistantContent (default true) settings to configure thinking and assistant prose loop detection.
  • Added explicit ArkType schema descriptions to parameters across all agent tools to improve model tool-calling instructions and parameter guidance
  • Added support for OpenRouter fallback in Perplexity web search when direct Perplexity API keys fail or are unavailable
  • Added support for streaming the Perplexity Responses API (/v1/responses) via the PI_PERPLEXITY_RESPONSES=1 environment variable
  • Added omp ttsr top-level CLI command to inspect and test Time-Traveling Stream Rules
  • Added omp ttsr list to enumerate all project/user-loaded TTSR rules with their conditions, scope, and source metadata
  • Added omp ttsr test to run snippets through the real TTSR matching pipeline with inline text, --file <path>, or stdin via --file -
  • Added --json output to omp ttsr test and omp ttsr list for machine-readable reporting
  • Added --rule, --source, --tool, --path, and --verbose options to omp ttsr test to control matching context and inspection details
  • Added omp ttsr subcommand for inspecting and testing Time-Traveling Stream Rules: omp ttsr list shows every TTSR-registered rule the current project/user config would load, and omp ttsr test feeds a snippet (inline, --file, or stdin) through the real TTSR matching pipeline (TtsrManager.checkSnapshot/checkAstSnapshot) and reports which rules would trigger. A positional that resolves to a file defaults to tool/edit context; --source, --tool, and --path override the inferred match context so glob/AST/scope-scoped rules evaluate the same way they do in a live session. --rule tests a single rule markdown file in isolation.
  • Added support for reading embedded PDF images via read <pdf>:<image>.png and listing available image members with read <pdf>:
  • Added a built-in ts-no-inline-cast-access TTSR rule that interrupts inline object-type assertions read immediately as a property ((x as { y: T }).y, including ?. and bracket access), steering toward schema validation, in/typeof narrowing, or a validated named type
  • Added startup.showSplash (default false) to show the full setup splash animation on normal interactive startup while startup.quiet still suppresses startup chrome. (#2880)
  • Added app.retry as an Alt+R keybinding for retrying the last failed or aborted assistant turn (#2790).
  • Added b branch promotion for completed /btw answers, creating a branch that preserves the side-question input and full assistant response including thinking blocks.

Changed

  • Enabled caching by default in the codebase search tool to improve search performance
  • Replaced internal schema validation and @sinclair/typebox polyfills across all agent tools and configurations from Zod to ArkType
  • Changed advisor model calls and overflow-compaction tasks to inherit and propagate primary telemetry spans, usage, and cost tracking
  • Changed PDF read output to replace <!-- image: ... --> placeholders with clickable read <pdf>:<image>.png handles, including line-range and multi-range reads
  • Changed the built-in ts-no-any rule to recommend a schema parse at trust boundaries and in-narrowing (instead of an inline as-cast) when reading fields off unknown

Fixed

  • Fixed legacy plugin validation for extensions that import defineTool, StringEnum, frontmatter helpers, SettingsManager, createCodingTools, or the bare typebox package through the hosted Pi compatibility shims (#2858).
  • Fixed edit seen-line guard mismatch assertion message formatting to report the actual state instead of generic failure notices
  • Fixed hashline edit mode rendering in the TUI when setIgnoreTight triggers synchronous display rebuilds in the constructor before #editMode is assigned, which previously caused the tool envelope target path to display as instead of the parsed hashline filename.
  • Fixed omp ttsr scan to discover files with gitignore-aware native globbing, skip binary/oversized files before text decoding, use a scan-specific matcher, keep default output summary-only, and avoid retaining per-file AST snapshots during scans
  • Fixed Perplexity web search to use shared OpenAI streaming transports while preserving streamed sources, citations, and related questions
  • Fixed StatusLineComponent fire-and-forget async callbacks (#isDefaultBranch, #lookupPr, fs.watch) firing #onBranchChange after dispose(), which reached the global settings proxy after tests called resetSettingsForTest() and threw "Settings not initialized" between test files; dispose() now sets a disposed flag, clears #onBranchChange, and every post-await continuation checks the flag before touching settings or the callback
  • Fixed read <pdf>:<member> errors for unknown PDF images to surface available extracted image names
  • Fixed puppeteer stealth scripts to use cached Reflect methods (Reflect_get, Reflect_apply) and Reflect.apply instead of live Reflect/Function.prototype.apply calls, preventing page tampering from leaking through proxy traps.
  • Fixed Perplexity API-key web search to use shared OpenAI streaming transports while preserving streamed sources and OpenRouter fallback.
  • Fixed subagents reporting success after a provider-error turn by preserving real run failures over earlier successful yield payloads, and retried bare OpenAI-compatible finish_reason: "error" provider failures after partial text instead of stopping immediately
  • Fixed MCP servers that do not implement resources/templates/list (JSON-RPC -32601) discarding their concrete resources; templates now fall back to an empty list (#2838 by @jms830)
  • Fixed provider setup sign-in URLs to attempt clipboard/OSC 52 copy and expose an Alt+C retry shortcut, so authentication is not blocked when TUI selection is unavailable (#2908).
  • Fixed ACP approval-mode documentation to describe config inheritance, omp acp --yolo/--auto-approve runtime overrides, client permission precedence, and headless prompt behavior (#2900).
  • Fixed /guided-goal to fall back to the current session model when neither the plan nor slow role resolves, instead of aborting during setup (#2855).
  • Fixed auto context-full maintenance to stop retrying the same summarization timeout before falling back to the next compaction model (#2913).
  • Fixed /plan <prompt> and /goal <objective> to preserve the typed slash-command line in TUI input history when entering those modes from off (#2887).
  • Fixed /model in the TUI to open the active-session model switcher instead of the role-assignment picker (#2846).
  • Fixed Perplexity web search collapsing every upstream failure to a generic 401 No authentication method available once all auth methods failed: the fallback loop now rethrows the last classified provider error (402/credits-exhausted, 429, 5xx), so quota and rate-limit failures are no longer mis-reported as authorization errors. The generic 401 is now only a defensive fallback for the no-method-ran case.

Security

  • Secured PDF image reads by validating requested image members against the extracted member list before opening files and refusing traversal-style names

@oh-my-pi/pi-mnemopi

Changed

  • Updated OpenRouter request headers to use standard shared headers from the pi-ai package

Fixed

  • Forced the on-demand fastembed runtime install to override fastembed's archived onnxruntime-node@1.21.0 transitive pin with Mnemopi's onnxruntime-node@1.26.0 pin, fixing local embedding startup on macOS ARM64. (#2920)

@oh-my-pi/pi-natives

Removed

  • Removed the cache option from GrepOptions

What's Changed

New Contributors

Full Changelog: v16.0.5...v16.0.6

Don't miss a new oh-my-pi release

NewReleases is sending notifications on new releases.