github can1357/oh-my-pi v16.0.5

4 hours ago

@oh-my-pi/pi-agent-core

Breaking Changes

  • Changed AgentOptions.getApiKey and AgentLoopConfig.getApiKey to receive the active Model and return an API key or ApiKeyResolver, so credential routing stays model-scoped and retry context is no longer exposed through the agent-core API

Added

  • Added agent-loop deadline support for graceful wall-clock session stops.

Changed

  • Changed Gemini repetition-loop detection to live in the pi-ai stream layer instead of the agent loop. The agent no longer runs its own Gemini-gated verbatim repetition check (detectRepetition/truncateRepetition); loops now surface as a retryable transient stream error that the standard auto-retry path discards and re-samples, rather than a committed contentful error message.

Fixed

  • Fixed PI_DIALECT=minimax being ignored by the owned tool-calling env selector. (#2759)

@oh-my-pi/pi-ai

Added

  • Added antigravityEndpointMode stream option with auto, production, and sandbox values to control Antigravity endpoint routing
  • Added seedApiKeyResolver for reusing a pre-resolved request key while preserving resolver-driven auth retry and credential rotation
  • Added optional contextSnapshot property to AssistantMessage with token usage metadata via new ContextSnapshot interface (promptTokens, nonMessageTokens, and optional lastMessageTimestamp)
  • Added LITELLM_BASE_URL guidance to the LiteLLM login prompt so non-default proxy endpoints are discoverable. (#2726)
  • Added a Gemini thinking-loop guard that watches streamed thinking deltas for degenerate reasoning loops — verbatim tail repetition and near-duplicate paragraph cycling — and terminates the stream with a retryable, empty-content error message (worded as a transient stream stall) so the turn is discarded and re-sampled instead of committing a runaway transcript. Gated to Gemini models across every transport (OpenRouter, direct Google, Vertex) and disarmed once visible answer text or a tool call starts; disable with PI_NO_THINKING_LOOP_GUARD=1.

Changed

  • Changed the Antigravity (google-antigravity) request builder to mirror the captured antigravity/hub client: gemini-3.x send thinkingConfig.thinkingBudget per tier, a fixed per-model maxOutputTokens, a default functionCallingConfig.mode: "VALIDATED" tool mode (auto/unset tool choice only), a role: "user" system instruction, a structured requestId (agent/<id>/<ts>/<trajectoryId>/<step>), and labels (model_enum, trajectory_id, last_step_index, last_execution_id, used_claude*) tracked across the conversation via provider session state.

Fixed

  • Fixed Gemini usage-tier mapping so gemini-3.5-flash is treated as Flash and gemini-3.1-pro plus gemini-pro-agent are treated as Pro in usage accounting
  • Fixed Antigravity stream state handling so a request’s last_execution_id is committed only after a successful completion and cleared between retry attempts
  • Fixed streamSimple() Gemini streams to run through the thinking-loop guard for custom API and pi-native transports, so degenerate thinking loops now abort with the same retryable empty-content error path as other Gemini stream paths
  • Fixed Antigravity model streaming and usage fetch paths to retry on transient 429/5xx errors by failing over to the alternate endpoint before surfacing an error
  • Fixed Antigravity endpoint tracking to prefer a previously successful endpoint in auto mode for subsequent requests
  • Fixed Antigravity and Gemini CLI model requests failing with an opaque error when Google requires account verification. Cloud Code Assist 403 VALIDATION_REQUIRED responses now surface the validation_url and the signed-in account email when available, so users see an actionable account-verification message instead of the raw API error body.
  • Fixed MiniMax M3 in-band tool calls by adding a MiniMax dialect that parses <minimax:tool_call> wrappers instead of falling back to generic XML. (#2759)
  • Fixed GitHub Copilot OAuth for Business seats by storing the login-discovered API endpoint and routing model enablement plus chat requests to that endpoint. (#2876)

@oh-my-pi/pi-catalog

Added

  • Added enableGeminiThinkingLoopGuard to OpenAI compatibility options to allow explicit opt-in or opt-out of the Gemini thinking-loop guard for OpenAI-compatible model aliases
  • Added LITELLM_BASE_URL as the LiteLLM provider discovery base URL fallback, with discovery caches scoped by the resolved proxy URL and explicit provider baseUrl config kept at higher precedence. (#2726)
  • Added ThinkingConfig.effortBudgets (per-effort thinking-budget contract baked into collapsed variants) and ANTIGRAVITY_MODEL_WIRE_PROFILES (maxOutputTokens + model_enum per Antigravity wire id) to mirror the captured Antigravity Cloud Code Assist client request shape.

Changed

  • Defaulted enableGeminiThinkingLoopGuard from Gemini family detection for both OpenAI completions and responses compatibility specs so Gemini models now enable the thinking-loop guard automatically
  • Updated the default Gemini CLI user-agent version fallback to 0.46.0.
  • Changed the Antigravity (google-antigravity, daily-cloudcode-pa) gemini-3.x collapse families to the budget thinking transport with the client's per-tier thinkingBudget (3.5 Flash low/medium/high = 1000/4000/10000, 3.1 Pro low/high = 1001/10001) and corrected 3.5 Flash effort→wire routing (medium → gemini-3.5-flash-low, high → gemini-3-flash-agent). Split the shared CCA collapse table so google-gemini-cli (cloudcode-pa) keeps the google-level thinkingLevel transport for official Gemini CLI parity. Stale collapsed snapshots (bundled catalog, recycled gemini-3-flash alias) self-heal from the hand table at collapse time, and the model cache schema is bumped to v7 to invalidate pre-budget Antigravity rows.
  • Changed the Antigravity user-agent to the antigravity/hub/<version> format (default 2.1.4) to match the captured client.

Fixed

  • Fixed off effort routing for claude-opus-4-5 and claude-opus-4-6 to use their base model IDs when thinking is disabled
  • Fixed gemini-2.5-flash effort routing so all non-off effort levels resolve to gemini-2.5-flash-thinking
  • Fixed shared variant alias provider resolution so resolveBareVariantAlias reports all matching providers when model aliases are present in both CCA collapse tables
  • Routed google-antigravity default baseUrl to the stable primary daily endpoint in the catalog generator and all fallback snapshots, resolving connection drops on heavy queries.
  • Fixed MiniMax M3 dialect selection so MiniMax-family OpenAI-compatible models use the MiniMax tool-call dialect instead of generic XML. (#2759)
  • Fixed GitHub Copilot dynamic discovery to honor plan-specific API endpoints stored in structured OAuth credentials. (#2876)

@oh-my-pi/pi-coding-agent

Added

  • Added tui.tight setting (default false) to enable tight layout by removing the 1-character horizontal padding from terminal output.
  • Added a providers.antigravityEndpoint setting (auto, production, sandbox) to control google-antigravity routing for chat, search, image, and discovery calls
  • Added automatic endpoint-mode support for google-antigravity provider calls so users can force production-only or sandbox-only usage
  • Added images.describeForTextModels option (default true) to control automatic image description for attachments sent to models without vision input
  • Added automatic vision fallback prompts to describe images for text-only models
  • Added advisor.immuneTurns setting (default 1) to limit how often advisor concern/blocker notes can interrupt the primary agent.
  • Added a main-session session_stop extension event with continuation feedback and an 8-continuation loop cap (#2834).
  • Added --max-time <seconds> so CLI sessions can stop after a wall-clock deadline.

Changed

  • Changed google-antigravity usage report lookups to honor the selected antigravity endpoint mode when resolving the reporting base URL
  • Changed context usage reporting to always return numeric token counts and percentages, so status-line and footer now show estimated values instead of ? immediately after compaction
  • Changed context usage reporting to use anchored snapshots and pending-prompts estimates, which now keeps /context, status line, and model selector token counts in sync

Fixed

  • Fixed Matplotlib figure display to emit PNG output immediately when display(fig) is called, even if the figure is closed before the end-of-cell flush
  • Fixed persisted tool-result image payloads in details.images to externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation
  • Fixed duplicate Matplotlib image output by skipping the automatic end-of-request figure flush for figures that were already displayed through display(fig)
  • Fixed google-antigravity image generation and web search requests to fail over to the alternate antigravity endpoint on 429/server/network failures instead of stopping at the first endpoint
  • Fixed context usage breakdown to use a completed assistant usage anchor from the current turn instead of a pending prompt snapshot so totals no longer overcount when a large in-turn tool step returns usage
  • Fixed side-channel turns and advisor requests to keep using credential resolvers during retries, so Google Resource exhausted 429s can rotate to the next account instead of surfacing a terminal error banner
  • Fixed context token accounting to keep branch-local anchors during branching so sibling-branch messages no longer pollute context estimates
  • Fixed context usage consistency so /context, status line, and idle compaction logic now report the same used-token totals
  • Fixed status-line context cache invalidation when assistant reasoning signature data grows so displayed context usage updates accurately
  • Fixed the status-line context% reading inflated during long tool turns and then dropping sharply on the next message even though no compaction ran. While a request was in flight getContextBreakdown summed a cl100k estimate of the entire tail on top of the stale turn-start prompt and never re-anchored to completed in-turn steps; it now prefers the real provider prompt-token count of any step that resolves at or after the pending cutoff. The status-line memo also keys on a contextUsageRevision that bumps when the in-flight snapshot is set/cleared, so a mid-turn estimate is invalidated on turn end/abort instead of surviving into idle until the next message
  • Fixed image attachment handling for text-only models by saving attachments to local:// and injecting generated descriptions so they are no longer lost when the target model cannot process images
  • Fixed the ssh tool rejecting valid Windows identity files before invoking OpenSSH by skipping Unix mode-bit key validation on native Windows (#2850).
  • Fixed web_search/omp q aborting before any provider ran when the global Settings singleton was not initialized; executeSearch now reads providers.antigravityEndpoint once and tolerates an uninitialized settings store instead of throwing
  • Fixed the new git.enabled and images.describeForTextModels settings declaring section groups (Git, Vision) that were not registered in TAB_GROUPS, so they now render in their intended settings-panel sections
  • Fixed Python display(fig) for Matplotlib figures to emit PNG output immediately, even when user code closes the figure before the end-of-cell flush.
  • Fixed persisted tool-result image payloads stored in details.images to externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation.
  • Fixed the tools.format setting schema so minimax can be selected as an owned tool-calling dialect, and taught auto mode to route tool-less MiniMax-family models to the MiniMax owned dialect. (#2759)
  • Fixed WSL2 TUI stutter by adding a git.enabled setting and skipping footer/status-line git probes when disabled or when no git-backed status segment is visible (#2847).
  • Fixed JSON-mode startup notices (export/resume/session-picker messages) writing to stdout before the JSON event stream; they now route to stderr so stdout remains newline-delimited JSON.

@oh-my-pi/collab-web

Fixed

  • Preserved assistant soft line breaks and Markdown paragraph/list indentation in the collab web transcript renderer so tree-shaped prose no longer collapses into one paragraph.
  • Changed collab web transcript wrapping to keep Korean/CJK words intact before falling back to emergency breaks for long URLs or identifiers.

@oh-my-pi/pi-mnemopi

Fixed

  • Capped sleep_consolidation episodic rows at maxEpisodeChars (default 100KB, MNEMOPI_MAX_EPISODE_CHARS) so raw session transcripts cannot be stored and extracted as multi-megabyte episodes. (#2869)
  • Skipped regex-only entity and pattern fact extraction for oversized raw transcripts so progress/log noise cannot flood MEMORIA with junk facts. (#2868)

@oh-my-pi/omp-stats

Added

  • New Projects view summarizing usage, cost, and reliability per project folder (backed by the existing /api/stats/folders endpoint).
  • System-aware light/dark theme toggle — follows the OS by default, and an explicit choice persists across reloads.

Changed

  • Redesigned the local stats dashboard with an OMP-themed product shell, dedicated per-section views, accessible loading/empty/error states, and flicker-free navigation between screens and time ranges.

Fixed

  • The 1h time-range chart rendered an empty/single-point line; it now buckets at 5-minute granularity for a real trend.

@oh-my-pi/pi-tui

Added

  • Added tight layout support (setTuiTight/getPaddingX) to dynamically remove 1-character horizontal padding from Text, Markdown, Box, and TruncatedText components.

Changed

  • Coalesced byte-adjacent SGR sequences in emitted lines into a single CSI … m. The component tree styles each span as <set>text<reset>, so adjacent spans emit runs of back-to-back SGR sequences (e.g. a CSI 39 m fg-reset immediately followed by the next span's CSI 38;2;r;g;b m); merging the run is behavior-preserving because SGR parameters apply left-to-right regardless of framing. On a real transcript this drops ~30-40% of all SGR sequences, cutting the per-frame byte volume and SGR-dispatch count a slow terminal engine (e.g. xterm.js/WebGL under a large viewport) must process. Each emitted sequence is capped at 16 parameter tokens so a long adjacent run is split across several valid CSIs instead of overflowing a terminal's parameter buffer (xterm.js caps at 32 and silently truncates, corrupting colors). A run is never extended past a parameter list that ends in an incomplete semicolon-form extended color (38/48/58;2 missing a channel or ;5 missing the index), so a following code can't be absorbed as the missing component. Disable with PI_NO_SGR_COALESCE=1.

Fixed

  • Fixed image cache invalidation when terminal image protocol, Kitty placeholder mode, or cell dimensions change, preventing stale rendered output
  • Fixed direct inline-image placements leaving the cursor inside the reserved image block, which let following chat rows overwrite the middle of rendered screenshots (#2863).
  • Fixed inline-image replay after startup or resume fallback paints by invalidating cached image rows when the terminal image protocol, Kitty placeholder mode, or cell dimensions change.

What's Changed

New Contributors

Full Changelog: v16.0.4...v16.0.5

Don't miss a new oh-my-pi release

NewReleases is sending notifications on new releases.