@oh-my-pi/pi-agent-core
Breaking Changes
- Changed
AgentOptions.getApiKeyandAgentLoopConfig.getApiKeyto receive the activeModeland return an API key orApiKeyResolver, so credential routing stays model-scoped and retry context is no longer exposed through the agent-core API
Added
- Added agent-loop deadline support for graceful wall-clock session stops.
Changed
- Changed Gemini repetition-loop detection to live in the pi-ai stream layer instead of the agent loop. The agent no longer runs its own Gemini-gated verbatim repetition check (
detectRepetition/truncateRepetition); loops now surface as a retryable transient stream error that the standard auto-retry path discards and re-samples, rather than a committed contentful error message.
Fixed
- Fixed
PI_DIALECT=minimaxbeing ignored by the owned tool-calling env selector. (#2759)
@oh-my-pi/pi-ai
Added
- Added
antigravityEndpointModestream option withauto,production, andsandboxvalues to control Antigravity endpoint routing - Added
seedApiKeyResolverfor reusing a pre-resolved request key while preserving resolver-driven auth retry and credential rotation - Added optional
contextSnapshotproperty toAssistantMessagewith token usage metadata via newContextSnapshotinterface (promptTokens,nonMessageTokens, and optionallastMessageTimestamp) - Added
LITELLM_BASE_URLguidance to the LiteLLM login prompt so non-default proxy endpoints are discoverable. (#2726) - Added a Gemini thinking-loop guard that watches streamed
thinkingdeltas for degenerate reasoning loops — verbatim tail repetition and near-duplicate paragraph cycling — and terminates the stream with a retryable, empty-contenterrormessage (worded as a transient stream stall) so the turn is discarded and re-sampled instead of committing a runaway transcript. Gated to Gemini models across every transport (OpenRouter, direct Google, Vertex) and disarmed once visible answer text or a tool call starts; disable withPI_NO_THINKING_LOOP_GUARD=1.
Changed
- Changed the Antigravity (
google-antigravity) request builder to mirror the capturedantigravity/hubclient: gemini-3.x sendthinkingConfig.thinkingBudgetper tier, a fixed per-modelmaxOutputTokens, a defaultfunctionCallingConfig.mode: "VALIDATED"tool mode (auto/unset tool choice only), arole: "user"system instruction, a structuredrequestId(agent/<id>/<ts>/<trajectoryId>/<step>), andlabels(model_enum,trajectory_id,last_step_index,last_execution_id,used_claude*) tracked across the conversation via provider session state.
Fixed
- Fixed Gemini usage-tier mapping so
gemini-3.5-flashis treated asFlashandgemini-3.1-proplusgemini-pro-agentare treated asProin usage accounting - Fixed Antigravity stream state handling so a request’s
last_execution_idis committed only after a successful completion and cleared between retry attempts - Fixed
streamSimple()Gemini streams to run through the thinking-loop guard for custom API and pi-native transports, so degeneratethinkingloops now abort with the same retryable empty-content error path as other Gemini stream paths - Fixed Antigravity model streaming and usage fetch paths to retry on transient
429/5xxerrors by failing over to the alternate endpoint before surfacing an error - Fixed Antigravity endpoint tracking to prefer a previously successful endpoint in
automode for subsequent requests - Fixed Antigravity and Gemini CLI model requests failing with an opaque error when Google requires account verification. Cloud Code Assist
403 VALIDATION_REQUIREDresponses now surface thevalidation_urland the signed-in account email when available, so users see an actionable account-verification message instead of the raw API error body. - Fixed MiniMax M3 in-band tool calls by adding a MiniMax dialect that parses
<minimax:tool_call>wrappers instead of falling back to generic XML. (#2759) - Fixed GitHub Copilot OAuth for Business seats by storing the login-discovered API endpoint and routing model enablement plus chat requests to that endpoint. (#2876)
@oh-my-pi/pi-catalog
Added
- Added
enableGeminiThinkingLoopGuardto OpenAI compatibility options to allow explicit opt-in or opt-out of the Gemini thinking-loop guard for OpenAI-compatible model aliases - Added
LITELLM_BASE_URLas the LiteLLM provider discovery base URL fallback, with discovery caches scoped by the resolved proxy URL and explicit providerbaseUrlconfig kept at higher precedence. (#2726) - Added
ThinkingConfig.effortBudgets(per-effort thinking-budget contract baked into collapsed variants) andANTIGRAVITY_MODEL_WIRE_PROFILES(maxOutputTokens+model_enumper Antigravity wire id) to mirror the captured Antigravity Cloud Code Assist client request shape.
Changed
- Defaulted
enableGeminiThinkingLoopGuardfrom Gemini family detection for both OpenAI completions and responses compatibility specs so Gemini models now enable the thinking-loop guard automatically - Updated the default Gemini CLI user-agent version fallback to 0.46.0.
- Changed the Antigravity (
google-antigravity, daily-cloudcode-pa) gemini-3.x collapse families to thebudgetthinking transport with the client's per-tierthinkingBudget(3.5 Flash low/medium/high = 1000/4000/10000, 3.1 Pro low/high = 1001/10001) and corrected 3.5 Flash effort→wire routing (medium →gemini-3.5-flash-low, high →gemini-3-flash-agent). Split the shared CCA collapse table sogoogle-gemini-cli(cloudcode-pa) keeps thegoogle-levelthinkingLeveltransport for official Gemini CLI parity. Stale collapsed snapshots (bundled catalog, recycledgemini-3-flashalias) self-heal from the hand table at collapse time, and the model cache schema is bumped to v7 to invalidate pre-budget Antigravity rows. - Changed the Antigravity user-agent to the
antigravity/hub/<version>format (default2.1.4) to match the captured client.
Fixed
- Fixed
offeffort routing forclaude-opus-4-5andclaude-opus-4-6to use their base model IDs when thinking is disabled - Fixed
gemini-2.5-flasheffort routing so all non-off effort levels resolve togemini-2.5-flash-thinking - Fixed shared variant alias provider resolution so
resolveBareVariantAliasreports all matching providers when model aliases are present in both CCA collapse tables - Routed google-antigravity default baseUrl to the stable primary daily endpoint in the catalog generator and all fallback snapshots, resolving connection drops on heavy queries.
- Fixed MiniMax M3 dialect selection so MiniMax-family OpenAI-compatible models use the MiniMax tool-call dialect instead of generic XML. (#2759)
- Fixed GitHub Copilot dynamic discovery to honor plan-specific API endpoints stored in structured OAuth credentials. (#2876)
@oh-my-pi/pi-coding-agent
Added
- Added
tui.tightsetting (defaultfalse) to enable tight layout by removing the 1-character horizontal padding from terminal output. - Added a
providers.antigravityEndpointsetting (auto,production,sandbox) to control google-antigravity routing for chat, search, image, and discovery calls - Added automatic endpoint-mode support for google-antigravity provider calls so users can force production-only or sandbox-only usage
- Added
images.describeForTextModelsoption (defaulttrue) to control automatic image description for attachments sent to models without vision input - Added automatic vision fallback prompts to describe images for text-only models
- Added
advisor.immuneTurnssetting (default1) to limit how often advisorconcern/blockernotes can interrupt the primary agent. - Added a main-session
session_stopextension event with continuation feedback and an 8-continuation loop cap (#2834). - Added
--max-time <seconds>so CLI sessions can stop after a wall-clock deadline.
Changed
- Changed google-antigravity usage report lookups to honor the selected antigravity endpoint mode when resolving the reporting base URL
- Changed context usage reporting to always return numeric token counts and percentages, so status-line and footer now show estimated values instead of
?immediately after compaction - Changed context usage reporting to use anchored snapshots and pending-prompts estimates, which now keeps
/context, status line, and model selector token counts in sync
Fixed
- Fixed Matplotlib figure display to emit PNG output immediately when
display(fig)is called, even if the figure is closed before the end-of-cell flush - Fixed persisted tool-result image payloads in
details.imagesto externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation - Fixed duplicate Matplotlib image output by skipping the automatic end-of-request figure flush for figures that were already displayed through
display(fig) - Fixed google-antigravity image generation and web search requests to fail over to the alternate antigravity endpoint on 429/server/network failures instead of stopping at the first endpoint
- Fixed context usage breakdown to use a completed assistant usage anchor from the current turn instead of a pending prompt snapshot so totals no longer overcount when a large in-turn tool step returns usage
- Fixed side-channel turns and advisor requests to keep using credential resolvers during retries, so Google
Resource exhausted429s can rotate to the next account instead of surfacing a terminal error banner - Fixed context token accounting to keep branch-local anchors during branching so sibling-branch messages no longer pollute context estimates
- Fixed context usage consistency so
/context, status line, and idle compaction logic now report the same used-token totals - Fixed status-line context cache invalidation when assistant reasoning signature data grows so displayed context usage updates accurately
- Fixed the status-line context% reading inflated during long tool turns and then dropping sharply on the next message even though no compaction ran. While a request was in flight
getContextBreakdownsummed a cl100k estimate of the entire tail on top of the stale turn-start prompt and never re-anchored to completed in-turn steps; it now prefers the real provider prompt-token count of any step that resolves at or after the pending cutoff. The status-line memo also keys on acontextUsageRevisionthat bumps when the in-flight snapshot is set/cleared, so a mid-turn estimate is invalidated on turn end/abort instead of surviving into idle until the next message - Fixed image attachment handling for text-only models by saving attachments to
local://and injecting generated descriptions so they are no longer lost when the target model cannot process images - Fixed the ssh tool rejecting valid Windows identity files before invoking OpenSSH by skipping Unix mode-bit key validation on native Windows (#2850).
- Fixed
web_search/omp qaborting before any provider ran when the global Settings singleton was not initialized;executeSearchnow readsproviders.antigravityEndpointonce and tolerates an uninitialized settings store instead of throwing - Fixed the new
git.enabledandimages.describeForTextModelssettings declaring section groups (Git,Vision) that were not registered inTAB_GROUPS, so they now render in their intended settings-panel sections - Fixed Python
display(fig)for Matplotlib figures to emit PNG output immediately, even when user code closes the figure before the end-of-cell flush. - Fixed persisted tool-result image payloads stored in
details.imagesto externalize and resolve through the session blob store, so generated-image details survive resume without stale blob refs or truncation. - Fixed the
tools.formatsetting schema sominimaxcan be selected as an owned tool-calling dialect, and taught auto mode to route tool-less MiniMax-family models to the MiniMax owned dialect. (#2759) - Fixed WSL2 TUI stutter by adding a
git.enabledsetting and skipping footer/status-line git probes when disabled or when no git-backed status segment is visible (#2847). - Fixed JSON-mode startup notices (export/resume/session-picker messages) writing to stdout before the JSON event stream; they now route to stderr so stdout remains newline-delimited JSON.
@oh-my-pi/collab-web
Fixed
- Preserved assistant soft line breaks and Markdown paragraph/list indentation in the collab web transcript renderer so tree-shaped prose no longer collapses into one paragraph.
- Changed collab web transcript wrapping to keep Korean/CJK words intact before falling back to emergency breaks for long URLs or identifiers.
@oh-my-pi/pi-mnemopi
Fixed
- Capped
sleep_consolidationepisodic rows atmaxEpisodeChars(default 100KB,MNEMOPI_MAX_EPISODE_CHARS) so raw session transcripts cannot be stored and extracted as multi-megabyte episodes. (#2869) - Skipped regex-only entity and pattern fact extraction for oversized raw transcripts so progress/log noise cannot flood MEMORIA with junk facts. (#2868)
@oh-my-pi/omp-stats
Added
- New Projects view summarizing usage, cost, and reliability per project folder (backed by the existing
/api/stats/foldersendpoint). - System-aware light/dark theme toggle — follows the OS by default, and an explicit choice persists across reloads.
Changed
- Redesigned the local stats dashboard with an OMP-themed product shell, dedicated per-section views, accessible loading/empty/error states, and flicker-free navigation between screens and time ranges.
Fixed
- The 1h time-range chart rendered an empty/single-point line; it now buckets at 5-minute granularity for a real trend.
@oh-my-pi/pi-tui
Added
- Added tight layout support (
setTuiTight/getPaddingX) to dynamically remove 1-character horizontal padding from Text, Markdown, Box, and TruncatedText components.
Changed
- Coalesced byte-adjacent SGR sequences in emitted lines into a single
CSI … m. The component tree styles each span as<set>text<reset>, so adjacent spans emit runs of back-to-back SGR sequences (e.g. aCSI 39 mfg-reset immediately followed by the next span'sCSI 38;2;r;g;b m); merging the run is behavior-preserving because SGR parameters apply left-to-right regardless of framing. On a real transcript this drops ~30-40% of all SGR sequences, cutting the per-frame byte volume and SGR-dispatch count a slow terminal engine (e.g. xterm.js/WebGL under a large viewport) must process. Each emitted sequence is capped at 16 parameter tokens so a long adjacent run is split across several valid CSIs instead of overflowing a terminal's parameter buffer (xterm.js caps at 32 and silently truncates, corrupting colors). A run is never extended past a parameter list that ends in an incomplete semicolon-form extended color (38/48/58;2missing a channel or;5missing the index), so a following code can't be absorbed as the missing component. Disable withPI_NO_SGR_COALESCE=1.
Fixed
- Fixed image cache invalidation when terminal image protocol, Kitty placeholder mode, or cell dimensions change, preventing stale rendered output
- Fixed direct inline-image placements leaving the cursor inside the reserved image block, which let following chat rows overwrite the middle of rendered screenshots (#2863).
- Fixed inline-image replay after startup or resume fallback paints by invalidating cached image rows when the terminal image protocol, Kitty placeholder mode, or cell dimensions change.
What's Changed
- fix(catalog): route google-antigravity default baseUrl to primary daily endpoint by @cagedbird043 in #2860
- fix(scripts): check current Gemini CLI version source by @lyc-aon in #2843
- feat(stats): redesign the omp stats dashboard by @lyc-aon in #2841
- perf(tui): coalesce byte-adjacent SGR sequences in emitted lines by @DarkPhilosophy in #2848
- Keep JSON mode stdout clean during startup by @usr-bin-roygbiv in #2813
- feat(alibaba-coding-plan): add endpoint selection for China/International/Custom by @21307369 in #2802
- fix(ai): surface Google OAuth account-verification URL on VALIDATION_… by @igasmi in #2806
- fix(litellm): support LITELLM_BASE_URL by @alexanderkirilin in #2816
- fix(collab-web): align transcript wrapping with TUI by @chan1103 in #2638
- fix(minimax): add MiniMax tool-call dialect by @alexanderkirilin in #2817
- Add session_stop extension stop hook by @cexll in #2845
- Add max-time deadline support by @usr-bin-roygbiv in #2815
- fix(mnemopi): cap sleep consolidation episodes by @roboomp in #2873
- fix(mnemopi): guard regex extraction for large transcripts by @roboomp in #2874
- fix(auth): route GitHub Copilot Business endpoint by @roboomp in #2881
New Contributors
- @cagedbird043 made their first contribution in #2860
- @lyc-aon made their first contribution in #2843
- @21307369 made their first contribution in #2802
- @igasmi made their first contribution in #2806
- @alexanderkirilin made their first contribution in #2816
Full Changelog: v16.0.4...v16.0.5