github can1357/oh-my-pi v15.11.7

latest release: v15.11.8
3 hours ago

@oh-my-pi/pi-ai

Added

  • Added requestModelId and thinking.suppress options to google-gemini-cli so collapsed effort-tier variants serialize their per-effort upstream wire id, and thinking-off requests on models with thinking.suppressWhenOff send an explicit thinkingConfig (includeThoughts: false with thinkingLevel: "MINIMAL" or thinkingBudget: 0) — Cloud Code Assist re-applies the per-id baked server default when the config is omitted, silently thinking and billing the tokens
  • Added mandatory-reasoning clamping: models baked with thinking.requiresEffort floor omitted or disabled reasoning to the lowest supported effort in every api mapping, and disableReasoning no longer emits OpenRouter reasoning: { enabled: false } for them — fixes omp bench and utility requests 400ing with "Reasoning is mandatory for this endpoint and cannot be disabled" on OpenRouter Gemini 3.x

Changed

  • Changed google-gemini-cli request mapping to route per-request wire ids via resolveWireModelId: the session effort picks the backing variant id (collapsed gemini-3.5-flash at high → gemini-3.5-flash-low; claude pairs route off → bare id, efforts → -thinking) while AssistantMessage.model and usage attribution stay on the logical id. A thinking budget clamped to zero now falls through to the thinking-off path (off routing plus suppression) instead of only disabling thinking
  • Changed openai-completions and anthropic-messages to serialize per-request wire ids via resolveWireModelId, so collapsed X/X-thinking pairs on aggregators and custom providers switch to the thinking SKU when reasoning is enabled (previously only google-gemini-cli routed effort-tier variants)

Fixed

  • Fixed google-gemini-cli ignoring Model.requestModelId when serializing the request model id

@oh-my-pi/pi-catalog

Added

  • Added effort-tier variant collapsing (variant-collapse): providers that expose one logical model as several effort/thinking-suffixed upstream ids (Antigravity CCA gemini-3.5-flash-extra-low/-low/gemini-3-flash-agent, gemini-3[.1]-pro-low|high, claude-*[-thinking] pairs, gpt-oss-120b-medium) collapse into one logical entry carrying per-effort upstream routing in thinking.effortRouting (plus thinking.suppressWhenOff for Cloud Code Assist ids whose baked server default re-applies when thinkingConfig is omitted). Request-time code resolves the outbound id via resolveWireModelId(model, effort); selection, caching, and usage attribution key on the logical id.
  • Added the automatic X/X-thinking pair rule (deriveThinkingPairFamilies): any provider's live bare/thinking twin collapses into the bare id, routing thinking-enabled requests to the -thinking backing id (trailing or infix token, so kimi-k2-thinking-turbo pairs with kimi-k2-turbo). Gated on same api and compatible pricing — all-zero cost rows count as unknown, while twins that both carry real, differing prices remain separate SKUs.
  • Added collapseBuiltModelVariants and wired collapsing at every materialization point — Antigravity discovery, the catalog generator, and the model-manager merge — so stale sources (old static beside collapsed dynamic results, mixed cache rows) converge on logical entries instead of unioning raw tier ids back into the catalog.
  • Added thinking.requiresEffort, baked for reasoning-only upstreams — Gemini 3.x (levels only, no off), Gemini 2.5 Pro (thinkingBudget floors at 128, rejects 0), OpenAI o-series, MiniMax M2, and thinking-variant SKUs (*-thinking/*-reasoner/*-reasoning, with a negation-aware token grammar so non-thinking ids never match). Identity derivation bakes it for new entries and fillThinkingWireDefaults backfills explicit/cached metadata; minimumSupportedEffort exposes the canonical floor. Pair-collapsed twins drop member flags (their off routes to the bare SKU), while identity re-flags pairs whose logical id is itself mandatory

Changed

  • Changed model display names to drop model-extrinsic decorations: gateway author prefixes (OpenAI: …, Google: …), (latest) alias markers, (Antigravity) provider attribution, price tiers (($$$$)), and promo/lifecycle tags ((20% off), (retires …)). cleanModelName is applied in buildModel (covers live discovery and stale caches) and as a catalog-generator pass; Antigravity discovery no longer appends (Antigravity) to display names. Variant tags that map to distinct wire ids ((Thinking), (free), (Fast), dates, regions) are preserved.
  • Changed the google-antigravity default model from gemini-3-pro-high to gemini-3.1-pro
  • Changed gemini-2.5-flash-thinking handling from discovery-denylist to collapsing into gemini-2.5-flash (thinking-enabled requests route to the -thinking backing id)
  • Bumped the model cache schema to v5 so rows predating effort-tier variant collapsing (raw -low/-high/-thinking member ids) are invalidated

Fixed

  • Fixed catalog generation to apply effort-tier variant collapsing before provider grouping to ensure collapsed model families are consistently materialized without being impacted by in-loop mutation
  • Fixed Kimi K2.6 OpenAI-compatible compat metadata to use a 300s stream watchdog floor, covering Fire Pass router ids as well as public kimi-k2.6 ids so long reasoning starts do not hit the generic first-event timeout (#2366).

@oh-my-pi/pi-coding-agent

Added

  • Added mouse support to the setup wizard, including hover, wheel scrolling, and click-to-select for provider sign-in, web-search provider, glyph, and theme option lists
  • Added pointer support for the providers tab bar so tabs can be selected by click and wheel scrolling works inside the active panel
  • Added left-click navigation in setup wizard splash/outro screens to advance or exit without pressing Enter
  • Added bench CLI command to benchmark model selectors with a shared prompt and report time-to-first-token and throughput
  • Added omp bench options --runs, --max-tokens, --prompt, and --json to control run count, response length, prompt text, and machine-readable output
  • Added benchmark failure reporting that shows per-run errors, flags failed runs in the summary, and exits with a non-zero status when any model benchmark fails
  • Added snapcompact.shape setting to pick the frame variant snapcompact prints text with — auto (each provider's eval winner) or any research-eval variant: the square grids (8x8r/8x8u/6x6u/5x8 × sentence-hue/black ink) and the per-model winners (6x12-dim, 8x13-bw, 8on16-bw, doc-8on16-bw, doc-8on16-sent, doc-8on16-sent-dim); applies to both the compaction archive and inline system-prompt/tool-result imaging

Changed

  • Updated snapcompact.shape option help text so the auto mode and 8on16-bw descriptions now reflect OpenAI’s current default and GPT winner variants
  • Model selectors keyed by retired effort-tier variant ids keep resolving after catalog collapsing: reference resolution and bare-id matching consult the hand-table aliases (gemini-3.5-flash-low, gemini-pro-agent, recycled gemini-3-flash) plus the X-thinkingX grammar for auto-collapsed pairs, with exact matches always winning while a raw id is live; explicit :effort suffixes transfer unchanged. models.yml modelOverrides and rate-limit selector suppressions keyed by raw member ids re-key onto the collapsed model
  • Custom/config provider model lists now collapse X/X-thinking twins at registry rebuild (collapseBuiltModelVariants), folding config-defined twins into one entry whose thinking toggle routes to the -thinking backing id

Fixed

  • Fixed Ctrl+T thinking-block toggles clearing the pending user message and loader before the assistant stream starts (#2370).
  • Fixed Windows CodeGraph MCP launches from npm .cmd shims by resolving the shim's Node entry point and spawning it directly, keeping stdio attached to CodeGraph instead of the transient cmd.exe wrapper (#2367).
  • Fixed snapcompact archives going partially unreadable on OpenRouter, which hard-caps requests at 8 images and silently drops the excess: compaction now passes a provider-aware frame budget (providerFrameBudget) so the archive never exceeds the cap (unknown providers get a safe floor of 5), with overflow kept as a text tail on the summary instead of rendering frames the gateway would drop; inline system-prompt/tool-result imaging shares the same per-provider budget (providerImageBudget), so once existing images exhaust it (e.g. 8 archive frames on OpenRouter) tool results ship verbatim as text
  • Fixed Hindsight retains to send offset-aware local timestamps instead of UTC Z strings so extraction prompts keep the user's local time-of-day context (#2363).
  • Fixed tool calls taller than the viewport reading as cut off while streaming (the head reappeared only once the result landed): the 15.11.6 stranded-preview fix marked every collapsed pending tool preview commit-unstable, which also blocked durable top-anchored streams — e.g. a task call's context/assignment markdown — from reaching native scrollback mid-run. Commit stability is now classified per renderer (ToolRenderer.provisionalPendingPreview): only the tail-window previews the result render re-anchors (edit/apply_patch streamed-diff tails, bash/ssh command caps, eval cells with interleaved outputs) stay provisional; every other pending preview commits its settled head mid-stream again
  • Fixed omp bench reporting "tokens 0, TPS 0.0" successes on repeated OpenRouter runs: pi-ai opts every OpenRouter request into response caching, so bench's byte-identical request replayed a cached generation with zeroed usage at edge speed. Bench now sends X-OpenRouter-Cache: false so every run measures a fresh generation
  • Fixed omp bench failing with HTTP 400 {"detail":"Instructions are required"} against openai-codex models: bench requests now carry a minimal default system prompt (same guard as eval's completion bridge)
  • Fixed pressing Ctrl+T (toggle thinking blocks) — or hitting any other rebuild path such as the theme/preset selector — during the pre-streaming window after a submission erasing the just-submitted user message until the first assistant token arrived: the user message is rendered optimistically before session.prompt(...) lands it in session entries, so rebuildChatFromMessages() had no record of it; it now replays an in-flight optimistic submission after rebuilding the transcript, gated on optimisticUserMessageSignature (cleared by EventController once the real message_start lands) so it cannot duplicate post-streaming (#2372).

@oh-my-pi/pi-natives

Added

  • Added the X.org misc 6x12 and 8x13 BDF fonts (public domain, vendored in crates/pi-natives/src/fonts/) to renderSnapcompactPng, alongside two new options for the snapcompact eval-winner shapes: stretch: false renders glyphs at natural size on the requested cell box while keeping the 4-bit indexed encoder (e.g. 8x13 glyphs on an 8x16 pitch, the "8on16" shapes), and columns: 2 flows pre-wrapped newline-separated lines down two newspaper columns with a 3-cell gutter (the "doc" shapes); in doc mode sentence hues also advance across a terminator followed by a newline
  • Added a line-break marker to renderSnapcompactPng: U+2588 (FULL BLOCK) fills its entire cell box with pitch-black ink in both grid and doc layouts, ignoring the sentence hue and dim state, and counts as a sentence boundary after a ./!/? terminator

Changed

  • renderSnapcompactPng now clips the frame height to the text: the PNG stays size pixels wide but is only usedRows * lineRepeat * cellHeight tall (dim toggles are zero-width; doc layout counts \n-separated lines), so a partially filled frame no longer pads to a full square of blank rows
  • renderSnapcompactPng indexed frames now narrow the palette to the colors actually printed and pick the matching bit depth (plain bw 1-bit, dim/banded 2-bit, sentence hues up to 4-bit), and both encode paths moved from Balanced to High deflate: 8on16-bw frames shrink ~35%, 6x12-dim ~10%, sentence-hue doc frames ~9% — pure PNG, no decoder-side changes (lossless WebP was measured at only ~8% beyond this and rejected for provider-compatibility risk)

@oh-my-pi/snapcompact

Added

  • Added SHAPE_VARIANTS, the catalog of research-eval frame variants the native renderer reproduces faithfully (8x8r/8x8u/6x6u/5x8 × sent/bw), with ShapeVariantName, SHAPE_VARIANT_NAMES, and the isShapeVariantName guard
  • resolveShape(api, variant?) now accepts an explicit variant name (or "auto"); forced variants keep their geometry but are re-priced for the target provider's image billing (token estimate and OpenAI original detail hint)
  • Added the six research-eval winning frame variants to SHAPE_VARIANTS: 6x12-dim (Claude fable), 8x13-bw (Opus), 8on16-bw (GPT grid runner-up), doc-8on16-bw (GPT), doc-8on16-sent (GLM), and doc-8on16-sent-dim (Gemini/Kimi), backed by new Shape fields stretch (disable Lanczos stretch: natural glyphs on a larger cell pitch), columns (two word-wrapped newspaper columns), stopwordDim, and the X.org 6x12/8x13 fonts
  • Added dimStopwords(), which prints high-frequency function words in dim ink via zero-width markers (skipping spans that are already dim), and wrap(), the greedy word-wrap used to typeset doc-layout pages; geometry/render/renderMany/frames/compact understand doc shapes (wrap once, paginate into 2 * rows-line pages), and compaction frames persist columns/stopwordDim for mixed-shape detection
  • resolveShape now takes a ShapeTarget ({ api, id } — a pi-ai Model works as-is) and detects the ideal shape from the model id, not just the wire API: a Claude routed through Vertex or an OpenAI-compatible gateway keeps its Claude shape, with billing still priced by the API family actually carrying the request. idealShapeVariant(modelId) exposes the model-line table; unmeasured models fall back to the API family's winner
  • resolveShape now also resolves an ideal frame size per model line, and billing estimates come from verified per-family formulas instead of flat 1568px constants: Anthropic bills 28px patches capped at 4,784 visual tokens (+5% margin), Gemini 3.x bills a fixed 1,120-token media_resolution budget per image at any pixel size, and OpenAI bills 32px patches × 1.2 under the 10,000-patch detail: "original" budget. High-res Claude lines (Opus 4.7+, Fable, Mythos — native 2576px-edge ingestion) get 1932px frames (same recall and cost, a third fewer frames); Gemini gets 2048px frames (+70% chars per frame at the same bill); GPT and Kimi stay at 1568px (area-proportional billing and a model-side 1792px processor cap, respectively). idealShapeVariant now returns an IdealShape ({ variant, frameSize? })
  • Added per-provider image-count budgets: PROVIDER_IMAGE_BUDGETS, DEFAULT_PROVIDER_IMAGE_BUDGET, providerImageBudget(), and providerFrameBudget() (the image budget clamped to MAX_FRAMES). OpenRouter is capped at its measured hard limit of 8 images per request (excess images are silently dropped with no error); unknown providers get a safe floor of 5
  • Added Archive.textTail: archive content past the frame budget is no longer dropped — compact() stops rendering at the budget and keeps the newest unframed slice as verbatim text on the summary (capped at two frame capacities with middle elision, counted into truncatedChars when elided). The tail persists in preserveData and is folded back into frames by the next compaction

Changed

  • Frames are no longer padded to a square: the native renderer clips each PNG's height to the text rows actually printed, so a partially filled frame (typically the newest) bills only the pixel rows it uses
  • Changed the OpenAI default shape from 6x6u-sent to 8on16-bw. A production-regime mono eval (gpt-5.5, the full 800k-char SQuAD flow in one request, n=50) scored the old dense default f1 .602 vs .851 for 8on16-bw rendered by the production pipeline, at near-equal total cost (the dense cells burned the frame savings on reasoning tokens); chunked exp14 had already scored 8on16-bw .906. SHAPES.openaiDense is renamed to SHAPES.openai
  • Changed the Google default shape from 8x8r-sent to doc-8on16-sent-dim. Production-rendered mono eval on gemini-3.5-flash (400k chars, one request, n=25): f1 .900 vs .853 for the repeated grid at lower cost, agreeing with the chunked round-2 winner
  • Changed the Anthropic default shape from 8x8r-bw to 6x12-dim. Production mono eval on claude-fable (400k chars, one request, n=25): f1 .840 vs .877 for the repeated grid — within noise — at 37% lower cost (12 frames instead of 21 per 400k chars), with clean completions in every probe; opus reads the same trade (.800 vs .833 at 42% lower cost)
  • normalize() now keeps line structure: whitespace runs containing a line break collapse to NEWLINE_GLYPH (U+2588 FULL BLOCK, drawn by the native renderer as a pitch-black cell one character wide) instead of a plain space; leading/trailing breaks are trimmed, and the frame-reading prompt explains the marker
  • normalize() now skips characters the fonts cannot render instead of printing ? blanks: whole ANSI escape sequences are stripped, and bare control characters, zero-width format characters (ZWSP, BOM, directional marks), combining marks, and lone surrogates are dropped without occupying a cell; ? remains the fallback for unsupported graphic characters only

What's Changed

  • fix(hindsight): preserve local timezone on retain timestamps by @roboomp in #2364
  • fix(catalog): widen Kimi K2.6 stream watchdog by @roboomp in #2368
  • fix(mcp): launch Windows npm cmd shims directly by @roboomp in #2369
  • fix(tui): preserve pending message on thinking toggle by @roboomp in #2371
  • fix(tui): replay optimistic user message on chat rebuild by @roboomp in #2373

Full Changelog: v15.11.6...v15.11.7

Don't miss a new oh-my-pi release

NewReleases is sending notifications on new releases.