github can1357/oh-my-pi v15.9.1

4 hours ago

@oh-my-pi/pi-ai

Added

  • Added regional Xiaomi Token Plan login/provider entries (xiaomi-token-plan-sgp, xiaomi-token-plan-ams, xiaomi-token-plan-cn) so omp login can store token-plan keys against the selected region. (#1846)

Fixed

  • Removed the context-1m-2025-08-07 (1M long-context) beta from the Anthropic agent request headers, the OAuth model-discovery header, and the Claude usage-API header. Sending it caused subscription/OAuth requests without long-context credits to fail with 429 Usage credits are required for long context requests, breaking Sonnet. The remaining betas are unchanged.
  • Fixed Kimi K2.x maxTokens on Fireworks and Fire Pass (fireworks/kimi-k2.5, fireworks/kimi-k2.6, firepass/kimi-k2.6-turbo) being inherited from Fireworks /v1/models discovery (max_completion_tokens: 65536) rather than the published Kimi-on-Fireworks output budget, which let callers (and the openai-completions default-injection safety net) ship a budget the router cannot honor and made runaway reasoning traces more likely. The Fireworks resolver now clamps every Kimi K2.x id (public catalog ids and the canonical accounts/fireworks/{models,routers}/kimi-k2… wire form) to 32,768 output tokens, and the generator applies the same cap as a post-processing safety net so the firepass static fallback and the bundled fireworks entries stay in sync across regens. (#1849)
  • Fixed Xiaomi Token Plan MiMo OpenAI-compatible tool-call continuations omitting required reasoning_content replay. (#1846)
  • Fixed Anthropic prompt caching for OpenAI-compatible Claude proxies by honoring compat.cacheControlFormat: "anthropic" outside OpenRouter. (#1845)
  • Fixed Moonshot Kimi K2.6 silently pausing for many seconds between tool calls because the server discarded the reasoning_content that omp was already sending with every assistant tool-call replay. The K2.6 thinking parameter takes an extra keep field whose default (null) ignores historical reasoning, so K2.6 had to re-derive its full chain-of-thought from the user prompt on every iteration of the agent loop. The Moonshot direct (api.moonshot.ai) and Kimi Code (api.kimi.com) wire bodies now send thinking: { type: "enabled", keep: "all" } for kimi-k2.6 requests with reasoning enabled, matching Moonshot's documented best practice for multi-step tool-calling agents. The flag is gated on the K2.6 id and the two native hosts because earlier Moonshot models (K2.5 and below) 400 on the unknown field and every Kimi gateway (OpenRouter, OpenCode, Kilo, Fireworks, …) speaks its own thinking shape. (#1838)
  • Fixed Alibaba DashScope (Bailian) compatible-mode endpoint 400 InternalError.Algo.InvalidParameter: The provided messages input is invalid. The error info is [Unexpected item type in content.] when a screenshot or other image-producing tool result was folded into a known text-only Qwen turn (e.g. qwen3.7-max, qwen-max, qwen3-coder-*) hosted at dashscope.aliyuncs.com/compatible-mode/v1. convertMessages in openai-completions no longer forwards image_url content parts for those text-only id families even when a misconfigured custom provider claims input: ["text", "image"]; multimodal compatible-mode ids such as qwen3.7-plus and qwen-vl-max still rely on the catalog input field. The tool-result branch and the user-content branch both fall back to the standard [image omitted: model does not support vision] placeholder for text-only ids so the model still sees the attachment intent. (#1859)

@oh-my-pi/pi-coding-agent

Added

  • Added deferred session-title generation so greetings no longer become the session title. A first user message that is only a greeting / acknowledgement / filler ("hi", "thanks", "ok", a bare number, emoji-only, etc.) is now detected deterministically and skips titling entirely — no title model is invoked. Title generation then retries on each subsequent user message while the session stays unnamed, so the title is deduced from the first message that actually describes work. A capable online title model may additionally answer none to decline a non-greeting taskless message (normalized to "no title").

Changed

  • Changed mid-turn user steers to reach the model inside a wire-only interjection envelope, while transcripts and persisted session history keep the user's original text.
  • Changed the system prompt to treat user requests for parallel work as task subagent fan-out rather than parallel tool calls.
  • Changed the Agent Control Center's new-agent description field to use the multiline TUI editor, with Enter inserting lines and Ctrl+Enter generating the spec.
  • Changed the Agent Control Center and Extension Control Center to accept Left/Right arrow keys for switching tabs (source / provider), in addition to Tab / Shift+Tab — matching the model and settings selectors, whose TabBar already supported arrow navigation.
  • Refreshed the Ctrl+R history search overlay: the selected row now renders as a full-width selectedBg highlight bar, matched query tokens are highlighted in the accent color, each result shows a right-aligned relative timestamp, and the panel gained an icon'd accent title plus a two-tone keyhint footer. The selector also gained PageUp/PageDown (via the configurable tui.select.pageUp/pageDown keybindings) and Home/End navigation.
  • Changed Perplexity API-key web search to return more comprehensive results: web_search_options.search_context_size is now high (was medium) for maximum retrieval grounding, the default num_search_results is 20 (was 10) so twice as many sources are surfaced, and return_related_questions is enabled with the response's related_questions now parsed into relatedQuestions (previously dropped). On an identical query this lifted the result from 10 sources / ~410 output tokens to 20 sources / ~1900 output tokens with a structured, multi-section answer; latency tracks model output length, not context size, so the 60s hard timeout headroom is unchanged.

Fixed

  • Fixed a streamed assistant message freezing at a partial prefix (e.g. only "Nat" of "Natives built, now…") on ED3-risk terminals (Ghostty/kitty/iTerm2/Alacritty), with the final text appearing only after a resize. TranscriptContainer freezes each non-live block by replaying its last live render, but render coalescing can finalize a block's content and append the next block within the same throttled frame — so the block was sealed at its stale mid-stream snapshot and never repainted until the next thaw. The block that was live on the previous render is now recomputed once on the live→frozen transition, sealing it at its final content.

  • Fixed ACP/RPC stdio startup so protocol frames are no longer consumed as one-shot piped prompt input before the JSON-RPC transport starts.

  • Fixed omp completions to await the completion script write before exiting.

  • Fixed AssistantMessageComponent exposing its stable-prefix completion API again so streamed assistant messages remain unstable until explicitly completed.

  • Fixed session restoration to ignore transient fallback model switches (such as automatic context-promotion or retry fallback) so resumed or resumed-switch sessions revert to the configured default model unless the last change was a user-selected temporary model

  • Fixed in-session /resume to restore both the last user-selected temporary model and persisted plan/goal mode state instead of falling back to the default model with plan mode off.

  • Fixed the /resume session picker overflowing short viewports: the visible window was hardcoded to 5 entries (and assumed 3 lines each), but titled sessions render 4 lines, so on a typical-height terminal the picker's header and search box scrolled off the top and the first entry was hidden until you scrolled the terminal up. The visible-entry count is now derived from the live terminal height (budgeting the worst-case 4-line titled entry plus the picker's chrome), so the whole picker fits the viewport and grows on taller terminals.

  • Fixed the Agent Control Center and Extension Control Center dashboards overflowing the terminal: they were mounted inline below the chat transcript, so the combined height exceeded the viewport — the tab bar and controls scrolled off the top into native scrollback, and every state change yanked the view back to the bottom. Both dashboards now render as full-screen overlays sized to the live terminal height (process.stdout.rows), re-fit on resize, fill the viewport, and reserve space for the footer keyhints so the controls stay visible.

  • Fixed Ctrl+R history search results to remain globally sorted by prompt recency after merging FTS prefix matches with substring fallback matches.

  • Fixed Exa web search with no stored or environment credential to use the public Exa MCP fallback again, preserving the auth storage → EXA_API_KEYmcp.exa.ai resolution order (#1860).

  • Fixed ACP plan-mode writes to local://PLAN.md so session-local plan artifacts are written to OMP's local artifact root instead of being routed through the editor writeTextFile bridge, avoiding Zen's Internal error and making the plan readable after creation (#1863).

  • Fixed ACP plan mode stranding the agent at plan approval: entering mode: "plan" now registers a standing resolve handler so the agent's resolve { action: "apply" } no longer fails with No pending action to resolve. Nothing to apply or discard. The handler validates the plan file, asks the ACP client to confirm via unstable_createElicitation when the client supports forms, renames the approved plan to local://<title>.md, and exits plan mode so the agent regains write tools for execution (#1869).

  • Fixed provider.appendOnlyContext: "auto" staying inactive for Xiaomi Token Plan/SGLang endpoints, preserving prefix-cache hits without forcing append-only mode globally (#1851).

  • Fixed models.yml compatibility parsing to preserve compat.cacheControlFormat: "anthropic" for custom OpenAI-compatible Claude proxies. (#1845)

  • Fixed the TUI's Settings → Plugins panel reporting "No plugins installed" when only marketplace plugins were installed. The panel now merges PluginManager.list() with MarketplaceManager.listInstalledPlugins() — the same data source the /plugins list slash command and omp plugin list CLI already used — and tags each row with an [npm] / [marketplace] kind badge, a scope tag, and a shadow indicator for project-shadowed user installs. Selecting a marketplace row opens a new MarketplacePluginDetailComponent whose single Enabled toggle calls MarketplaceManager.setPluginEnabled(pluginId, enabled, scope), with read-only metadata (version, install path, installed-at, last-updated, git commit SHA) listed below the toggle. The empty-state now lists both install commands (omp plugin install <package> and omp plugin install <name>@<marketplace>) (#1842).

  • Fixed scoped mnemopi recall in MnemopiSessionState.collectScopedRecallResults/recallResultsScoped to await the async Mnemopi.recallEnhanced so the new auto-derived queryEmbedding flows through. Without this, the embedding-enabled mnemopi backend silently kept running FTS-only on every recall. (#1832)

  • Fixed the SSH tool renderer inlining multiline remote commands into its single-line status header, which produced a malformed cell where the bordered output block opened mid-command. The renderer now drops the command from the header (which keeps only [host]) and renders the full command in a framed section above Output, mirroring the bash renderer. renderStatusLine also flattens any embedded CR/LF in description, meta, and title so no tool can accidentally expand the header into multiple rows (#1828).

  • Fixed tsc --noEmit against packages/coding-agent/tsconfig.json reporting 56 errors under TypeScript 5.x (builtin-registry.ts × 46, agent-session-openai-responses-replay.test.ts × 10). The repo's own gate (tsgo / TypeScript 6.x) already accepted the () => void slash-command handlers, but 5.x rejects them because it does not coerce a void-returning function value into a () => T | undefined slot. The SlashCommandSpec.handle / handleTui signatures and the test's createPersistedSession populate callback are now expressed as a union of two function types (one returning a SlashCommandResult / target, one returning void), so the existing handler bodies typecheck on both compilers (#1821).

  • Fixed omp update leaving @oh-my-pi/pi-natives and the platform-specific @oh-my-pi/pi-natives-<tag> leaf at the previous version on bun install -g updates, so the next launch loaded a stale .node file and aborted at validateLoadedBindings with The .node file on disk is from a different release than this loader. omp update now pins the native addon core and the platform leaf to the same version it installs for @oh-my-pi/pi-coding-agent (#1824).

@oh-my-pi/pi-mnemopi

Breaking Changes

  • Changed Mnemopi.recall(), Mnemopi.recallEnhanced(), Mnemopi.search(), Mnemopi.query(), the module-level recall/recallEnhanced/search/query exports, the BeamMemory.recall/recallEnhanced methods, the free recall/recallEnhanced functions in core/beam/recall, and orchestrateRecall to return Promise<RecallResult[]> so the recall pipeline can auto-derive queryEmbedding from the query text via embedQuery. Callers must await recall calls; pass queryEmbedding: null to opt out of auto-embedding and stay on FTS-only.
  • Changed the MCP entrypoints handleToolCall, callToolJson, and handleJsonRpc in mcp-server/mcp-tools to async so the recall/shared-recall handlers can await the new Promise<ToolResult[]> shape; external MCP transports must await these.

Fixed

  • Fixed memory_embeddings never being populated by the production remember/rememberBatch/updateWorking/consolidateToEpisodic paths; embedding generation is now scheduled as a background task on beam.pendingExtractions (mirroring scheduleFactExtraction), so configured providers (fastembed, OpenAI-compatible API, custom) actually run and rows land in memory_embeddings(memory_id, embedding_json, model). (#1832)
  • Fixed recall()/recallEnhanced() never deriving a query embedding from the query text, which silently degraded every deployment to FTS-only regardless of provider configuration. The recall pipeline now auto-calls embedQuery(query) when options.queryEmbedding is undefined; pass null to keep the old FTS-only behaviour. (#1832)
  • Fixed toRecallOptions dropping queryEmbedding between the Mnemopi facade and the beam layer, so callers can now explicitly pin or disable the query vector through the public API.
  • Fixed withMemory (CLI) and withBeam/withSharedBeam (MCP) closing the SQLite handle before background fact-extraction and embedding tasks finished, so short-lived mnemopi store/mnemopi sleep and MCP remember/update paths now drain flushExtractions before close instead of silently dropping memory_embeddings rows. CLI handlers and MCP handleRemember/handleUpdate/handleSleep/etc. are async as a result. (#1832, follow-up to #1833 review)
  • Fixed the process-wide embedQuery() cache in core/embeddings.ts keying by query text alone, which let two Mnemopi instances in the same process with different providers/models cross-contaminate their dense_score rankings. The cache key now includes a WeakMap-assigned provider identity, the resolved model name, and the configured apiUrl, so disjoint runtimes never read each other's cached vectors. (#1832, follow-up to #1833 review)

@oh-my-pi/pi-tui

Fixed

  • Fixed the OSC 11 appearance poll re-querying every 2s forever on terminals that support Mode 2031 but never change theme, whose repeated OSC 11/DA1 writes cleared the user's active text selection (breaking copy every 2 seconds). The poll now stops as soon as DECRQM confirms Mode 2031 support, since push notifications make polling redundant.

@oh-my-pi/pi-utils

Fixed

  • Hardened getIndentation against malformed paths: any filesystem error from the .editorconfig probe (e.g. ENAMETOOLONG on oversized garbage path segments) is now swallowed and cached as a miss instead of escaping and crashing the TUI mid-render (#1871).
  • Fixed getIndentation (and the edit renderer's replaceTabs callers) crashing with ENAMETOOLONG/ENOTDIR/etc. when handed a path with an overlong component or a non-directory in its parent chain. Editorconfig discovery now short-circuits to the default tab width on any path component above NAME_MAX (255 bytes) and absorbs any FsError while walking the editorconfig chain — best-effort discovery must never escape as an uncaught exception (#1872).

What's Changed

  • fix(robomp): backfill partial-clone blobs before worktree add by @roboomp in #1820
  • fix(coding-agent): made slash-command handlers compatible with TypeScript 5.x by @roboomp in #1822
  • fix(coding-agent): sync pi-natives on omp update by @roboomp in #1825
  • fix(tools/ssh): render multiline remote commands in a framed body block by @roboomp in #1830
  • fix(mnemopi): populated memory_embeddings on remember and auto-derived queryEmbedding on recall by @roboomp in #1833
  • fix(ai): preserve kimi-k2.6 reasoning across tool calls by @roboomp in #1839
  • fix(robomp): serialize same-issue event claims by @roboomp in #1841
  • fix(tui): list marketplace plugins in settings panel by @roboomp in #1844
  • fix(ai): honor Anthropic cache-control compat for OpenAI-compatible providers by @roboomp in #1847
  • fix(providers): add Xiaomi Token Plan support by @roboomp in #1848
  • fix(providers): cap Kimi K2.x maxTokens on Fireworks/Fire Pass at 32,768 by @roboomp in #1852
  • fix(providers): enable append-only auto for xiaomi sgLang endpoints by @roboomp in #1854
  • fix(mcp): update completed tool status icons by @roboomp in #1856
  • fix(ai): drop image content for DashScope compatible-mode text-only Qwen by @roboomp in #1861
  • fix(coding-agent): restore Exa MCP fallback by @roboomp in #1862
  • fix(coding-agent): keep local plan writes off ACP bridge by @roboomp in #1864
  • fix(utils): swallow editorconfig probe errors to keep TUI rendering safe by @roboomp in #1873
  • fix(utils): tolerate malformed paths in editorconfig indentation lookup by @roboomp in #1874

Full Changelog: v15.9.0...v15.9.1

Don't miss a new oh-my-pi release

NewReleases is sending notifications on new releases.