diegosouzapw/OmniRoute v3.8.9 on GitHub

[3.8.9] — 2026-06-03

✨ New Features

Obsidian context source — 24 MCP tools (read:obsidian / write:obsidian) — search, read, write, and bidirectional sync against a local Obsidian vault via the Local REST API community plugin. Dashboard "Context Sources" tab, settings API, DB config. (#3077 — thanks @branben)
cursor: vision (image_url) input for the Cursor provider — OpenAI image parts are encoded as SelectedContext.selected_images[] in the agent.v1 protobuf, plus a tool-commit directive (lifts composer-2.5's tool-call rate), tool_choice none/required/specific handling, and response_format/max_tokens/stop output constraints surfaced to the agent. Hardened with SSRF + DNS-rebinding guards, a 1 MiB pre-decode cap, and a protobuf length-overrun check. (#3104 — thanks @payne0420)
deepseek-web: opt-in persistent session + rolling-window conversation memory (persistSession, historyWindow per-connection settings) and bidirectional tool-call translation — tool schemas are injected as a system prompt and <tool>{…}</tool> blocks in the reply are parsed back into OpenAI tool_calls (replacing the old hard 400). (#2942, #2820)
i18n: Turkish locale-aware search & sorting — a turkishText helper (normalizeForSearch, matchesSearch, compareTr) folds the dotted/dotless İ/ı correctly and uses Intl.Collator("tr"), wired across dashboard search/sort call-sites with an ESLint guard (warn) against raw toLowerCase().includes(). (#3115 — thanks @osrt91)
kiro: add Claude Opus 4.8 to the Kiro (AWS CodeWhisperer) model catalog — Kiro previously topped out at Opus 4.7 even though Opus 4.8 was already defined and served by the claude provider. (#3131 — thanks @artickc)

🔧 Bug Fixes

sse: stop 502'ing streaming requests when a "reasoning" openai-compatible upstream ignores stream:true and returns a complete application/json body — the streaming readiness check only recognized SSE data: frames, so such a JSON body (even with valid content/reasoning_content) produced a spurious STREAM_EARLY_EOF. OmniRoute now detects a non-SSE JSON upstream body on the streaming path and synthesizes an equivalent OpenAI SSE stream (synthesizeOpenAiSseFromJson), preserving content + reasoning_content. (#3089)
cache: serve semantic-cache hits as SSE for streaming clients — a cache hit returned application/json regardless of the stream flag, so OpenAI-compatible streaming clients lost reasoning_content (and got a non-stream body) on cached responses. Stream requests now SSE-wrap the cached completion. (#2952)
i18n: fill the missing Chinese (zh-CN) and Russian (ru) UI translations — both locales were missing 9 entire sections (quotaPlans, activity, agentBridge, trafficInspector, cliCommon, cliCode, cliAgents, acpAgents, agentSkills, ~823 keys each) added after the last translation sweep, so those buttons/labels rendered in English. Both catalogs are now at full key parity with en.json (8025 keys). (#3026, #3067)
dashboard: fix "Ambiguous model" error in the provider Playground for vendor-namespaced models — the Playground only prefixed models without a /, so ids like moonshotai/kimi-k2.6 or nvidia/zyphra/zamba2-7b-instruct (NVIDIA NIM) were sent bare and rejected when the same id exists under multiple providers. The Playground now always qualifies the selected model with its providerId/ prefix (without double-prefixing). (#3050)
db: stop accepting duplicate API keys for the same provider — createProviderConnection now dedups by the decrypted key value (not just by name), so re-adding the same key under a different/blank name updates the existing connection instead of inserting a second row. Whitespace-only differences also dedup. (#3023)
dashboard: "Import from /models" now works for no-auth providers (e.g. OpenCode Free) — the button used to silently no-op because no-auth providers have no connection row, so handleImportModels returned early and the models route 404'd. The route now serves the provider's model catalog when called with a no-auth provider id, and the dashboard falls back to the provider id when there is no connection. (#3047)
providers: forward Grok's paired sso-rw cookie for grok-web — both the executor and the connection validator now send sso=…; sso-rw=… (via the new buildGrokCookieHeader helper) when the pasted blob carries sso-rw, fixing the 403 "Request rejected by anti-bot rules" that Grok returns for sso alone. The add-account hint now asks for the full cookie line. (#3063)
providers: fix claude-web persistent 403 — execute() was calling the synchronous normalizeClaudeSessionCookie() which never injects cf_clearance; changed to async normalizeClaudeSessionCookieWithAutoRefresh() with allowAutoSolve:true. Also removes dead executor claude-web-auto-refresh.ts and correctly reclassifies duckduckgo-web and veoaifree-web as NOAUTH_PROVIDERS. (#3090 — thanks @oyi77)
autoCombo: rotate across all provider connections, never waste capacity — buildAutoCandidates now expands each provider into one candidate per active connection (e.g. 43 Cerebras keys → 43 candidates). Adds ScoreTierRotator with per-combo round-robin state, combo-name-aware tier preferences (smart/fast/cheap/coding), connectionDensity factor (weight 0.05), and budget-cap degradation using the rotator. (#3078 — thanks @oyi77)
providers: fix SiliconFlow model sync from configured endpoint — routes model discovery through providerSpecificData.baseUrl so CN (api.siliconflow.cn) vs Global endpoint selection is respected, and prevents /sync-models from treating source: "local_catalog" fallback responses as successful remote syncs. (#3094 — thanks @xz-dev)
resilience: a per-model subscription/permission 403 from a passthrough provider (e.g. Ollama Cloud deepseek-v4-pro → "this model requires a subscription") now locks out only that model instead of cooling down the whole connection — the free models on the same key keep serving, and repeated paid-model 403s no longer escalate a connection-wide backoff. Generalizes the grok-web 403 precedent to all hasPerModelQuota providers; terminal/credential 403s (banned/deactivated key) still deactivate the connection. (#3027)
cache: preserve client-side cache_control breakpoints for Xiaomi MiMo — added xiaomi-mimo to the prompt-caching provider allowlist so Claude Code (via cc-switch) cache hints are no longer stripped by the OpenAI-format translator, restoring cache hits. (#3088)
tools: keep opaque object schemas open — empty object schemas (and the web_search passthrough shim) now get additionalProperties: true so GPT-5.5/Codex stop pruning untyped nested payloads (e.g. SPLOX_EXECUTE_TOOL.args). (#3097 — thanks @nmime)
codex: preserve native Responses passthrough tools and history — tool_search and custom tools (e.g. apply_patch) survive normalizeCodexTools, and phase:"commentary" history items are kept, only on the native passthrough path (_nativeCodexPassthrough). (#3107 — thanks @yinaoxiong)
responses: resolve bare ChatGPT model ids (e.g. gpt-5.5) to codex/… on the /v1/responses HTTP fallback path, fixing the Codex CLI WS→HTTP fallback that was routing to a credential-less provider (#3113).
sse: bound the Antigravity 429 short-retry loop (per-URL MAX_AUTO_RETRIES guard — no more infinite loop on a persistent 429) and lock quota-exhausted accounts for the full "Resets in XhYmZs" window via model lockout. (#3122 — thanks @ahmet-cetinkaya)
image-gen: add an AbortController timeout to fetchImageEndpoint so a stuck image provider surfaces a 504 instead of hanging until the server timeout. (#3105 — thanks @mgarmash)
logs (perf): fix browser freeze and network saturation on /dashboard/logs — smaller page size, 15s polling, pause polling on a hidden tab / past the first page, and memoized derived lists. (#3109 — thanks @0xtbug)
cli: handle Windows .exe healthchecks with spaces in the path — direct executables skip the shell (so cmd.exe doesn't split C:\…\Name With Spaces\…\claude.exe) while .cmd/.bat wrappers still run through it. (#3111 — thanks @EmpRider)
cli: don't write STORAGE_ENCRYPTION_KEY to .env on informational commands — omniroute --version/--help no longer generate a key or create ~/.omniroute/.env; provisioning is scoped to commands that actually touch encrypted storage (#3129).
tests: remove a stale lowercase db-apikeys-crud.test.ts duplicate that collided with the canonical db-apiKeys-crud.test.ts on case-insensitive filesystems (no coverage lost). (#3125 — thanks @juandisay)
kimi: add a dedicated KimiExecutor so Kimi thinking-mode responses no longer drop reasoning_content — the reasoning stream is now surfaced instead of being lost. (#3132 — thanks @bypanghu)
handler: provide a connectionId fallback when it is undefined, fixing kilo (kilocode) calls that were silently not being written to call_logs. (#3130 — thanks @androw)

🔧 Build

build-output-isolation: unified standalone assembly into one shared assembleStandalone module; isolated build output into .build/ (intermediates, gitignored) and dist/ (shippable bundle, gitignored), replacing the old repo-root app/ and .next/ directories; dropped the duplicate next build that prepublish previously ran; added build:release script for a clean rebuild with a dist/BUILD_SHA HEAD sentinel that guards against deploying stale bundles. Operators using custom app/ paths: the published bundle directory on the VPS image (/usr/lib/node_modules/omniroute/app/) is unchanged — only the in-repo build output path moved. Update any local scripts that reference the repo-local app/ build output to dist/ instead.
build: re-apply the build-reorg follow-ups that landed after the main refactor merged — the serve CLI now falls back from dist/ to the legacy app/ location for upgrade safety, and the deploy skills pm2 stop before rsync --delete to avoid a transient Cannot find module ./chunks/… race (#3127).
build: fix the standalone static-asset path so the dashboard renders after the build-output reorg — assembleStandalone was copying static/ into <bundle>/.next/static, but the standalone server (built with distDir=.build/next) serves /_next/static from <bundle>/.build/next/static, so every JS/CSS chunk 404'd and the login UI rendered as a blank page. The static (and required-server-files.json / Turbopack chunk) destinations are now derived from the configured distDir instead of a hard-coded .next.

📦 Dependencies

electron: bump to 42.3.2 (crash fix desktopCapturer, Chromium 148.0.7778.218, ThinLTO perf) (#3083)
electron-updater: bump to 6.8.8 (security: harden auto-update flow against path traversal and env var intercepts) (#3084)
electron-builder: bump to 26.14.0 (security hardening, pure-JS blockmap/icon migration) (#3082)
dev deps: bump eslint-config-next 16.2.7, lint-staged 17.0.7, typescript-eslint 8.60.1, vitest 4.1.8 (#3086)
prod deps: bump next 16.2.7, react/react-dom 19.2.7, tsx 4.22.4, ws 8.21.0, parse5 8.0.1, commander 15.0.0, and 15 other packages (#3085)

🙌 Contributors

Huge thanks to everyone whose work shipped in v3.8.9:

@branben (Obsidian context source), @oyi77 (claude-web 403 fix, autoCombo connection rotation), @xz-dev (SiliconFlow model sync), @nmime (open opaque tool schemas), @payne0420 (Cursor vision input), @mgarmash (image-gen fetch timeout), @yinaoxiong (Codex native passthrough tools/history), @0xtbug (logs page perf), @EmpRider (Windows CLI healthcheck paths), @ahmet-cetinkaya (Antigravity 429 retry bound + quota lockout), @juandisay (duplicate test cleanup), @osrt91 (Turkish locale-aware search & sorting), @artickc (Kiro Opus 4.8 catalog), @bypanghu (Kimi thinking-mode reasoning_content fix), and @androw (connectionId fallback + kilo call logging).

And thank you to the OmniRoute community for the bug reports, reproductions, and testing that drove these fixes. 🎉

What's Changed

fix(providers): claude-web 403 fix, no-auth providers misplaced in web-cookie by @oyi77 in #3090
fix(autoCombo): rotate across all connections, never waste provider capacity by @oyi77 in #3078
deps: bump electron-updater from 6.8.6 to 6.8.8 in /electron by @dependabot[bot] in #3084
deps: bump electron from 42.2.0 to 42.3.2 in /electron by @dependabot[bot] in #3083
feat(observability): add Obsidian context source with 24 MCP tools by @branben in #3077
deps: bump the development group with 5 updates by @dependabot[bot] in #3086
deps: bump the production group with 21 updates by @dependabot[bot] in #3085
fix(cache): preserve client cache_control for Xiaomi MiMo (#3088) by @diegosouzapw in #3093
Fix SiliconFlow model sync from configured endpoint by @xz-dev in #3094
fix(sse): scope ollama-cloud per-model 403 to model lockout, not connection cooldown (#3027) by @diegosouzapw in #3096
fix(providers): forward Grok sso-rw cookie to fix anti-bot 403 (#3063) by @diegosouzapw in #3098
fix(dashboard): make 'Import from /models' work for no-auth providers (#3047) by @diegosouzapw in #3099
fix(db): dedup duplicate API keys per provider on connection create (#3023) by @diegosouzapw in #3100
fix(dashboard): qualify vendor-namespaced Playground models with provider prefix (#3050) by @diegosouzapw in #3102
fix(i18n): fill missing zh-CN and ru UI translations (#3026, #3067) by @diegosouzapw in #3103
fix(sse): non-SSE JSON upstream on streaming path + SSE-wrap cache hits (#3089, #2952) by @diegosouzapw in #3108
fix(sse): emit reasoning/content as separate SSE deltas, no duplication (#3089 follow-up) by @diegosouzapw in #3112
refactor(build): isolate output to .build/+dist/, unify standalone assembly, drop 2nd build by @diegosouzapw in #3124
chore(build): re-apply build-reorg follow-ups (compat fallback + deploy docs) by @diegosouzapw in #3127
fix(responses): resolve bare ChatGPT model ids to codex on HTTP fallback path by @diegosouzapw in #3113
fix(codex): preserve native Responses passthrough tools and history by @yinaoxiong in #3107
fix: add AbortController timeout to fetchImageEndpoint by @mgarmash in #3105
feat(cursor): vision (image_url) input + tool-commit/output-constraint enhancements by @payne0420 in #3104
fix+feat(sse): per-model 403 lockout (#3027) + deepseek-web memory (#2942) & tool-calls (#2820) by @diegosouzapw in #3101
fix(tools): keep opaque object schemas open by @nmime in #3097
Remove duplicate lowercase db-apikeys-crud.test.ts from tracking by @juandisay in #3125
fix(sse): bound Antigravity 429 retry loop and lock quota-exhausted accounts for full reset window by @ahmet-cetinkaya in #3122
fix(cli): handle Windows exe healthchecks with spaces by @EmpRider in #3111
perf(logs): fix browser freeze and network saturation on /dashboard/logs by @0xtbug in #3109
fix(cli): don't provision STORAGE_ENCRYPTION_KEY on informational commands by @diegosouzapw in #3129
fix(i18n): Turkish locale-aware search and sorting by @osrt91 in #3115
fix(handler): provide fallback for connectionId when undefined by @androw in #3130
Fix：Add KimiExecutor to fix the error of reasoning_content is missing in kimi thing mode by @bypanghu in #3132
feat(kiro): add Claude Opus 4.8 to the Kiro model catalog by @artickc in #3131
Release v3.8.9 by @diegosouzapw in #3092
Release v3.8.9 — final sync (static-asset fix + contributor credits) by @diegosouzapw in #3135

New Contributors

@yinaoxiong made their first contribution in #3107
@juandisay made their first contribution in #3125
@EmpRider made their first contribution in #3111
@osrt91 made their first contribution in #3115
@androw made their first contribution in #3130
@artickc made their first contribution in #3131

Full Changelog: v3.8.8...v3.8.9