✨ New Features
- feat(@omniroute/opencode-plugin): upstream-provider suffix in model display name — appends provider label to enriched names (e.g.
Claude Opus 4.7 · ClaudevsClaude Opus 4.7 · Kiro) so the OC TUI model picker can differentiate same-id models routed through different upstream connections. Default-on, opt-out viafeatures.providerTag: false. (#2602 — thanks @mrmm) - feat(@omniroute/opencode-plugin): provider-tag becomes a prefix + traffic-light compression emoji — provider label now prepends (
Claude - Claude Opus 4.7) for better TUI column grouping, with smart abbreviation for long labels (GitHub Models→GHM). Compression pipelines render intensity as emoji (🟢🟡🟠🔴). (#2604 — thanks @mrmm) - feat(providers): add 7 free-tier providers (Wave 1) — Arcee AI, InclusionAI, Krutrim, Liquid AI, MonsterAPI, Nomic, and Poolside now available as new API-key providers with provider icons, model specs, and full routing support. (#2479 — thanks @oyi77)
- feat(providers): add Astraflow provider support with global + China endpoints — new provider with dual-region base URLs for global and mainland China access. (#2486 — thanks @ucloudnb666)
- feat(providers): add
claude-webprovider — cookie-based Claude Web chat access without OAuth. (#2476 — thanks @oyi77) - feat(providers): add 14 free-tier providers (Wave 1b) — 360AI, Baichuan, Baidu, ByteDance/Doubao, IDEO, Kuaishou/Kling, Kunlun/Skywork, SenseTime/SenseNova, Stepfun, Tencent HunYuan, Zhipu GLM, Replicate, RunPod, and Modal with provider icons, model specs, and routing support. (#2488 — thanks @oyi77)
- feat(hermes): add rich multi-role Hermes Agent CLI support — 7 configurable roles (default, delegation, vision, compression, web_extract, skills_hub, approval), per-role model selection with YAML config generation, dashboard card with preview, and home widget integration. (#2526 — thanks @apoapostolov)
- feat(cloud-agents): cloud agents UX overhaul — tabs (tasks/agents/settings), status filters, Material icons, duration formatting, cloud agent credentials and health API endpoints, memory stats endpoint. (#2516 — thanks @oyi77)
- feat(authz): manage-scope API keys may reach
/api/mcp/*from non-loopback — Route Guard Tiers system (LOCAL_ONLY / ALWAYS_PROTECTED / MANAGEMENT), narrow carve-out for remote MCP access gated bymanagescope;/api/cli-tools/runtime/*stays strict-loopback. Includes dashboard AuthzSection, inventory API, and comprehensive docs. (#2473 — thanks @mrmm) - feat(home): home page customization for experienced users — pin Provider Quota to home, toggle Quick Start and Provider Topology visibility via Appearance settings. (#2531 — thanks @apoapostolov)
- feat(home): automatic refresh of Provider Quota — configurable interval (60s–600s) with toggle in Appearance settings; auto-refreshes pinned quota on the home page. (#2532 — thanks @apoapostolov)
- feat(@omniroute/opencode-plugin): OmniRoute OpenCode plugin — live models fetched from OmniRoute API, combo-aware model listing, Gemini request sanitization, multi-instance support, auth flow integration, and 10 test files. (#2529 — thanks @mrmm)
- feat(executors): forward OpenCode client headers to upstream providers — OpenCode-specific headers are now forwarded through the executor pipeline for improved compatibility. (#2538 — thanks @kang-heewon)
- feat(fireworks): add new models with
modelIdPrefixsupport — generic registry field that stores short model IDs and prepends the full path prefix before upstream API calls. Adds 6 new Fireworks models,modelsUrlfor dynamic sync, and Qwen3 reranker. (#2560 — thanks @HALDRO) - feat(@omniroute/opencode-plugin): readable + filterable + offline-resilient model picker —
usableOnlyfilter (only show providers with healthy connections),diskCachefor offline hydration,Combo:prefix labeling, and compression metadata tags in combo display names. (#2572 — thanks @mrmm) - feat(smart-pipeline): multi-stage pipeline for auto combo routing — rule-based + intent-classifier + domain-specific stages with configurable pipeline router, accuracy benchmarks, and comprehensive tests. (#2551 — thanks @oyi77)
- feat(ops): skip DB health check on startup via
OMNIROUTE_SKIP_DB_HEALTHCHECK=1— replaces slowintegrity_check(7+ min on large WAL) withquick_check, and adds env var to skip entirely. (#2554 — thanks @soyelmismo) - refactor(dashboard): Provider Quota grouped layout with vertical rail — restructures the page to a 2-column per-provider layout (left rail with icon/name/status, right content with dynamic per-provider columns), new
providerColumns.ts/ProviderGroup.tsx/AccountRow.tsxcomponents, env chip-filter row, bulk-refresh per group, and inline expanded panels. (#2528 — thanks @Gi99lin) - feat(providers): add 26 free-tier providers missing from registry — Novita, Avian, Chutes, Kluster, Targon, Nineteen, Celery, Ditto, Atoma, and more. (#2590 — thanks @oyi77)
- feat(providers): add api-airforce free provider with 55 models. (#2587 — thanks @oyi77)
- feat(dashboard): configurable sidebar — presets, drag-and-drop ordering, smart-grouping, and new Settings → Sidebar page. (#2581 — thanks @Gi99lin)
🔧 Bug Fixes
-
fix(validation): stop appending a second
/modelswhen the Gemini base URL already ends in/models— Google AI Studio connections using the default base URL were validating against.../v1beta/models/modelsand failing with404for every connection. (#2545) -
fix(cloudflare-ai): flatten OpenAI content-part arrays to plain strings for the Workers AI (
cf/) executor — Workers AI's/ai/v1/chat/completionsrejectscontent: [{type:"text",...}]with HTTP 400, so requests with array content now have their text parts joined into a string. (#2539) -
fix(i18n): replace leftover Portuguese strings in the English source with English on the Quota dashboards — the quota-share Beta notice (
betaConfigSaved*) and the Provider Quota row'sEdit cutoffs/Refresh nowfallbacks were showing Portuguese. (#2540) -
fix(proxy): honor the legacy per-provider/global proxy config in
resolveProxyForProvider— the Claude OAuth token exchange and token refresh only consulted the new proxy registry, so a proxy configured the legacy way (/api/settings/proxy?level=provider) was ignored and the exchange went out directly from the host, tripping Anthropic's IPrate_limit_erroron VPS deployments. It now falls back to the legacy config, mirroringresolveProxyForConnection. (#2456) -
fix(antigravity): auto-discover a missing Cloud Code
projectIdvialoadCodeAssistbefore failing — a freshly re-added Antigravity account whose storedprojectIdwas empty (OAuth-time discovery returned nothing) now recovers the project on the first request instead of returning422 Missing Google projectId, mirroring thegemini-clibootstrap. (#2334, #2541) -
fix(stream): keep the
/v1/responsesSSE connection warm for strict clients — emit an early keepalive while the upstream produces its first token and lower the heartbeat cadence to 4s, so Codex CLI'sreqwestclient (≈5s idle-read timeout) no longer drops the stream "before completion" on slow/reasoning models.curlwas unaffected because it has no idle timeout. (#2544) -
fix(electron): wait longer for the server on first launch and reload once it responds — long post-upgrade DB migrations could exceed the 30s readiness probe, leaving the desktop app stuck on the "Server starting" screen even though the backend was healthy. The probe now targets the auth-exempt health endpoint with a generous timeout and reloads the window once the server comes up. (#2460)
-
fix(cli): mark
bin/omniroute.mjsas executable (mode 755) so the globally-installed CLI runs directly without a manualchmod +x. (#2469 — thanks @disonjer) -
fix(settings): restore the Global System Prompt into the in-memory config on server startup and after JSON/SQLite import — it was only loaded by the PUT endpoint, so the toggle/prompt silently reverted to defaults after any restart or import. (#2470 — thanks @disonjer)
-
fix(settings): append the Global System Prompt after existing system content instead of prepending it, so provider/agent instructions (Kiro, OpenCode, Hermes, …) injected into the system message no longer override the user's global prompt via recency bias. (#2468 — thanks @disonjer)
-
fix(kiro): refresh imported social tokens (
authMethod === "imported") via the Kiro social-auth endpoint instead of AWS SSO OIDC — imported tokens carry a registeredclientId/clientSecretbut a social-issued refresh token the OIDC client cannot refresh, so auto-refresh was failing with "provider returned no new token". (#2467 — thanks @disonjer) -
fix(antigravity): resolve the Cloud Code
projectIdfromproviderSpecificDataas a fallback (and preserve it across token refresh) so the Gemini/v1betastreaming path stops returning a spurious422 Missing Google projectIdfor connections that store the project there. (#2480) -
fix(api):
GET /v1beta/modelsnow lists only models whose provider has an active/validated connection, matching the OpenAI-format/v1/modelsbehavior, instead of returning the entire catalog. (#2483) -
fix(cli): persist
STORAGE_ENCRYPTION_KEYintoDATA_DIR(not only~/.omniroute) and refuse to auto-generate a fresh key when astorage.sqlitealready exists — a new key cannot decrypt previously-encrypted credentials, so silently regenerating it locked users out of their database. The CLI now mirrors the serverbootstrapEnvguard. (reported by Daniel Nach; original key persistence by @Chewji9875 — follow-up to #1622) -
fix(gemini): preserve and re-attach the
thoughtSignatureon Gemini thinking-model tool calls — thread the signature namespace through theFORMATS.GEMINIandFORMATS.GEMINI_CLIrequest translators so the cached signature (keyed by connection + tool-call id) is found on the follow-up turn. Fixes[400]: Function call is missing a thought_signature in functionCall partson agentic Gemini tool use. (#2504) -
fix(translator): accept PDFs sent in the Responses-API
input_fileshape on the Gemini path, and the Gemini-styledocumentshape on the Responses/Codex path — content parts are now normalized acrossinput_file/file/documentso a PDF reaches the model regardless of which field name the client used. (#2515) -
fix(stream): count
thinkingarrays andreasoning_detailsas useful stream output — a reasoning-only response (e.g. Mistral/StepFun with a lowmax_tokens) was misclassified as "Stream ended before producing useful content" and turned into a spurious 502; it is now recognized as valid output. (#2520) -
fix(claude): extract system/developer role messages in Claude Code semantic passthrough paths — moves
role:"system"/role:"developer"messages from themessages[]array to the top-levelsystemparameter before sending to Anthropic, which rejects them inside messages. Fixes memory injection context being silently dropped. (#2497 — thanks @unitythemaker) -
fix(vision-bridge): auto-route non-standard provider models through OmniRoute self-loop — vision-bridge now detects when a model doesn't natively support vision and automatically re-routes the image through OmniRoute's own endpoint for format translation. (#2487 — thanks @herjarsa)
-
fix(mitm): add IPv6 DNS redirect, modular antigravity target, improved logging — MITM DNS handler now correctly redirects IPv6 (AAAA) queries alongside IPv4, adds a dedicated
antigravity.tstarget module, and enhances DNS/TLS logging for debugging. (#2514 — thanks @herjarsa) -
fix(usage): improve Claude and MiniMax plan label detection — better tier name resolution for Claude OAuth usage (tier/plan/subscription_type/org fields) and new MiniMax plan label inference from quota totals. (#2498 — thanks @Gi99lin)
-
fix(codex): fan out image
nrequests in parallel — when Codex requestsn > 1images, the image-generation handler now dispatches them concurrently instead of sequentially, significantly reducing total latency. (#2499 — thanks @nmime) -
fix(embeddings): strip stale
Content-Encodingheaders from upstream response — prevents clients from receiving gzip-encoded responses withidentityencoding declared, which caused silent data corruption. (#2477 — thanks @lordavadon2) -
fix(model): return clear error instead of silent OpenAI default for unrecognized models — previously, an unrecognized model silently fell back to OpenAI; now returns a 404 with a descriptive message listing known providers. (#2492 — thanks @herjarsa)
-
fix(dark-mode): correct background token on Compression Override select — the combo compression override
<select>was using a hard-coded white background that was invisible in dark mode. (#2513 — thanks @apoapostolov) -
fix(antigravity): align subscription tier detection with Antigravity Manager —
extractCodeAssistSubscriptionTiernow parses the correct nested field from theloadCodeAssistresponse, and a newextractCodeAssistOnboardTierIdfallback handles the onboarding flow. Subscription info is cached per access-token with 5-min TTL. (#2496 — thanks @Gi99lin) -
fix(opencode-zen): add
opencodeprovider alias and sync model list with live API —opencode-zenandopencode-goare now also reachable via the shorteropencodealias, and the default model list is kept in sync with the live/v1/modelscatalog. (#2508 — thanks @herjarsa) -
fix(combo): clarify log message when combo target is skipped due to unavailable credentials — previously logged a misleading "provider not found" message; now says "skipped: credentials unavailable". (#2494 — thanks @herjarsa)
-
fix(security): replace
Math.randomwithcrypto.randomUUIDingenerateTaskId/ActivityIdand fix URL hostname check in test — eliminates weak PRNG usage flagged by CodeQL. (#2489) -
fix(electron): downgrade to Electron 41.x for better-sqlite3 V8 compatibility — Electron 42.x shipped a V8 version that broke
better-sqlite3native bindings at runtime; pinning to 41.x restores stability. -
fix(@omniroute/opencode-provider): include
limit.contextin model entries for OpenCode context window detection — OpenCode readslimit.contextto determine usable context length for compaction and overflow detection. -
fix(providers): make
gitlawb/gitlawb-gmimodel entry optional — prevents provider initialization failure when the model is not available in the catalog. (#2476 — thanks @oyi77) -
fix(translator): inject
omniroute_web_searchin the Responses-API flat tool shape ({ type, name }) when the target provider speaks the Responses API — previously it was always emitted in the Chat Completions nested shape, so Codex/relay upstreams rejected the request. (#2390) -
fix(kiro): serialize non-string
role:"tool"message content before sending to CodeWhisperer — structured/array tool output was collapsing tocontent:[{ text: "" }], which Kiro rejects with400 Improperly formed request. (#2446) -
fix(claude): gate the heavy-agent beta headers (
context-1m,effort,advanced-tool-use) on Opus/Sonnet only — Haiku with OAuth was receivingcontext-1mand rejecting it with 400. Also sanitizes historicalthinkingblock signatures in passthrough. (#2454 — thanks @havockdev) -
fix(perplexity-web): route requests through a Firefox-148 TLS-impersonating client so Perplexity's Cloudflare edge stops rejecting VPS/datacenter IPs with a 403 challenge. (#2459 — thanks @havockdev)
-
fix(validation): guard
apiKey/modelsUrlagainst non-string values before calling.startsWith()/.trim()in the provider connection-test path. (#2463) -
fix(cost): prevent double-billing of
cache_creation_input_tokens—prompt_tokensfrom token extractors already includes bothcache_readandcache_creation, sononCachedInputnow subtracts both cache types to avoid pricing cache at the full input rate. (#2522 — thanks @herjarsa) -
fix(handler): always normalize system role messages in Claude passthrough paths —
normalizeClaudeUpstreamMessages()is now called unconditionally in bothcompatibleBridgeand pure passthrough, ensuringrole:"system"messages are always extracted to the top-levelsystemparameter. (#2519 — thanks @herjarsa) -
fix(handler): capture Gemini
thought_signaturein non-streaming response path — the non-streaming translator now capturesthoughtSignaturefrom Gemini thinking model parts and persists them so follow-up turns can resolve them correctly. (#2518 — thanks @herjarsa) -
fix(kiro): replace broken social OAuth with device flow — rewrites Kiro's Google/GitHub social login from the broken PKCE
kiro://custom protocol to AWS Cognito device flow, which works correctly in web/proxy environments. (#2524 — thanks @disonjer) -
fix(providers): resolve
opencode/→opencode-zenslug mismatch + add 40+ new models —opencodeis now a proper alias foropencode-zenin executor, model resolver, and provider registry; adds GPT 5.x, Claude 4.x, Gemini 3.x, Grok, Kimi, and other models with tests. (#2517 — thanks @herjarsa) -
fix(antigravity): fail over stalled Antigravity sessions — new
ANTIGRAVITY_PRE_RESPONSE_TIMEOUT_CODEshared constant for pre-response timeout detection, automatic failover to next account when session stalls before headers arrive. Node.js engine range relaxed to>=20.20.2. (#2464 — thanks @dhaern) -
fix(deepseek-web): fix SSE parser, prompt format, and error handling — handles all 3 DeepSeek SSE stream formats (initial fragments, APPEND operations, bare string tokens), simplifies prompt to single-turn to prevent chat marker leakage, and checks
json.codebefore token extraction. (#2502 — thanks @ovehbe) -
fix(codex): accept
auth.jsonwithoutauth_modefield on import — Codex CLI no longer writesauth_mode; import now accepts both formats as long as required tokens are present. Semantic cache read now requires explicittemperature: 0. (#2536 — thanks @janeza2) -
fix(freetheai): add
/chat/completionsto baseUrl to resolve 404 errors. (#2557 — thanks @lordavadon2) -
fix(qoder): route PAT tokens to Qoder native API instead of DashScope — detects
pt-prefixed tokens and routes toapi.qoder.comwith proper User-Agent header. (#2559 — thanks @herjarsa) -
fix(perf): cache compiled RegExp in RTK compression hot path — eliminates thousands of redundant
new RegExp()instantiations per second. (#2553 — thanks @soyelmismo) -
fix(reasoning-cache): auto-start periodic cleanup on module load — the
server-init.tsjob was never imported (dead code), causing thereasoning_cachetable to grow indefinitely. Now runs 30-min cleanup cycles automatically. (#2552 — thanks @soyelmismo) -
fix(claude): omit
context-1mbeta for Sonnet — restrict to Opus-only to avoid long-context credit gate errors. Addafk-mode-2026-01-31, replaceredact-thinkingwiththinking-token-count-2026-05-13. (#2568 — thanks @unitythemaker) -
fix(codex): relax
auth_modecheck in frontend import preview — acceptundefined/null/"chatgpt"instead of requiring"chatgpt"strictly, matching the backend fix in #2536. (#2567 — thanks @janeza2) -
fix(kimi): declare vision capability for Kimi K2.6 in all 4 layers —
providerRegistry,modelSpecs,catalog.tskeyword list, and PlaygroundVISION_MODELS; previously the model silently rejected image uploads. (#2573 — thanks @herjarsa) -
fix(dashboard): paginate request-log viewer beyond 300 rows —
getCallLogsnow acceptsoffsetwith parameterized SQL (eliminates string-interpolatedLIMIT);RequestLoggerV2grows its window via "Load more" + IntersectionObserver infinite scroll, resetting on filter change. (#2576) -
fix(cli): use
/api/monitoring/healthfor server readiness check —waitForServer()was polling the auth-protected/api/health(401), causingomniroute serveto hang indefinitely. (#2578 — thanks @amogus22877769) -
fix(combo): detect invalid model errors via structured error codes + regex fallback — when a combo target rejects a model (e.g. free account vs Pro), the router now recognizes
model_not_found/deployment_not_foundcodes and 6 regex patterns, and falls through to the next target instead of stopping the loop. (#2534 — thanks @HALDRO) -
fix(security): post-review hardening batch —
spawnSyncarg-array replacesexecSyncstring-template (command injection), CSPunsafe-evalgated on!app.isPackaged,requireManagementAuthguard on budget/bulk and resilience/reset endpoints, error messages sanitized in gemini-web/claude-web/copilot-web/oauth/agents catch blocks, circuit breaker persistslastFailureKind, and combo resetsexhaustedProvidersper set-retry iteration. (#2435) -
fix(@omniroute/opencode-plugin): honor
geminiSanitizationandfetchInterceptorfeature flags — both were applied unconditionally; now each fetch layer is gated by its flag (default ON), and disabling both falls back to plain SDK fetch. (#2546) -
fix(#2575): check DB feature flag override in
arePrivateProviderUrlsAllowed()— supports runtime toggle without restart. (#2595 — thanks @herjarsa) -
fix(mimo): add
supportsVisionflag to MiMo-V2.5, V2.5-Pro, and V2-Omni — previously image uploads were silently rejected. (#2592 — thanks @herjarsa) -
fix(ops): propagate
OMNIROUTE_SKIP_DB_HEALTHCHECKenv var to periodic DB health check scheduler — companion fix to #2554. (#2591 — thanks @soyelmismo) -
fix(github): remove incorrect
openai-responsestargetFormat from GitHub Copilot's Haiku/Sonnet models. (#2583 — thanks @oyi77) -
fix(copilot): stabilize responses configuration — removes 865 lines of unstable config, simplifies handler. (#2579 — thanks @ivan-mezentsev)
-
fix(#2544): add SSE heartbeat keepalive to Responses API transform stream — prevents Codex CLI 0.130.0 from disconnecting during long thinking/reasoning phases. (#2599 — thanks @herjarsa)
-
fix(memory): extract system role messages in semantic passthrough path to prevent 400 on memory injection — system messages were being passed as-is to providers that reject mixed roles. (#2474 — thanks @Tentoxa)
-
fix(@omniroute/opencode-provider): include
limit.contextin model entries for OpenCode context window detection — previously OpenCode couldn't determine model context size. (#2482 — thanks @herjarsa) -
fix(mimo): add
supportsVisionflag to Kimi K2.6 in providerRegistry + comprehensive vision tests for MiMo V2.5/V2.5-Pro/V2-Omni. (#2600 — thanks @herjarsa) -
fix(proxy): prefer scoped proxies over registry global fallback — legacy provider-specific proxy was being shadowed by a registry-global fallback across both storage backends. Resolution now follows strict specificity: account → provider → combo → global. (#2606 — thanks @terence71-glitch)
-
fix(@omniroute/opencode-plugin): canonical-twin dedup + alias-fallback enrichment —
/v1/modelsreturned the same model under both alias (cc/claude-opus-4-7) and canonical (claude/claude-opus-4-7) names; now drops ~75 canonical duplicates and rescues ~88 raw-id rows with proper provider prefix via alias-index fallback. Also emitscost,release_date,modalitiesfields in static catalog and raises provider label threshold to 12 chars (preservesAssemblyAI,Antigravityverbatim). (#2607 — thanks @mrmm) -
fix(registry): populate empty models arrays for HuggingFace (6 models) and HackClub (3 models) + fix Snowflake placeholder baseUrl to
{account}template pattern. (#2611 — thanks @oyi77)
🌐 Internationalization
- i18n(zh-CN): translate 830 missing UI strings — replaces all
__MISSING__:placeholders with proper Chinese translations. (#2523 — thanks @InkshadeWoods) - i18n(dashboard): add missing dashboard keys and fix EN fallbacks — hundreds of hardcoded English strings across cache, caveman, costs, skills, memory, and evals pages replaced with
t()calls. (#2500 — thanks @Gi99lin) - i18n(pt-BR): complete and fix Brazilian Portuguese translation — comprehensive overhaul of pt-BR locale with ~3000 lines of quality translations, filling all missing keys and correcting existing entries. (#2543 — thanks @alltomatos)
- i18n(ru): comprehensive Russian translation update — ~2000 lines of corrected and filled translations. (#2550 — thanks @AgentAlexAI)
- i18n(all): comprehensive localization and UI refactoring — 42 locale files synchronized with missing keys, cloud-agents page i18n rewrite, and consistent
t()usage across 21 dashboard components. (#2580 — thanks @alltomatos) - i18n(all): translate freeTier provider strings across 41 locales — replaces
__MISSING__:Free Tier Providersplaceholders with proper translations in bothcommonandprovidersnamespaces. (#2609 — thanks @leninejunior) - i18n(pt-BR): eliminate all 1270 remaining
__MISSING__markers — completes pt-BR translation across 41 namespaces to true 100% coverage. (#2610 — thanks @leninejunior)
📝 Maintenance
- chore: remove Akamai VPS deploy from release workflow and skills.
- chore(deps): bump
actions/setup-nodefrom v4 to v6 +randomBytessecurity fix for cloud agent task IDs. (#2589) - chore(deps): bump
actions/upload-artifactfrom v4 to v7. (#2588) - chore: ignore
.claude/worktreesfrom git tracking. - chore(ci): auto-lock release branch on version publish — new CI workflow applies
lock_branchprotection when a GitHub Release is published. (#2542) - docs: redesign README — marketing-first layout with accurate provider counts. (#2490)
What's Changed
- fix(security): post-review hardening batch — command injection, CSP, … by @diegosouzapw in #2435
- Release v3.8.1 by @diegosouzapw in #2441
- fix(electron): downgrade to Electron 41.x + remove Akamai deploy by @diegosouzapw in #2462
- Release v3.8.2 by @diegosouzapw in #2503
Full Changelog: v3.8.0...v3.8.2