github diegosouzapw/OmniRoute v3.8.2

5 hours ago

✨ New Features

  • feat(@omniroute/opencode-plugin): upstream-provider suffix in model display name — appends provider label to enriched names (e.g. Claude Opus 4.7 · Claude vs Claude Opus 4.7 · Kiro) so the OC TUI model picker can differentiate same-id models routed through different upstream connections. Default-on, opt-out via features.providerTag: false. (#2602 — thanks @mrmm)
  • feat(@omniroute/opencode-plugin): provider-tag becomes a prefix + traffic-light compression emoji — provider label now prepends (Claude - Claude Opus 4.7) for better TUI column grouping, with smart abbreviation for long labels (GitHub ModelsGHM). Compression pipelines render intensity as emoji (🟢🟡🟠🔴). (#2604 — thanks @mrmm)
  • feat(providers): add 7 free-tier providers (Wave 1) — Arcee AI, InclusionAI, Krutrim, Liquid AI, MonsterAPI, Nomic, and Poolside now available as new API-key providers with provider icons, model specs, and full routing support. (#2479 — thanks @oyi77)
  • feat(providers): add Astraflow provider support with global + China endpoints — new provider with dual-region base URLs for global and mainland China access. (#2486 — thanks @ucloudnb666)
  • feat(providers): add claude-web provider — cookie-based Claude Web chat access without OAuth. (#2476 — thanks @oyi77)
  • feat(providers): add 14 free-tier providers (Wave 1b) — 360AI, Baichuan, Baidu, ByteDance/Doubao, IDEO, Kuaishou/Kling, Kunlun/Skywork, SenseTime/SenseNova, Stepfun, Tencent HunYuan, Zhipu GLM, Replicate, RunPod, and Modal with provider icons, model specs, and routing support. (#2488 — thanks @oyi77)
  • feat(hermes): add rich multi-role Hermes Agent CLI support — 7 configurable roles (default, delegation, vision, compression, web_extract, skills_hub, approval), per-role model selection with YAML config generation, dashboard card with preview, and home widget integration. (#2526 — thanks @apoapostolov)
  • feat(cloud-agents): cloud agents UX overhaul — tabs (tasks/agents/settings), status filters, Material icons, duration formatting, cloud agent credentials and health API endpoints, memory stats endpoint. (#2516 — thanks @oyi77)
  • feat(authz): manage-scope API keys may reach /api/mcp/* from non-loopback — Route Guard Tiers system (LOCAL_ONLY / ALWAYS_PROTECTED / MANAGEMENT), narrow carve-out for remote MCP access gated by manage scope; /api/cli-tools/runtime/* stays strict-loopback. Includes dashboard AuthzSection, inventory API, and comprehensive docs. (#2473 — thanks @mrmm)
  • feat(home): home page customization for experienced users — pin Provider Quota to home, toggle Quick Start and Provider Topology visibility via Appearance settings. (#2531 — thanks @apoapostolov)
  • feat(home): automatic refresh of Provider Quota — configurable interval (60s–600s) with toggle in Appearance settings; auto-refreshes pinned quota on the home page. (#2532 — thanks @apoapostolov)
  • feat(@omniroute/opencode-plugin): OmniRoute OpenCode plugin — live models fetched from OmniRoute API, combo-aware model listing, Gemini request sanitization, multi-instance support, auth flow integration, and 10 test files. (#2529 — thanks @mrmm)
  • feat(executors): forward OpenCode client headers to upstream providers — OpenCode-specific headers are now forwarded through the executor pipeline for improved compatibility. (#2538 — thanks @kang-heewon)
  • feat(fireworks): add new models with modelIdPrefix support — generic registry field that stores short model IDs and prepends the full path prefix before upstream API calls. Adds 6 new Fireworks models, modelsUrl for dynamic sync, and Qwen3 reranker. (#2560 — thanks @HALDRO)
  • feat(@omniroute/opencode-plugin): readable + filterable + offline-resilient model picker — usableOnly filter (only show providers with healthy connections), diskCache for offline hydration, Combo: prefix labeling, and compression metadata tags in combo display names. (#2572 — thanks @mrmm)
  • feat(smart-pipeline): multi-stage pipeline for auto combo routing — rule-based + intent-classifier + domain-specific stages with configurable pipeline router, accuracy benchmarks, and comprehensive tests. (#2551 — thanks @oyi77)
  • feat(ops): skip DB health check on startup via OMNIROUTE_SKIP_DB_HEALTHCHECK=1 — replaces slow integrity_check (7+ min on large WAL) with quick_check, and adds env var to skip entirely. (#2554 — thanks @soyelmismo)
  • refactor(dashboard): Provider Quota grouped layout with vertical rail — restructures the page to a 2-column per-provider layout (left rail with icon/name/status, right content with dynamic per-provider columns), new providerColumns.ts / ProviderGroup.tsx / AccountRow.tsx components, env chip-filter row, bulk-refresh per group, and inline expanded panels. (#2528 — thanks @Gi99lin)
  • feat(providers): add 26 free-tier providers missing from registry — Novita, Avian, Chutes, Kluster, Targon, Nineteen, Celery, Ditto, Atoma, and more. (#2590 — thanks @oyi77)
  • feat(providers): add api-airforce free provider with 55 models. (#2587 — thanks @oyi77)
  • feat(dashboard): configurable sidebar — presets, drag-and-drop ordering, smart-grouping, and new Settings → Sidebar page. (#2581 — thanks @Gi99lin)

🔧 Bug Fixes

  • fix(validation): stop appending a second /models when the Gemini base URL already ends in /models — Google AI Studio connections using the default base URL were validating against .../v1beta/models/models and failing with 404 for every connection. (#2545)

  • fix(cloudflare-ai): flatten OpenAI content-part arrays to plain strings for the Workers AI (cf/) executor — Workers AI's /ai/v1/chat/completions rejects content: [{type:"text",...}] with HTTP 400, so requests with array content now have their text parts joined into a string. (#2539)

  • fix(i18n): replace leftover Portuguese strings in the English source with English on the Quota dashboards — the quota-share Beta notice (betaConfigSaved*) and the Provider Quota row's Edit cutoffs / Refresh now fallbacks were showing Portuguese. (#2540)

  • fix(proxy): honor the legacy per-provider/global proxy config in resolveProxyForProvider — the Claude OAuth token exchange and token refresh only consulted the new proxy registry, so a proxy configured the legacy way (/api/settings/proxy?level=provider) was ignored and the exchange went out directly from the host, tripping Anthropic's IP rate_limit_error on VPS deployments. It now falls back to the legacy config, mirroring resolveProxyForConnection. (#2456)

  • fix(antigravity): auto-discover a missing Cloud Code projectId via loadCodeAssist before failing — a freshly re-added Antigravity account whose stored projectId was empty (OAuth-time discovery returned nothing) now recovers the project on the first request instead of returning 422 Missing Google projectId, mirroring the gemini-cli bootstrap. (#2334, #2541)

  • fix(stream): keep the /v1/responses SSE connection warm for strict clients — emit an early keepalive while the upstream produces its first token and lower the heartbeat cadence to 4s, so Codex CLI's reqwest client (≈5s idle-read timeout) no longer drops the stream "before completion" on slow/reasoning models. curl was unaffected because it has no idle timeout. (#2544)

  • fix(electron): wait longer for the server on first launch and reload once it responds — long post-upgrade DB migrations could exceed the 30s readiness probe, leaving the desktop app stuck on the "Server starting" screen even though the backend was healthy. The probe now targets the auth-exempt health endpoint with a generous timeout and reloads the window once the server comes up. (#2460)

  • fix(cli): mark bin/omniroute.mjs as executable (mode 755) so the globally-installed CLI runs directly without a manual chmod +x. (#2469 — thanks @disonjer)

  • fix(settings): restore the Global System Prompt into the in-memory config on server startup and after JSON/SQLite import — it was only loaded by the PUT endpoint, so the toggle/prompt silently reverted to defaults after any restart or import. (#2470 — thanks @disonjer)

  • fix(settings): append the Global System Prompt after existing system content instead of prepending it, so provider/agent instructions (Kiro, OpenCode, Hermes, …) injected into the system message no longer override the user's global prompt via recency bias. (#2468 — thanks @disonjer)

  • fix(kiro): refresh imported social tokens (authMethod === "imported") via the Kiro social-auth endpoint instead of AWS SSO OIDC — imported tokens carry a registered clientId/clientSecret but a social-issued refresh token the OIDC client cannot refresh, so auto-refresh was failing with "provider returned no new token". (#2467 — thanks @disonjer)

  • fix(antigravity): resolve the Cloud Code projectId from providerSpecificData as a fallback (and preserve it across token refresh) so the Gemini /v1beta streaming path stops returning a spurious 422 Missing Google projectId for connections that store the project there. (#2480)

  • fix(api): GET /v1beta/models now lists only models whose provider has an active/validated connection, matching the OpenAI-format /v1/models behavior, instead of returning the entire catalog. (#2483)

  • fix(cli): persist STORAGE_ENCRYPTION_KEY into DATA_DIR (not only ~/.omniroute) and refuse to auto-generate a fresh key when a storage.sqlite already exists — a new key cannot decrypt previously-encrypted credentials, so silently regenerating it locked users out of their database. The CLI now mirrors the server bootstrapEnv guard. (reported by Daniel Nach; original key persistence by @Chewji9875 — follow-up to #1622)

  • fix(gemini): preserve and re-attach the thoughtSignature on Gemini thinking-model tool calls — thread the signature namespace through the FORMATS.GEMINI and FORMATS.GEMINI_CLI request translators so the cached signature (keyed by connection + tool-call id) is found on the follow-up turn. Fixes [400]: Function call is missing a thought_signature in functionCall parts on agentic Gemini tool use. (#2504)

  • fix(translator): accept PDFs sent in the Responses-API input_file shape on the Gemini path, and the Gemini-style document shape on the Responses/Codex path — content parts are now normalized across input_file / file / document so a PDF reaches the model regardless of which field name the client used. (#2515)

  • fix(stream): count thinking arrays and reasoning_details as useful stream output — a reasoning-only response (e.g. Mistral/StepFun with a low max_tokens) was misclassified as "Stream ended before producing useful content" and turned into a spurious 502; it is now recognized as valid output. (#2520)

  • fix(claude): extract system/developer role messages in Claude Code semantic passthrough paths — moves role:"system" / role:"developer" messages from the messages[] array to the top-level system parameter before sending to Anthropic, which rejects them inside messages. Fixes memory injection context being silently dropped. (#2497 — thanks @unitythemaker)

  • fix(vision-bridge): auto-route non-standard provider models through OmniRoute self-loop — vision-bridge now detects when a model doesn't natively support vision and automatically re-routes the image through OmniRoute's own endpoint for format translation. (#2487 — thanks @herjarsa)

  • fix(mitm): add IPv6 DNS redirect, modular antigravity target, improved logging — MITM DNS handler now correctly redirects IPv6 (AAAA) queries alongside IPv4, adds a dedicated antigravity.ts target module, and enhances DNS/TLS logging for debugging. (#2514 — thanks @herjarsa)

  • fix(usage): improve Claude and MiniMax plan label detection — better tier name resolution for Claude OAuth usage (tier/plan/subscription_type/org fields) and new MiniMax plan label inference from quota totals. (#2498 — thanks @Gi99lin)

  • fix(codex): fan out image n requests in parallel — when Codex requests n > 1 images, the image-generation handler now dispatches them concurrently instead of sequentially, significantly reducing total latency. (#2499 — thanks @nmime)

  • fix(embeddings): strip stale Content-Encoding headers from upstream response — prevents clients from receiving gzip-encoded responses with identity encoding declared, which caused silent data corruption. (#2477 — thanks @lordavadon2)

  • fix(model): return clear error instead of silent OpenAI default for unrecognized models — previously, an unrecognized model silently fell back to OpenAI; now returns a 404 with a descriptive message listing known providers. (#2492 — thanks @herjarsa)

  • fix(dark-mode): correct background token on Compression Override select — the combo compression override <select> was using a hard-coded white background that was invisible in dark mode. (#2513 — thanks @apoapostolov)

  • fix(antigravity): align subscription tier detection with Antigravity Manager — extractCodeAssistSubscriptionTier now parses the correct nested field from the loadCodeAssist response, and a new extractCodeAssistOnboardTierId fallback handles the onboarding flow. Subscription info is cached per access-token with 5-min TTL. (#2496 — thanks @Gi99lin)

  • fix(opencode-zen): add opencode provider alias and sync model list with live API — opencode-zen and opencode-go are now also reachable via the shorter opencode alias, and the default model list is kept in sync with the live /v1/models catalog. (#2508 — thanks @herjarsa)

  • fix(combo): clarify log message when combo target is skipped due to unavailable credentials — previously logged a misleading "provider not found" message; now says "skipped: credentials unavailable". (#2494 — thanks @herjarsa)

  • fix(security): replace Math.random with crypto.randomUUID in generateTaskId/ActivityId and fix URL hostname check in test — eliminates weak PRNG usage flagged by CodeQL. (#2489)

  • fix(electron): downgrade to Electron 41.x for better-sqlite3 V8 compatibility — Electron 42.x shipped a V8 version that broke better-sqlite3 native bindings at runtime; pinning to 41.x restores stability.

  • fix(@omniroute/opencode-provider): include limit.context in model entries for OpenCode context window detection — OpenCode reads limit.context to determine usable context length for compaction and overflow detection.

  • fix(providers): make gitlawb/gitlawb-gmi model entry optional — prevents provider initialization failure when the model is not available in the catalog. (#2476 — thanks @oyi77)

  • fix(translator): inject omniroute_web_search in the Responses-API flat tool shape ({ type, name }) when the target provider speaks the Responses API — previously it was always emitted in the Chat Completions nested shape, so Codex/relay upstreams rejected the request. (#2390)

  • fix(kiro): serialize non-string role:"tool" message content before sending to CodeWhisperer — structured/array tool output was collapsing to content:[{ text: "" }], which Kiro rejects with 400 Improperly formed request. (#2446)

  • fix(claude): gate the heavy-agent beta headers (context-1m, effort, advanced-tool-use) on Opus/Sonnet only — Haiku with OAuth was receiving context-1m and rejecting it with 400. Also sanitizes historical thinking block signatures in passthrough. (#2454 — thanks @havockdev)

  • fix(perplexity-web): route requests through a Firefox-148 TLS-impersonating client so Perplexity's Cloudflare edge stops rejecting VPS/datacenter IPs with a 403 challenge. (#2459 — thanks @havockdev)

  • fix(validation): guard apiKey/modelsUrl against non-string values before calling .startsWith() / .trim() in the provider connection-test path. (#2463)

  • fix(cost): prevent double-billing of cache_creation_input_tokensprompt_tokens from token extractors already includes both cache_read and cache_creation, so nonCachedInput now subtracts both cache types to avoid pricing cache at the full input rate. (#2522 — thanks @herjarsa)

  • fix(handler): always normalize system role messages in Claude passthrough paths — normalizeClaudeUpstreamMessages() is now called unconditionally in both compatibleBridge and pure passthrough, ensuring role:"system" messages are always extracted to the top-level system parameter. (#2519 — thanks @herjarsa)

  • fix(handler): capture Gemini thought_signature in non-streaming response path — the non-streaming translator now captures thoughtSignature from Gemini thinking model parts and persists them so follow-up turns can resolve them correctly. (#2518 — thanks @herjarsa)

  • fix(kiro): replace broken social OAuth with device flow — rewrites Kiro's Google/GitHub social login from the broken PKCE kiro:// custom protocol to AWS Cognito device flow, which works correctly in web/proxy environments. (#2524 — thanks @disonjer)

  • fix(providers): resolve opencode/opencode-zen slug mismatch + add 40+ new models — opencode is now a proper alias for opencode-zen in executor, model resolver, and provider registry; adds GPT 5.x, Claude 4.x, Gemini 3.x, Grok, Kimi, and other models with tests. (#2517 — thanks @herjarsa)

  • fix(antigravity): fail over stalled Antigravity sessions — new ANTIGRAVITY_PRE_RESPONSE_TIMEOUT_CODE shared constant for pre-response timeout detection, automatic failover to next account when session stalls before headers arrive. Node.js engine range relaxed to >=20.20.2. (#2464 — thanks @dhaern)

  • fix(deepseek-web): fix SSE parser, prompt format, and error handling — handles all 3 DeepSeek SSE stream formats (initial fragments, APPEND operations, bare string tokens), simplifies prompt to single-turn to prevent chat marker leakage, and checks json.code before token extraction. (#2502 — thanks @ovehbe)

  • fix(codex): accept auth.json without auth_mode field on import — Codex CLI no longer writes auth_mode; import now accepts both formats as long as required tokens are present. Semantic cache read now requires explicit temperature: 0. (#2536 — thanks @janeza2)

  • fix(freetheai): add /chat/completions to baseUrl to resolve 404 errors. (#2557 — thanks @lordavadon2)

  • fix(qoder): route PAT tokens to Qoder native API instead of DashScope — detects pt- prefixed tokens and routes to api.qoder.com with proper User-Agent header. (#2559 — thanks @herjarsa)

  • fix(perf): cache compiled RegExp in RTK compression hot path — eliminates thousands of redundant new RegExp() instantiations per second. (#2553 — thanks @soyelmismo)

  • fix(reasoning-cache): auto-start periodic cleanup on module load — the server-init.ts job was never imported (dead code), causing the reasoning_cache table to grow indefinitely. Now runs 30-min cleanup cycles automatically. (#2552 — thanks @soyelmismo)

  • fix(claude): omit context-1m beta for Sonnet — restrict to Opus-only to avoid long-context credit gate errors. Add afk-mode-2026-01-31, replace redact-thinking with thinking-token-count-2026-05-13. (#2568 — thanks @unitythemaker)

  • fix(codex): relax auth_mode check in frontend import preview — accept undefined/null/"chatgpt" instead of requiring "chatgpt" strictly, matching the backend fix in #2536. (#2567 — thanks @janeza2)

  • fix(kimi): declare vision capability for Kimi K2.6 in all 4 layers — providerRegistry, modelSpecs, catalog.ts keyword list, and Playground VISION_MODELS; previously the model silently rejected image uploads. (#2573 — thanks @herjarsa)

  • fix(dashboard): paginate request-log viewer beyond 300 rows — getCallLogs now accepts offset with parameterized SQL (eliminates string-interpolated LIMIT); RequestLoggerV2 grows its window via "Load more" + IntersectionObserver infinite scroll, resetting on filter change. (#2576)

  • fix(cli): use /api/monitoring/health for server readiness check — waitForServer() was polling the auth-protected /api/health (401), causing omniroute serve to hang indefinitely. (#2578 — thanks @amogus22877769)

  • fix(combo): detect invalid model errors via structured error codes + regex fallback — when a combo target rejects a model (e.g. free account vs Pro), the router now recognizes model_not_found / deployment_not_found codes and 6 regex patterns, and falls through to the next target instead of stopping the loop. (#2534 — thanks @HALDRO)

  • fix(security): post-review hardening batch — spawnSync arg-array replaces execSync string-template (command injection), CSP unsafe-eval gated on !app.isPackaged, requireManagementAuth guard on budget/bulk and resilience/reset endpoints, error messages sanitized in gemini-web/claude-web/copilot-web/oauth/agents catch blocks, circuit breaker persists lastFailureKind, and combo resets exhaustedProviders per set-retry iteration. (#2435)

  • fix(@omniroute/opencode-plugin): honor geminiSanitization and fetchInterceptor feature flags — both were applied unconditionally; now each fetch layer is gated by its flag (default ON), and disabling both falls back to plain SDK fetch. (#2546)

  • fix(#2575): check DB feature flag override in arePrivateProviderUrlsAllowed() — supports runtime toggle without restart. (#2595 — thanks @herjarsa)

  • fix(mimo): add supportsVision flag to MiMo-V2.5, V2.5-Pro, and V2-Omni — previously image uploads were silently rejected. (#2592 — thanks @herjarsa)

  • fix(ops): propagate OMNIROUTE_SKIP_DB_HEALTHCHECK env var to periodic DB health check scheduler — companion fix to #2554. (#2591 — thanks @soyelmismo)

  • fix(github): remove incorrect openai-responses targetFormat from GitHub Copilot's Haiku/Sonnet models. (#2583 — thanks @oyi77)

  • fix(copilot): stabilize responses configuration — removes 865 lines of unstable config, simplifies handler. (#2579 — thanks @ivan-mezentsev)

  • fix(#2544): add SSE heartbeat keepalive to Responses API transform stream — prevents Codex CLI 0.130.0 from disconnecting during long thinking/reasoning phases. (#2599 — thanks @herjarsa)

  • fix(memory): extract system role messages in semantic passthrough path to prevent 400 on memory injection — system messages were being passed as-is to providers that reject mixed roles. (#2474 — thanks @Tentoxa)

  • fix(@omniroute/opencode-provider): include limit.context in model entries for OpenCode context window detection — previously OpenCode couldn't determine model context size. (#2482 — thanks @herjarsa)

  • fix(mimo): add supportsVision flag to Kimi K2.6 in providerRegistry + comprehensive vision tests for MiMo V2.5/V2.5-Pro/V2-Omni. (#2600 — thanks @herjarsa)

  • fix(proxy): prefer scoped proxies over registry global fallback — legacy provider-specific proxy was being shadowed by a registry-global fallback across both storage backends. Resolution now follows strict specificity: account → provider → combo → global. (#2606 — thanks @terence71-glitch)

  • fix(@omniroute/opencode-plugin): canonical-twin dedup + alias-fallback enrichment — /v1/models returned the same model under both alias (cc/claude-opus-4-7) and canonical (claude/claude-opus-4-7) names; now drops ~75 canonical duplicates and rescues ~88 raw-id rows with proper provider prefix via alias-index fallback. Also emits cost, release_date, modalities fields in static catalog and raises provider label threshold to 12 chars (preserves AssemblyAI, Antigravity verbatim). (#2607 — thanks @mrmm)

  • fix(registry): populate empty models arrays for HuggingFace (6 models) and HackClub (3 models) + fix Snowflake placeholder baseUrl to {account} template pattern. (#2611 — thanks @oyi77)

🌐 Internationalization

  • i18n(zh-CN): translate 830 missing UI strings — replaces all __MISSING__: placeholders with proper Chinese translations. (#2523 — thanks @InkshadeWoods)
  • i18n(dashboard): add missing dashboard keys and fix EN fallbacks — hundreds of hardcoded English strings across cache, caveman, costs, skills, memory, and evals pages replaced with t() calls. (#2500 — thanks @Gi99lin)
  • i18n(pt-BR): complete and fix Brazilian Portuguese translation — comprehensive overhaul of pt-BR locale with ~3000 lines of quality translations, filling all missing keys and correcting existing entries. (#2543 — thanks @alltomatos)
  • i18n(ru): comprehensive Russian translation update — ~2000 lines of corrected and filled translations. (#2550 — thanks @AgentAlexAI)
  • i18n(all): comprehensive localization and UI refactoring — 42 locale files synchronized with missing keys, cloud-agents page i18n rewrite, and consistent t() usage across 21 dashboard components. (#2580 — thanks @alltomatos)
  • i18n(all): translate freeTier provider strings across 41 locales — replaces __MISSING__:Free Tier Providers placeholders with proper translations in both common and providers namespaces. (#2609 — thanks @leninejunior)
  • i18n(pt-BR): eliminate all 1270 remaining __MISSING__ markers — completes pt-BR translation across 41 namespaces to true 100% coverage. (#2610 — thanks @leninejunior)

📝 Maintenance

  • chore: remove Akamai VPS deploy from release workflow and skills.
  • chore(deps): bump actions/setup-node from v4 to v6 + randomBytes security fix for cloud agent task IDs. (#2589)
  • chore(deps): bump actions/upload-artifact from v4 to v7. (#2588)
  • chore: ignore .claude/worktrees from git tracking.
  • chore(ci): auto-lock release branch on version publish — new CI workflow applies lock_branch protection when a GitHub Release is published. (#2542)
  • docs: redesign README — marketing-first layout with accurate provider counts. (#2490)

What's Changed

Full Changelog: v3.8.0...v3.8.2

Don't miss a new OmniRoute release

NewReleases is sending notifications on new releases.