github diegosouzapw/OmniRoute v3.8.27

14 hours ago

[3.8.27] — 2026-06-17

✨ New Features

  • feat(combos): advertise combo capabilities (multimodal / reasoning / caching) on the import surfaces — importing a combo package into a client (LobeHub / OpenCode / VS Code, via /v1/combos and the VS Code combo catalog) no longer requires manually enabling multimodal/image-input, reasoning, and caching afterwards. projectCombo now attaches a registry-derived capabilities block, gated conservatively: multimodal/reasoning are advertised only when every concrete model step proves the capability (an unprovable nested combo-ref drops them, since the strategy may route to any member), and caching reflects the combo's explicit Context-Cache-Protection setting (no surprise prompt-cache cost). The public /v1/combos default projection (#2300) is unchanged unless the caller opts in. (#3979 — thanks @xenstar)
  • feat(sse): delegated Anthropic Context Editing for Claude (clear_tool_uses) — Claude requests can now offload context trimming to Anthropic's server-side context-management API (beta context-management-2025-06-27, clear_tool_uses_20250919), pruning stale tool-use turns upstream instead of locally. Claude-only by nature (the edit runs server-side); multi-provider context trimming remains the job of the local compression engines. (#4021 — thanks @diegosouzapw)
  • feat(sse): real LLMLingua-2 ONNX compression engine (stable) — the LLMLingua-2 prompt-compression engine is now a real local ONNX model (TinyBERT default, transformers.js + tfjs), promoted to stable after VPS validation, replacing the previous placeholder. (#4014 — thanks @diegosouzapw)
  • feat(compression): capture per-engine analytics + Lite schema fix — the compression pipeline now persists a per-engine breakdown for historical analytics so the dashboard can attribute savings to each engine in a stacked pipeline, and a Lite-schema mismatch was corrected. (#4018 — thanks @diegosouzapw)
  • feat(dashboard): real circuit-breaker state in the Combo Live cascade (U1b) — the Combo Live cascade view now surfaces each provider's real circuit-breaker state (CLOSED / OPEN / HALF_OPEN) as a badge, read live from /api/monitoring/health, instead of inferring health from request outcomes. (#4029 — thanks @diegosouzapw)
  • feat(openai): honor a custom base URL in model discovery + complete openai/codex pricing — OpenAI-format providers configured with a custom base URL now have that URL honored during model discovery (not just inference), and the openai/codex pricing table was completed. Discovery is routed through the SSRF-guarded outbound fetch. (#4005 — thanks @artickc)
  • feat(observability): capture actual upstream provider requests — the request inspector now records the exact payload sent to the upstream provider (post-translation), so you can see what OmniRoute actually dispatched rather than only the client's original request. (#3941 — thanks @rdself)
  • feat(providers): provider auth visibility controls — adds controls to show/hide provider auth details in the dashboard so credentials can be revealed only when needed. (#3953 — thanks @rdself)
  • feat(providers): model search filter on the provider dashboard — the provider dashboard gains a search filter to quickly narrow a provider's model list. (#3950 — thanks @felipesartori)
  • feat(compression): Indonesian caveman rules + language pack — adds an Indonesian "caveman" rule set and language pack to the rule-based compression engine. (#3975 — thanks @Veier04)
  • feat(dashboard): sidebar group separator toggles — the dashboard sidebar can now toggle group separators for a cleaner navigation layout. (#3971 — thanks @rdself)
  • feat(api): local @@om-usage command for cached per-key usage — API clients can send a message that is exactly @@om-usage to retrieve cached Claude-style usage data locally, without forwarding the prompt to an upstream provider. Gated by a new per-key allowance flag. (#4034 — thanks @Witroch4)

🐛 Fixed

  • fix(opencode): forward the OpenCode session id to the upstream regardless of how the user named the provider — the OpencodeExecutor forwarded the x-opencode-session/request/project/client headers, but the OpenCode CLI only emits those when the configured providerID starts with "opencode". A user who adds OmniRoute as a custom provider (e.g. "omniroute") makes the CLI send x-session-affinity / X-Session-Id instead (both carry the same session id), which the executor never read — so the session-metadata forwarding was effectively dead code for the realistic provider-naming case. The opencode-family executor now falls back to x-session-affinity / X-Session-Id and maps it onto x-opencode-session when the client didn't send the header directly, so session continuity to the opencode.ai upstream works for any provider name (a direct x-opencode-session still wins). Scoped to this executor only — the generic DefaultExecutor intentionally does not do this, to avoid leaking the client session id to arbitrary third-party upstreams. (#4022 — thanks @pizzav-xyz)
  • fix(guardrails): Vision Bridge no longer drops the image when the describe call fails (Nvidia NIM "Image unavailable") — the Vision Bridge is enabled by default and engages for any model whose vision capability OmniRoute can't prove from the registry (supportsVision !== true, which includes uncatalogued models that resolve to null). When the per-image describe call failed (e.g. no vision model configured), it replaced the image with the literal text [Image N]: (unavailable) and dropped the original image_url — so a genuinely vision-capable upstream (Nvidia NIM) received text only and answered "Image unavailable. Cannot provide description without visual data." A describe failure is no longer destructive: replaceImageParts now receives null for failed images and preserves the original image part so the upstream can still see it (successful describes still replace the image with the text description; meta.descriptions observability is unchanged). (#4012 — thanks @daniij)
  • fix(kiro): preserve finish_reason: "tool_calls" on the Kiro streaming path — streaming tool-call requests through the Kiro (Responses API) provider had their terminal finish_reason reported as "stop" instead of "tool_calls", so agent clients (Hermes) treated the tool-call turn as a finished turn, never ran the tool, and the next request failed with HTTP 400 on the incomplete tool state. convertKiroToOpenAI's terminal messageStopEvent/done branch hardcoded finish_reason: "stop" regardless of whether the stream had emitted toolUseEvents. The translator now records state.sawToolUse when a tool-use chunk is emitted and reports finish_reason: "tool_calls" on the terminal chunk (and in state.finishReason) whenever the stream produced tool calls. The non-streaming path was already correct. (#3980 — thanks @lordavadon2)
  • fix(resilience): respect connection cooldown stored as a numeric epoch — the router kept dispatching to connections still inside their rate-limit cooldown because rate_limited_until (a TEXT column) was persisted as a raw epoch number, which SQLite coerced to a string like "1781696905131.0" that new Date(...) parsed as NaN, so the cooling connection was never skipped. The cooldown read predicates now normalize numeric-epoch strings via a shared cooldownUntilMs() helper; ISO behavior is unchanged. (#3995 — thanks @diegosouzapw)
  • fix(providers): fetch the live /models catalog for LLM7 and BytePlus — importing an LLM7 or BytePlus key surfaced only a small, outdated hardcoded list because neither provider was classified by any live-fetch branch of the model-import route. Both are now in NAMED_OPENAI_STYLE_PROVIDERS, so the route probes <baseUrl>/models with the key and serves the live catalog, falling back to the local catalog only when the upstream fetch fails. (#3996 — thanks @FerLuisxd / @diegosouzapw)
  • fix(dashboard): logs auto-refresh reads live visibility, not a stale mount ref — the Logs page never auto-refreshed when the tab loaded in the background because the auto-refresh interval gated each tick on a visibility ref seeded once at mount; the tick now reads the live document.visibilityState, so polling self-heals as soon as the tab is visible while still pausing when genuinely hidden. (#3997 — thanks @tjengbudi / @diegosouzapw)
  • fix(combo): shuffle the strict-random fallback remainder to spread load — with the strict-random strategy a persistently-failing model was retried on essentially every request because only the deck-selected slot 0 was shuffled while the fallback remainder stayed in fixed priority order; the remainder is now shuffled too, so fallback load (and recovery from a failing target) spreads evenly across healthy peers. (#3998 — thanks @KeNJiKunG / @diegosouzapw)
  • fix(claude): forward the client tool-search-tool-2025-10-19 anthropic-beta on the Claude OAuth path — with deferred tools active, Claude Code negotiates the tool-search-tool-2025-10-19 beta, but OmniRoute dropped it on both Claude code paths, so the claude.ai backend rejected every deferred-tool request with 400 Tool reference not found. A new allowlist-merge (mergeClientAnthropicBeta) now unions the client's negotiated beta into the outbound set on both paths, appending only allowlisted client betas (preserving the #3415 fix). (#3999 — thanks @huohua-dev / @diegosouzapw)
  • fix(executor): strip stream_options on non-streaming requests (NVIDIA NIM 400) — clients that send stream_options: { include_usage: true } regardless of stream (e.g. the OpenAI Python SDK) had it passed through untouched on non-streaming calls, and NVIDIA NIM rejected it with 400 "Stream options can only be defined when stream=True". DefaultExecutor.transformRequest now strips stream_options whenever stream is false; the streaming injection path is unchanged. (#4000 — thanks @andrea-kingautomation / @daniij / @diegosouzapw)
  • fix(sse): guard model-less registry entries in getUnsupportedParams (mimocode) — a registry entry without a model map (mimocode) threw when computing unsupported params; the lookup now guards the model-less case so request validation no longer crashes. (#4015 — thanks @diegosouzapw)
  • fix(perplexity-web): parse the schematized diff_block stream so answers aren't empty — Perplexity web streamed its answer as RFC-6902 diff_block patches that OmniRoute didn't apply during the PENDING phase, so responses came back empty; the parser now applies the patches and materializes the text only on COMPLETED. (#4001 — thanks @artickc)
  • fix(default-executor): honor a custom providerSpecificData.baseUrl for OpenAI-format providers — OpenAI-format providers configured with a custom base URL had it ignored on the inference path; the default executor now honors providerSpecificData.baseUrl so requests reach the configured endpoint. (#4002 — thanks @artickc)
  • fix(live-ws): bridge LiveWS sidecar events to the dashboard — events emitted by the LiveWS sidecar were not reaching the dashboard; they are now bridged so live websocket activity is visible. (A cookie-auth regression in the sidecar's auth-token parsing was also corrected.) (#4004 — thanks @megamen32)
  • fix(qwen-web): cookie validation false-positive — check the response body for a user object — Qwen web cookie validation reported a valid cookie as invalid; it now inspects the response body for the user object instead of relying on the status code alone. (#3958 — thanks @thezukiru)
  • fix(vision-bridge): force the bridge for tokenrouter deepseek models — tokenrouter DeepSeek models are now forced through the Vision Bridge so image inputs are handled correctly. (#3946 — thanks @WormAlien)
  • fix(api): return 400 (not 500) for malformed JSON on /api/auth/login — a malformed JSON body on the login endpoint returned an opaque 500; it now returns a proper 400. (#4031 — thanks @rdself)
  • fix(dashboard): Playground Compare tab loading + HTTP method guard — the Playground Compare tab failed to load; the loading path was fixed and an HTTP method guard added. (#4024 — thanks @rdself)
  • fix(proxy): gate the control-plane proxy direct fallback behind a feature flag (fail-closed) — the direct-connection fallback for control-plane ops when a pinned proxy is unreachable is now gated behind a feature flag and fails closed, so a pinned proxy is never silently bypassed unless explicitly allowed. (#3963 — thanks @rdself)
  • fix(db): persist backup retention days — the backup retention-days setting was not persisted across restarts; it is now stored durably. (#3970 — thanks @rdself)
  • fix(dashboard): refine the provider quota card display — the provider quota card layout was refined for clearer quota/usage presentation. (#3969 — thanks @rdself)
  • fix(dashboard): refine compression settings, storage labels, and sidebar grouping — polishes the compression-settings UI, clarifies storage labels, and tidies the sidebar grouping. (#4033 — thanks @rdself)

🔒 Security & Hardening

  • fix(security): eliminate a polynomial ReDoS in the combo <omniModel> tag regexcomboAgentMiddleware's cache-tag pattern wrapped the tag in an unbounded newline run ((?:\n|\r)*), making .test() / .replace() run in O(n²) on inputs with many newlines (CodeQL js/polynomial-redos). The detection pattern now matches only the core <omniModel>…</omniModel> and the global strip pattern bounds the surrounding newline runs, keeping it linear; detection / extraction / multi-tag stripping behavior is unchanged. (#3982 — thanks @diegosouzapw)
  • ci(security): harden workflows — artipacked persist-credentials, cache-poisoning, SC2086 — GitHub Actions workflows were hardened against the artipacked persist-credentials leak and cache-poisoning, and shell-quoting (SC2086) issues were fixed. (#3965 — thanks @diegosouzapw)
  • ci(quality): flip require-tighten + osv + Trivy to blocking (cycle-end) — the per-module require-tighten check and the OSV / Trivy scanners moved from advisory to blocking for the v3.8.27 cycle close, so new dependency or coverage regressions fail CI. (#3984 — thanks @diegosouzapw)
  • chore(deps): dependabot security bumps + drop unused gray-matter — applies a batch of Dependabot security bumps and removes the unused gray-matter dependency from the tree. (#4036 — thanks @diegosouzapw)
  • chore(deps): automated dependency bumps — Dependabot upgraded the production dependency group (13 updates), vite, form-data, and the npm_and_yarn group. (#3915, #3942, #3943, #3944 — thanks @dependabot)

🧹 Internal / Quality / Docs

  • feat(ci): Quality Gate v2 — Onda 0 + Onda 1 — first two waves of the Quality Gate v2 program: gate flips, test-impact analysis (TIA), SAST, DAST-smoke, and mutation-testing infrastructure. (#4016 — thanks @diegosouzapw)
  • refactor: modularize the provider registry into individual provider pluginsproviderRegistry.ts was split into individual per-provider plugin modules (non-stacked). A forward-fix restored the byteplus + mimocode modules dropped by the move. (#3993 — thanks @oyi77 / @diegosouzapw)
  • refactor: modularize schemas (non-stacked) — the request/response schema definitions were split into individual modules to reduce file size and improve maintainability. (#3988 — thanks @oyi77)
  • fix: restore unit regressions dropped by the lossy schema/registry modularizations — the schema/registry modularizations (#3988, #3993) silently dropped internal logic covered by unit tests; this PR restores the regressed units. (#4030 — thanks @diegosouzapw)
  • refactor(dashboard): settings UI layout + API Keys naming — the settings UI layout was reorganized and the "API Keys" naming clarified. (#4020 — thanks @rdself)
  • 大量UI显示和i18n优化 (dashboard UI display + i18n improvements) — a batch of dashboard UI-display refinements and i18n string improvements. (#3973 — thanks @rdself)
  • fix(ci): scope TIA to node:test unit files only — test-impact analysis was matching files the node:test runner doesn't execute, producing 99 false failures; the TIA glob now mirrors the test:unit glob exactly. (#4035 — thanks @diegosouzapw)
  • fix(ci): electron-release publish-npm needs contents: write — the reusable npm-publish job invoked by the electron release lacked contents: write, causing a v3.8.26 startup_failure; the permission was granted. (#3966 — thanks @diegosouzapw)
  • test(opencode-plugin): ESM default-export test (drop the stale CJS bundle test) — replaces the stale CJS bundle test with an ESM default-export test, following up the #3883 ESM-only migration. (#3967 — thanks @diegosouzapw)
  • fix(ci): Fix promptfoo security-assertion parsing — the promptfoo (DAST/security eval) assertion parser was corrected so security assertions are read reliably. (#4032 — thanks @rdself)
  • docs(troubleshooting): note that the MITM proxy cannot intercept Windows-host apps under WSL — documents that the MITM proxy running inside WSL cannot intercept traffic from apps on the Windows host. (#4003 — thanks @diegosouzapw)
  • chore(quality): maintenance roll-up — assorted quality-gate hygiene that does not change runtime behavior: re-baseline validation.ts for the #3958 qwen body-check, allowlist the socks dependency declared by #4004, ignore jscpd major bumps (the v5 Rust rewrite breaks the pinned duplication gate), untrack an accidentally-committed root node_modules symlink (and gitignore it), rehome the #3972 logs auto-refresh test so a runner collects it, and open the v3.8.27 development cycle. (thanks @diegosouzapw)

What's Changed

New Contributors

Full Changelog: v3.8.26...v3.8.27

What's Changed

New Contributors

Full Changelog: v3.8.26...v3.8.27

Don't miss a new OmniRoute release

NewReleases is sending notifications on new releases.