github diegosouzapw/OmniRoute v3.8.37

4 hours ago

[3.8.37] — 2026-06-26

✨ New Features

  • feat(providers): add DGrid AI gateway provider — OpenAI-compatible gateway at api.dgrid.ai/v1 (alias dgrid, API-key auth, passthrough models). Free router tier (10 RPM / 100 RPD); a $5 lifetime top-up raises limits to 20 RPM / 1,000 RPD. (#4931 — thanks @dgridOP)

  • feat(providers): add Pioneer AI (Fastino Labs) provider — OpenAI-compatible chat completions at api.pioneer.ai/v1. Registered with alias pn, X-API-Key auth, and a catalog of 10 open-tier serverless models (Qwen3, Llama 3.1/3.2, Gemma 3, SmolLM3). Free $75 credits, no credit card required. Gated enterprise models (Claude/GPT/Gemini) require prior fine-tuning on the Pioneer platform and are intentionally excluded from the catalog. (#4909 — thanks @HikiNarou)

  • feat(providers): add xAI Grok inbound translators and a thinking patcher — Grok requests are now translated on the inbound path and reasoning is normalized so Grok modes behave consistently across clients. (#4910 — thanks @mugnimaestra)

  • feat(oauth): Codex bulk-import endpoint — POST /api/oauth/codex/import accepts multiple Codex OAuth credentials in one call for fast multi-account onboarding. (#4914 — thanks @beaaan)

  • feat(embeddings): add a dimensions override field to embedding combos so an embedding combo can pin the output vector size per target. (#4913 — thanks @wenzetan)

  • feat(sse): auto-promote successful combo model — a new opt-in comboAutoPromoteEnabled setting reorders a combo's persisted model list so that, when a combo model responds successfully, it is moved to position #1 for future requests. (#4852 — thanks @arssnndr)

  • feat(sse): add toggleable tool-source diagnostics — an opt-in switch surfaces where each tool definition originated when debugging tool-routing issues. (#4856 — thanks @DuyPrX)

  • feat(headroom): proxy lifecycle management + dashboard UI — start/stop/monitor a Headroom compression proxy from the dashboard, with Docker sidecar support. (#4649 — thanks @diegosouzapw / @carmelogunsroses)

  • feat(sse): x-omniroute-strip-reasoning request header to drop reasoning_content from upstream responses (opt-in, preserving reasoning-aware clients). (#4678 — thanks @anuragg-saxenaa / @diegosouzapw)

  • feat(cli): multi-model support for the Factory Droid CLI integration. (#4682 — thanks @anuragg-saxenaa / @diegosouzapw)

  • feat(sse): parse Gemini CLI 429 retryDelay from the structured RetryInfo payload so cooldowns honor the upstream-provided backoff. (#4738 — thanks @NoxzRCW)

  • feat(sse): add GPT-4 and GPT-4o mini to the GitHub Copilot provider catalog. (#4798, #4797 — thanks @decolua)

  • feat(api): add the MiniMax-M3 pricing row (canonical + lowercase alias) so the new MiniMax default model gets accurate per-request cost accounting instead of falling back to a zero/default rate. (#4814 — thanks @octo-patch)

🔧 Bug Fixes

  • fix(sse): dense, deterministic response.output ordering in response.completed — items are now sorted by their actual output_index (via a recorded-as-emitted accumulator + stable sort) instead of being rebuilt from unordered state dicts; normalizeOutputIndex replaces fragile parseInt calls for robust index coercion; superseded tool calls (replaced at the same index mid-stream) are excluded from the final output array. (#4906 — thanks @Marco9113)

  • fix(sse): normalize Codex custom/freeform tools (apply_patch, type:"custom" with no parameters) to a { input: string } function schema instead of an empty schema — the empty schema made models invoke apply_patch with {}, breaking the Codex runtime which expects { input: string }. Also maps custom_tool_call / custom_tool_call_output input items and streams apply_patch tool calls via custom_tool_call_input.delta/.done events. (#4862 — thanks @nstung463)

  • fix(sse): preserve the required array when translating Draft 2020-12 antigravity tool schemas (e.g. from OpenCode), stripping unsupported JSON Schema meta keywords while keeping mandatory arguments required so the model no longer calls tools without them. (#4843 — thanks @anuragg-saxenaa)

  • fix(sse): Kiro tool-schema sanitizer — strip unsupported JSON-Schema keywords (anyOf/$ref/if-then, etc.) and hash-truncate tool names >64 chars before dispatch, mapping the streamed tool-call name back for the client, so Kiro no longer rejects tool calls with 400 "Improperly formed request". (#4847 — thanks @smarthomeblack)

  • fix(sse): make the anthropic-version default-guard case-insensitive for anthropic-compatible-* providers, so a caller/operator-supplied Anthropic-Version (any casing) is no longer clobbered by a second lowercase anthropic-version: 2023-06-01 header. (#4823 — thanks @zakirkun)

  • fix(db): validate HuggingFace API tokens via the whoami-v2 endpoint as a pure auth probe so fine-grained Inference-Provider tokens (valid even when model/task endpoints reject them) are no longer falsely marked invalid; only 401/403 means an invalid key, other non-OK statuses surface as transient upstream errors. (#4819 — thanks @Delcado19)

  • fix(sse): reject the Anthropic-only [1m] context-1m suffix in buildKiroPayload before it reaches AWS Bedrock — Kiro is Bedrock-backed and cannot honor the beta, so a forwarded kr/*[1m] model id was malformed upstream; callers now get a clear error pointing them at a direct-Anthropic provider for 1M-context routing. (#4816 — thanks @Delcado19)

  • fix(dashboard): align the Engine Combos editor engines with the API schema — the named-combos pipeline dropdown offered four engines (headroom, session-dedup, ccr, llmlingua) that PUT /api/context/combos/[id] rejects, so selecting one made the save return 400 while the UI swallowed the error. The dropdown is now sourced from a single canonical engine map shared with stackedPipelineStepSchema (parity guarded by a unit test), and the editor surfaces save errors plus empty-name/empty-pipeline validation instead of failing quietly. (#5062 — closes #4955)

  • fix(sse): surface malformed HTTP-200 upstream responses instead of treating them as success, so combo fallback can trigger. (#4942 — thanks @haipham22)

  • fix(antigravity): retry transient upstream failures rather than failing the request outright. (#4941 — thanks @Jordannst)

  • fix(sse): exclude WS-bridge controller-closed errors from the provider circuit breaker so a client disconnect no longer trips the whole provider. (#4870 — closes #4602, thanks @huohua-dev)

  • fix(sse): resolve custom combos by id and case-insensitive name. (#4869 — closes #4446, thanks @herjarsa)

  • fix(sse): forward AI SDK image parts in the Responses translator. (#4859 — thanks @mugnimaestra)

  • fix(sse): emit valid concatenable Kiro tool_calls.arguments deltas. (#4855 — thanks @wahyuzero)

  • fix(sse): strip temperature for Claude models with extended thinking enabled (the upstream rejects it). (#4853 — thanks @noestelar)

  • fix(sse): unwrap the Qoder HTTP-200 SSE error envelope so combo fallback can trigger. (#4850 — thanks @vianlearns)

  • fix(sse): strip reasoning blobs from agentic context to prevent O(n²) token growth across multi-turn agent loops. (#4849 — thanks @GodrezJr2)

  • fix(sse): close the reasoning block before message content in the Responses stream so clients render reasoning and answer in the right order. (#4848 — thanks @kwanLeeFrmVi)

  • fix(config): sync the full SiliconFlow model list into the registry. (#4844 — thanks @letanphuc)

  • fix(sse): strip Composer <|final|> sentinel markers that leaked after Composer reasoning. (#4842 — thanks @noestelar)

  • fix(build): trace-include sql.js's sql-wasm.wasm in the standalone bundle so SQLite-WASM works in the packaged build. (#4839 — thanks @Delcado19)

  • fix(cli): persist lazily-installed native runtime deps (better-sqlite3, systray2) to the shared runtime package.json with --save-exact instead of --no-save, so installing one no longer prunes the other as "extraneous" — fixing a "No SQLite driver available" failure after a --tray install. (#4841 — thanks @omartuhintvs)

  • fix(sse): resolve bare model names to a connection's defaultModel before upstream calls. (#4825 — thanks @anuragg-saxenaa)

  • fix(api): surface a Docker-localhost hint on provider-node validation connection errors. (#4822 — thanks @anuragg-saxenaa)

  • fix(sse): strip Gemini built-in tools when functionDeclarations are present in the Antigravity envelope (the two are mutually exclusive upstream). (#4821 — thanks @Vanszs)

  • fix(sse): strip X-Stainless-* headers and normalize the SDK User-Agent for OpenAI-compatible endpoints. (#4820 — thanks @anuragg-saxenaa)

  • fix(oauth): allow a per-connection refresh lead-time override via providerSpecificData.refreshLeadMs. (#4818 — thanks @anuragg-saxenaa)

  • fix(dashboard): resolve passthrough model aliases by providerId in ModelSelectModal. (#4815 — thanks @anuragg-saxenaa)

  • fix(sse): strip enumDescriptions from Antigravity tool schemas. (#4813, #4740 — thanks @anuragg-saxenaa)

  • fix(dashboard): keep the desktop sidebar visible via an explicit CSS class. (#4812 — thanks @Delcado19)

  • fix(sse): filter nameless hosted tools when converting Responses API to Chat format. (#4789 — upstream, thanks Владимир Акимов)

  • fix(sse): stream-writer mock abort() now returns a Promise (test-stability fix). (#4788 — thanks @decolua)

  • fix(sse): use the WorkOS auth-token shape for Cline. (#4787 — thanks @apeltekci)

  • fix(api): fall back to the existing access token for any OAuth provider when a refresh fails. (#4786 — thanks @decolua)

  • fix(sse): read Antigravity usage from the response.usageMetadata envelope. (#4785 — thanks @decolua)

  • fix(oauth): verify Cursor installation on Linux before auto-import. (#4770 — upstream, thanks Ibrahim Ryan)

  • fix(cli): fall back to the default data dir when DATA_DIR is not writable. (#4767 — upstream, thanks Thiên Toán)

  • fix(sse): json_schema fallback for OpenAI-compatible providers that don't support structured outputs. (#4766 — thanks @mustafabozkaya)

  • fix(cli): verify launchd registration and skip self-SIGTERM in macOS autostart. (#4765 — thanks @ntdung6868)

  • fix(sse): finalize the tool_calls finish_reason on early stream end in the OpenAI Responses translator. (#4764 — thanks @decolua)

  • fix(sse): gate Kiro image attachments behind a Claude-capability check. (#4763 — thanks @decolua)

  • fix(sse): track Ollama streaming usage from raw NDJSON chunks. (#4754 — thanks @fresent)

  • fix(sse): include low-level cause details in formatProviderError. (#4741 — thanks @decolua)

  • fix(executors): anthropic-compatible-* gateways now get a Bearer token alongside x-api-key. (#4729 — thanks @hodtien)

  • fix(translator): strip the x-anthropic-billing-header in the claude-to-openai path. (#4728 — thanks @weimaozhen)

  • fix(translator): preserve reasoning_effort for non-Copilot Responses clients. (#4688 — thanks @ryanngit / @diegosouzapw)

  • fix(codex): treat an OAuth 401 as an unrecoverable refresh failure (stop retrying a dead token). (#4686 — thanks @sacwooky / @diegosouzapw)

  • fix(translator): coerce tool descriptions to strings in OpenAI normalization. (#4675 — thanks @East-rayyy / @diegosouzapw)

  • fix(dashboard): stop double-masking an already-masked API key in the list view (E2E 3/9 regression). (#4671 — thanks @diegosouzapw)

  • fix(combo): flatten Anthropic tool messages + tool history to prevent an upstream 503. (#4648 — thanks @warelik / @diegosouzapw)

  • fix(providers): require a Default Model in the compatible-provider API-key setup flow. (#4641 — thanks @arden1601)

🔒 Security

  • fix(auth): only trust forwarding headers (X-Forwarded-For / X-Real-IP) from loopback TCP peers, so a non-loopback client can't spoof its origin to bypass local-only route guards. (#4689 — thanks @Jordannst / @diegosouzapw)

  • fix(sse): redact the API key from the AUTH debug log in the chat handler. (#4858 — thanks @sacwooky)

  • fix(oauth): classify /api/oauth/cursor/auto-import as a local-only route in the route guard, so the loopback-enforced process-spawning endpoint can't be reached through a tunneled/leaked JWT (Hard Rule #17). (#5070 — thanks @diegosouzapw)

📝 Maintenance

  • chore(ci): harden the release flow — decouple the Quality Ratchet from coverage-shard flakes (if: !cancelled() + --allow-missing), add fast-path drift gates (check:complexity, check:cognitive-complexity, check:pack-policy, check:build-scope), and raise the default build heap to 8 GB. (#5054 — thanks @diegosouzapw)

  • docs(routing): sync the combo strategy docs for Fusion (17 strategies). (#5067 — thanks @diegosouzapw)

  • test(sse): golden-lock the provider.ts translate-path across all providers. (#4734 — thanks @diegosouzapw / @decolua)

  • docs(env): document HEADROOM_URL in .env.example + ENVIRONMENT.md. (thanks @diegosouzapw)

  • chore(quality): rebaseline the file-size ratchet across the rc17 PR-batch levas (leva2/leva3/leva4) to absorb cycle drift. (thanks @diegosouzapw)


What's Changed

Full Changelog: v3.8.36...v3.8.37

Don't miss a new OmniRoute release

NewReleases is sending notifications on new releases.