✨ New Features
- feat(combo): nested combo-ref execution (
nestedComboMode: execute) — selection strategies can now treat a combo-reference step as a black box, executing the referenced combo as a single unit instead of flattening its targets. (#4537 — thanks @adivekar-utexas) - feat(combo): sticky weighted selection limit with exhaustion-aware renormalization — weighted strategies gain a configurable sticky-selection limit; once a target is exhausted, remaining weights renormalize so traffic is redistributed correctly. (#4489 — thanks @adivekar-utexas)
- feat(combos): provider-wildcard expansion in combo steps — a combo step may now reference a whole provider via wildcard and have it expand to that provider's models at resolution time. (#4545 — thanks @Rahulsharma0810)
- feat(compression): Phase 2 — named profiles + active selector — the compression settings panel becomes the single source of truth via a single active-profile selector (Default panel vs a named combo) wired into the runtime. (#4521 — thanks @diegosouzapw)
- feat(sse): route
web_searchrequests to a configured model — CCR-style webSearch scenario: requests carrying aweb_search*tool can be routed to a dedicatedwebSearchRouteModel, configurable from the Routing tab. (#4509 — thanks @shafqatevo / @diegosouzapw) - feat(mcp):
omniroute_web_fetchtool for URL content extraction — new MCP tool that fetches and extracts the content of a URL. (#4510 — thanks @ponkcore) - feat(models): qualify duplicate model names with their provider prefix — when two providers expose a same-named model, the catalog now disambiguates each with its provider prefix. (#4516 — thanks @Rahulsharma0810)
- feat(translator): accept OpenAI audio input parts in Gemini translation —
input_audiomessage parts are now translated through to Gemini. (#4434 — thanks @diegosouzapw) - feat(webhooks): enrich Telegram request notifications — Telegram webhook payloads carry richer request context. (#4524 — thanks @mppata-glitch)
- feat(bazaarlink): add
authHintto the existing APIKEY_PROVIDERS entry — surfaces the auth hint for the bazaarlink provider. (#4522 — thanks @adivekar-utexas) - feat(usage): API-key USD quota percent + reset hints, weekly-window cutoff — usage dashboard surfaces API-key USD quota percentage and reset hints, honoring the weekly window cutoff. (#4398 — thanks @Witroch4)
- feat(usage): surface Codex code-review weekly window +
additional_rate_limitsfallback — exposes the Codex code-review weekly window and falls back toadditional_rate_limitswhen present. (#4494 — thanks @diegosouzapw) - feat(dashboard): per-provider dropdown filter on the quota dashboard — filter the quota dashboard by provider. (#4495 — thanks @diegosouzapw)
- feat(dashboard): inline show/hide toggle for API keys on the API Manager page (#4505 — thanks @diegosouzapw)
- feat(dashboard): toggle-style model deselection in the combo builder modal (#4498 — thanks @diegosouzapw)
- feat(dashboard): Done button in the model picker for combo creation (#4496 — thanks @diegosouzapw)
- feat(providers): expose
gpt-4oon the built-in GitHub Copilot (gh) provider (#4487 — thanks @diegosouzapw) - feat(pricing): default pricing for the Qwen coder-model on the
qwprovider (#4488 — thanks @diegosouzapw)
🔧 Bug Fixes
- fix(api): resolve a compatible provider node by base type, not only exact id — connection→node resolution now matches on the bare derived node type when the exact id isn't found and the match is unambiguous (ambiguous → 404), via a pure
providerNodeSelecthelper. (#4576 — thanks @aleksesipenko / @diegosouzapw) - fix(cli): supervisor restarts on spontaneous exit-0 (OOM cgroup) + waits for port before respawn — a child that exits 0 because the cgroup OOM-killer reaped it is now restarted (not treated as a clean shutdown), the restart reset window widened 30s→60s, and the supervisor waits for the port to be free before respawning. (#4578 — thanks @oyi77 / @diegosouzapw)
- fix(combo): attribute lockout decay & success telemetry to the dynamically-selected connection — on the combo success path the actual connection chosen by dynamic account-selection is read from the
X-OmniRoute-Selected-Connection-Idresponse header (instead of the often-empty statictarget.connectionId), so model-lockout decay,recordProviderSuccess, LKGP and success/failure telemetry attribute to the right connection on both the priority and round-robin paths. The pre-screen "unavailable" snapshot is also no longer a permanent skip — availability is re-checked on each retry since connection cooldowns can expire mid-request. (#4550 — thanks @Chewji9875) - fix(auto): enforce the quota cutoff before scoring (opt-in) — auto-routing now evaluates a hard quota cutoff in
buildAutoCandidatesto drop low-quota candidates before scoring, with a 429 guard when all candidates fall below cutoff. The cutoff is opt-in behindQuotaPreflightSettings.enabled(default OFF viaQUOTA_PREFLIGHT_CUTOFF_ENABLED), so default behavior is unchanged. (#4483 — thanks @megamen32) - fix(antigravity): reasoning/thinking models no longer 400 with
oneOf at '/' not met— the Cloud Code envelope passthrough also leaked the Claude/OpenAI-native thinking fields (thinking,reasoning_effort,reasoning,enable_thinking,thinking_budget) the unified thinking adapter sets at the body root; Google rejected them with400 Bad input: oneOf at '/' not met. The whole thinking family is now stripped before the envelope is built; Gemini's owngenerationConfig.thinkingConfigis unaffected. (#4485 — port from 9router#1926, thanks @theseven99 / @diegosouzapw) - fix(integration): restore the codex and memory pipeline contracts — realigns the CLI fingerprint + memory-tools contracts so the codex and memory pipelines pass their integration checks again. (#4474 — thanks @KooshaPari)
- fix(sse): RTK must preserve
cache_control-markedtool_resultblocks — reasoning-token-keeping no longer drops tool_result blocks that carry acache_controlmarker. (#4560 — thanks @diegosouzapw) - fix(auto-combo): respect model visibility (
isHidden) in the auto-combo candidate pool — hidden models are excluded from auto-combo candidates. (#4558 — thanks @herjarsa) - fix(dashboard): avoid overlapping provider health polls — guards against concurrent provider-health poll cycles overlapping. (#4557 — thanks @KooshaPari)
- fix(dashboard): make the API Manager key table usable on mobile (#4556 — thanks @janeza2)
- fix(executors): decode Composer/Cursor
</think>-marked visible output — visible text wrapped in Cursor Composer's</think>markers is now decoded correctly. (#4554 — thanks @diegosouzapw) - fix(oauth): improve Cursor auto-import reliability on macOS (#4552 — thanks @diegosouzapw)
- fix(providers/test): probe the real Codex
/responsesendpoint — connection test hits the actual Codex/responsesendpoint. (#4551 — thanks @diegosouzapw) - fix(mcp):
webFetchInputemitsURL is requiredfor a missing url — clearer validation error for the web-fetch tool. (#4541 — thanks @ponkcore / @diegosouzapw) - fix(compression): allow
enginesExplicitthrough the PUT validation schema — the compression settings PUT no longer rejects theenginesExplicitflag. (#4532 — thanks @DevEstacion) - fix(no-think): normalize provider prefix to canonical in no-think variants (#4531 — thanks @Rahulsharma0810)
- fix(combo): pass
maxCooldownMsfrom settings to therecordModelLockoutFailurecall sites (#4530 — thanks @Chewji9875) - fix(combo): allow fallback on context-overflow & param-validation 400s; preserve upstream codes — combo fallback now triggers on recoverable 400s while keeping the original upstream status. (#4519 — thanks @adivekar-utexas)
- fix(command-code): cap
max_tokensper model using the registrymaxOutputTokens(#4518 — thanks @adivekar-utexas) - fix(mitm): gate sudo prompts on server platform, not browser UA (#4514 — thanks @diegosouzapw)
- fix(mitm): graceful sudo degradation in slim Docker / non-root containers (#4513 — thanks @diegosouzapw)
- fix(usage): clear auth-expired message for Kiro social-auth accounts (#4512 — thanks @diegosouzapw)
- fix(pricing): default cost rows for Antigravity Gemini 3.5 Flash tiers +
gemini-pro-agent(#4508 — thanks @diegosouzapw) - fix(api): dedupe exact-duplicate ids in
/v1/models— low-noise model output without alias/canonical duplicates. (#4506 — thanks @Rahulsharma0810 / @diegosouzapw) - fix(dashboard): enable Codex Apply/Reset buttons when the CLI is installed (#4504 — thanks @diegosouzapw)
- fix(dashboard): show API-Key-compatible providers in the Antigravity CLI Tools model picker (#4503 — thanks @diegosouzapw)
- fix(dashboard): migrate ManualConfigModal copy to the shared
useCopyToClipboardhook (#4502 — thanks @diegosouzapw) - fix(sse): skip disabled providers in combo fallback (#4500 — thanks @diegosouzapw)
- fix(usage): parse numeric-string quota reset timestamps as Unix sec/ms (#4493 — thanks @diegosouzapw)
- fix(db): scheduled VACUUM + persist
lastVacuumAt— a newvacuumScheduler.tspersists the last run timestamp and last error to thekey_valuetable (migration 102) and feeds the database settings panel; wired into the Next.js lifecycle (default 24h, window 02:00–04:00 local). New env flags:OMNIROUTE_VACUUM_ENABLED,OMNIROUTE_VACUUM_INTERVAL_HOURS,OMNIROUTE_VACUUM_WINDOW. (#4480 — thanks @KooshaPari / @oyi77) - perf(quota): stop writing redundant
quota_snapshotsrows from idle connections — the 60s background refresh persisted a snapshot for every window of every connection regardless of change, generating 400K+ rows/day from idle accounts.setQuotaCachenow skips the write when a window'sremaining_percentage/is_exhaustedis unchanged from the last cached observation; the first observation and every real change still persist. (#4565, #4438 — thanks @oyi77)
🔒 Security
- fix(sse): crypto-secure RNG for combo/deck load-balancing selection — replaces
Math.random()with a crypto-secure source in the combo/deck weighted-selection path. (#4455 — thanks @diegosouzapw)
📝 Maintenance
- perf(dashboard): shrink provider assets + fix the usage rollup cutoff — recompresses oversized provider images (nanobot/picoclaw/zeroclaw) and adds a
check:provider-assetsgate, plus a usage-analytics rollup cutoff fix. (#4464 — thanks @KooshaPari) - refactor(chatCore): extract pure leaves from
chatCore.ts— incremental decomposition of the chat-core handler into pure, individually-testable leaves (system-role extraction, upstream-header build, failure usage-record builder, key-health, request-format, claude-effort, target-format, Background-Task-Redirect decision, Codex quota-state persistence). (#4548, #4547, #4544, #4538, #4526, #4492 — #3501, thanks @diegosouzapw) - chore(i18n): remove unused config helpers (#4482 — thanks @KooshaPari)
- chore(quality): reconcile quality baselines (complexity, cognitive-complexity, file-size) across the cycle (#4579, #4570, #4543, #4542, #4535, #4534, #4529, #4528, #4523 — thanks @diegosouzapw)
What's Changed
- Release v3.8.33 by @diegosouzapw in #4515
Full Changelog: v3.8.32...v3.8.33