[3.8.16] — 2026-06-08
✨ New Features
- feat(vision-bridge): auto-routing to the fastest available vision model — when a request carries image content and the selected model does not support vision, OmniRoute now transparently delegates to the best-match vision-capable model instead of returning an error. (#3377 — thanks @herjarsa)
- feat(web-session): web-session pool observability — new MCP tool
get_web_session_pool_healthand a health-matrix REST response (GET /api/web-session-pool/health) expose per-provider slot counts, lease ages, and error budgets so operators can diagnose pool exhaustion without digging through logs. (#3395 — thanks @oyi77) - feat(web-session): adaptive keepalive threshold — the keepalive heartbeat interval now self-adjusts based on observed provider idle-disconnect behaviour instead of using a fixed constant, reducing both unnecessary pings and unexpected session drops. (#3397 — thanks @oyi77)
- feat(web-session): bulk credential import endpoint (
POST /api/web-session/import) — import a JSON array of session credentials in one call; each entry is validated and inserted atomically, with per-entry success/failure reported in the response. (#3403 — thanks @oyi77) - feat(api): REST API for session pool health (
GET /api/session-pool/health) — a dashboard-facing endpoint that aggregates live slot usage, wait-queue depth, and error rates across all active session pools; wired to a new dashboard widget. (#3404 — thanks @oyi77)
🔧 Bug Fixes
- fix(sse): eliminate race window in
usageTokenBuffersettings update — a concurrent save + stream-start could race to apply stale settings, causing token counts to roll back by up to 2 000 tokens after a restart; the update now uses an atomic read-modify-write on the shared settings ref. (#3405 — thanks @diegosouzapw) - fix(context-cache): server-side context-cache pinning now correctly persists across restarts; proxy message content no longer leaks into the upstream prompt; and the
context_cache_protectiontoggle is properly saved to the DB on change. (#3399 — thanks @k0valik) - fix(providers): the provider settings page now refreshes its model list after a successful
sync-modelscall — previously the stale list remained until a full page reload. (#3402 — thanks @0xtbug) - fix(stream): empty-choices chunks (choices array present but empty, no
finish_reason) are now silently dropped rather than emitted as aretry:SSE event — removes spurious retry lines from streaming responses for providers that emit heartbeat keep-alive chunks. (#3400 — thanks @0xtbug) - fix(account-fallback): the connection cooldown deduplication state is now preserved across the fallback retry chain — previously a second concurrent failure on the same account could clear the dedupe flag set by the first, allowing the cooldown window to be extended twice. (#3381 — thanks @oyi77)
- fix(stream): false-positive textual tool-call marker truncation —
containsTextualToolCallMarkernow tracks how much of the accumulated streamed content has already been emitted, so it only withholds the unemitted tail rather than re-scanning from the start on every new chunk. (#3382 — thanks @Ardem2025) - fix(sanitizer):
containsTextualToolCallContent()now requires the complete[Tool call: name]\nArguments:header pattern instead of a bare.includes("[Tool call:")check — prevents the non-streaming response sanitizer from nulling out model responses that merely quote[Tool call:]in prose or code examples. (#3355 — thanks @diegosouzapw) - fix(stream): the streaming textual tool-call guard now flushes any remaining buffered content as plain text when the stream ends, regardless of whether the buffer contains
"Arguments:"— previously, a partial/incomplete tool-call header that arrived at end-of-stream was silently dropped. (#3355 — thanks @diegosouzapw) - fix(executor): Mistral (and any provider in
PROVIDERS_REQUIRING_USER_LAST_MESSAGE) no longer receives a trailingassistantmessage with plain text content —stripTrailingAssistantForProviderdrops it on the upstream-send path, fixing the400: Expected last role User or Tool … but got assistantrejection. (#3396 — thanks @diegosouzapw) - fix(mitm):
getMitmStatus()in the build-time stub (Docker image) now returns a graceful{ running: false }status instead of throwing, so the Agent Bridge UI shows a clean "stopped" state rather than an error banner in containerised deployments. (#3390 — thanks @diegosouzapw) - fix(env): corrected casing of
OMNIROUTE_TRACEin.env.exampleand all related documentation files — was previously mixed-case in some places, causing the variable to be silently ignored on case-sensitive file systems. (#3393 — thanks @androw) - fix(featureFlags):
PRICING_SYNC_ENABLEDdescription now clearly states that the feature requires the corresponding environment variable to be set — removes the ambiguity that led operators to enable it via the UI only and wonder why sync never ran. (#3394 — thanks @androw) - fix(docker):
runner-webimage now copiesplaywrightandplaywright-corefrom the builder stage instead of usingnpxto fetch them at build time — eliminates the exit 127 failure on GitHub-hosted runners where the registry download is unreliable.
📝 Maintenance
- ci(docker): the CI pipeline now builds and publishes the
-webimage variant in the same Docker publish workflow, so both the standard and browser-backed images stay in sync on every release. (#3389 — thanks @zhiru) - ci(e2e): E2E shard suite hardened — timeout raised to 45 min for the heaviest shard; build artifact now uses an explicit
tarbundle to avoidupload-artifact@v4LCA path ambiguity;node_modulescopied into standalone after download; browser cache added to cut cold-shard time;sync-modelsendpoint mocked inproviders-management.spec.tsso the import modal reaches "done" immediately. (#3387 / #3392 — thanks @diegosouzapw) - docs: Codex CLI configuration guide added to the dashboard (
/dashboard/codex-config) — covers profile naming, model selection, and theCODEX_*environment variables accepted by OmniRoute. (thanks @diegosouzapw) - chore(agentSkills): catalog expanded to 43 entries —
config-codex-cliadded as a newCONFIG_SKILL_IDScategory; all skill-count assertions updated across unit and integration test suites;next-fetchopts cast to satisfy the TypeScript overload signature in the skill runner. (thanks @diegosouzapw)
🙌 Contributors
Thanks to everyone whose work landed in v3.8.16:
| Contributor | Contribution |
|---|---|
| @herjarsa | Vision-bridge auto-routing to fastest vision model (#3377) |
| @oyi77 | Web-session pool observability (#3395), adaptive keepalive (#3397), bulk credential import (#3403), session pool REST API (#3404), cooldown dedupe fix (#3381) |
| @Ardem2025 | Stream false-positive tool-call marker truncation fix (#3382) |
| @zhiru | Docker -web image variant CI (#3389) |
| @androw | OMNIROUTE_TRACE casing fix (#3393), PRICING_SYNC_ENABLED clarification (#3394) |
| @k0valik | Context-cache pinning + proxy message leak fix (#3399) |
| @0xtbug | Empty-choices chunk drop (#3400), model list refresh after sync (#3402) |
| @diegosouzapw | Release engineering + usageTokenBuffer race fix (#3405), sanitizer+stream hardening (#3355/#3410), Mistral trailing-assistant fix (#3396/#3409), mitm Docker stub (#3390/#3408), E2E shard stabilization (#3387/#3392), Docker -web build fix, and direct release-branch commits |
What's Changed
- fix(ci): stop E2E shard 5/6 being cancelled mid-run (timeout headroom) by @diegosouzapw in #3387
- fix(ci): E2E shard headroom (50m) + live line reporter for diagnosis by @diegosouzapw in #3392
- fix(env): correct casing of OMNIROUTE_TRACE in .env.example and related files by @androw in #3393
- fix(featureFlags): update description for PRICING_SYNC_ENABLED to clarify environment variable requirement by @androw in #3394
- fix(account-fallback): preserve provider cooldown dedupe state by @oyi77 in #3381
- ci(docker): also build & publish the -web image variant by @zhiru in #3389
- fix(stream): solve false-positive textual tool-call marker truncation by @Ardem2025 in #3382
- fix(stream): drop empty choices chunks instead of emitting retry text by @0xtbug in #3400
- feat: adaptive keepalive threshold for web-session providers by @oyi77 in #3397
- feat: add web-session pool observability (MCP tool + health-matrix) by @oyi77 in #3395
- fix(providers): refresh model list after provider sync by @0xtbug in #3402
- feat(vision-bridge): auto-route to fastest vision model by @herjarsa in #3377
- fix: server-side context cache pinning, stop proxy message leaks, persist context_cache_protection toggle by @k0valik in #3399
- fix(sse): eliminate usageTokenBuffer race — +2000 tokens after settings save/restart by @diegosouzapw in #3405
- feat: add bulk web-session credential import endpoint by @oyi77 in #3403
- feat: add REST API for session pool health (dashboard interface) by @oyi77 in #3404
- fix(mitm): getMitmStatus stub returns graceful status in Docker (#3390) by @diegosouzapw in #3408
- fix(executor): strip trailing assistant text for Mistral (user-last required) (#3396) by @diegosouzapw in #3409
- fix(sanitizer+stream): tighten textual tool-call detection, flush partial buffer (#3355) by @diegosouzapw in #3410
- Release v3.8.16 by @diegosouzapw in #3385
Full Changelog: v3.8.15...v3.8.16