github diegosouzapw/OmniRoute v3.8.10

4 hours ago

OAuth resilience & observability release: spaced/sequential quota sync for OAuth accounts, a per-provider proactive-refresh skip list to keep short-TTL providers (Kimi) alive without re-exposing the Codex Auth0 cascade, token-expiry visibility on the provider cards, a new provider-stats dashboard, plus a wide batch of provider fixes (DeepSeek-web tool calls, Antigravity, Qoder, MiniMax, GitHub Copilot, Fireworks, llama.cpp, t3.chat-web, Kiro, Kilocode) and Podman deployment support.

✨ New Features

  • dashboard: new Provider Stats page + /api/provider-stats endpoint — per-provider and per-model aggregates from call_logs plus live combo/telemetry/tool-latency overlays. (#3175 — thanks @pizzav-xyz / @diegosouzapw)
  • metrics: cross-request TTFT and gap-after-tool-call latency tracking, aggregated per provider. (#3173 — thanks @pizzav-xyz / @diegosouzapw)
  • quota: show the OAuth token expiry on provider cards (small, blue, informative — "Token expires in …" / "Token expired"). (#3178 — thanks @diegosouzapw)
  • responses: strip previous_response_id for stateless Responses upstreams, with an auto/strip/preserve setting + UI so stateless clients (e.g. VS Code Custom Endpoint) keep context. (#3143 — thanks @JxnLexn)
  • deploy: Podman/rootless deployment support (contrib units + CONTAINER_HOST hint) and larger upload body-size limits for /v1/files. (#3128 — thanks @hartmark)

🔧 Bug Fixes

  • usage: sequential + spaced OAuth quota sync (PROVIDER_LIMITS_SYNC_SPACING_MS) so a host no longer bursts simultaneous usage/refresh requests; reactive forced re-mint after a 401 on the per-card refresh (recovers imported accounts); a genuine 401 now surfaces a re-authenticate hint. (#3156 — thanks @diegosouzapw)
  • healthcheck: per-provider proactive-refresh skip list (OMNIROUTE_HEALTHCHECK_SKIP_PROVIDERS) — keep rotating-cascade providers (Codex/OpenAI) reactive-only while short-TTL providers (Kimi-coding) keep refreshing proactively. (#3159 — thanks @diegosouzapw)
  • providers: on ?refresh=true with no remote models, don't resurface the just-cleared synced cache into the local-catalog fallback. (#3181 — thanks @diegosouzapw)
  • providers: use synced models as the authoritative local catalog across all providers (even on connections that didn't run the sync). (#3148 — thanks @herjarsa)
  • web-tools: parse bare-JSON tool calls for DeepSeek-web with fuzzy tool-name matching scoped to the requested tools. (#3157 — thanks @wilsonicdev)
  • responses: normalize image_url parts across every Responses input path (message content, replayed output items, function_call_output) to avoid upstream 400s. (#3150 — thanks @wilsonicdev)
  • antigravity: dynamic upstream model resolution via the MITM alias table (server-only executor), with a guard against corrupted alias values. (#3144 — thanks @herjarsa)
  • qoder: bifurcate validation by token type — PAT (pt-) → Cosy, regular API key → dashscope — matching the executor's routing. (#3149 — thanks @herjarsa)
  • api-manager: preserve API key expiration in local time (the datetime-local input no longer silently shifts to UTC) + a clear button. (#3146 — thanks @xz-dev)
  • opencode-plugin: map caps.thinking → ModelV2.capabilities.interleaved for single models and combos. (#3138 — thanks @mrmm)
  • kiro: optional targetProvider on the social-OAuth exchange so Kiro-based providers can reuse the social login flow. (#3176 — thanks @pizzav-xyz)
  • misc: broaden the DeepSeek reasoning-replay regex (-free / zen/deepseek-v4), export ProviderProfile, and guard a non-string directory entry in the binary manager. (#3177 — thanks @pizzav-xyz)
  • providerRegistry: point kilocode at the OpenAI format + default executor (matching its sibling kilo-gateway). (#3166 — thanks @androw)
  • fireworks: preserve fully-qualified router/model IDs so Fire Pass router IDs (accounts/fireworks/routers/...) are no longer double-prefixed into an upstream 404. (#3133 — thanks @KooshaPari)
  • llama-cpp: route requests to the configured local baseUrl instead of OpenAI's API (which returned an OpenAI-worded 401). (#3136 — thanks @tjengbudi)
  • t3-chat-web: parse cookies + convexSessionId from the single stored credential so t3.chat web connections work (the executor previously read fields the credential pipeline never produced). (#3007 — thanks @minhtran162)
  • minimax: stop capping MiniMax-M3 / MiniMax-M2.7 max_tokens at the 8192 default — add the M3 model spec (512K output) and make model-spec lookups case-insensitive. (#3141 — thanks @totaltube)
  • github-copilot: discover the model catalog live from api.githubcopilot.com/models so Import Models refreshes and only entitled models are listed (with fallback to the static catalog). (#3120, #3121 — thanks @gabrielmoreira)
  • combo: invalidate the nested-combo cache on combo edits so removed targets/models stop being served within the 10s window; log the resolved DATA_DIR at startup to diagnose multi-replica volume mismatches. (#3147 — thanks @ViFigueiredo)
  • providers: resolve web-provider alias collisions. (thanks @diegosouzapw)

📝 Maintenance

  • deps: bump hono from 4.12.18 to 4.12.23. (#3179 — thanks @dependabot)
  • ci(electron): make the macOS-arm64 smoke step best-effort (headless GPU crash). (#3137 — thanks @diegosouzapw)
  • chore(release): open the v3.8.10 development cycle. (thanks @diegosouzapw)

What's Changed

Full Changelog: v3.8.9...v3.8.10

Don't miss a new OmniRoute release

NewReleases is sending notifications on new releases.