What's Changed
- (Bug) Fix reasoning response ID by @Sameerlite in #15265
- fix gemini cli by actually streaming the response by @Sameerlite in #15264
- Add gpt-realtime-mini support by @Sameerlite in #15283
- [Feat] Proxy CLI - dont store existing key in the URL, store it in the state param by @ishaan-jaff in #15290
- Fix: Make PATCH /model/{model_id}/update handle team_id consistently with POST /model/new by @ishaan-jaff in #15297
- [Fix] Networking: remove limitations by @AlexsanderHamir in #15302
- [MCP Gateway] Litellm mcp fixes team control by @ishaan-jaff in #15304
- [MCP Gateway] QA/Fixes - Ensure Team/Key level enforcement works for MCPs by @ishaan-jaff in #15305
- fix: model + endpoints page crash when config file contains router_settings.model_group_alias by @ARajan1084 in #15308
- Upgrades tenacity version to 8.5.0 by @ARajan1084 in #15303
- [QA/Fixes] - Dynamic Rate Limiter v3 - final QA by @ishaan-jaff in #15311
- Add Cohere Embed v4 support for AWS Bedrock by @timelfrink in #15298
- fix(bedrock): include cacheWriteInputTokens in prompt_tokens calculation by @timelfrink in #15292
- fix issue with parsing assistant messages by @Sameerlite in #15320
- feat(files): add @client decorator to file operations by @FelipeRodriguesGare in #15339
- [Fix] Watsonx - Apply correct prompt templates for openai/gpt-oss model family by @ishaan-jaff in #15341
- potentially fixes a UI spasm issue with an expired cookie by @ARajan1084 in #15309
- Add gpt-5-pro-2025-10-06 to model costs by @sandeshghanta in #15344
- Fix - (openrouter): move cache_control to content blocks for claude/gemini by @ishaan-jaff in #15345
- [Fix] x-litellm-cache-key header not being returned on cache hit by @ishaan-jaff in #15348
- Add native Responses API support for litellm_proxy provider by @Copilot in #15347
- AzureAD Default credentials - select credential type based on environment by @krrishdholakia in #14470
- SSO - support EntraID app roles by @krrishdholakia in #15351
- MCP - support converting OpenAPI specs to MCP servers by @krrishdholakia in #15343
- LiteLLM UI Refactor Infrastructure by @ARajan1084 in #15236
- MCP - specify allowed params per tool by @krrishdholakia in #15346
Full Changelog: v1.77.7.dev.3...v1.77.7.dev9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.7.dev9
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 60 | 77.84764691897514 | 6.515099712566108 | 6.515099712566108 | 1950 | 1950 | 40.86414299996477 | 2919.70134099995 |
Aggregated | Failed ❌ | 60 | 77.84764691897514 | 6.515099712566108 | 6.515099712566108 | 1950 | 1950 | 40.86414299996477 | 2919.70134099995 |