What's Changed
- VLLM - transcription endpoint support + Ollama_chat/ - images, thinking, and content as list handling + by @krrishdholakia in #14523
- [Fix] Bug Fix - Org Budget was not updating by @ishaan-jaff in #14541
- Docs update on user header mapping by @boopesh07 in #14527
- Litellm 1.77.2 stable notes by @ishaan-jaff in #14544
- Litellm UI qa 09 13 2025 p1 - fix end user filtering + fix load mcp tool call error + prevent setting max user budget on scroll in edit user settings by @krrishdholakia in #14545
- The 'last 24 hours' button shows up above the end user dropdown on Logs page by @NANDINI-star in #14546
- fix: DD tool calls passed in metadata by @mubashir1osmani in #14531
- Added user_email labels to the prometheus monitoring. by @boopesh07 in #14520
- feat: add tool-permission guardrail by @uc4w6c in #14519
- Add sambanova deepseek v3.1 and gpt-oss-120b models by @luisfucros in #14500
- fix: completion chat id by @hanakannzashi in #14548
- feat: Add OVHCloud AI Endpoints as a provider by @eliasto in #14494
- Resolve cache key collision issue where all soft budget alerts use identical cache keys by @Rasmusafj in #14491
- Fix unsupported stop param for grok-code models by @Sameerlite in #14565
- [Feat]Add cancel endpoint support for openai and azure by @Sameerlite in #14561
- Fix: Bedrock cross-region inference profile cost calculation by @timelfrink in #14566
- [Bug Fix] SCIM v2 - ensure group PUSH and PUT ops allow creating non-existent members by @ishaan-jaff in #14581
- s3_endpoint_url returned 404 by @mubashir1osmani in #14559
- Fix: Vertex AI Gemini labels field provider-aware filtering by @timelfrink in #14563
- Add AWS external ID parameter support for Bedrock authentication by @timelfrink in #14582
- [Fix] /responses API - add cancel endpoint + allow non-admins to use this as an llm api endpoint by @ishaan-jaff in #14594
- Fix: handle empty arguments in Bedrock tool call invocation by @pazevedo-hyland in #14583
- fix(proxy): Correctly parse multi-part MCP server aliases from URL paths by @iabhi4 in #14558
- fix: recompute filters after deleting an MCP Server by @uc4w6c in #14542
- Add CompactifAI provider support by @timelfrink in #14532
- feat(proxy): Assign default budget to auto-generated JWT teams by @iabhi4 in #14514
- fix volcengine thinking parameters missing when it set disable by @LingXuanYin in #14569
New Contributors
- @luisfucros made their first contribution in #14500
- @hanakannzashi made their first contribution in #14548
- @eliasto made their first contribution in #14494
- @Rasmusafj made their first contribution in #14491
- @LingXuanYin made their first contribution in #14569
Full Changelog: v1.77.1-nightly...v1.77.1.dev6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.1.dev6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 170.0 | 186.49447404744328 | 6.343361663640738 | 6.343361663640738 | 1897 | 1897 | 144.19464299999163 | 537.2179720000076 |
Aggregated | Failed ❌ | 170.0 | 186.49447404744328 | 6.343361663640738 | 6.343361663640738 | 1897 | 1897 | 144.19464299999163 | 537.2179720000076 |