What's Changed
- VLLM - transcription endpoint support + Ollama_chat/ - images, thinking, and content as list handling + by @krrishdholakia in #14523
- [Fix] Bug Fix - Org Budget was not updating by @ishaan-jaff in #14541
- Docs update on user header mapping by @boopesh07 in #14527
- Litellm 1.77.2 stable notes by @ishaan-jaff in #14544
- Litellm UI qa 09 13 2025 p1 - fix end user filtering + fix load mcp tool call error + prevent setting max user budget on scroll in edit user settings by @krrishdholakia in #14545
- The 'last 24 hours' button shows up above the end user dropdown on Logs page by @NANDINI-star in #14546
- fix: DD tool calls passed in metadata by @mubashir1osmani in #14531
- Added user_email labels to the prometheus monitoring. by @boopesh07 in #14520
- feat: add tool-permission guardrail by @uc4w6c in #14519
- Add sambanova deepseek v3.1 and gpt-oss-120b models by @luisfucros in #14500
- fix: completion chat id by @hanakannzashi in #14548
- feat: Add OVHCloud AI Endpoints as a provider by @eliasto in #14494
- Resolve cache key collision issue where all soft budget alerts use identical cache keys by @Rasmusafj in #14491
- Fix unsupported stop param for grok-code models by @Sameerlite in #14565
- [Feat]Add cancel endpoint support for openai and azure by @Sameerlite in #14561
- Fix: Bedrock cross-region inference profile cost calculation by @timelfrink in #14566
New Contributors
- @luisfucros made their first contribution in #14500
- @hanakannzashi made their first contribution in #14548
- @eliasto made their first contribution in #14494
- @Rasmusafj made their first contribution in #14491
Full Changelog: v1.77.1-nightly...v1.77.1.dev5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.1.dev5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 180.0 | 188.5220071670228 | 6.268629876464755 | 6.268629876464755 | 1874 | 1874 | 144.19824700001982 | 566.4026640000088 |
Aggregated | Failed ❌ | 180.0 | 188.5220071670228 | 6.268629876464755 | 6.268629876464755 | 1874 | 1874 | 144.19824700001982 | 566.4026640000088 |