BerriAI/litellm v1.77.1.dev6 on GitHub

What's Changed

VLLM - transcription endpoint support + Ollama_chat/ - images, thinking, and content as list handling + by @krrishdholakia in #14523
[Fix] Bug Fix - Org Budget was not updating by @ishaan-jaff in #14541
Docs update on user header mapping by @boopesh07 in #14527
Litellm 1.77.2 stable notes by @ishaan-jaff in #14544
Litellm UI qa 09 13 2025 p1 - fix end user filtering + fix load mcp tool call error + prevent setting max user budget on scroll in edit user settings by @krrishdholakia in #14545
The 'last 24 hours' button shows up above the end user dropdown on Logs page by @NANDINI-star in #14546
fix: DD tool calls passed in metadata by @mubashir1osmani in #14531
Added user_email labels to the prometheus monitoring. by @boopesh07 in #14520
feat: add tool-permission guardrail by @uc4w6c in #14519
Add sambanova deepseek v3.1 and gpt-oss-120b models by @luisfucros in #14500
fix: completion chat id by @hanakannzashi in #14548
feat: Add OVHCloud AI Endpoints as a provider by @eliasto in #14494
Resolve cache key collision issue where all soft budget alerts use identical cache keys by @Rasmusafj in #14491
Fix unsupported stop param for grok-code models by @Sameerlite in #14565
[Feat]Add cancel endpoint support for openai and azure by @Sameerlite in #14561
Fix: Bedrock cross-region inference profile cost calculation by @timelfrink in #14566
[Bug Fix] SCIM v2 - ensure group PUSH and PUT ops allow creating non-existent members by @ishaan-jaff in #14581
s3_endpoint_url returned 404 by @mubashir1osmani in #14559
Fix: Vertex AI Gemini labels field provider-aware filtering by @timelfrink in #14563
Add AWS external ID parameter support for Bedrock authentication by @timelfrink in #14582
[Fix] /responses API - add cancel endpoint + allow non-admins to use this as an llm api endpoint by @ishaan-jaff in #14594
Fix: handle empty arguments in Bedrock tool call invocation by @pazevedo-hyland in #14583
fix(proxy): Correctly parse multi-part MCP server aliases from URL paths by @iabhi4 in #14558
fix: recompute filters after deleting an MCP Server by @uc4w6c in #14542
Add CompactifAI provider support by @timelfrink in #14532
feat(proxy): Assign default budget to auto-generated JWT teams by @iabhi4 in #14514
fix volcengine thinking parameters missing when it set disable by @LingXuanYin in #14569

New Contributors

@luisfucros made their first contribution in #14500
@hanakannzashi made their first contribution in #14548
@eliasto made their first contribution in #14494
@Rasmusafj made their first contribution in #14491
@LingXuanYin made their first contribution in #14569

Full Changelog: v1.77.1-nightly...v1.77.1.dev6

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.1.dev6

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Failed ❌	170.0	186.49447404744328	6.343361663640738	6.343361663640738	1897	1897	144.19464299999163	537.2179720000076
Aggregated	Failed ❌	170.0	186.49447404744328	6.343361663640738	6.343361663640738	1897	1897	144.19464299999163	537.2179720000076