BerriAI/litellm v1.77.7.dev9 on GitHub

What's Changed

(Bug) Fix reasoning response ID by @Sameerlite in #15265
fix gemini cli by actually streaming the response by @Sameerlite in #15264
Add gpt-realtime-mini support by @Sameerlite in #15283
[Feat] Proxy CLI - dont store existing key in the URL, store it in the state param by @ishaan-jaff in #15290
Fix: Make PATCH /model/{model_id}/update handle team_id consistently with POST /model/new by @ishaan-jaff in #15297
[Fix] Networking: remove limitations by @AlexsanderHamir in #15302
[MCP Gateway] Litellm mcp fixes team control by @ishaan-jaff in #15304
[MCP Gateway] QA/Fixes - Ensure Team/Key level enforcement works for MCPs by @ishaan-jaff in #15305
fix: model + endpoints page crash when config file contains router_settings.model_group_alias by @ARajan1084 in #15308
Upgrades tenacity version to 8.5.0 by @ARajan1084 in #15303
[QA/Fixes] - Dynamic Rate Limiter v3 - final QA by @ishaan-jaff in #15311
Add Cohere Embed v4 support for AWS Bedrock by @timelfrink in #15298
fix(bedrock): include cacheWriteInputTokens in prompt_tokens calculation by @timelfrink in #15292
fix issue with parsing assistant messages by @Sameerlite in #15320
feat(files): add @client decorator to file operations by @FelipeRodriguesGare in #15339
[Fix] Watsonx - Apply correct prompt templates for openai/gpt-oss model family by @ishaan-jaff in #15341
potentially fixes a UI spasm issue with an expired cookie by @ARajan1084 in #15309
Add gpt-5-pro-2025-10-06 to model costs by @sandeshghanta in #15344
Fix - (openrouter): move cache_control to content blocks for claude/gemini by @ishaan-jaff in #15345
[Fix] x-litellm-cache-key header not being returned on cache hit by @ishaan-jaff in #15348
Add native Responses API support for litellm_proxy provider by @Copilot in #15347
AzureAD Default credentials - select credential type based on environment by @krrishdholakia in #14470
SSO - support EntraID app roles by @krrishdholakia in #15351
MCP - support converting OpenAPI specs to MCP servers by @krrishdholakia in #15343
LiteLLM UI Refactor Infrastructure by @ARajan1084 in #15236
MCP - specify allowed params per tool by @krrishdholakia in #15346

Full Changelog: v1.77.7.dev.3...v1.77.7.dev9

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.7.dev9

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Failed ❌	60	77.84764691897514	6.515099712566108	6.515099712566108	1950	1950	40.86414299996477	2919.70134099995
Aggregated	Failed ❌	60	77.84764691897514	6.515099712566108	6.515099712566108	1950	1950	40.86414299996477	2919.70134099995