Full Changelog: v1.72.0.stable...v1.72.2-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.72.2-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 220.70987704382637 | 6.253574765303403 | 0.0 | 1871 | 0 | 179.76551599997492 | 1345.8777239999904 |
Aggregated | Passed ✅ | 200.0 | 220.70987704382637 | 6.253574765303403 | 0.0 | 1871 | 0 | 179.76551599997492 | 1345.8777239999904 |
What's Changed
- Litellm doc fixes 05 31 2025 by @krrishdholakia in #11305
- Converted action buttons to sticky footer action buttons by @NANDINI-star in #11293
- Add support for DataRobot as a provider in LiteLLM by @mjnitz02 in #10385
- fix: remove dupe server_id MCP config servers by @wagnerjt in #11327
- Add unit tests for Cohere Embed v4.0 model by @colesmcintosh in #11329
- Add presidio_language yaml configuration support for guardrails by @colesmcintosh in #11331
- [Fix] Fix SCIM running patch operation case sensitivity by @ishaan-jaff in #11335
- Fix transcription model name mapping by @colesmcintosh in #11333
- [Feat] DD Trace - Add instrumentation for streaming chunks by @ishaan-jaff in #11338
- UI - Custom Server Root Path (Multiple Fixes) by @krrishdholakia in #11337
- [Perf] - Add Async + Batched S3 Logging by @ishaan-jaff in #11340
- fixes: expose flag to disable token counter by @ishaan-jaff in #11344
- Merge in - Gemini streaming - thinking content parsing - return in
reasoning_content
by @krrishdholakia in #11298 - Support returning virtual key in custom auth + Handle provider-specific optional params for embedding calls by @krrishdholakia in #11346
- Doc : Nvidia embedding models by @AnilAren in #11352
- feat: add cerebras/qwen-3-32b model pricing and context window by @colesmcintosh in #11373
- Fix Google/Vertex AI Gemini module linting errors - Remove unused imports by @colesmcintosh in #11374
- [Feat]: Performance add DD profiler to monitor python profile of LiteLLM CPU% by @ishaan-jaff in #11375
- [Fix]: Performance - Don't run auth on /health/liveliness by @ishaan-jaff in #11378
- [Bug Fix] Create/Update team member api 500 errror by @hagan in #10479
- add gemini-embeddings-001 model prices and context window by @marty-sullivan in #11332
- [Performance]: Add debugging endpoint to track active /asyncio-tasks by @ishaan-jaff in #11382
- Add Claude 4 Sonnet & Opus, DeepSeek R1, and fix Llama Vision model pricing configurations by @colesmcintosh in #11339
- [Feat] Performance - Don't create 1 task for every hanging request alert by @ishaan-jaff in #11385
- UI / SSO - Update proxy admin id role in DB + Handle SSO redirects with custom root path by @krrishdholakia in #11384
- Anthropic - pass file url's as Document content type + Gemini - cache token tracking on streaming calls by @krrishdholakia in #11387
- Anthropic - Token tracking for Passthrough Batch API calls by @krrishdholakia in #11388
- update GCSBucketBase to handle GSM project ID if passed by @wwells in #11409
- fix: add enterprise feature gating to RegenerateKeyModal in KeyInfoView by @likweitan in #11400
- Litellm audit log staging by @krrishdholakia in #11418
- Add User ID validation to ensure it is not an email or phone number by @raz-alon in #10102
- [Performance] Performance improvements for /v1/messages route by @ishaan-jaff in #11421
- Add SSO configuration endpoints and UI integration with persistent settings by @colesmcintosh in #11417
- [Build] Bump dd trace version by @ishaan-jaff in #11426
- Add together_ai provided deepseek-r1 family model configuration by @jtsai-quid in #11394
- fix: Use proper attribute for Sagemaker request for embeddings by @tmbo in #11362
- added gemini url context support by @wangsha in #11351
- fix(redis_cache.py): support pipeline redis lpop for older redis vers… by @krrishdholakia in #11425
- Support no reasoning option for gemini models by @lowjiansheng in #11393
- fix(prometheus.py): pass custom metadata labels in litellm_total_toke… by @krrishdholakia in #11414
- Fix None values in usage field for gpt-image-1 model responses by @colesmcintosh in #11448
- Fix HuggingFace embeddings using non-default
input_type
by @seankwalker in #11452 - Add AGENTS.md by @colesmcintosh in #11461
- Custom Root Path Improvements: don't require reserving
/litellm
route by @krrishdholakia in #11460 - [Feat] Make batch size for maximum retention in spend logs a controllable parameter by @ishaan-jaff in #11459
- Add pangea to guardrails sidebar by @ryanmeans in #11464
- [Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #11467
- [Feat] Return response_id == upstream response ID for VertexAI + Google AI studio (Stream+Non stream) by @ishaan-jaff in #11456
- [Fix]: /v1/messages - return streaming usage statistics when using litellm with bedrock models by @ishaan-jaff in #11469
- fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
- feat: add HuggingFace rerank provider support by @cainiaoit in #11438
- Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470
- Fix variable redefinition linting error in vertex_and_google_ai_studio_gemini.py by @colesmcintosh in #11486
- Add Google Gemini 2.5 Pro Preview 06-05 by @PeterDaveHello in #11447
- Feat: add add azure endpoint for image endpoints by @ishaan-jaff in #11482
- [Feat] New model - add
codex-mini-latest
by @ishaan-jaff in #11492 - Nebius model pricing info updated by @Aktsvigun in #11445
- [Docs] Add audio / tts section for gemini and vertex by @AyrennC in #11306
- Document batch polling logic to avoid ValueError: Output file id is None error by @fadil4u in #11286
- Revert "Nebius model pricing info updated" by @ishaan-jaff in #11493
- [Bug Fix] Fix: _transform_responses_api_content_to_chat_completion_content` doesn't support file content type by @ishaan-jaff in #11494
- Fix Fireworks AI rate limit exception mapping - detect "rate limit" text in error messages by @colesmcintosh in #11455
- Update Makefile to match CI workflows and improve contributor experience by @colesmcintosh in #11485
- Fix: Respect user_header_name property for budget selection and user identification by @colesmcintosh in #11419
- Update production doc by @ishaan-jaff in #11499
- Enhance proxy CLI with Rich formatting and improved user experience by @colesmcintosh in #11420
- Remove retired version gpt-3.5 from configs.md by @vuanhtu52 in #11508
- Update model version in deploy.md by @vuanhtu52 in #11506
- [Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) by @ishaan-jaff in #11502
- Update the correct test directory in contributing_code.md by @vuanhtu52 in #11511
- Fix UI navbar + UI server root path issue + Mask key in audit logs by @krrishdholakia in #11496
- Update web search documentation for new provider support (xAI, VertexAI, Google AI Studio) by @colesmcintosh in #11515
- Simplify experimental multi-instance rate limiter - more accurate by @krrishdholakia in #11424
- Litellm anthropic mcp support by @krrishdholakia in #11474
- UI - fix invitation link + ensure team models returned when team has 'all-proxy-models' + team only models by @krrishdholakia in #11524
- [Docs] - Add section on using all models with /v1/messages by @ishaan-jaff in #11523
- UI - fix(add_credentials_tab.tsx): filter for null values when adding credentials by @krrishdholakia in #11525
New Contributors
- @mjnitz02 made their first contribution in #10385
- @hagan made their first contribution in #10479
- @wwells made their first contribution in #11409
- @likweitan made their first contribution in #11400
- @raz-alon made their first contribution in #10102
- @jtsai-quid made their first contribution in #11394
- @tmbo made their first contribution in #11362
- @wangsha made their first contribution in #11351
- @seankwalker made their first contribution in #11452
- @pazevedo-hyland made their first contribution in #11381
- @cainiaoit made their first contribution in #11438
- @vuanhtu52 made their first contribution in #11508
Full Changelog: v1.72.0-stable...v1.72.2-stable