What's Changed
- fix: supports_function_calling works with llm_proxy models by @pazevedo-hyland in #11381
- feat: add HuggingFace rerank provider support by @cainiaoit in #11438
- Litellm dev 06 05 2025 p2 by @krrishdholakia in #11470
- Fix variable redefinition linting error in vertex_and_google_ai_studio_gemini.py by @colesmcintosh in #11486
- Add Google Gemini 2.5 Pro Preview 06-05 by @PeterDaveHello in #11447
- Feat: add add azure endpoint for image endpoints by @ishaan-jaff in #11482
- [Feat] New model - add
codex-mini-latest
by @ishaan-jaff in #11492 - Nebius model pricing info updated by @Aktsvigun in #11445
- [Docs] Add audio / tts section for gemini and vertex by @AyrennC in #11306
- Document batch polling logic to avoid ValueError: Output file id is None error by @fadil4u in #11286
- Revert "Nebius model pricing info updated" by @ishaan-jaff in #11493
- [Bug Fix] Fix: _transform_responses_api_content_to_chat_completion_content` doesn't support file content type by @ishaan-jaff in #11494
- Fix Fireworks AI rate limit exception mapping - detect "rate limit" text in error messages by @colesmcintosh in #11455
- Update Makefile to match CI workflows and improve contributor experience by @colesmcintosh in #11485
- Fix: Respect user_header_name property for budget selection and user identification by @colesmcintosh in #11419
- Update production doc by @ishaan-jaff in #11499
- Enhance proxy CLI with Rich formatting and improved user experience by @colesmcintosh in #11420
- Remove retired version gpt-3.5 from configs.md by @vuanhtu52 in #11508
- Update model version in deploy.md by @vuanhtu52 in #11506
- [Feat] Allow using litellm.completion with /v1/messages API Spec (use gpt-4, gemini etc with claude code) by @ishaan-jaff in #11502
New Contributors
- @pazevedo-hyland made their first contribution in #11381
- @cainiaoit made their first contribution in #11438
- @vuanhtu52 made their first contribution in #11508
Full Changelog: v1.72.1.dev8...v1.72.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 180.0 | 201.25992693793458 | 6.245318195564 | 0.0 | 1869 | 0 | 165.1556739999478 | 1316.0002060000124 |
Aggregated | Passed ✅ | 180.0 | 201.25992693793458 | 6.245318195564 | 0.0 | 1869 | 0 | 165.1556739999478 | 1316.0002060000124 |