What's Changed
- feat: Make gemini accept the openai parameter parallel_tool_calls by @aholmberg in #11125
- Fix #9295 docker-compose healthcheck test uses curl but curl is not in the image by @agajdosi in #9737
- [Feat] Add /image/edits support for Azure by @ishaan-jaff in #11160
- Fix deprecation_date value for llama groq models by @kiriloman in #11151
- [Fix] Rollback to httpx==0.27.0 by @ishaan-jaff in #11146
- Doc update for azure openai by @ketangangal in #11161
- Litellm fix GitHub action testing by @krrishdholakia in #11163
- [Feat - Contributor PR] Add Video support for Bedrock Converse by @ishaan-jaff in #11166
- [Fixes] Aiohttp transport fixes - add handling for
aiohttp.ClientPayloadError
and ssl_verification settings by @ishaan-jaff in #11162 - prevent leaking sensitive keys to langfuse + support forwarding
/sso/key/generate
to the server root path url by @krrishdholakia in #11165 - [Fix] - Duplicate maxTokens parameter being sent to Bedrock/Claude model with thinking by @ishaan-jaff in #11181
- Integration with Nebius AI Studio added by @Aktsvigun in #11143
- Codestral - return litellm latency overhead on
/v1/completions
+ Add 'contains' support for ChatCompletionDeltaToolCall by @krrishdholakia in #10879 - Ollama Chat - parse tool calls on streaming by @krrishdholakia in #11171
- [Fix] Prometheus Metrics - Do not track end_user by default + expose flag to enable tracking end_user on prometheus by @ishaan-jaff in #11192
- [Fix]: Add cost tracking for image edits endpoint [OpenAI, Azure] by @ishaan-jaff in #11186
- VertexAI -
codeExecution
tool support + anyOf handling by @krrishdholakia in #11195 - Add Pangea provider to Guardrails hook by @ryanmeans in #10775
- Return anthropic thinking blocks on streaming + VertexAI Minor Fixes & Improvements (Thinking, Global regions, Parallel tool calling) by @krrishdholakia in #11194
- Azure OIDC provider improvements + OIDC audience bug fix by @nikoizs in #10054
- [Feat] Add well known MCP servers to LiteLLM by @ishaan-jaff in #11209
- Add missing
request_kwargs
toget_available_deployment
call by @Nitro963 in #11202
New Contributors
- @agajdosi made their first contribution in #9737
- @ketangangal made their first contribution in #11161
- @Aktsvigun made their first contribution in #11143
- @ryanmeans made their first contribution in #10775
- @nikoizs made their first contribution in #10054
- @Nitro963 made their first contribution in #11202
Full Changelog: v1.71.1-nightly...v1.71.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.71.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 296.34506371491256 | 6.07338921380297 | 0.0 | 1817 | 0 | 196.8864309999958 | 5947.878826000022 |
Aggregated | Passed ✅ | 230.0 | 296.34506371491256 | 6.07338921380297 | 0.0 | 1817 | 0 | 196.8864309999958 | 5947.878826000022 |