What's Changed
- Update management_cli.md by @wildcard in #12157
- fix: support Cursor IDE tool_choice format {"type": "auto"} by @colesmcintosh in #12168
- [Feat] Add new AWS SQS Logging Integration by @ishaan-jaff in #12176
- [Bug Fix] Allow passing litellm_params when using generateContent API endpoint by @ishaan-jaff in #12177
- Use the -d flag in docs instead of -D by @szafranek in #12179
- [Bug Fix] Using /messages with lowest latency routing by @ishaan-jaff in #12180
- [Bug Fix] Fix Error code: 307 for LlamaAPI Streaming Chat by @seyeong-han in #11946
- Fix client function to support anthropic_messages call type and implement max tokens check by @dinggh in #12162
- Fix - handle empty config.yaml + Fix gemini /models - replace models/ as expected, instead of using 'strip' by @krrishdholakia in #12189
- VertexAI Anthropic - streaming cost tracking w/ prompt caching fixes by @krrishdholakia in #12188
- Fix: correct user_id validation logic in Anthropic… by @raz-alon in #11432
- Fix allow strings in calculate cost by @tofarr in #12200
- Fix: Flaky test_keys_delete_error_handling test by @colesmcintosh in #12209
- added changes to mcp url wrapping by @jugaldb in #12207
- update pydantic version by @jugaldb in #12213
- Non-anthropic (gemini/openai/etc.) models token usage returned when calling
/v1/messages
by @krrishdholakia in #12184 - added error handling for MCP tools not found or invalid server by @jugaldb in #12223
- Customizable Email template - Subject and Signature by @jugaldb in #12218
- Update Vertex Model Garden doc to use SDK for deploy + chat completion by @lizzij in #12219
- Fix: Preserve full path structure for Gemini custom api_base by @colesmcintosh in #12215
- Fix default parameters for ollama-chat by @cipri-tom in #12201
- Passes through extra_ properties on "custom" llm provider by @zsimjee in #12185
- Fix/move panw prisma airs test file location per feedback on PR #12116 by @jroberts2600 in #12175
- [Feat] Add litellm-proxy cli login for starting to use litellm proxy by @ishaan-jaff in #12216
- Bug Fix - responses api fix got multiple values for keyword argument 'litellm_trace_id' by @ishaan-jaff in #12225
- OpenMeter integration error handling fix by @SamBoyd in #12147
- [Feat] Polish - add better error validation when users configure prometheus metrics and labels to control cardinality by @ishaan-jaff in #12182
- Revert "Fix: Preserve full path structure for Gemini custom api_base" by @ishaan-jaff in #12227
- Litellm add sentry scrubbing by @jugaldb in #12210
- Fix rendering ui on non-root images by @krrishdholakia in #12226
- Batches - support batch retrieve with target model Query Param + Anthropic - completion bridge, yield content_block_stop chunk by @krrishdholakia in #12228
- fix: mistral transform_response handling for empty string content by @njbrake in #12202
- Add logos to callback list by @NANDINI-star in #12244
- fix(streaming_handler.py): store finish reason, even if is_finished i… by @krrishdholakia in #12250
- Fix: Initialize JSON logging for all loggers when JSON_LOGS=True by @colesmcintosh in #12206
- Azure - responses api bridge - respect
responses/
+ Gemini - generate content bridge - handle kwargs + litellm params containingstream
by @krrishdholakia in #12224 - feat: Turn Mistral to use llm_http_handler by @njbrake in #12245
- [Bug Fix] Fixes for bedrock guardrails post_call - applying to streaming responses by @ishaan-jaff in #12252
- [Bump] Litellm responses format by @jugaldb in #12253
- Add MCP url masking on frontend by @jugaldb in #12247
- fix(docs): fix config file description in k8s deployment by @utsumi-fj in #12230
- [Feat] UI - Allow adding team specific logging callbacks by @ishaan-jaff in #12261
- Add fix to tests by @jugaldb in #12263
- [Feat] Add Arize Team Based Logging by @ishaan-jaff in #12264
- Add MCP servers header to the scope of header by @jugaldb in #12266
- [UI] QA - QA Arize Team based logging callbacks by @ishaan-jaff in #12265
- Add 'audio_url' message type support for VLLM by @krrishdholakia in #12270
- Add Azure Content Safety Guardrails to LiteLLM proxy by @krrishdholakia in #12268
- Correctly display 'Internal Viewer' user role by @NANDINI-star in #12284
- Add azure_ai cohere rerank v3.5 by @dcieslak19973 in #12283
- Fix Hugging Face tests by @hanouticelina in #12286
- Litellm mcp tool prefix by @jugaldb in #12289
- [Bug Fix] Using
gemini-cli
with Vertex Anthropic Models by @ishaan-jaff in #12246 - Fix: handle proxy internal callbacks in callback management test by @colesmcintosh in #12294
- Fix DeepEval logging format for failure events by @ishaan-jaff in #12303
- Fix credentials CLI test by @ishaan-jaff in #12304
- Fix credentials CLI test by @ishaan-jaff in #12305
- fix(factory.py): support optional args for bedrock by @krrishdholakia in #12287
- fix(guardrails): add azure content safety guardrails to the UI by @krrishdholakia in #12309
- fix: Add size parameter support for Vertex AI image generation by @colesmcintosh in #12292
- improve readme: replace claude-3-sonnet because it will be retired soon by @takashiishida in #12239
- (Prompt Management) Langfuse prompt_version support by @krrishdholakia in #12301
- Fix gemini tool call sequence by @lowjiansheng in #11999
- fix mapped tests by @ishaan-jaff in #12320
New Contributors
- @wildcard made their first contribution in #12157
- @szafranek made their first contribution in #12179
- @dinggh made their first contribution in #12162
- @tofarr made their first contribution in #12200
- @lizzij made their first contribution in #12219
- @cipri-tom made their first contribution in #12201
- @zsimjee made their first contribution in #12185
- @SamBoyd made their first contribution in #12147
- @utsumi-fj made their first contribution in #12230
- @dcieslak19973 made their first contribution in #12283
- @takashiishida made their first contribution in #12239
Full Changelog: v1.73.6.rc.1...v1.73.7-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.7-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 266.65155370887544 | 6.1803154475053885 | 0.0 | 1848 | 0 | 213.35606200000257 | 1776.5402100000074 |
Aggregated | Passed ✅ | 240.0 | 266.65155370887544 | 6.1803154475053885 | 0.0 | 1848 | 0 | 213.35606200000257 | 1776.5402100000074 |