BerriAI/litellm v1.73.7-nightly on GitHub

What's Changed

Update management_cli.md by @wildcard in #12157
fix: support Cursor IDE tool_choice format {"type": "auto"} by @colesmcintosh in #12168
[Feat] Add new AWS SQS Logging Integration by @ishaan-jaff in #12176
[Bug Fix] Allow passing litellm_params when using generateContent API endpoint by @ishaan-jaff in #12177
Use the -d flag in docs instead of -D by @szafranek in #12179
[Bug Fix] Using /messages with lowest latency routing by @ishaan-jaff in #12180
[Bug Fix] Fix Error code: 307 for LlamaAPI Streaming Chat by @seyeong-han in #11946
Fix client function to support anthropic_messages call type and implement max tokens check by @dinggh in #12162
Fix - handle empty config.yaml + Fix gemini /models - replace models/ as expected, instead of using 'strip' by @krrishdholakia in #12189
VertexAI Anthropic - streaming cost tracking w/ prompt caching fixes by @krrishdholakia in #12188
Fix: correct user_id validation logic in Anthropic… by @raz-alon in #11432
Fix allow strings in calculate cost by @tofarr in #12200
Fix: Flaky test_keys_delete_error_handling test by @colesmcintosh in #12209
added changes to mcp url wrapping by @jugaldb in #12207
update pydantic version by @jugaldb in #12213
Non-anthropic (gemini/openai/etc.) models token usage returned when calling /v1/messages by @krrishdholakia in #12184
added error handling for MCP tools not found or invalid server by @jugaldb in #12223
Customizable Email template - Subject and Signature by @jugaldb in #12218
Update Vertex Model Garden doc to use SDK for deploy + chat completion by @lizzij in #12219
Fix: Preserve full path structure for Gemini custom api_base by @colesmcintosh in #12215
Fix default parameters for ollama-chat by @cipri-tom in #12201
Passes through extra_ properties on "custom" llm provider by @zsimjee in #12185
Fix/move panw prisma airs test file location per feedback on PR #12116 by @jroberts2600 in #12175
[Feat] Add litellm-proxy cli login for starting to use litellm proxy by @ishaan-jaff in #12216
Bug Fix - responses api fix got multiple values for keyword argument 'litellm_trace_id' by @ishaan-jaff in #12225
OpenMeter integration error handling fix by @SamBoyd in #12147
[Feat] Polish - add better error validation when users configure prometheus metrics and labels to control cardinality by @ishaan-jaff in #12182
Revert "Fix: Preserve full path structure for Gemini custom api_base" by @ishaan-jaff in #12227
Litellm add sentry scrubbing by @jugaldb in #12210
Fix rendering ui on non-root images by @krrishdholakia in #12226
Batches - support batch retrieve with target model Query Param + Anthropic - completion bridge, yield content_block_stop chunk by @krrishdholakia in #12228
fix: mistral transform_response handling for empty string content by @njbrake in #12202
Add logos to callback list by @NANDINI-star in #12244
fix(streaming_handler.py): store finish reason, even if is_finished i… by @krrishdholakia in #12250
Fix: Initialize JSON logging for all loggers when JSON_LOGS=True by @colesmcintosh in #12206
Azure - responses api bridge - respect responses/ + Gemini - generate content bridge - handle kwargs + litellm params containing stream by @krrishdholakia in #12224
feat: Turn Mistral to use llm_http_handler by @njbrake in #12245
[Bug Fix] Fixes for bedrock guardrails post_call - applying to streaming responses by @ishaan-jaff in #12252
[Bump] Litellm responses format by @jugaldb in #12253
Add MCP url masking on frontend by @jugaldb in #12247
fix(docs): fix config file description in k8s deployment by @utsumi-fj in #12230
[Feat] UI - Allow adding team specific logging callbacks by @ishaan-jaff in #12261
Add fix to tests by @jugaldb in #12263
[Feat] Add Arize Team Based Logging by @ishaan-jaff in #12264
Add MCP servers header to the scope of header by @jugaldb in #12266
[UI] QA - QA Arize Team based logging callbacks by @ishaan-jaff in #12265
Add 'audio_url' message type support for VLLM by @krrishdholakia in #12270
Add Azure Content Safety Guardrails to LiteLLM proxy by @krrishdholakia in #12268
Correctly display 'Internal Viewer' user role by @NANDINI-star in #12284
Add azure_ai cohere rerank v3.5 by @dcieslak19973 in #12283
Fix Hugging Face tests by @hanouticelina in #12286
Litellm mcp tool prefix by @jugaldb in #12289
[Bug Fix] Using gemini-cli with Vertex Anthropic Models by @ishaan-jaff in #12246
Fix: handle proxy internal callbacks in callback management test by @colesmcintosh in #12294
Fix DeepEval logging format for failure events by @ishaan-jaff in #12303
Fix credentials CLI test by @ishaan-jaff in #12304
Fix credentials CLI test by @ishaan-jaff in #12305
fix(factory.py): support optional args for bedrock by @krrishdholakia in #12287
fix(guardrails): add azure content safety guardrails to the UI by @krrishdholakia in #12309
fix: Add size parameter support for Vertex AI image generation by @colesmcintosh in #12292
improve readme: replace claude-3-sonnet because it will be retired soon by @takashiishida in #12239
(Prompt Management) Langfuse prompt_version support by @krrishdholakia in #12301
Fix gemini tool call sequence by @lowjiansheng in #11999
fix mapped tests by @ishaan-jaff in #12320

New Contributors

@wildcard made their first contribution in #12157
@szafranek made their first contribution in #12179
@dinggh made their first contribution in #12162
@tofarr made their first contribution in #12200
@lizzij made their first contribution in #12219
@cipri-tom made their first contribution in #12201
@zsimjee made their first contribution in #12185
@SamBoyd made their first contribution in #12147
@utsumi-fj made their first contribution in #12230
@dcieslak19973 made their first contribution in #12283
@takashiishida made their first contribution in #12239

Full Changelog: v1.73.6.rc.1...v1.73.7-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.73.7-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	266.65155370887544	6.1803154475053885	0.0	1848	0	213.35606200000257	1776.5402100000074
Aggregated	Passed ✅	240.0	266.65155370887544	6.1803154475053885	0.0	1848	0	213.35606200000257	1776.5402100000074