What's Changed
- fix inference endpoints (#11630) by @ishaan-jaff in #11631
- [UI] Add Deepgram provider to supported providers list and mappings by @ishaan-jaff in #11634
- [Bug Fix] Add audio/ogg mapping for Audio MIME types by @ishaan-jaff in #11635
- [Feat] Add Background mode for Responses API - OpenAI, AzureOpenAI by @ishaan-jaff in #11640
- [Feat] Add provider specific params for
deepgram/
by @ishaan-jaff in #11638 - [Feat] MCP - Add support for
streamablehttp_client
MCP Servers by @ishaan-jaff in #11628 - [Feat] Perf fix - ensure deepgram provider uses async httpx calls by @ishaan-jaff in #11641
- Trim the long user ids on the the keys page by @NANDINI-star in #11488
- Enable System Proxy Support for aiohttp Transport by @idootop in #11616
- GA Multi-instance rate limiting v2 Requirements + New - specify token rate limit type - output / input / total by @krrishdholakia in #11646
- Add bridge for /chat/completion -> /responses API by @krrishdholakia in #11632
- Convert scientific notation str to int + Bubble up azure content filter results by @krrishdholakia in #11655
- feat(helm): [#11648] support extraContainers in migrations-job.yaml by @stevenaldinger in #11649
- Correct success message when user creates new budget by @vuanhtu52 in #11608
- fix: Do not add default model on tag based-routing when valid tag by @thiagosalvatore in #11454
- Fix default user settings by @NANDINI-star in #11674
- [Pricing] add azure/gpt-4o-mini-transcribe models by @ishaan-jaff in #11676
- Enhance Mistral model support with reasoning capabilities by @colesmcintosh in #11642
- [Feat] MCP expose streamable https endpoint for LiteLLM Proxy by @ishaan-jaff in #11645
- change space_key header to space_id for Arize by @vanities in #11595
- Add performance indexes to LiteLLM_SpendLogs for analytics queries by @colesmcintosh in #11675
- Revert "Add performance indexes to LiteLLM_SpendLogs for analytics queries" by @krrishdholakia in #11683
- [Feat] Use dedicated Rest endpoints for list, calling MCP tools by @ishaan-jaff in #11684
- Chat Completions <-> Responses API Bridge Improvements by @krrishdholakia in #11685
- [UI] Fix MCP Server Table to Match Existing Table Pattern by @ishaan-jaff in #11691
- Logging: prevent double logging logs when bridge is used (anthropic <-> chat completion OR chat completion <-> responses api) by @krrishdholakia in #11687
- fix(vertex_ai): support global location in vertex ai passthrough by @alvarosevilla95 in #11661
- [Feat] UI Allow editing mcp servers by @ishaan-jaff in #11693
- [Feat] UI - Allow setting MCP servers when creating keys, teams by @ishaan-jaff in #11711
- [Feat] Add Authentication + Permission Management for MCP List, Call Tool Ops by @ishaan-jaff in #11682
- Add Live Tail Feature to Logs View by @NANDINI-star in #11712
- [Feat] Add Connect to MCP Page by @ishaan-jaff in #11716
- Enterprise feature preview improvement on Audit Logs by @NANDINI-star in #11715
New Contributors
- @idootop made their first contribution in #11616
- @stevenaldinger made their first contribution in #11649
- @thiagosalvatore made their first contribution in #11454
- @vanities made their first contribution in #11595
- @alvarosevilla95 made their first contribution in #11661
Full Changelog: v1.72.5.dev1...v1.72.2.devMCP
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.2.devMCP
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 241.96992280403583 | 6.294425384311064 | 0.0 | 1883 | 0 | 199.48631400001204 | 1258.8171310000007 |
Aggregated | Passed ✅ | 220.0 | 241.96992280403583 | 6.294425384311064 | 0.0 | 1883 | 0 | 199.48631400001204 | 1258.8171310000007 |