What's Changed
- UI Improvements + Fixes - remove 'default key' on user signup + fix showing user models available for personal key creation by @krrishdholakia in #9741
- Fix prompt caching for Anthropic tool calls by @aorwall in #9706
- passthrough kwargs during acompletion, and unwrap extra_body for openrouter by @adrianlyjak in #9747
- [Feat] UI - Test Key v2 page - allow testing image endpoints + polish the page by @ishaan-jaff in #9748
- [Feat] Allow assigning SSO users to teams on MSFT SSO by @ishaan-jaff in #9745
- Fix VertexAI Credential Caching issue by @krrishdholakia in #9756
- [Reliability] v2 DB Deadlock Reduction Architecture – Add Max Size for In-Memory Queue + Backpressure Mechanism by @ishaan-jaff in #9759
- fix(router.py): support reusable credentials via passthrough router by @krrishdholakia in #9758
- Allow team members to see team models by @krrishdholakia in #9742
- fix(xai/chat/transformation.py): filter out 'name' param for xai non-… by @krrishdholakia in #9761
- Gemini image generation output support by @krrishdholakia in #9646
- [Fix] issue that metadata key exist, but value is None by @chaosddp in #9764
- fix(asr-groq): add groq whisper models to model cost map by @liuhu in #9648
- Update model_prices_and_context_window.json by @caramulrooney in #9620
- [Reliability] Emit operational metrics for new DB Transaction architecture by @ishaan-jaff in #9719
- [Security feature] Allow adding authentication on /metrics endpoints by @ishaan-jaff in #9766
- [Reliability] Prometheus emit llm provider on failure metric - make it easy to differentiate litellm error vs llm api error by @ishaan-jaff in #9760
- Fix prisma migrate deploy to use correct directory by @krrishdholakia in #9767
- Add DBRX Anthropic w/ thinking + response_format support by @krrishdholakia in #9744
New Contributors
- @aorwall made their first contribution in #9706
- @adrianlyjak made their first contribution in #9747
- @chaosddp made their first contribution in #9764
- @liuhu made their first contribution in #9648
- @caramulrooney made their first contribution in #9620
Full Changelog: v1.65.3.dev5...v1.65.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.4-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 223.7736452877303 | 6.155321270706562 | 0.0 | 1842 | 0 | 181.8848560000106 | 4326.022138999974 |
Aggregated | Passed ✅ | 200.0 | 223.7736452877303 | 6.155321270706562 | 0.0 | 1842 | 0 | 181.8848560000106 | 4326.022138999974 |