✨ Today we're launching support for Gemini Context Caching on LiteLLM Proxy- Start here: https://docs.litellm.ai/docs/providers/vertex#context-caching
🔥 Fix UI - Easily add Groq models
⚡️ Admin UI - Azure OpenAI don't require api version when adding model
📈 UI - sort providers in alphabetical order on Models Page
🛠️ [Fix-Bug]: Whisper not working
📈 fix handle case when service logger has no attribute prometheusService
What's Changed
- fix handle case when service logger has no attribute prometheusService by @ishaan-jaff in #5115
- [Feat-Proxy] Add Support for VertexAI context caching by @ishaan-jaff in #5119
- [Fix-Bug]: Whisper is broken by @ishaan-jaff in #5114
- fix(user_api_key_auth.py): Fix issue with key auth w/ user not in db by @krrishdholakia in #5117
- UI add groq models by @ishaan-jaff in #5125
- ui show litellm model name by @ishaan-jaff in #5123
- Admin UI - add mistral ai by @ishaan-jaff in #5126
- Admin UI - Azure OpenAI dont require api version azure by @ishaan-jaff in #5127
- UI - sort providers in alphabetical order by @ishaan-jaff in #5128
Full Changelog: v1.43.3...v1.43.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 161.51749219841636 | 6.333549395955978 | 0.23729921219676753 | 1895 | 71 | 102.82188400003633 | 956.2377719999517 |
Aggregated | Passed ✅ | 140.0 | 161.51749219841636 | 6.333549395955978 | 0.23729921219676753 | 1895 | 71 | 102.82188400003633 | 956.2377719999517 |