What's Changed
- Litellm user daily activity allow non admin usage by @krrishdholakia in #9695
- fix(model_management_endpoints.py): fix allowing team admins to update team models by @krrishdholakia in #9697
- Add support for max_completion_tokens to the Cohere chat transformati… by @simha104 in #9701
- fix(gemini/): add gemini/ route embedding optional param mapping support by @krrishdholakia in #9677
- Add Google AI Studio
/v1/files
upload API support by @krrishdholakia in #9645 - [Docs] High Availability Setup (Resolve DB Deadlocks) by @ishaan-jaff in #9714
- Bump image-size from 1.1.1 to 1.2.1 in /docs/my-website by @dependabot in #9708
- [Bug fix] Azure o-series tool calling by @ishaan-jaff in #9694
- [Reliability Fix] - Use Redis for PodLock Manager instead of PG (ensures no deadlocks occur) by @ishaan-jaff in #9715
- Ban hardcoded numbers - merge of #9513 by @krrishdholakia in #9709
- [Feat] Add VertexAI gemini-2.0-flash by @Dobiasd in #9723
- Fix: Use request body in curl log for Gemini streaming mode by @fengjiajie in #9736
- LiteLLM Minor Fixes & Improvements (04/02/2025) by @krrishdholakia in #9725
- fix:Gemini Flash 2.0 implementation is not returning the logprobs by @sajdakabir in #9713
New Contributors
- @simha104 made their first contribution in #9701
- @Dobiasd made their first contribution in #9723
- @sajdakabir made their first contribution in #9713
Full Changelog: v1.65.2.dev1...v1.65.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 280.0203592110988 | 6.144626182419757 | 0.0 | 1838 | 0 | 216.55763899997282 | 5015.033350000011 |
Aggregated | Passed ✅ | 250.0 | 280.0203592110988 | 6.144626182419757 | 0.0 | 1838 | 0 | 216.55763899997282 | 5015.033350000011 |