🚨 This Release has a change to DB schema (PR #3371), recommend waiting 1-2 releases to let this bake in
What's Changed
- usage based routing RPM count fix by @sumanth13131 in #3358
- Fix Cohere tool calling by @elisalimli in #3351
- [Feat] Write LLM Exception to LiteLLM Proxy DB by @ishaan-jaff in #3371
- fix(lowest_latency.py): allow setting a buffer for getting values within a certain latency threshold by @krrishdholakia in #3370
- Disambiguate invalid model name errors by @msabramo in #3374
- Revert "Disambiguate invalid model name errors" by @krrishdholakia in #3377
- fix(router.py): unify retry timeout logic across sync + async function_with_retries by @krrishdholakia in #3376
New Contributors
Full Changelog: v1.35.32.dev1...v1.35.33
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 82 | 87.50630895143662 | 1.5129454451077264 | 0.0 | 453 | 0 | 75.89100999999232 | 500.8229590000042 |
/health/liveliness | Passed ✅ | 66 | 68.2478437879775 | 15.500176182659951 | 0.0 | 4641 | 0 | 63.3791120000069 | 1334.4145010000261 |
/health/readiness | Passed ✅ | 66 | 69.03417801545166 | 15.346543753355414 | 0.003339835419663855 | 4595 | 1 | 63.32242299998825 | 1252.5400890000071 |
Aggregated | Passed ✅ | 66 | 69.52117338796633 | 32.359665381123094 | 0.003339835419663855 | 9689 | 1 | 63.32242299998825 | 1334.4145010000261 |