What's Changed
- fix(utils.py): fix the response object returned when n>1 for stream=true by @krrishdholakia in #3308
- [Fix] sending deployment latencies to slack alerting - lowest_latency by @ishaan-jaff in #3301
- fix(proxy/utils.py): log rejected proxy requests to langfuse by @krrishdholakia in #3310
Full Changelog: v1.35.28...v1.35.28.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 78 | 85.02681108551172 | 1.4062802929202902 | 0.0 | 421 | 0 | 72.50512399997433 | 1238.4483960000239 |
/health/liveliness | Passed ✅ | 62 | 64.63901720184093 | 15.241940561984046 | 0.0033403332373403566 | 4563 | 1 | 59.40863600000057 | 1528.008194999984 |
/health/readiness | Passed ✅ | 62 | 64.47801373834994 | 15.626078884278188 | 0.0 | 4678 | 0 | 59.54985199997509 | 1021.8157470000051 |
Aggregated | Passed ✅ | 62 | 65.4494174318984 | 32.274299739182524 | 0.0033403332373403566 | 9662 | 1 | 59.40863600000057 | 1528.008194999984 |