What's Changed
- Add return_exceptions to batch_completion (retry) by @ffreemt in #3462
- Fix issue with delta being None when Deferred / Async Content Filter is enabled on Azure OpenAI by @afbarbaro in #3812
- docs - using vllm with litellm proxy server by @ishaan-jaff in #3822
- Log errors in Traceloop Integration by @nirga in #3780
- [Feat] Enterprise - Send Email Alerts when user, key crosses budget by @ishaan-jaff in #3826
- fix(slack_alerting.py): support region based outage alerting by @krrishdholakia in #3828
- [Feat] - send Email alerts when making new key by @ishaan-jaff in #3829
- Revert "Log errors in Traceloop Integration" by @ishaan-jaff in #3831
Full Changelog: v1.38.2...v1.38.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 22 | 24.895043136564478 | 1.5163419416402713 | 1.5163419416402713 | 454 | 454 | 17.791474000006247 | 181.73625299999685 |
/health/liveliness | Failed ❌ | 21 | 24.081051182476717 | 15.667753410208178 | 15.667753410208178 | 4691 | 4691 | 17.23909200001117 | 1213.6252360000128 |
/health/readiness | Failed ❌ | 21 | 24.068655364948725 | 15.49407547856656 | 15.49407547856656 | 4639 | 4639 | 17.314572000003636 | 1084.6369509999931 |
Aggregated | Failed ❌ | 21 | 24.11294490177805 | 32.678170830415006 | 32.678170830415006 | 9784 | 9784 | 17.23909200001117 | 1213.6252360000128 |