What's Changed
- Add support for async streaming to watsonx provider by @simonsanvil in #3479
- feat(proxy_server.py): add CRUD endpoints for 'end_user' management by @krrishdholakia in #3536
- Revert "Add support for async streaming to watsonx provider " by @krrishdholakia in #3546
- [Feat] support
stream_options
param for OpenAI by @ishaan-jaff in #3537
Full Changelog: v1.36.4-stable...v1.37.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 24 | 30.920653391666942 | 1.6032874745747854 | 1.6032874745747854 | 480 | 480 | 22.7277699999604 | 1106.2691980000068 |
/health/liveliness | Failed ❌ | 23 | 27.53817710301093 | 15.531847409943232 | 15.531847409943232 | 4650 | 4650 | 21.782322000035492 | 1163.796681000008 |
/health/readiness | Failed ❌ | 23 | 27.385915189429188 | 15.79906198903903 | 15.79906198903903 | 4730 | 4730 | 21.733759999960967 | 370.47772500000065 |
Aggregated | Failed ❌ | 23 | 27.62979878326593 | 32.93419687355705 | 32.93419687355705 | 9860 | 9860 | 21.733759999960967 | 1163.796681000008 |