What's Changed
- fix: fix streaming with httpx client by @krrishdholakia in #3944
- feat(scheduler.py): add request prioritization scheduler by @krrishdholakia in #3954
- [FEAT] Perf improvements - litellm.completion / litellm.acompletion - Cache OpenAI client by @ishaan-jaff in #3956
- fix(http_handler.py): support verify_ssl=False when using httpx client by @krrishdholakia in #3959
- Litellm docker compose start by @krrishdholakia in #3961
Full Changelog: v1.39.6...v1.40.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 133.63252197830545 | 6.467733658247951 | 0.0 | 1936 | 0 | 94.77090299998281 | 801.180971000008 |
Aggregated | Passed ✅ | 120.0 | 133.63252197830545 | 6.467733658247951 | 0.0 | 1936 | 0 | 94.77090299998281 | 801.180971000008 |