What's Changed
- [Fix]- router/proxy show better client side errors when
no_healthy deployments available
by @ishaan-jaff in #3679 - [Fix] Flush langfuse logs on proxy shutdown by @ishaan-jaff in #3681
- Allow non-admins to use
/engines/{model}/chat/completions
by @msabramo in #3663 - Fix
datetime.datetime.utcnow
DeprecationWarning
by @msabramo in #3686 - [Fix] - include model name in cool down alerts by @ishaan-jaff in #3690
- feat(lago.py): Enable Usage-based billing with lago by @krrishdholakia in #3685
- [UI] End User Spend - Fix Timezone diff bug by @ishaan-jaff in #3692
- [Feat]
token_counter
endpoint by @ishaan-jaff in #3682
Full Changelog: v1.37.12...v1.37.12.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.12.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 20 | 22.19931109401512 | 1.5632483080919861 | 1.5632483080919861 | 468 | 468 | 18.174590999990414 | 121.02210900002319 |
/health/liveliness | Failed ❌ | 18 | 21.37898694340765 | 15.75941349909827 | 15.75941349909827 | 4718 | 4718 | 16.764754999996967 | 224.45532900002263 |
/health/readiness | Failed ❌ | 18 | 21.855020476549342 | 15.52559430771699 | 15.52559430771699 | 4648 | 4648 | 16.63430500002505 | 755.6972119999728 |
Aggregated | Failed ❌ | 18 | 21.64302147305256 | 32.848256114907244 | 32.848256114907244 | 9834 | 9834 | 16.63430500002505 | 755.6972119999728 |