What's Changed
- LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 by @krrishdholakia in #7643
- Litellm dev 01 08 2025 p1 by @krrishdholakia in #7640
- (proxy - RPS) - Get 2K RPS at 4 instances, minor fix for caching_handler by @ishaan-jaff in #7655
- (proxy - RPS) - Get 2K RPS at 4 instances, minor fix
aiohttp_openai/
by @ishaan-jaff in #7659 - (proxy perf improvement) - use
uvloop
for higher RPS (10%-20% higher RPS) by @ishaan-jaff in #7662 - (Feat - Batches API) add support for retrieving vertex api batch jobs by @ishaan-jaff in #7661
- (proxy-latency fixes) use asyncio tasks for logging db metrics by @ishaan-jaff in #7663
Full Changelog: v1.57.4...v1.57.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.57.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 282.70225500655766 | 6.115771768544881 | 0.0 | 1830 | 0 | 206.44150200001832 | 3375.4479410000044 |
Aggregated | Passed ✅ | 230.0 | 282.70225500655766 | 6.115771768544881 | 0.0 | 1830 | 0 | 206.44150200001832 | 3375.4479410000044 |