What's Changed
- docs: Update references to Ollama repository url by @DaxServer in #2778
- fix(docs): Correct Docker pull command in deploy.md by @DaxServer in #2779
- (fix) improve async perf by 100ms by @ishaan-jaff in #2774
- support cohere_chat in get_api_key by @phact in #2782
- fix(router.py): fix check for context window fallbacks by @krrishdholakia in #2783
- [Feat] Proxy - high traffic redis caching - when using
url
by @ishaan-jaff in #2785 - fix(proxy_server.py): don't require scope for team-based jwt access by @krrishdholakia in #2787
- [Feat] Allow using model = * on proxy config.yaml by @ishaan-jaff in #2788
New Contributors
- @DaxServer made their first contribution in #2778
- @phact made their first contribution in #2782
Full Changelog: v1.34.18...v1.34.19
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 82 | 89.06715224896365 | 1.6099163983391747 | 0.0 | 482 | 0 | 75.70292299999437 | 1423.435052000002 |
/health/liveliness | Passed ✅ | 66 | 68.38107856032434 | 15.254124878039441 | 0.0033400755152265035 | 4567 | 1 | 63.28286600000865 | 1515.330511000002 |
/health/readiness | Passed ✅ | 66 | 69.006023757576 | 15.54137137234892 | 0.0 | 4653 | 0 | 63.62064799998279 | 1312.5853199999824 |
Aggregated | Passed ✅ | 66 | 69.70849120933853 | 32.405412648727534 | 0.0033400755152265035 | 9702 | 1 | 63.28286600000865 | 1515.330511000002 |