What's Changed
- Litellm dev 01 20 2025 p3 by @krrishdholakia in #7890
- (e2e testing + minor refactor) - Virtual Key Max budget check by @ishaan-jaff in #7888
- fix(proxy_server.py): fix get model info when litellm_model_id is set + move model analytics to free by @krrishdholakia in #7886
- fix: add default credential for azure (#7095) by @krrishdholakia in #7891
- (Bug fix) - Allow setting
null
formax_budget
,rpm_limit
,tpm_limit
when updating values on a team by @ishaan-jaff in #7912 - (fix langfuse tags) - read tags from
StandardLoggingPayload
by @ishaan-jaff in #7903 - (Feat) Add x-litellm-overhead-duration-ms and "x-litellm-response-duration-ms" in response from LiteLLM by @ishaan-jaff in #7899
- (Code quality) - Ban recursive functions in codebase by @ishaan-jaff in #7910
- Litellm dev 01 21 2025 p1 by @krrishdholakia in #7898
- (Feat - prometheus) - emit
litellm_overhead_latency_metric
by @ishaan-jaff in #7913
Full Changelog: v1.59.1...v1.59.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.59.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 277.37964377510815 | 6.123201928767048 | 0.0 | 1832 | 0 | 225.21770500003413 | 1457.6771990000168 |
Aggregated | Passed ✅ | 250.0 | 277.37964377510815 | 6.123201928767048 | 0.0 | 1832 | 0 | 225.21770500003413 | 1457.6771990000168 |