BerriAI/litellm v1.48.5.dev1 on GitHub

What's Changed

(perf improvement proxy) use one async async_batch_set_cache in parallel request limiter by @ishaan-jaff in #5956
(fix proxy) model_group/info support rerank models by @ishaan-jaff in #5955
(perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement) by @ishaan-jaff in #5960
(perf proxy) don't run redis async_set_cache_pipeline when empty list passed to it by @ishaan-jaff in #5962
[Feat Proxy] Allow using hypercorn for http v2 by @ishaan-jaff in #5950

Full Changelog: v1.48.5...v1.48.5.dev1

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.48.5.dev1

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.48.5.dev1

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	129.13028895631774	6.429070638139938	0.0	1923	0	90.43558999997003	762.7744659999962
Aggregated	Passed ✅	110.0	129.13028895631774	6.429070638139938	0.0	1923	0	90.43558999997003	762.7744659999962