BerriAI/litellm v1.48.6 on GitHub

What's Changed

(perf improvement proxy) use one async async_batch_set_cache in parallel request limiter by @ishaan-jaff in #5956
(fix proxy) model_group/info support rerank models by @ishaan-jaff in #5955
(perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement) by @ishaan-jaff in #5960
(perf proxy) don't run redis async_set_cache_pipeline when empty list passed to it by @ishaan-jaff in #5962
[Feat Proxy] Allow using hypercorn for http v2 by @ishaan-jaff in #5950
(feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus by @ishaan-jaff in #5968
(proxy prometheus) track api key and team in latency metrics by @ishaan-jaff in #5966
(feat prometheus proxy) track remaining team and key alias in deployment failure metrics by @ishaan-jaff in #5967
(proxy docker) add sentry sdk to litellm docker by @ishaan-jaff in #5965

Full Changelog: v1.48.5...v1.48.6

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.48.6

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	123.41082122468366	6.334644427987359	0.0	1896	0	88.9820840000084	3007.4007179999853
Aggregated	Passed ✅	110.0	123.41082122468366	6.334644427987359	0.0	1896	0	88.9820840000084	3007.4007179999853