BerriAI/litellm v1.40.0 on GitHub

What's Changed

fix: fix streaming with httpx client by @krrishdholakia in #3944
feat(scheduler.py): add request prioritization scheduler by @krrishdholakia in #3954
[FEAT] Perf improvements - litellm.completion / litellm.acompletion - Cache OpenAI client by @ishaan-jaff in #3956
fix(http_handler.py): support verify_ssl=False when using httpx client by @krrishdholakia in #3959
Litellm docker compose start by @krrishdholakia in #3961

Full Changelog: v1.39.6...v1.40.0

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.0

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	120.0	133.63252197830545	6.467733658247951	0.0	1936	0	94.77090299998281	801.180971000008
Aggregated	Passed ✅	120.0	133.63252197830545	6.467733658247951	0.0	1936	0	94.77090299998281	801.180971000008