BerriAI/litellm v1.41.3.dev3 on GitHub

What's Changed

fix(slack_alerting.py): use in-memory cache for checking request status by @krrishdholakia in #4520
feat(vertex_httpx.py): Support cachedContent. by @Manouchehri in #4492
[Fix+Test] /audio/transcriptions - use initialized OpenAI / Azure OpenAI clients by @ishaan-jaff in #4519
[Fix-Proxy] Background health checks use deep copy of model list for _run_background_health_check by @ishaan-jaff in #4518
refactor(azure.py): move azure dall-e calls to httpx client by @krrishdholakia in #4523

Full Changelog: v1.41.3.dev2...v1.41.3.dev3

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.3.dev3

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	100.0	121.89055827443096	6.607625351081853	0.0	1975	0	85.08053399998516	1232.177610000008
Aggregated	Passed ✅	100.0	121.89055827443096	6.607625351081853	0.0	1975	0	85.08053399998516	1232.177610000008