BerriAI/litellm v1.40.31 on GitHub

What's Changed

[Fix] Azure Post-API Call occurs before Pre-API Call in CustomLogger by @ishaan-jaff in #4451
[Fix-Proxy] Fix in memory caching memory leak by @ishaan-jaff in #4366
fix: do not resolve vertex project id from creds by @ushuz in #4445
fix(utils.py): return 'response_cost' in completion call by @krrishdholakia in #4436
feat(azure.py): azure tts support by @krrishdholakia in #4449
fix(token_counter.py): New `get_modified_max_tokens' helper func by @krrishdholakia in #4446
docs: minor link repairs by @nibalizer in #4460
Fix typo by @lnguyen in #4457
Docs create pass through routes litellm proxy (tutorial setup cohere Re-Rank Endpoint) by @ishaan-jaff in #4463
[Feat] Allow users to set pass through endpoint + add Cohere Re-Rank by @ishaan-jaff in #4462
[Enterprise] Return Raw response from Lakera in failed responses by @ishaan-jaff in #4464
[Feat] - Proxy support Passing through Langfuse requests by @ishaan-jaff in #4465

Full Changelog: v1.40.29...v1.40.31

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.31

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	77	91.1696113255562	6.455451046411027	0.0	1929	0	66.79889399998729	1628.1963670000437
Aggregated	Passed ✅	77	91.1696113255562	6.455451046411027	0.0	1929	0	66.79889399998729	1628.1963670000437