BerriAI/litellm v1.43.5 on GitHub

We're launching tracking LLM API Outages on LiteLLM 1.43.5 📈 Start here https://docs.litellm.ai/docs/proxy/prometheus
🪨 [Fix] Support for AWS Bedrock translating tool call names

✨ UI add support for adding Cohere embedding models on UI

💵 Added cost tracking for Cohere embedding models

🛠️ [Feat] v2 prometheus deployment outage, healthy, partial outage alerting

🪢 [Feat-Langfuse] log VertexAI Grounding Metadata as Spans

What's Changed

feat: hash prompt when caching by @prd-tuong-nguyen in #5105
feat: set max_internal_budget for user w/ sso by @krrishdholakia in #5120
Litellm sso team member add by @krrishdholakia in #5129
[Feat] Add pricing for cohere embedding models by @ishaan-jaff in #5137
[Feat] v2 prometheus deployment outage, healthy, partial outage alerting by @ishaan-jaff in #5134
ui allow adding cohere models by @ishaan-jaff in #5136
Feat - Translate openai function names to bedrock converse schema by @ishaan-jaff in #5138
[Feat-Langfuse] log VertexAI Grounding Metadata as Spans by @ishaan-jaff in #5139
[Fix] Place bedrock modified tool call name in output by @ishaan-jaff in #5144

New Contributors

@prd-tuong-nguyen made their first contribution in #5105

Full Changelog: v1.43.4...v1.43.5

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.5

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	140.0	165.77518946539055	6.278079718762026	0.0	1878	0	111.6302299999461	1672.4383819999957
Aggregated	Passed ✅	140.0	165.77518946539055	6.278079718762026	0.0	1878	0	111.6302299999461	1672.4383819999957