We're launching tracking LLM API Outages on LiteLLM 1.43.5 📈 Start here https://docs.litellm.ai/docs/proxy/prometheus
🪨 [Fix] Support for AWS Bedrock translating tool call names
✨ UI add support for adding Cohere embedding models on UI
💵 Added cost tracking for Cohere embedding models
🛠️ [Feat] v2 prometheus deployment outage, healthy, partial outage alerting
🪢 [Feat-Langfuse] log VertexAI Grounding Metadata as Spans
What's Changed
- feat: hash prompt when caching by @prd-tuong-nguyen in #5105
- feat: set max_internal_budget for user w/ sso by @krrishdholakia in #5120
- Litellm sso team member add by @krrishdholakia in #5129
- [Feat] Add pricing for cohere embedding models by @ishaan-jaff in #5137
- [Feat] v2 prometheus deployment outage, healthy, partial outage alerting by @ishaan-jaff in #5134
- ui allow adding cohere models by @ishaan-jaff in #5136
- Feat - Translate openai function names to bedrock converse schema by @ishaan-jaff in #5138
- [Feat-Langfuse] log VertexAI Grounding Metadata as Spans by @ishaan-jaff in #5139
- [Fix] Place bedrock modified tool call name in output by @ishaan-jaff in #5144
New Contributors
- @prd-tuong-nguyen made their first contribution in #5105
Full Changelog: v1.43.4...v1.43.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 165.77518946539055 | 6.278079718762026 | 0.0 | 1878 | 0 | 111.6302299999461 | 1672.4383819999957 |
Aggregated | Passed ✅ | 140.0 | 165.77518946539055 | 6.278079718762026 | 0.0 | 1878 | 0 | 111.6302299999461 | 1672.4383819999957 |