What's Changed
- LLM Guardrails - Support lakera config thresholds + custom api base by @krrishdholakia in #5076
- Revert "Fix: Add prisma binary_cache_dir specification to pyproject.toml" by @ishaan-jaff in #5085
- add ft:gpt-4o-mini-2024-07-18 to model prices by @ishaan-jaff in #5084
- [Fix-Bug]: Using extra_headers removes the OpenRouter HTTP-Referer/X-Title headers by @ishaan-jaff in #5086
- [Feat] - Prometheus Metrics to monitor a model / deployment health by @ishaan-jaff in #5092
- [Fix] Init Prometheus Service Logger when it's None by @ishaan-jaff in #5088
- fix(anthropic.py): handle anthropic returning empty argument string (invalid json str) for tool call while streaming by @krrishdholakia in #5091
- Clarifai : Removed model name casing issue by @Mogith-P-N in #5095
- feat(utils.py): support passing response_format as pydantic model by @krrishdholakia in #5079
- [Feat-Router + Proxy] Add provider wildcard routing by @ishaan-jaff in #5098
- Add deepseek-coder-v2(-lite), mistral-large, codegeex4 to ollama by @sammcj in #5100
New Contributors
- @Mogith-P-N made their first contribution in #5095
- @sammcj made their first contribution in #5100
Full Changelog: v1.43.1...v1.43.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 100.0 | 144.3065982356908 | 6.422272607319682 | 0.0 | 1922 | 0 | 83.24493999998595 | 25498.95384100003 |
Aggregated | Passed ✅ | 100.0 | 144.3065982356908 | 6.422272607319682 | 0.0 | 1922 | 0 | 83.24493999998595 | 25498.95384100003 |