What's Changed
- Add medlm models to cost map by @skucherlapati in #4766
- feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests by @krrishdholakia in #4762
- Add enabled_roles to Guardrails configuration, Update Lakera guardrail moderation hook by @vingiarrusso in #4729
- feat(proxy): support hiding health check details by @fgreinacher in #4772
- [Feat] Add OpenAI GPT-4o mini by @ishaan-jaff in #4776
- [Feat] run guardrail moderation check on embedding by @ishaan-jaff in #4764
- [FEAT] - add Google AI Studio: gemini-gemma-2-27b-it, gemini-gemma-2-9b-it by @ishaan-jaff in #4782
- Docs - add
LITELLM_SALT_KEY
to docker compose by @ishaan-jaff in #4779 - [Feat-Enterprise] Use free/paid tiers for Virtual Keys by @ishaan-jaff in #4786
- [Feat] Router - Route based on free/paid tier by @ishaan-jaff in #4785
- feat(vertex_ai_anthropic.py): support response_schema for vertex ai anthropic calls by @krrishdholakia in #4784
- [Fix] Admin UI - make ui session last 12 hours by @ishaan-jaff in #4787
- [Feat-Router] - Tag based routing by @ishaan-jaff in #4789
- Alias
/health/liveliness
as/health/liveness
by @msabramo in #4781 - Removed weird replicate model from model prices list by @areibman in #4783
- Add missing
num_gpu
ollama configuration parameter by @titusz in #4773 - docs(docusaurus.config.js): fix docusaurus base url by @krrishdholakia in #4287
- fix ui - make default session 24 hours by @ishaan-jaff in #4791
- UI redirect logout after session, only show 1 error message by @ishaan-jaff in #4792
- docs - show curl examples of how to control cache on / off per request by @ishaan-jaff in #4793
- fix - add fix to update spend logs discrepancy for team spend by @ishaan-jaff in #4794
- fix health check - make sure one failing deployment does not stop the health check by @ishaan-jaff in #4798
- ui - rename api_key -> virtual key by @ishaan-jaff in #4797
- feat(bedrock_httpx.py): add ai21 jamba instruct as bedrock model by @krrishdholakia in #4788
New Contributors
- @vingiarrusso made their first contribution in #4729
- @fgreinacher made their first contribution in #4772
- @areibman made their first contribution in #4783
- @titusz made their first contribution in #4773
Full Changelog: v1.41.24...v1.41.25
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.25
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 110.0 | 133.07006499947943 | 6.410337009992175 | 0.0 | 1918 | 0 | 96.78975299999593 | 3914.958607000017 |
Aggregated | Passed ✅ | 110.0 | 133.07006499947943 | 6.410337009992175 | 0.0 | 1918 | 0 | 96.78975299999593 | 3914.958607000017 |