What's Changed
- build: bump litellm-proxy-extras version by @krrishdholakia in #9771
- Update model_prices by @aoaim in #9768
- Move daily user transaction logging outside of 'disable_spend_logs' flag - different tables by @krrishdholakia in #9772
- Add inference providers support for Hugging Face (#8258) (#9738) by @krrishdholakia in #9773
- [UI Bug fix] Don't show duplicate models on Team Admin models page by @ishaan-jaff in #9775
- [UI QA/Bug Fix] - Don't change team, key, org, model values on scroll by @ishaan-jaff in #9776
- [UI Polish] - Polish login screen by @ishaan-jaff in #9778
- Litellm 04 05 2025 release notes by @krrishdholakia in #9785
- feat: add offline swagger docs by @devdev999 in #7653
- fix(gemini/transformation.py): handle file_data being passed in by @krrishdholakia in #9786
- Realtime API Cost tracking by @krrishdholakia in #9795
- fix(vertex_ai.py): move to only passing in accepted keys by vertex ai response schema by @krrishdholakia in #8992
- fix(databricks/chat/transformation.py): remove reasoning_effort from … by @krrishdholakia in #9811
- Handle pydantic base model in message tool calls + Handle tools = [] + handle fireworks ai w/ 'strict' param in function call + support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 by @krrishdholakia in #9774
- Allow passing
thinking
param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) by @krrishdholakia in #9386 - [Feat] LiteLLM Tag/Policy Management by @ishaan-jaff in #9813
- Remove redundant
apk update
in Dockerfiles (cc #5016) by @PeterDaveHello in #9055 - [Security fix - CVE-2025-0330] - Leakage of Langfuse API keys in team exception handling by @ishaan-jaff in #9830
- [Security Fix CVE-2024-6825] Fix remote code execution in post call rules by @ishaan-jaff in #9826
- Bump next from 14.2.25 to 14.2.26 in /ui/litellm-dashboard by @dependabot in #9716
- fix: claude haiku cache read pricing per token by @hewliyang in #9834
- Add service annotations to litellm-helm chart by @mlhynfield in #9840
- Reflect key and team update in UI by @crisshaker in #9825
- Add user alias to API endpoint by @Jacobh2 in #9859
- Update Azure Phi-4 pricing by @emerzon in #9862
- feat: add enterpriseWebSearch tool for vertex-ai by @qvalentin in #9856
- VertexAI non-jsonl file storage support by @krrishdholakia in #9781
- [Bug Fix] Add support for UploadFile on LLM Pass through endpoints (OpenAI, Azure etc) by @ishaan-jaff in #9853
- [Feat SSO] Debug route - allow admins to debug SSO JWT fields by @ishaan-jaff in #9835
- [Feat] - SSO - Use MSFT Graph API to assign users to teams by @ishaan-jaff in #9865
- Cost tracking for
gemini-2.5-pro
by @krrishdholakia in #9837 - [SSO] Connect LiteLLM to Azure Entra ID Enterprise Application by @ishaan-jaff in #9872
New Contributors
- @aoaim made their first contribution in #9768
- @hewliyang made their first contribution in #9834
- @mlhynfield made their first contribution in #9840
- @crisshaker made their first contribution in #9825
- @qvalentin made their first contribution in #9856
Full Changelog: v1.65.4-nightly...v1.65.5-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.5-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 281.015845046053 | 6.098441913282575 | 0.0 | 1824 | 0 | 213.96507000002885 | 5930.206827000006 |
Aggregated | Passed ✅ | 240.0 | 281.015845046053 | 6.098441913282575 | 0.0 | 1824 | 0 | 213.96507000002885 | 5930.206827000006 |