What's Changed
- Logs page screen size fixed by @NANDINI-star in #14135
- Create Organization Tooltip added on Success by @NANDINI-star in #14132
- [BUG] XAI Cost Calculation Fix by @kankute-sameer in #14127
- Back to Keys should say Back to Logs by @NANDINI-star in #14134
- helm(chart): add optional PodDisruptionBudget for litellm proxy (#14062) by @iabhi4 in #14093
- update together models by @zainhas in #14087
- [Perf] LiteLLM Proxy: +400 RPS when using correct amount of CPU cores by @ishaan-jaff in #14153
- [Bug fix] Misclassified 500 error on invalid image_url in /chat/completions request by @ishaan-jaff in #14149
- openrouter: added gpt 4.1 model family by @mubashir1osmani in #14101
- [Feat] Allow using
x-litellm-stream-timeout
header for stream timeout in requests by @ishaan-jaff in #14147 - [Bug]: Gemini 2.5 Pro – schema validation fails with OpenAI-style type arrays in tools by @ishaan-jaff in #14154
- [Bug Fix] Gemini Tool Calling - fix gemini empty enum property by @ishaan-jaff in #14155
- GPT-5: Drop unsupported params by @LifeDJIK in #14146
- Prometheus missing metrics by @mubashir1osmani in #14139
- Add supported text field to anthropic citation response by @TomeHirata in #14126
- Fix token count error for gemini cli by @retanoj in #14133
- Braintrust - fix logging when OTEL is enabled + Gemini - add 'thoughtSignature' support via 'thinking_blocks' by @krrishdholakia in #14122
- VLLM - handle output parsing responses api output + Ollama - add unified 'thinking' param support (via
reasoning_content
) by @krrishdholakia in #14121 - [Feat]Add support for safety_identifier parameter in chat.completions.create by @kankute-sameer in #14174
- [Feature]: Support GPT-OSS models on vertex ai by @ishaan-jaff in #14184
- [Feature]: Add header support for spend_logs_metadata by @ishaan-jaff in #14186
- OTEL: Optional Metrics and Logs following semantic conventions by @keith-decker in #14179
- fix: Log page parameter passing error by @zhxlp in #14193
- Remove "/" or ":" from model name when being used as h11 header name by @kayoch1n in #14191
- Remove table filter on user info page by @NANDINI-star in #14169
- fix(oci): Handle assistant messages with both content and tool_calls in OCI provider (#14158) by @kutsushitaneko in #14171
- feat: added alert type to alert messate to slack for easier handling by @mjmendo in #14176
- Fix/remove deprecated cerebras gpt oss 20b by @HarshavardhanK in #14213
- [Feat] Support reasoning_effort in Groq by @eycjur in #14207
- Feat - add better SCIM debugging by @ishaan-jaff in #14221
- [Fix] SCIM - Bug fixes for handling SCIM Group Memberships by @ishaan-jaff in #14226
New Contributors
- @iabhi4 made their first contribution in #14093
- @zainhas made their first contribution in #14087
- @LifeDJIK made their first contribution in #14146
- @retanoj made their first contribution in #14133
- @zhxlp made their first contribution in #14193
- @kayoch1n made their first contribution in #14191
- @kutsushitaneko made their first contribution in #14171
- @mjmendo made their first contribution in #14176
- @HarshavardhanK made their first contribution in #14213
- @eycjur made their first contribution in #14207
Full Changelog: v1.76.1.rc.1...v1.76.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.76.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 180.0 | 187.40066063530654 | 6.323880840002156 | 6.323880840002156 | 1892 | 1892 | 147.8367500000104 | 738.2145440000158 |
Aggregated | Failed ❌ | 180.0 | 187.40066063530654 | 6.323880840002156 | 6.323880840002156 | 1892 | 1892 | 147.8367500000104 | 738.2145440000158 |