What's Changed
- [Bug Fix] - Allow using
reasoning_effort
for gpt-5 model family andreasoning
for Responses API by @ishaan-jaff in #13475 - [Bug Fix]: Azure OpenAI GPT-5 max_tokens +
reasoning
param support by @ishaan-jaff in #13510 - [Draft] [LLM Translation] Add model id check by @jugaldb in #13507
- [Docs] - Document how to sending tags with Litellm Python SDK Calls to LiteLM Proxy by @ishaan-jaff in #13517
- Fix OCI streaming by @breno-aumo in #13437
- feat: add CometAPI provider support with chat completions and streaming by @TensorNull in #13458
- Allow unsetting TPM and RPM - Teams Settings by @NANDINI-star in #13430
- [Feat] - Add key/team logging for Langfuse OTEL Logger by @ishaan-jaff in #13512
- [Feat] Add Streaming support + Docs for bedrock gpt-oss model family by @ishaan-jaff in #13346
- [Feat] GEMINI CLI Integration - Add /countTokens endpoint support by @ishaan-jaff in #13545
- Feat/sambanova embeddings by @jhpiedrahitao in #13308
- Display Error from Backend on the UI - Keys Page by @NANDINI-star in #13435
- Team Member Permissions Page - Access Column Changes by @NANDINI-star in #13145
- Fix internal users table overflow by @NANDINI-star in #12736
- Enhance chart readability with short-form notation for large numbers by @NANDINI-star in #12370
- [Bug fix] SCIM Team Memberships - handle metadata by @ishaan-jaff in #13553
- [Feat] GEMINI CLI - Add Token Counter for VertexAI Models by @ishaan-jaff in #13558
- [Feat] Add CredentialDeleteModal component and integrate with CredentialsPanel by @jugaldb in #13550
- Implement GitHub Action to auto-label issues with provider keywords by @kankute-sameer in #13537
- LiteLLM SDK <-> Proxy: support
user
param + Prisma - removeuse_prisma_migrate
flag - redundant as this is now default by @krrishdholakia in #13555 - [Fix] Streaming - consistent 'finish_reason' chunk index by @krrishdholakia in #13560
- [Fix] Hide sensitive data in /model/info - azure entra client_secret by @MajorD00m in #13577
- Fix Ollama GPT-OSS streaming with 'thinking' field by @colesmcintosh in #13375
- fix(azure): remove trailing semicolon in Content-Type header for image generation by @VerunicaM in #13584
- Remove ambiguous network response error by @NANDINI-star in #13582
- [fix] Enhance MCPServerManager with access groups and description support by @jugaldb in #13549
- [Feat] New model
vertex_ai/deepseek-ai/deepseek-r1-0528-maas
by @ishaan-jaff in #13594 - [Docs] Update build from pip docs - new prisma migrate by @ishaan-jaff in #13603
- [Feat] New provider - Azure AI Flux Image Generation by @ishaan-jaff in #13592
- [Feat] Team Member Rate Limits + Support for using with JWT Auth by @ishaan-jaff in #13601
- Fix e2e_ui_testing by @NANDINI-star in #13610
- fix(volcengine): handle thinking disabled parameter properly by @colesmcintosh in #13598
- [Feat] Add
reasoning_effort
param for hosted_vllm provider by @ishaan-jaff in #13620 - perf(main.py): new 'EXPERIMENTAL_OPENAI_BASE_LLM_HTTP_HANDLER' flag (+100 rps improvement on openai calls) by @krrishdholakia in #13625
- Add deepseek-chat-v3-0324 to OpenRouter cost map by @huangyafei in #13607
- [LLM Translation/Proxy] Fix - add safe divide by 0 for most places to prevent crash by @jugaldb in #13624
- [LLM translation] Refactor Anthropic Configurations and Add Support for
anthropic_beta
Headers by @jugaldb in #13590 - [Management/UI]Allow routes for admin viewer by @jugaldb in #13588
- [Proxy] Litellm fix mapped tests by @jugaldb in #13634
- Update mlflow logger usage span attributes by @TomeHirata in #13561
New Contributors
- @TensorNull made their first contribution in #13458
- @MajorD00m made their first contribution in #13577
- @VerunicaM made their first contribution in #13584
- @huangyafei made their first contribution in #13607
- @TomeHirata made their first contribution in #13561
Full Changelog: v1.75.5.rc.1...v1.75.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 110.0 | 152.45377470174589 | 6.50695562099742 | 0.0 | 1948 | 0 | 86.13810599996441 | 2202.5806519999946 |
Aggregated | Passed ✅ | 110.0 | 152.45377470174589 | 6.50695562099742 | 0.0 | 1948 | 0 | 86.13810599996441 | 2202.5806519999946 |