BerriAI/litellm v1.75.6-nightly on GitHub

What's Changed

[Bug Fix] - Allow using reasoning_effort for gpt-5 model family and reasoning for Responses API by @ishaan-jaff in #13475
[Bug Fix]: Azure OpenAI GPT-5 max_tokens + reasoning param support by @ishaan-jaff in #13510
[Draft] [LLM Translation] Add model id check by @jugaldb in #13507
[Docs] - Document how to sending tags with Litellm Python SDK Calls to LiteLM Proxy by @ishaan-jaff in #13517
Fix OCI streaming by @breno-aumo in #13437
feat: add CometAPI provider support with chat completions and streaming by @TensorNull in #13458
Allow unsetting TPM and RPM - Teams Settings by @NANDINI-star in #13430
[Feat] - Add key/team logging for Langfuse OTEL Logger by @ishaan-jaff in #13512
[Feat] Add Streaming support + Docs for bedrock gpt-oss model family by @ishaan-jaff in #13346
[Feat] GEMINI CLI Integration - Add /countTokens endpoint support by @ishaan-jaff in #13545
Feat/sambanova embeddings by @jhpiedrahitao in #13308
Display Error from Backend on the UI - Keys Page by @NANDINI-star in #13435
Team Member Permissions Page - Access Column Changes by @NANDINI-star in #13145
Fix internal users table overflow by @NANDINI-star in #12736
Enhance chart readability with short-form notation for large numbers by @NANDINI-star in #12370
[Bug fix] SCIM Team Memberships - handle metadata by @ishaan-jaff in #13553
[Feat] GEMINI CLI - Add Token Counter for VertexAI Models by @ishaan-jaff in #13558
[Feat] Add CredentialDeleteModal component and integrate with CredentialsPanel by @jugaldb in #13550
Implement GitHub Action to auto-label issues with provider keywords by @kankute-sameer in #13537
LiteLLM SDK <-> Proxy: support user param + Prisma - remove use_prisma_migrate flag - redundant as this is now default by @krrishdholakia in #13555
[Fix] Streaming - consistent 'finish_reason' chunk index by @krrishdholakia in #13560
[Fix] Hide sensitive data in /model/info - azure entra client_secret by @MajorD00m in #13577
Fix Ollama GPT-OSS streaming with 'thinking' field by @colesmcintosh in #13375
fix(azure): remove trailing semicolon in Content-Type header for image generation by @VerunicaM in #13584
Remove ambiguous network response error by @NANDINI-star in #13582
[fix] Enhance MCPServerManager with access groups and description support by @jugaldb in #13549
[Feat] New model vertex_ai/deepseek-ai/deepseek-r1-0528-maas by @ishaan-jaff in #13594
[Docs] Update build from pip docs - new prisma migrate by @ishaan-jaff in #13603
[Feat] New provider - Azure AI Flux Image Generation by @ishaan-jaff in #13592
[Feat] Team Member Rate Limits + Support for using with JWT Auth by @ishaan-jaff in #13601
Fix e2e_ui_testing by @NANDINI-star in #13610
fix(volcengine): handle thinking disabled parameter properly by @colesmcintosh in #13598
[Feat] Add reasoning_effort param for hosted_vllm provider by @ishaan-jaff in #13620
perf(main.py): new 'EXPERIMENTAL_OPENAI_BASE_LLM_HTTP_HANDLER' flag (+100 rps improvement on openai calls) by @krrishdholakia in #13625
Add deepseek-chat-v3-0324 to OpenRouter cost map by @huangyafei in #13607
[LLM Translation/Proxy] Fix - add safe divide by 0 for most places to prevent crash by @jugaldb in #13624
[LLM translation] Refactor Anthropic Configurations and Add Support for anthropic_beta Headers by @jugaldb in #13590
[Management/UI]Allow routes for admin viewer by @jugaldb in #13588
[Proxy] Litellm fix mapped tests by @jugaldb in #13634
Update mlflow logger usage span attributes by @TomeHirata in #13561

New Contributors

@TensorNull made their first contribution in #13458
@MajorD00m made their first contribution in #13577
@VerunicaM made their first contribution in #13584
@huangyafei made their first contribution in #13607
@TomeHirata made their first contribution in #13561

Full Changelog: v1.75.5.rc.1...v1.75.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.75.6-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	110.0	152.45377470174589	6.50695562099742	0.0	1948	0	86.13810599996441	2202.5806519999946
Aggregated	Passed ✅	110.0	152.45377470174589	6.50695562099742	0.0	1948	0	86.13810599996441	2202.5806519999946