BerriAI/litellm v1.74.4-nightly on GitHub

What's Changed

Litellm release notes 07 12 2025 by @krrishdholakia in #12563
Add Bytez to the list of providers in the docs by @inf3rnus in #12588
[Feat] New LLM API Integration - Add Moonshot API (Kimi) (#12551) by @ishaan-jaff in #12592
[Feat] Add ai21/jamba-1.7 model family pricing by @ishaan-jaff in #12593
fix: add implicit caching cost calculation for Gemini 2.x models by @colesmcintosh in #12585
Updated release notes by @krrishdholakia in #12594
[Feat] Vector Stores - Add Vertex RAG Engine API as a provider by @ishaan-jaff in #12595
Wildcard model filter by @NANDINI-star in #12597
[Bug fix] [Bug]: Verbose log is enabled by default by @ishaan-jaff in #12596
Control Plane + Data Plane support by @krrishdholakia in #12601
Claude 4 Bedrock /invoke route support + Bedrock application inference profile tool choice support by @krrishdholakia in #12599
refactor(prisma_migration.py): refactor to support use_prisma_migrate - for helm hook by @krrishdholakia in #12600
feat: Add envVars and extraEnvVars support to Helm migrations job by @AntonioKL in #12591
feat(gemini): Add custom TTL support for context caching (#9810) by @marcelodiaz558 in #12541
fix(anthropic): fix streaming + response_format + tools bug by @dmcaulay in #12463
[Bug Fix] Include /mcp in list of available routes on proxy by @ishaan-jaff in #12612
Add Copy-on-Click for IDs by @NANDINI-star in #12615
add azure blob cache support by @demoray in #12587
refactor(mcp): Make MCP_TOOL_PREFIX_SEPARATOR configurable from env by @juancarlosm in #12603
[Bug Fix] Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses by @ishaan-jaff in #12618
[Docs] troubleshooting SSO configs by @ishaan-jaff in #12621
[Feat] MCP Gateway - allow using MCPs with all LLM APIs when using /responses with LiteLLM by @ishaan-jaff in #12546
rm claude instant 1 and 1.2 from model_prices_and_context_window.json by @staeiou in #12631
Add "keys import" command to CLI by @msabramo in #12620
Add token pricing for Together.ai Llama-4 and DeepSeek models by @stefanc-ai2 in #12622
Add input_cost_per_pixel to values in ModelGroupInfo model by @Mte90 in #12604
fix: role chaining with webauthentication for aws bedrock by @RichardoC in #12607
(#11794) use upsert for managed object table rather than create to avoid UniqueViolationError by @yeahyung in #11795
[Bug Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #12628
fix(router.py): use more descriptive error message + UI - enable team admins to update member role by @krrishdholakia in #12629
fix(proxy_server.py): fixes for handling team only models on UI by @krrishdholakia in #12632
OpenAI deepresearch models via .completion support by @krrishdholakia in #12627
fix: Handle circular references in spend tracking metadata JSON serialization by @colesmcintosh in #12643
Fix bedrock nova micro and lite info by @mnguyen96 in #12619
[New Model] add together_ai/moonshotai/Kimi-K2-Instruct by @ishaan-jaff in #12645
Add groq/moonshotai-kimi-k2-instruct model configuration by @colesmcintosh in #12648
[Bug Fix] grok-4 does not support the stop param by @ishaan-jaff in #12646
Add GitHub Copilot LiteLLM tutorial by @colesmcintosh in #12649
Fix unused imports in completion_extras transformation by @colesmcintosh in #12655
[MCP Gateway] Allow MCP access groups to be added via the config LIT-312 by @jugaldb in #12654
[MCP Gateway] List tools from access list for keys by @jugaldb in #12657
[MCP Gateway] Allow MCP sse and http to have namespaced url for better segregation LIT-304 by @jugaldb in #12658
[Feat] Allow reading custom logger python scripts from s3 by @ishaan-jaff in #12623
[Feat] UI - Add end_user filter on UI by @ishaan-jaff in #12663
[Bug Fix] StandardLoggingPayload on cache_hits should track custom llm provider + DD LLM Obs span type by @ishaan-jaff in #12652
[Bug Fix] SCIM - add GET /ServiceProviderConfig by @ishaan-jaff in #12664
feat: add input_fidelity parameter for OpenAI image generation by @colesmcintosh in #12662

New Contributors

@AntonioKL made their first contribution in #12591
@marcelodiaz558 made their first contribution in #12541
@dmcaulay made their first contribution in #12463
@demoray made their first contribution in #12587
@staeiou made their first contribution in #12631
@stefanc-ai2 made their first contribution in #12622
@RichardoC made their first contribution in #12607
@yeahyung made their first contribution in #11795
@mnguyen96 made their first contribution in #12619

Full Changelog: v1.74.3.rc.1...v1.74.4-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.4-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	240.0	258.63325774080073	6.1786141049802525	0.0	1848	0	211.92541800002118	1368.992559999981
Aggregated	Passed ✅	240.0	258.63325774080073	6.1786141049802525	0.0	1848	0	211.92541800002118	1368.992559999981