What's Changed
- Litellm release notes 07 12 2025 by @krrishdholakia in #12563
- Add Bytez to the list of providers in the docs by @inf3rnus in #12588
- [Feat] New LLM API Integration - Add Moonshot API (Kimi) (#12551) by @ishaan-jaff in #12592
- [Feat] Add ai21/jamba-1.7 model family pricing by @ishaan-jaff in #12593
- fix: add implicit caching cost calculation for Gemini 2.x models by @colesmcintosh in #12585
- Updated release notes by @krrishdholakia in #12594
- [Feat] Vector Stores - Add Vertex RAG Engine API as a provider by @ishaan-jaff in #12595
- Wildcard model filter by @NANDINI-star in #12597
- [Bug fix] [Bug]: Verbose log is enabled by default by @ishaan-jaff in #12596
- Control Plane + Data Plane support by @krrishdholakia in #12601
- Claude 4 Bedrock /invoke route support + Bedrock application inference profile tool choice support by @krrishdholakia in #12599
- refactor(prisma_migration.py): refactor to support use_prisma_migrate - for helm hook by @krrishdholakia in #12600
- feat: Add envVars and extraEnvVars support to Helm migrations job by @AntonioKL in #12591
- feat(gemini): Add custom TTL support for context caching (#9810) by @marcelodiaz558 in #12541
- fix(anthropic): fix streaming + response_format + tools bug by @dmcaulay in #12463
- [Bug Fix] Include /mcp in list of available routes on proxy by @ishaan-jaff in #12612
- Add Copy-on-Click for IDs by @NANDINI-star in #12615
- add azure blob cache support by @demoray in #12587
- refactor(mcp): Make MCP_TOOL_PREFIX_SEPARATOR configurable from env by @juancarlosm in #12603
- [Bug Fix] Add swagger docs for LiteLLM /chat/completions, /embeddings, /responses by @ishaan-jaff in #12618
- [Docs] troubleshooting SSO configs by @ishaan-jaff in #12621
- [Feat] MCP Gateway - allow using MCPs with all LLM APIs when using /responses with LiteLLM by @ishaan-jaff in #12546
- rm claude instant 1 and 1.2 from model_prices_and_context_window.json by @staeiou in #12631
- Add "keys import" command to CLI by @msabramo in #12620
- Add token pricing for Together.ai Llama-4 and DeepSeek models by @stefanc-ai2 in #12622
- Add input_cost_per_pixel to values in ModelGroupInfo model by @Mte90 in #12604
- fix: role chaining with webauthentication for aws bedrock by @RichardoC in #12607
- (#11794) use upsert for managed object table rather than create to avoid UniqueViolationError by @yeahyung in #11795
- [Bug Fix] [Bug]: Knowledge Base Call returning error by @ishaan-jaff in #12628
- fix(router.py): use more descriptive error message + UI - enable team admins to update member role by @krrishdholakia in #12629
- fix(proxy_server.py): fixes for handling team only models on UI by @krrishdholakia in #12632
- OpenAI deepresearch models via
.completion
support by @krrishdholakia in #12627 - fix: Handle circular references in spend tracking metadata JSON serialization by @colesmcintosh in #12643
- Fix bedrock nova micro and lite info by @mnguyen96 in #12619
- [New Model] add together_ai/moonshotai/Kimi-K2-Instruct by @ishaan-jaff in #12645
- Add groq/moonshotai-kimi-k2-instruct model configuration by @colesmcintosh in #12648
- [Bug Fix] grok-4 does not support the
stop
param by @ishaan-jaff in #12646 - Add GitHub Copilot LiteLLM tutorial by @colesmcintosh in #12649
- Fix unused imports in completion_extras transformation by @colesmcintosh in #12655
- [MCP Gateway] Allow MCP access groups to be added via the config LIT-312 by @jugaldb in #12654
- [MCP Gateway] List tools from access list for keys by @jugaldb in #12657
- [MCP Gateway] Allow MCP sse and http to have namespaced url for better segregation LIT-304 by @jugaldb in #12658
- [Feat] Allow reading custom logger python scripts from s3 by @ishaan-jaff in #12623
- [Feat] UI - Add
end_user
filter on UI by @ishaan-jaff in #12663 - [Bug Fix] StandardLoggingPayload on cache_hits should track custom llm provider + DD LLM Obs span type by @ishaan-jaff in #12652
- [Bug Fix] SCIM - add GET /ServiceProviderConfig by @ishaan-jaff in #12664
- feat: add input_fidelity parameter for OpenAI image generation by @colesmcintosh in #12662
New Contributors
- @AntonioKL made their first contribution in #12591
- @marcelodiaz558 made their first contribution in #12541
- @dmcaulay made their first contribution in #12463
- @demoray made their first contribution in #12587
- @staeiou made their first contribution in #12631
- @stefanc-ai2 made their first contribution in #12622
- @RichardoC made their first contribution in #12607
- @yeahyung made their first contribution in #11795
- @mnguyen96 made their first contribution in #12619
Full Changelog: v1.74.3.rc.1...v1.74.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.4-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 258.63325774080073 | 6.1786141049802525 | 0.0 | 1848 | 0 | 211.92541800002118 | 1368.992559999981 |
Aggregated | Passed ✅ | 240.0 | 258.63325774080073 | 6.1786141049802525 | 0.0 | 1848 | 0 | 211.92541800002118 | 1368.992559999981 |