github BerriAI/litellm v1.77.1.rc.1-v2
v1.77.7.rc.1-v2

latest releases: v1.77.7-nightly, v1.77.7.rc.1, v1.77.5-stable...
pre-release13 hours ago

What's Changed

  • merge main by @Sameerlite in #14957
  • [Fix] Fix LiteLLM model name fallback in dashboard overview by @herve-ves in #14998
  • [Fix] Use the extra_query parameter for GET requests in Azure Batch by @eycjur in #14997
  • MCP - specify forwardable headers, specify allowed/disallowed tools for MCP servers by @krrishdholakia in #15002
  • [Fix] response_format bug in hosted vllm audio_transcription by @eycjur in #15010
  • feat: add ollama cloud models by @wenxi-onyx in #15008
  • fix passthrough of atranscription into kwargs going to upstream provider by @jpetrucciani in #15005
  • Update litellm docs from latest release by @berri-teddy in #15004
  • Feat: Add Javelin standalone guardrails integration for LiteLLM Proxy by @abhijitjavelin in #14983
  • Fix/remove servername prefix mcp tools tests by @uc4w6c in #14986
  • Updated the behavior of Vertex AI and Google AI Studio when using a custom api_base by @ZeroClover in #15039
  • Revert "Updated the behavior of Vertex AI and Google AI Studio when using a custom api_base" by @ishaan-jaff in #15042
  • [Feat] Add new claude-sonnet-4-5 model family by @ishaan-jaff in #15041
  • (Feat) Add cost tracking for Vertex AI Passthrough /predict endpoint by @Sameerlite in #15019
  • Add anthropic/claude-sonnet-4-5 to model price json by @ishaan-jaff in #15049
  • [Feat] Add litellm overhead metric for VertexAI by @ishaan-jaff in #15040
  • fix: remove router inefficiencies (from O(M*N) to O(1)) - 62.5% faster P99 latency by @AlexsanderHamir in #15046
  • [Feat] LiteLLM Overhead metric tracking - Add support for tracking litellm overhead on cache hits by @ishaan-jaff in #15045
  • [Fix] Parallel Request Limiter v3 - use well known redis cluster hashing algorithm by @ishaan-jaff in #15052
  • Fix: Add /v1/messages/count_tokens to Anthropic routes for non-admin user access by @Copilot in #15034
  • [Feat] Return Cost for Responses API Streaming requests by @ishaan-jaff in #15053
  • doc: add missing api_key parameter by @uc4w6c in #15058
  • fix: resolve regression with duplicate Mcp-Protocol-Version header by @uc4w6c in #15050
  • fix: remove invalid vertex -latest models by @cedarm in #15043
  • fix: use extra_query for download results (Batch API) by @Isydmr in #15025
  • Ignore type param for gemini tools by @Sameerlite in #15022
  • [bug]: Update request handling for original exceptions by @serializer in #15013
  • Add AMD Lemonade provider support by @eddierichter-amd in #14840
  • feat: add groq/moonshotai/kimi-k2-instruct-0905 by @ishaan-jaff in #15079
  • [Feat] UI - add snowflake on UI by @ishaan-jaff in #15083
  • [Fix] Router - Remove hasattr checks by @AlexsanderHamir in #15082
  • [Fix] Router - Remove Double Lookups by @AlexsanderHamir in #15084
  • [Feature]: Replace HTTPException with ParallelRequestLimitError in parallel_request_limiter_v3 by @Copilot in #15033
  • [Fix Security] Ensure OCI secret fields not shared on /models and /v1/models endpoints by @ishaan-jaff in #15085
  • [Bug Fix] Passthrough API Endpoints - Ensure query params are forwarded from origin url to downstream request by @ishaan-jaff in #15087
  • Make UI theme settings publicly accessible for custom branding by @Jetemple in #15074
  • MCP - enforce server permissions on call tools + Teams - add model specific tpm/rpm limits to teams on LiteLLM by @krrishdholakia in #15044
  • [Performance] Reduce complexity of InMemoryCache.evict_cache from O(n*log(n)) to O(log(n)) by @malags in #15000
  • [Feat] Guardrails - add logging for important status fields by @ishaan-jaff in #15090
  • [Fix] Router - optimize _filter_cooldown_deployments from O(n×m + k×n) to O(n) by @AlexsanderHamir in #15091
  • Add support for GPT 5 codex models by @uzaxirr in #14841
  • (Feat) Add Vertex AI Live API WebSocket Passthrough with Cost Tracking by @Sameerlite in #14956
  • Revert "[Feature]: Replace HTTPException with ParallelRequestLimitError in parallel_request_limiter_v3" by @ishaan-jaff in #15095
  • feat(gemini): Add full support for native Gemini API translation by @henryhwang in #15029
  • Add Gemini generateContent passthrough cost tracking by @Sameerlite in #15014
  • Fix missing HTTPException import by @plafleur in #15111
  • fix: model_group not always present in litellm_params, and metadata r… by @luizrennocosta in #15108
  • [Fix] Proxy Auth - Ensure LLM_API_KEYs can access pass through routes by @ishaan-jaff in #15115
  • [Bug Fix] gpt-5-chat-latest has incorrect max_input_tokens value by @ishaan-jaff in #15116
  • [Fix] LiteLLM UI - Ensure OTEL settings are saved in DB after set on UI by @ishaan-jaff in #15118
  • Gitlab based Prompt manager by @deepanshululla in #14988
  • [Feat] Fixes to dynamic rate limiter v3 - add saturatation detection by @ishaan-jaff in #15119
  • Guardrails - Don't run post_call guardrail if no text returned from Bedrock by @plafleur in #15106
  • docs: use docker compose instead of docker-compose by @kowyo in #15024
  • fix (opentelemetry): use generation_name for span naming in logging method by @tyler-liner in #14799
  • add azure_ai grok-4 model family by @mubashir1osmani in #15137
  • Added railtracks to projects that are using litellm by @Amir-R25 in #15144
  • [Security Fix] fix: don't log JWT SSO token on .info() log by @ishaan-jaff in #15145
  • [Fix] Proxy: end user cost tracking in the responses API by @georg-wolflein in #15124
  • Price Fix: Add 200K prices for Sonnet 4.5 by @niharm in #15140
  • Add provider name to payload specification by @deepanshululla in #15130
  • [Fix]: Handle non-serializable objects in Langfuse logging by @ishaan-jaff in #15148
  • fix: set usage_details.total in langfuse integration by @anthony-liner in #15015
  • Top api key tags by @DrQuacks in #15151
  • Top api key tags by @DrQuacks in #15156
  • (feat)Litellm x twelvelabs bedrock[Async Invoke Support] by @Sameerlite in #14871
  • [Fix] - Router: optimize unhealthy deployment filtering in retry path (O(n*m) → O(n+m)) by @AlexsanderHamir in #15110
  • [Feat] Add Nvidia NIM Rerank Support by @ishaan-jaff in #15152
  • Litellm dev 10 02 2025 p1 by @krrishdholakia in #15155
  • [Feat] MCP Gateway Fine-grained Tools Addition by @rishiganesh2002 in #15153
  • Fix(critical) : Preserve Whitespace Characters in Model Response Streams by @danielaskdd in #15160
  • [Fix] Session Token Cookie Infinite Logout Loop by @JVenberg in #15146
  • (Feat) Add cost tracking for /v1/messages in streaming response by @Sameerlite in #15102
  • Fix OCI Generative AI Integration when using Proxy by @speglich in #15072
  • feature/add max requests env var by @TobiMayr in #15007
  • Add "eu.anthropic.claude-sonnet-4-5-20250929-v1:0" in "model_prices_and_context_window.json" by @ishaan-jaff in #15181
  • [Feat] VertexAI - Support googlemap grounding in vertex ai by @ishaan-jaff in #15179
  • feat: add JP Cross-Region Inference by @uc4w6c in #15188
  • test: fix test_mcp_server.py by @uc4w6c in #15183
  • update: DeepInfra model data refresh [2025-09-26] by @Toy-97 in #14939
  • #14404 BugFix - Add support for Azure AD token-based authorization in… by @shagunb-acn in #14813
  • Fix: Authorization header to use correct "Bearer" capitalization by @daily-kim in #14764
  • [Fix] Cache - Avoiding expensive operations when cache isn't available by @AlexsanderHamir in #15182
  • [Doc] Perf: Last week improvement by @AlexsanderHamir in #15193
  • [Feat] Dynamic Rate Limiter v3 - fixes for detecting saturation + fixes for post saturation behavior by @ishaan-jaff in #15192
  • fix: empty premium fields resulting in edit key blocking by @ARajan1084 in #15184
  • Add streamGenerateContent cost tracking in passthrough by @Sameerlite in #15199
  • Add sync models GitHub documentation with Loom video and cross-refere… by @TeddyAmkie in #15191
  • UI - fix failed copy to clipboard for http ui + UI - fix logs page render logs on filter lookup + UI - fix lookup list of end users (migrate to more efficient /customers/list lookup) by @krrishdholakia in #15195
  • fix: Test key view: state on model info update by @ARajan1084 in #15197
  • Guardrails - run all guardrails before calling other post_call_success_hook + Prometheus - support custom metadata labels on key/team by @krrishdholakia in #15094
  • fix: fix ui rerender by @krrishdholakia in #15200
  • Litellm staging 10 04 2025 by @krrishdholakia in #15196
  • Revert "Add streamGenerateContent cost tracking in passthrough" by @ishaan-jaff in #15202
  • Fix "azure_ai/grok-4-fast-reasoning" entry in "model_prices_and_context_window.json" by @ishaan-jaff in #15204
  • (security) prevent user key from updating other user keys + don't return all keys with blank key alias on /v2/key/info by @krrishdholakia in #15201
  • (feat) Support 'guaranteed_throughput' when setting limits on keys belonging to a team by @krrishdholakia in #15120
  • fix(proxy_server.py): handle decrypting model list from db, when lite… by @krrishdholakia in #15154
  • (MCP - feat) UI - show health status of MCP servers, allow setting extra headers on the UI, allow editing allowed tools on the UI by @krrishdholakia in #15185

New Contributors

Full Changelog: v1.77.5.rc.4...v1.77.1.rc.1-v2

Don't miss a new litellm release

NewReleases is sending notifications on new releases.