What's Changed
- Prompt Management API - new API to interact with Prompt Management integrations (no PR required) by @krrishdholakia in #17800
- [Feature] UI - Keys/Teams: Add Access Group Selector to Create and Edit Flow by @yuneng-jiang in #21234
- [Fix] Preserve key_alias and team_id metadata in /user/daily/activity/aggregated after key deletion or regeneration by @shivamrawat1 in #20684
- [Infra] Fixes + UI Build by @yuneng-jiang in #21235
- [Docs] Access Group Docs by @yuneng-jiang in #21236
- [Infra] UI - Unit Tests: Increase timeout on Long Running Tests by @yuneng-jiang in #21237
- fix: use None instead of Reasoning() for reasoning parameter by @jquinter in #21103
- chore: add .claude directory to gitignore by @jquinter in #21104
- fix: remove unused Reasoning import from transformation.py by @jquinter in #21246
- Fix: Langfuse test isolation to prevent flaky failures by @jquinter in #21214
- fix(test): resolve merge conflict and fix bedrock thinking test flakiness by @jquinter in #21216
- [test] Fix flaky tests caused by module reloading and missing mocks by @jquinter in #19747
- Fix SSO test flakiness by correctly mocking premium_user by @jquinter in #21227
- fix: make policy_resolve_endpoints importable without FastAPI by @jquinter in #21075
- fix(test): improve Langfuse test isolation to prevent flaky failures by @jquinter in #21248
- fix(test): add mock isolation for test_video_content_handler_uses_get_for_openai by @jquinter in #21251
- fix(test): add cleanup for disable_aiohttp_transport in test_extra_body_with_fallback by @jquinter in #21250
- refactor: remove dead code from Langfuse test cleanup by @jquinter in #21253
- fix(test): restore Langfuse client counter in test cleanup by @jquinter in #21254
- refactor(test): remove redundant cache flush from test_openai_env_base by @jquinter in #21257
- fix(test): add environment cleanup for Vertex AI rerank tests by @jquinter in #21268
- fix(test): update reasoning_effort test to expect dict format by @jquinter in #21271
- fix(test): add environment cleanup for Vertex AI GPT-OSS tests by @jquinter in #21272
- fix(test): add environment cleanup for Vertex AI Qwen tests by @jquinter in #21273
- fix(test): use async side_effect for client.post mock in watsonx test by @jquinter in #21275
- fix(test): mock vertexai module in GPT-OSS tests to prevent authentication by @jquinter in #21276
- fix(test): update test_other_constraints_preserved for new schema filtering by @jquinter in #21217
- fix(deps): add fakeredis for pod lock manager tests by @jquinter in #21281
- fix(test): clear tokenizer LRU cache for test isolation by @jquinter in #21279
- fix(test): mock environment variables for callback validation test by @jquinter in #21286
- Fix/21193 chatgpt codex unsupported params by @jayy-77 in #21209
- Add routing based on if reasoning is supported or not by @Sameerlite in #21302
- Fix converse anthropic usage object according to v1/messages specs by @Sameerlite in #21295
- Managed batches - Misc bug fixes by @ephrimstanley in #21157
- fix(bedrock): clamp thinking.budget_tokens to minimum 1024 by @mjkam in #21306
- Add doc for OpenAI Agents SDK with LiteLLM by @Sameerlite in #21311
- Litellm oss staging 02 14 20262 by @Sameerlite in #21307
- fix: virutal key grace period from env/UI by @Harshit28j in #20321
- fix: SSO PKCE support fails in multi-pod Kubernetes deployments by @Harshit28j in #20314
- fix(deps): add pytest-postgresql for db schema migration tests by @jquinter in #21280
- fix(test): replace caplog with custom handler for parallel execution by @jquinter in #21282
- fix(test): correct async mock for video generation logging test by @jquinter in #21283
- fix(test): add cleanup fixture and no_parallel mark for MCP tests by @jquinter in #21284
- Litellm anthropic doc beta header by @Sameerlite in #21320
- Generic Guardrails: Add a configurable fallback to handle generic guardrail endpoint connection failures by @itayov in #21245
- Fix: Exclude tool params for models without function calling support (#21125) by @AtharvaJaiswal005 in #21244
- fix: preserve metadata for custom callbacks on codex/responses path (… by @saneroen in #21243
- fix(proxy): handle missing DATABASE_URL in append_query_params by @vincentkoc in #21239
- fix(mcp): revert StreamableHTTPSessionManager to stateless mode by @michelligabriele in #21323
- fix: prevent double-counting of litellm_proxy_total_requests_metric by @shivamrawat1 in #21159
- UI - Content Filters, help edit/view categories and 1-click add categories + go to next page by @krrishdholakia in #21223
- fix(responses-api): return finish_reason='tool_calls' when response.completed contains function_call items by @felixti in #19745
- Fix OCI Grok output pricing by @ishaan-jaff in #21329
- [Infra] Bumping proxy extras version by @yuneng-jiang in #21332
- docs: add Semgrep & OOM fixes section to v1.81.12 release notes by @AlexsanderHamir in #21334
- Fix au.anthropic.claude opus 4 6 v1 by @anttttti in #20731
- Feat/playground test fallbacks by @atapia27 in #21007
- fix(proxy): fix master key rotation Prisma validation errors by @michelligabriele in #21330
- Add GDPR Art. 32 EU PII Protection Policy Template by @ishaan-jaff in #21340
- feat: EU AI Act Article 5 policy template for prohibited practices detection by @ishaan-jaff in #21342
- [Feature] UI - Usage: Allow Filtering by User by @yuneng-jiang in #21351
- fix: Make vector stores migration idempotent by @milan-berri in #21325
- feat: guardrail tracing UI - policy, detection method, match details by @ishaan-jaff in #21349
- feat(bedrock): support native structured outputs API (outputConfig.textFormat) by @ndgigliotti in #21222
- Fix: Add blog as incident report by @Sameerlite in #21356
- feat(models): add github_copilot/gpt-5.3-codex and github_copilot/claude-opus-4.6-fast by @Chesars in #21316
- fix(proxy): preserve and forward OAuth Authorization headers through proxy layer by @iamadamreed in #19912
- feat: Add IBM watsonx.ai rerank support by @MateuszOssGit in #21303
- fix: make PodLockManager.release_lock atomic compare-and-delete by @emerzon in #21226
- [Infra] v1.81.13-nightly Change Copy to main by @yuneng-jiang in #21357
- [Infra] Add Server Root Test to GitHub Actions by @yuneng-jiang in #21353
- fix: preserve provider_specific_fields from proxy responses by @sahukanishka in #21220
- perf(router): remove quadratic deployment scan in usage-based routing v2 by @emerzon in #21211
- perf(router): avoid O(n^2) membership scans in team deployment filter by @emerzon in #21210
- fix: add
storeto OPENAI_CHAT_COMPLETION_PARAMS by @namabile in #21195 - Fix Bedrock service_tier cost propagation by @emerzon in #21172
- fix: add missing OpenAI chat completion params to OPENAI_CHAT_COMPLETION_PARAMS by @shin-bot-litellm in #21360
- perf: increase default LRU cache size to reduce multi-model thrash by @emerzon in #21139
- fix(router): avoid O(n) alias scan for non-alias get_model_list lookups by @emerzon in #21136
- [Fix] Key Expiry Default Duration by @yuneng-jiang in #21362
- Add Databricks to supported LLM providers for response schema by @TomeHirata in #21368
- Update poetry.lock by @Sameerlite in #21383
- [feat] Add support for Openai Evals API by @Sameerlite in #21375
- Add vllm e2e test for embedding by @Sameerlite in #21382
- fix(lint): suppress PLR0915 too many statements in route_request by @jquinter in #21390
- Add Claude Sonnet 4.6 pricing by @ishaan-jaff in #21395
- add default version for opus 4.6 by @superpoussin22 in #21397
- Day 0 Support: Claude Sonnet 4.6 by @ishaan-jaff in #21401
- fix(ci): reduce parallelism and add retry logic to improve test stability by @jquinter in #21394
- fix(tests): improve conftest isolation and remove deprecation warnings by @jquinter in #21396
- Add EU AI Act Article 5 template to policy templates UI by @ishaan-jaff in #21414
- fix: remove unused asyncio imports (linting errors) by @jquinter in #21412
- fix(deps): regenerate poetry.lock after pyproject.toml changes by @jquinter in #21418
- fix(tests): resolve test isolation issue in http_handler tests by @jquinter in #21388
- fix(test): prevent flaky failure in test_log_langfuse_v2_handles_null_usage_values by @jquinter in #21419
- fix(token-counter): normalize encode() return type and handle HF tokenizer fallback by @jquinter in #21416
- fix(tests): mock prisma.Prisma in backoff retry tests to avoid 'prisma generate' by @jquinter in #21421
- fix(lakera-guardrail): avoid KeyError on missing LAKERA_API_KEY during initialization by @jquinter in #21422
- Fix EU AI Act template: add missing category_file path by @ishaan-jaff in #21424
- [Fix] /v1/models returning wildcard instead of expanded models for BYOK team keys by @shivamrawat1 in #21408
- fix: remove importlib.reload calls causing cross-test class-reference staleness by @jquinter in #21425
- Add French language support for EU AI Act Article 5 guardrail by @ishaan-jaff in #21427
- fix(token-counter): fix test isolation and encode() return type normalization by @jquinter in #21423
- fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests by @jquinter in #21428
- fix(tests): restore disable_aiohttp_transport and force_ipv4 in isolate_litellm_state by @jquinter in #21431
- fix(test): mock enterprise license check in JWT test by @jquinter in #21285
- fix: improve test isolation for parallel execution by @jquinter in #20595
- improve(ci): enhance test stability with better isolation and distribution by @jquinter in #21277
- fix: session grouping broken for dict rows from query_raw by @ishaan-jaff in #21435
- fix: restore sys.modules after stub injection in langfuse otel test by @jquinter in #21434
- feat(ui): add guardrail jump link in log detail view by @ishaan-jaff in #21437
- move e2e to llm translation by @Sameerlite in #21387
- Add compliance checker endpoints + UI panel by @ishaan-jaff in #21432
- fix(bedrock): broaden Nova 2 model detection to support all nova-2-* variants by @ryanh-ai in #21358
- Add prompt injection detection policy template + guardrails by @ishaan-jaff in #21452
- feat: split EU AI Act Article 5 into 5 dedicated sub-guardrails by @ishaan-jaff in #21453
- Add MCP Security guardrail to block unregistered MCP servers by @ishaan-jaff in #21429
- End users - Allow giving end users access to specific mcp servers by @krrishdholakia in #21411
- Revert "End users - Allow giving end users access to specific mcp servers " by @krrishdholakia in #21461
- Add support for devstral 2512 model aliases by @stronk7 in #21372
- feat(bedrock): support nova/ and nova-2/ spec prefixes for custom imported models by @ryanh-ai in #21359
- Add native Responses API support for Databricks GPT models by @TomeHirata in #21460
- Litellm prompt registry fix by @Harshit28j in #21402
- Prompt Management API - allow integrating with LiteLLM prompt management without a PR by @krrishdholakia in #17946
- Revert "fix: make PodLockManager.release_lock atomic compare-and-delete" by @Sameerlite in #21469
- Litellm oss staging 02 16 2026 by @krrishdholakia in #21326
- Litellm oss staging 02 17 2026 by @krrishdholakia in #21361
- [Chore]Add remaining beta tests2 by @Sameerlite in #21299
- Add mapping for websearch from v1/messages to chat/completions by @Sameerlite in #21465
- Add 'reasoning' field to 'reasoning_content' field in delta by @Sameerlite in #21468
- [Feat] Add duckduckgo as search tool by @Sameerlite in #21467
- Litellm sanitise anthropic mesages 2 by @Sameerlite in #21464
- Add File deletion criteria with batch references by @Sameerlite in #21456
- Incident Report: vLLM Embeddings Broken by encoding_format Parameter by @Sameerlite in #21474
- [Feat]Add day 0 claude sonnet 4.6 feat support by @Sameerlite in #21448
- Fix mock test by @Sameerlite in #21475
- fix(tests): restore proxy_server module attrs after test_proxy_admin_expired_key_from_cache by @jquinter in #21473
- fix(ci): add prisma generate step to matrix CI workflow by @jquinter in #21436
- feat(datadog): add 'team' tag to logs, metrics, and cost management by @Harshit28j in #21449
- fix(tests): resolve merge conflict in test_vertex_ai_rerank_transformation.py by @jquinter in #21478
- fix(proxy): use prisma.Json for JSON fields in _rotate_master_key create_many() by @jquinter in #21479
- fix(tests): add inference_geo to model prices JSON schema validator by @jquinter in #21477
- Add deployment affinity routing callback by @emerzon in #19143
- [Refactor] UI - Keys: Change Key Type Label by @yuneng-jiang in #21364
- Add version in claude-code-beta-headers-incident by @Sameerlite in #21485
- fix: guard against None metadata in prometheus metrics by @ishaan-jaff in #21489
- fix(tests): restore litellm.model_cost after reload endpoint test by @jquinter in #21499
- [Infra] Change Server Root Path GitHub action test to non root image by @yuneng-jiang in #21495
- fix(ci): force-reinstall enterprise package to override PyPI version by @jquinter in #21481
- fix(tests): resolve MCP test isolation failures in parallel execution by @jquinter in #21484
- fix(tests): restore default_internal_user_params instead of delattr-ing it by @jquinter in #21483
- fix: improve streaming proxy throughput by fixing middleware and logging bottlenecks by @ishaan-jaff in #21501
- fix(ci): install enterprise package into main project venv, not enterprise's own venv by @jquinter in #21506
- [Bug] Allow internal_user_viewer to access RAG endpoints; restrict ingest to existing vector stores by @shivamrawat1 in #21508
- fix(sso): preserve SSO role regardless of role_mappings config by @yuneng-jiang in #21503
- [Feature] Allow store_model_in_db to be set via database by @yuneng-jiang in #21511
- fix: CI failures - missing env key doc + streaming test by @ishaan-jaff in #21510
- Add aviation and UAE policy templates with tag-based filtering by @ishaan-jaff in #21518
- Mcp user permissions by @krrishdholakia in #21462
- feat(ui): add CSV dataset upload to compliance playground by @ishaan-jaff in #21526
- Litellm cicd 190226 by @Sameerlite in #21531
- Add supoort for context-1m-2025-08-07 by @Sameerlite in #21534
- fix: prevent sys.modules["langfuse"] import failures in langfuse unit tests by @jquinter in #21440
- fix(types): add = None defaults to Optional[str] fields in managed table models by @jquinter in #21500
- [Feature] UI - Models & Endpoints: Add Model Settings Modal by @yuneng-jiang in #21516
- fix(tests): restore litellm.model_cost after TestPriceDataReloadIntegration tests by @jquinter in #21505
- fix(tests): update MCP tests broken by user permissions commit (#21462) by @jquinter in #21536
- fix(mypy): resolve type errors from MCP user permissions commit by @jquinter in #21535
- fix(test): restore default_internal_user_params to None instead of delattr by @jquinter in #21439
- fix(tests): use record.getMessage() instead of record.message for LogRecord by @jquinter in #21476
- fix(ui): remove duplicate URL in tagsSpendLogsCall query string by @jquinter in #20909
- Competitor guardrails: streaming discovery, variations, pre/post split by @ishaan-jaff in #21533
- [Feature] Allow team members to view entire team usage by @yuneng-jiang in #21537
- Litellm project management apis by @Harshit28j in #21078
- fix: remove list-to-str transformation from dashscope by @ZeroAurora in #21547
- Uncomment response_model in user_info endpoint by @richardmcsong in #17430
- fix: allow github aliases to reuse upstream model metadata by @SolitudePy in #21497
- fix(proxy): prevent is_premium() debug log spam on every request by @themavik in #20841
- Convert thinking_blocks to content blocks for hosted_vllm multi-turn by @SherifWaly in #21557
- Fix usage in xai by @Sameerlite in #21559
- [Feat] Add Default usage data configuration by @Sameerlite in #21550
- Fix: add stop param as supported for openai and azure by @Sameerlite in #21539
- [Feat] Add server side compaction translation from openai to anthropic by @Sameerlite in #21555
- Add method based routing for passthrough endpoints by @Sameerlite in #21543
- fix(websearch_interception): fix pre_call_deployment_hook not triggering via proxy router by @michelligabriele in #21433
- fix(constants): add env var override support for COMPETITOR_LLM_TEMPERATURE and MAX_COMPETITOR_NAMES by @jquinter in #21564
- fix(types): fix mypy errors in pass-through endpoint query param types by @jquinter in #21566
- [Feat]Add gemini 3.1 pro preview day 0 support by @Sameerlite in #21568
- bump: version 0.4.40 → 0.4.41 by @yuneng-jiang in #21579
- [Infra] bump proxy extras by @yuneng-jiang in #21580
- [Feature] Key Last Active Tracking by @yuneng-jiang in #21545
- [Infra] Fixing Merge Artifacts by @yuneng-jiang in #21586
- [Infra] Add project_id to DeletedVerificationTable by @yuneng-jiang in #21587
- Fix release by @Sameerlite in #21588
- fix: handle deprovisioning operations without path field by @milan-berri in #21571
- fix(bedrock): add Accept header for AgentCore MCP server requests by @michelligabriele in #21551
- feat: AI policy template suggestions by @ishaan-jaff in #21589
- fix: reduce proxy overhead for large base64 payloads by @ishaan-jaff in #21594
- Add OpenAPI-to-MCP support via API and UI by @ishaan-jaff in #21575
- docs: add latency overhead troubleshooting guide by @ishaan-jaff in #21600
- docs: add latency overhead troubleshooting guide by @ishaan-jaff in #21603
- feat: add airline off-topic restriction policy template by @ishaan-jaff in #21607
- feat(policy): test playground for AI policy suggestions by @ishaan-jaff in #21608
- ci: auto-regenerate poetry.lock when pyproject.toml changes on main by @jquinter in #21610
- fix(key management): return failed_tokens in delete_verification_tokens response by @jquinter in #21609
- fix(ci): fix YAML syntax error in regenerate-poetry-lock workflow by @jquinter in #21615
- fix(ci): fall back to github.token when GH_TOKEN secret is not set by @jquinter in #21616
- feat: prompt injection guardrail policy template by @ishaan-jaff in #21520
- fix(ci): remove --no-update flag removed in Poetry 2.x by @jquinter in #21617
- fix(ci): use PAT_TOKEN_2 for gh pr create in regenerate-lock workflow by @jquinter in #21618
- feat(ui): show latency overhead for AI-suggested policy templates by @ishaan-jaff in #21620
- feat(ci): auto-approve and auto-merge the regenerated poetry.lock PR by @jquinter in #21619
- fix(ci): drop PAT_TOKEN_2 approval, use github.token for auto-merge by @jquinter in #21625
- chore: regenerate poetry.lock to match pyproject.toml by @github-actions[bot] in #21626
- [Fix] Service Account Visibility for Team Members by @yuneng-jiang in #21627
- fix: handle explicit None model_info in LowestLatencyLoggingHandler by @dkindlund in #21633
- support reasoning and effort parameters on sonnet 4.6 by @jtsaw in #21598
- fix(lint): remove unused imports in semantic_guard and policy_endpoints by @jquinter in #21639
- fix(tests): set premium_user=True in JWT tests that call user_api_key_auth by @jquinter in #21641
- fix(anthropic): empty system messages in translate_system_message by @Chesars in #21630
- fix(tests): pass host to RedisCache in test_team_update_redis by @jquinter in #21643
- fix(policy): use litellm.acompletion directly in AiPolicySuggester by @jquinter in #21638
- fix(lint): remove redundant router import in policy_endpoints init by @jquinter in #21602
- [Refactor] UI: Remove 38 unused files detected by knip by @yuneng-jiang in #21657
- Fix _map_reasoning_effort_to_thinking_level for all gemini 3 family by @Sameerlite in #21654
- Fix: api_base is required. Unable to determine the correct api_base for the request by @Sameerlite in #21658
- Fix mapping of parallel_tool_calls for bedrock converse by @Sameerlite in #21659
- [Fix]Add mcp via openapi spec by @Sameerlite in #21662
- [Feat] Add reasoning support via config by @Sameerlite in #21663
- fix(tests): update test_max_effort_rejected_for_opus_45 regex to match current error message by @jquinter in #21666
- fix(policy_endpoints): re-export private helper functions from package init.py by @jquinter in #21667
- fix(tests): skip more CI tests requiring external DB/Redis connections by @jquinter in #21671
- fix(tests): set premium_user=True in test_aasync_call_with_key_over_model_budget by @jquinter in #21668
- fix(tests): skip CI tests requiring external services (DB, API keys) by @jquinter in #21669
- fix(tests): correct medium reasoning_effort assertion for gemini-3-pro-preview by @jquinter in #21670
- fix(tests): skip project endpoint tests requiring Prisma DB connection by @jquinter in #21675
- fix(tests): skip prisma DB test and sync root schema.prisma with spec_path by @jquinter in #21676
- fix(tests): skip all remaining prisma DB tests in test_key_generate_prisma.py by @jquinter in #21679
- fix(lint): extract service_tier mapping to fix PLR0915 in converse_transformation.py by @jquinter in #21680
- fix(tests): add spec_path=None to MCP server mocks to fix Pydantic validation by @jquinter in #21681
- fix(model-pricing): add missing fireworks_ai model pricing for glm-4p7, minimax-m2p1, kimi-k2p5 by @michelligabriele in #21642
- fix(proxy): use batch_ prefix for Vertex AI batch IDs in encode_file_id_with_model by @michelligabriele in #21624
- fix(tests): skip remaining real prisma DB tests in CI and related test suites by @jquinter in #21684
- fix(tests): skip test_search_api_logging_and_cost_tracking - requires Prisma DB by @jquinter in #21682
- fix(cost-calc): use per-image pricing for Bedrock multimodal embeddings by @michelligabriele in #21646
- fix(types): resolve MyPy assignment type errors in logging_utils and vertex transformation by @jquinter in #21685
New Contributors
- @mjkam made their first contribution in #21306
- @saneroen made their first contribution in #21243
- @vincentkoc made their first contribution in #21239
- @felixti made their first contribution in #19745
- @anttttti made their first contribution in #20731
- @ndgigliotti made their first contribution in #21222
- @iamadamreed made their first contribution in #19912
- @sahukanishka made their first contribution in #21220
- @namabile made their first contribution in #21195
- @stronk7 made their first contribution in #21372
- @ZeroAurora made their first contribution in #21547
- @SolitudePy made their first contribution in #21497
- @SherifWaly made their first contribution in #21557
- @github-actions[bot] made their first contribution in #21626
- @dkindlund made their first contribution in #21633
Full Changelog: v1.81.12-nightly...litellm_langfuse-dev-v1.81.13