What's Changed
- feat(internal_user_endpoints.py): new 
/user/bulk_updateendpoint by @krrishdholakia in #12720 - fix(team_endpoints.py): ensure user id correctly added when new team … by @krrishdholakia in #12719
 - Teams - allow setting custom key duration + show many user + service account keys have been created by @krrishdholakia in #12722
 - Regenerate Key State Management and Authentication Issues by @NANDINI-star in #12729
 - Fix AsyncMock error in team endpoints test by @colesmcintosh in #12730
 - [Feat] Add 
azure_ai/grok-3model family + Cost tracking by @ishaan-jaff in #12732 - fixed comment in docs for anthropic provider by @jvanmelckebeke in #12725
 - [Bug Fix] QA - Use PG Vector Vector Store with LiteLLM by @ishaan-jaff in #12716
 - [Bug fix] s3 v2 log uploader crashes when using with guardrails by @ishaan-jaff in #12733
 - chore(proxy): loosen rich version from ==13.7.1 to >=13.7.1 by @jlaurendi in #12704
 - Add Hosted VLLM rerank provider integration by @jugaldb in #12738
 - /streamGenerateContent - non-gemini model support by @krrishdholakia in #12647
 - Anthropic - add tool cache control support by @krrishdholakia in #12668
 - Health check app on separate port by @jugaldb in #12718
 - Guardrails AI - support 
llmOutputbased guardrails as pre-call hooks by @krrishdholakia in #12674 - [Prometheus] Move Prometheus to enterprise folder by @jugaldb in #12659
 - [jais-30b-chat] added model to prices and context window by @jugaldb in #12739
 - feat: integrate Google Cloud Model Armor guardrails by @colesmcintosh in #12492
 - Add project_id to cached credentials for VertexAI by @doublerr in #12661
 - [Feat] UI - Allow clicking into Vector Stores by @ishaan-jaff in #12741
 - fix(lowest_latency.py): Handle ZeroDivisionError with zero completion tokens by @colesmcintosh in #12734
 - build(deps): bump on-headers and compression in /docs/my-website by @dependabot[bot] in #12721
 - [LLM Translation] Change System prompts to assistant prompts as a workaround for GH Copilot by @jugaldb in #12742
 - [LLM Translation - Redis] fix: redis caching for embedding response models by @jugaldb in #12750
 - [LLM Translation] Added model name formats by @jugaldb in #12745
 - [Feat] LLM API Endpoint - Expose OpenAI Compatible /vector_stores/{vector_store_id}/search endpoint by @ishaan-jaff in #12749
 - [Feat] UI Vector Stores - Allow adding Vertex RAG Engine, OpenAI, Azure by @ishaan-jaff in #12752
 - feat: add v0 provider support by @colesmcintosh in #12751
 
New Contributors
- @jvanmelckebeke made their first contribution in #12725
 - @jlaurendi made their first contribution in #12704
 - @doublerr made their first contribution in #12661
 
Full Changelog: v1.74.5.dev1...v1.74.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
| Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) | 
|---|---|---|---|---|---|---|---|---|---|
| /chat/completions | Passed ✅ | 220.0 | 237.64228548393717 | 6.242987805521814 | 0.0 | 1868 | 0 | 193.6083079999662 | 1787.4751499999775 | 
| Aggregated | Passed ✅ | 220.0 | 237.64228548393717 | 6.242987805521814 | 0.0 | 1868 | 0 | 193.6083079999662 | 1787.4751499999775 |