BerriAI/litellm v1.74.6-nightly on GitHub

What's Changed

feat(internal_user_endpoints.py): new /user/bulk_update endpoint by @krrishdholakia in #12720
fix(team_endpoints.py): ensure user id correctly added when new team … by @krrishdholakia in #12719
Teams - allow setting custom key duration + show many user + service account keys have been created by @krrishdholakia in #12722
Regenerate Key State Management and Authentication Issues by @NANDINI-star in #12729
Fix AsyncMock error in team endpoints test by @colesmcintosh in #12730
[Feat] Add azure_ai/grok-3 model family + Cost tracking by @ishaan-jaff in #12732
fixed comment in docs for anthropic provider by @jvanmelckebeke in #12725
[Bug Fix] QA - Use PG Vector Vector Store with LiteLLM by @ishaan-jaff in #12716
[Bug fix] s3 v2 log uploader crashes when using with guardrails by @ishaan-jaff in #12733
chore(proxy): loosen rich version from ==13.7.1 to >=13.7.1 by @jlaurendi in #12704
Add Hosted VLLM rerank provider integration by @jugaldb in #12738
/streamGenerateContent - non-gemini model support by @krrishdholakia in #12647
Anthropic - add tool cache control support by @krrishdholakia in #12668
Health check app on separate port by @jugaldb in #12718
Guardrails AI - support llmOutput based guardrails as pre-call hooks by @krrishdholakia in #12674
[Prometheus] Move Prometheus to enterprise folder by @jugaldb in #12659
[jais-30b-chat] added model to prices and context window by @jugaldb in #12739
feat: integrate Google Cloud Model Armor guardrails by @colesmcintosh in #12492
Add project_id to cached credentials for VertexAI by @doublerr in #12661
[Feat] UI - Allow clicking into Vector Stores by @ishaan-jaff in #12741
fix(lowest_latency.py): Handle ZeroDivisionError with zero completion tokens by @colesmcintosh in #12734
build(deps): bump on-headers and compression in /docs/my-website by @dependabot[bot] in #12721
[LLM Translation] Change System prompts to assistant prompts as a workaround for GH Copilot by @jugaldb in #12742
[LLM Translation - Redis] fix: redis caching for embedding response models by @jugaldb in #12750
[LLM Translation] Added model name formats by @jugaldb in #12745
[Feat] LLM API Endpoint - Expose OpenAI Compatible /vector_stores/{vector_store_id}/search endpoint by @ishaan-jaff in #12749
[Feat] UI Vector Stores - Allow adding Vertex RAG Engine, OpenAI, Azure by @ishaan-jaff in #12752
feat: add v0 provider support by @colesmcintosh in #12751

New Contributors

@jvanmelckebeke made their first contribution in #12725
@jlaurendi made their first contribution in #12704
@doublerr made their first contribution in #12661

Full Changelog: v1.74.5.dev1...v1.74.6-nightly

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.74.6-nightly

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	220.0	237.64228548393717	6.242987805521814	0.0	1868	0	193.6083079999662	1787.4751499999775
Aggregated	Passed ✅	220.0	237.64228548393717	6.242987805521814	0.0	1868	0	193.6083079999662	1787.4751499999775