What's Changed
- Fix route check for non-proxy admins on jwt auth by @krrishdholakia in #9454
- docs(predibase): fix typo by @luisegarduno in #9464
- build(deps): bump next from 14.2.21 to 14.2.25 in /ui/litellm-dashboard by @dependabot in #9458
- [Feat] Add OpenAI Web Search Tool Call Support - Initial support by @ishaan-jaff in #9465
- Refactor vertex ai passthrough routes - fixes unpredictable behaviour w/ auto-setting default_vertex_region on router model add by @krrishdholakia in #9467
- [Feat] Add testing for
litellm.supports_web_search()
and render supports_web_search on model hub by @ishaan-jaff in #9469 - Litellm dev 03 22 2025 release note by @krrishdholakia in #9475
- build: add new vertex text embedding model by @krrishdholakia in #9476
- enables viewing all wildcard models on /model/info by @krrishdholakia in #9473
- Litellm redis semantic caching by @tylerhutcherson in #9356
- Log 'api_base' on spend logs by @krrishdholakia in #9509
- [Fix] Use StandardLoggingPayload for GCS Pub Sub Logging Integration by @ishaan-jaff in #9508
- [Feat] Support for exposing MCP tools on litellm proxy by @ishaan-jaff in #9426
- fix(invoke_handler.py): remove hard coded final usage chunk on bedrock streaming usage by @krrishdholakia in #9512
- Add vertexai topLogprobs support by @krrishdholakia in #9518
- Update model_prices_and_context_window.json by @superpoussin22 in #9459
- fix vertex ai multimodal embedding translation by @krrishdholakia in #9471
- ci(publish-migrations.yml): add action for publishing prisma db migrations by @krrishdholakia in #9537
- [Feat - New Model] Add VertexAI
gemini-2.0-flash-lite
and Google AI Studiogemini-2.0-flash-lite
by @ishaan-jaff in #9523 - Support
litellm.api_base
for vertex_ai + gemini/ across completion, embedding, image_generation by @krrishdholakia in #9516 - Nova Canvas complete image generation tasks (#9177) by @krrishdholakia in #9525
- [Feature]: Support for Fine-Tuned Vertex AI LLMs by @ishaan-jaff in #9542
- feat(prisma-migrations): add baseline db migration file by @krrishdholakia in #9565
- Add Daily User Spend Aggregate view - allows UI Usage tab to work > 1m rows by @krrishdholakia in #9538
- Support Gemini audio token cost tracking + fix openai audio input token cost tracking by @krrishdholakia in #9535
- [Reliability Fixes] - Gracefully handle exceptions when DB is having an outage by @ishaan-jaff in #9533
- [Reliability Fix] - Allow Pods to startup + passing /health/readiness when
allow_requests_on_db_unavailable: True
and DB is down by @ishaan-jaff in #9569 - Add OpenAI gpt-4o-transcribe support by @krrishdholakia in #9517
- Allow viewing keyinfo on request logs by @krrishdholakia in #9568
- Allow team admins to add/update/delete models on UI + show api base and model id on request logs by @krrishdholakia in #9572
- Litellm fix db testing by @krrishdholakia in #9593
- Litellm new UI build by @krrishdholakia in #9601
- Support max_completion_tokens on Mistral by @Cmancuso in #9589
- Revert "Support max_completion_tokens on Mistral" by @krrishdholakia in #9604
- fix(mistral_chat_transformation.py): add missing comma by @krrishdholakia in #9606
- Support discovering gemini, anthropic, xai models by calling their
/v1/model
endpoint by @krrishdholakia in #9530 - Connect UI to "LiteLLM_DailyUserSpend" spend table - enables usage tab to work at 1m+ spend logs by @krrishdholakia in #9603
- Update README.md by @krrishdholakia in #9616
- fix(proxy_server.py): get master key from environment, if not set in … by @krrishdholakia in #9617
New Contributors
- @luisegarduno made their first contribution in #9464
- @Cmancuso made their first contribution in #9589
Full Changelog: v1.63.14-stable.patch1...v1.65.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.65.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 233.43193575834258 | 6.214443976298119 | 0.0 | 1858 | 0 | 180.17820199997914 | 4614.819022000006 |
Aggregated | Passed ✅ | 200.0 | 233.43193575834258 | 6.214443976298119 | 0.0 | 1858 | 0 | 180.17820199997914 | 4614.819022000006 |