What's Changed
- Litellm fix db testing by @krrishdholakia in #9593
- Litellm new UI build by @krrishdholakia in #9601
- Support max_completion_tokens on Mistral by @Cmancuso in #9589
- Revert "Support max_completion_tokens on Mistral" by @krrishdholakia in #9604
- fix(mistral_chat_transformation.py): add missing comma by @krrishdholakia in #9606
- Support discovering gemini, anthropic, xai models by calling their
/v1/model
endpoint by @krrishdholakia in #9530 - Connect UI to "LiteLLM_DailyUserSpend" spend table - enables usage tab to work at 1m+ spend logs by @krrishdholakia in #9603
- Update README.md by @krrishdholakia in #9616
- Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9618
- fix(proxy_server.py): get master key from environment, if not set in … by @krrishdholakia in #9617
- fix(logging): add json formatting for uncaught exceptions (#9615) by @krrishdholakia in #9619
- fix: wrong indentation of ttlSecondsAfterFinished in chart by @Dbzman in #9611
- Fix anthropic thinking + response_format by @krrishdholakia in #9594
- Add support to Vertex AI transformation for anyOf union type with null fields by @NickGrab in #9625
- fix(openrouter/chat/transformation.py): raise informative message for openrouter key error by @krrishdholakia in #9626
- [Reliability] - Reduce DB Deadlocks by storing spend updates in Redis and then committing to DB by @ishaan-jaff in #9608
- Add bedrock latency optimized inference support + Vertex AI Multimodal embedding cost tracking by @krrishdholakia in #9623
- build(pyproject.toml): add new dev dependencies - for type checking by @krrishdholakia in #9631
- install prisma migration files - connects litellm proxy to litellm's prisma migration files by @krrishdholakia in #9637
- update docs for openwebui by @tan-yong-sheng in #9636
- Add gemini audio input support + handle special tokens in sagemaker response by @krrishdholakia in #9640
- [Docs - Release notes v0] v1.65.0-stable by @ishaan-jaff in #9643
- [Feat] - MCP improvements, add support for using SSE MCP servers by @ishaan-jaff in #9642
- [FIX] - Add password to sync sentinel client by @jmarshall-medallia in #9622
- fix: Anthropic prompt caching on GCP Vertex AI by @sammcj in #9605
- Fixes Databricks llama3.3-70b endpoint and add databricks claude 3.7 sonnet endpoint by @anton164 in #9661
- fix(docs): update xAI Grok vision model reference by @colesmcintosh in #9286
- docs(gemini): fix typo by @GabrielLoiseau in #9581
- Update all_caches.md by @KPCOFGS in #9562
- [Bug fix] - Sagemaker endpoint with inference component streaming by @ishaan-jaff in #9515
- Revert "Correct Databricks llama3.3-70b endpoint and add databricks c… by @krrishdholakia in #9668
- Revert "fix: Anthropic prompt caching on GCP Vertex AI" by @krrishdholakia in #9670
- [Refactor] - Expose litellm.messages.acreate() and litellm.messages.create() to make LLM API calls in Anthropic API spec by @ishaan-jaff in #9567
New Contributors
- @Cmancuso made their first contribution in #9589
- @Dbzman made their first contribution in #9611
- @tan-yong-sheng made their first contribution in #9636
- @jmarshall-medallia made their first contribution in #9622
- @GabrielLoiseau made their first contribution in #9581
- @KPCOFGS made their first contribution in #9562
Full Changelog: v1.64.1.dev1...v1.65.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 261.03979166611845 | 6.112143157921839 | 0.0 | 1827 | 0 | 196.8891020000001 | 5075.201525000011 |
Aggregated | Passed ✅ | 220.0 | 261.03979166611845 | 6.112143157921839 | 0.0 | 1827 | 0 | 196.8891020000001 | 5075.201525000011 |