What's Changed
- Fix azure tenant id check from env var + response_format check on api_version 2025+ by @krrishdholakia in #9993
- Add
/vllm
and/mistral
passthrough endpoints by @krrishdholakia in #10002 - CI/CD fix mock tests by @ishaan-jaff in #10003
- Setting
litellm.modify_params
via environment variables by @Eoous in #9964 - Support checking provider
/models
endpoints on proxy/v1/models
endpoint by @krrishdholakia in #9958 - Update AWS bedrock regions by @Schnitzel in #9430
- Fix case where only system messages are passed to Gemini by @NolanTrem in #9992
- Revert "Fix case where only system messages are passed to Gemini" by @krrishdholakia in #10027
- chore(docs): Update logging.md by @mrlorentx in #10006
- build(deps): bump @babel/runtime from 7.23.9 to 7.27.0 in /ui/litellm-dashboard by @dependabot in #10001
- Fix typo: Entrata -> Entra in code by @msabramo in #9922
- Retain schema field ordering for google gemini and vertex by @adrianlyjak in #9828
- Revert "Retain schema field ordering for google gemini and vertex" by @krrishdholakia in #10038
- Add aggregate team based usage logging by @krrishdholakia in #10039
- [UI Polish] UI fixes for cache control injection settings by @ishaan-jaff in #10031
- [UI] Bug Fix - Show created_at and updated_at for Users Page by @ishaan-jaff in #10033
- [Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions by @ishaan-jaff in #10029
- Fix gcs pub sub logging with env var GCS_PROJECT_ID by @krrishdholakia in #10042
- Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls by @krrishdholakia in #10040
- [Docs] Auto prompt caching by @ishaan-jaff in #10044
- Add litellm call id passing to Aim guardrails on pre and post-hooks calls by @hxmichael in #10021
- /utils/token_counter: get model_info from deployment directly by @chaofuyang in #10047
- [Bug Fix] Azure Blob Storage fixes by @ishaan-jaff in #10059
- build(deps): bump http-proxy-middleware from 2.0.7 to 2.0.9 in /docs/my-website by @dependabot in #10064
- fix(stream_chunk_builder_utils.py): don't set index on modelresponse by @krrishdholakia in #10063
- fix(llm_http_handler.py): fix fake streaming by @krrishdholakia in #10061
New Contributors
- @Eoous made their first contribution in #9964
- @mrlorentx made their first contribution in #10006
- @hxmichael made their first contribution in #10021
- @chaofuyang made their first contribution in #10047
Full Changelog: v1.66.1-nightly...v1.66.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |
Aggregated | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |