What's Changed
- [Docs] v1.72.2.rc by @ishaan-jaff in #11519
- Support env var vertex credentials for passthrough + ignore space id on watsonx deployment (throws Json validation errors) by @krrishdholakia in #11527
- Ensure consistent 'created' across all chunks + set tool call id for ollama streaming calls by @krrishdholakia in #11528
- Update enduser spend and budget reset date based on budget duration by @laurien16 in #8460
- feat: add .cursor to .gitignore by @colesmcintosh in #11538
- Add gpt-4o-audio-preview-2025-06-03 pricing configuration by @colesmcintosh in #11560
- [Docs] Fix incorrect reference to database_url as master_key by @fengbohello in #11547
- Update documentation for configuring web search options in config.yaml by @colesmcintosh in #11537
- [Bug fix]: aiohttp fixes for transfer encoding error on aiohttp transport by @ishaan-jaff in #11561
- [Feat] Add
reasoning_effort
support for perplexity models by @ishaan-jaff in #11562 - Make all commands show server URL by @msabramo in #10801
- Simplify
management_cli.md
CLI docs by @msabramo in #10799 - Fix: Adds support for choosing the default region based on where the model is available by @ishaan-jaff in #11566
- [Feat] Add Lasso Guardrail to LiteLLM by @ishaan-jaff in #11565
- Fix gemini tool call indexes by @lowjiansheng in #11558
- Show remaining users on UI + prevent early stream stopping for gemini requests by @krrishdholakia in #11568
- Add VertexAI
claude-opus-4
+ Assign users to orgs on creation by @krrishdholakia in #11572 - Pangea/kl/udpate readme by @lapinek in #11570
- Update README.md so docker compose will work as described by @yanwork in #11586
New Contributors
- @laurien16 made their first contribution in #8460
- @fengbohello made their first contribution in #11547
- @lapinek made their first contribution in #11570
- @yanwork made their first contribution in #11586
Full Changelog: v1.72.2.rc...v1.72.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.72.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 244.23009797848223 | 6.212339526915134 | 0.0 | 1859 | 0 | 195.39837500002477 | 1308.959112999986 |
Aggregated | Passed ✅ | 230.0 | 244.23009797848223 | 6.212339526915134 | 0.0 | 1859 | 0 | 195.39837500002477 | 1308.959112999986 |