🔥 We're launching filtering LLMs by provider, max_tokens on https://models.litellm.ai 👉 View cost, max_tokens for 200+ LLMs (@LiteLLM)
🔭 [Feat] - log writing BatchSpendUpdate events on OTEL
🔑 Proxy Enterprise - security - check max request size
🛡️ [Feat Enterprise] - check max response size
✅ Feat Enterprise - set max request / response size UI
What's Changed
- feat(ollama_chat.py): support ollama tool calling by @krrishdholakia in #4918
- fix(proxy_server.py): fix get secret for environment_variables by @krrishdholakia in #4907
- Fix Datadog JSON serialization by @idris in #4920
- [Fix] using airgapped license for Enterprise by @ishaan-jaff in #4921
- [Feat] - log writing BatchSpendUpdate events on OTEL by @ishaan-jaff in #4924
- Fix Canary error with
docusaurus start
by @yujonglee in #4919 - [Feature]: Allow using custom and on-demand models in Fireworks AI + update data to model_prices_and_context_window.json by @danielbichuetti in #4730
- Proxy Enterprise - security - check max request size by @ishaan-jaff in #4926
- [Feat Enterprise] - check max response size by @ishaan-jaff in #4928
- Feat Enterprise - set max request / response size UI by @ishaan-jaff in #4927
Full Changelog: v1.42.4...v1.42.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.42.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 149.07872144345131 | 6.351580011280877 | 0.0 | 1901 | 0 | 107.79980099999875 | 1698.2656079999856 |
Aggregated | Passed ✅ | 130.0 | 149.07872144345131 | 6.351580011280877 | 0.0 | 1901 | 0 | 107.79980099999875 | 1698.2656079999856 |