What's Changed
- Internal User Endpoint - vulnerability fix + response type fix by @krrishdholakia in #8228
- Litellm UI fixes 8123 v2 (#8208) by @ishaan-jaff in #8245
- Update model_prices_and_context_window.json by @superpoussin22 in #8249
- Update model_prices_and_context_window.json by @superpoussin22 in #8256
- (dependency) - pip loosen httpx version requirement by @ishaan-jaff in #8255
- Add hyperbolic deepseek v3 model configurations by @lowjiansheng in #8232
- fix(prometheus.py): fix setting key budget metrics by @krrishdholakia in #8234
- (feat) - add supports tool choice to model cost map by @ishaan-jaff in #8265
- (feat) - track org_id in SpendLogs by @ishaan-jaff in #8253
- (Bug fix) - Langfuse / Callback settings stored in DB by @ishaan-jaff in #8251
- Fix passing top_k parameter for Bedrock Anthropic models (#8131) by @ishaan-jaff in #8269
- (Feat) - Add support for structured output on
bedrock/nova
models + add utillitellm.supports_tool_choice
by @ishaan-jaff in #8264 - [BETA] Support OIDC
role
based access to proxy by @krrishdholakia in #8260 - Fix deepseek calling - refactor to use base_llm_http_handler by @krrishdholakia in #8266
- allows dynamic message redaction by @krrishdholakia in #8270
Full Changelog: v1.60.2...v1.60.2-dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.2-dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 251.01715871561 | 6.124144736848293 | 0.0 | 1832 | 0 | 171.2837300000274 | 3691.155395999999 |
Aggregated | Passed ✅ | 220.0 | 251.01715871561 | 6.124144736848293 | 0.0 | 1832 | 0 | 171.2837300000274 | 3691.155395999999 |