What's Changed
- fix(utils.py): drop response_format if 'drop_params=True' for gpt-4 by @krrishdholakia in #3724
- fix(vertex_ai.py): support passing in result of tool call to vertex by @krrishdholakia in #3729
- feat(proxy_cli.py): support json logs on proxy by @krrishdholakia in #3737
Full Changelog: v1.37.16...v1.37.17
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.37.17
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 19 | 21.889823638252498 | 1.60686072057369 | 1.60686072057369 | 481 | 481 | 17.351264000012634 | 105.27215500002285 |
/health/liveliness | Failed ❌ | 18 | 22.18578070962989 | 15.68108985940312 | 15.68108985940312 | 4694 | 4694 | 16.798868000023504 | 1270.1498669999864 |
/health/readiness | Failed ❌ | 18 | 22.51649214501728 | 15.687771192960598 | 15.687771192960598 | 4696 | 4696 | 16.936380999993617 | 1206.0035970000058 |
Aggregated | Failed ❌ | 18 | 22.328690804782045 | 32.975721772937405 | 32.975721772937405 | 9871 | 9871 | 16.798868000023504 | 1270.1498669999864 |