What's Changed
- Revert "* feat(factory.py): add support for merging consecutive messages of one role when separated with empty message of another role" by @krrishdholakia in #3518
- Edit cost per input + cost per output token on UI by @krrishdholakia in #3512
- Pydantic warning conflict with protected namespace by @CyanideByte in #3519
- [Feat] send alert on cooling down deployment by @ishaan-jaff in #3532
- Add
/engines/{model}/chat/completions
endpoint by @msabramo in #3437 - feat(proxy_server.py): return litellm version in response headers by @krrishdholakia in #3535
New Contributors
Full Changelog: v1.36.2-stable...v1.36.3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 81 | 84.12845592706023 | 1.4197437828442634 | 0.0 | 425 | 0 | 75.00172599998223 | 276.58217700002297 |
/health/liveliness | Passed ✅ | 65 | 68.02378255285612 | 15.263080808977504 | 0.003340573606692384 | 4569 | 1 | 63.314151000042784 | 1318.427387000014 |
/health/readiness | Passed ✅ | 65 | 67.54219389117722 | 15.410066047671968 | 0.003340573606692384 | 4613 | 1 | 63.400273000013385 | 1442.6363040000183 |
Aggregated | Passed ✅ | 65 | 68.50498560143636 | 32.09289063949374 | 0.006681147213384768 | 9607 | 2 | 63.314151000042784 | 1442.6363040000183 |