π¨Alpha - 1.58.0 has various perf improvements, we recommend waiting for a stable release before bumping in production
What's Changed
- (core sdk fix) - fix fallbacks stuck in infinite loop by @ishaan-jaff in #7751
- [Bug fix]: v1.58.0 - issue with read request body by @ishaan-jaff in #7753
- (litellm SDK perf improvements) - handle cases when unable to lookup model in model cost map by @ishaan-jaff in #7750
- (prometheus - minor bug fix) -
litellm_llm_api_time_to_first_token_metric
not populating for bedrock models by @ishaan-jaff in #7740 - (fix) health check - allow setting
health_check_model
by @ishaan-jaff in #7752
Full Changelog: v1.58.0...v1.58.1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.58.1
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |
Aggregated | Passed β | 250.0 | 294.2978673554448 | 6.045420383532543 | 0.0 | 1809 | 0 | 223.72276400000146 | 3539.4181890000027 |