What's Changed
- feat(prometheus_api.py): support querying prometheus metrics for all-up + key-level spend on UI by @krrishdholakia in #5782
- [Fix-Bedrock] use Bedrock converse for
"meta.llama3-8b-instruct-v1:0", "meta.llama3-70b-instruct-v1:0"
by @ishaan-jaff in #5729 - [Feat] add Groq gemma2 9b pricing by @ishaan-jaff in #5788
- LiteLLM Minor Fixes & Improvements (09/18/2024) by @krrishdholakia in #5772
- [Feat] Add Azure gpt-35-turbo-0301 pricing by @ishaan-jaff in #5790
- test: replace gpt-3.5-turbo-0613 (deprecated model) by @krrishdholakia in #5794
- [Chore-Docs] fix curl on /get team info swagger by @ishaan-jaff in #5792
Full Changelog: v1.46.6...v1.46.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.46.7
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 150.0 | 168.9139116122553 | 6.325020266340649 | 0.0 | 1893 | 0 | 116.5782520000107 | 1552.0026590000384 |
Aggregated | Passed ✅ | 150.0 | 168.9139116122553 | 6.325020266340649 | 0.0 | 1893 | 0 | 116.5782520000107 | 1552.0026590000384 |