What's Changed
- [Bug Fix] Use OpenAI Tool Response Spec When Converting To Gemini/VertexAI Tool Response by @andrewmjc in #4522
- feat - show key alias on prometheus metrics by @ishaan-jaff in #4545
- Deepseek coder now has 128k context by @paul-gauthier in #4541
- Cohere tool calling fix by @krrishdholakia in #4546
- fix: Include vertex_ai_beta in vertex_ai param mapping/Do not use google auth project_id by @t968914 in #4461
- [Fix] Invite Links / Onboarding flow on admin ui by @ishaan-jaff in #4548
- feat - allow looking up model_id on
/model/info
by @ishaan-jaff in #4547 - feat(internal_user_endpoints.py): expose
/user/delete
endpoint by @krrishdholakia in #4386 - Return output_vector_size in get_model_info by @tomusher in #4279
- [Feat] Add Groq/whisper-large-v3 by @ishaan-jaff in #4549
New Contributors
- @andrewmjc made their first contribution in #4522
- @t968914 made their first contribution in #4461
- @tomusher made their first contribution in #4279
Full Changelog: v1.41.6...v1.41.7
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.41.7
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 152.06898919521237 | 6.419721686734246 | 0.0 | 1921 | 0 | 111.60093299997698 | 1678.7594189999027 |
Aggregated | Passed ✅ | 130.0 | 152.06898919521237 | 6.419721686734246 | 0.0 | 1921 | 0 | 111.60093299997698 | 1678.7594189999027 |