What's Changed
- feat(vertex_ai_partner.py): Vertex AI Mistral Support by @krrishdholakia in #4925
- Support vertex mistral cost tracking by @krrishdholakia in #4929
- [Feat-Proxy] - Langfuse log /audio/transcription on langfuse by @ishaan-jaff in #4939
- Fix: #4942. Remove verbose logging when exception can be handled by @dleen in #4943
- fixes: #4947 Bedrock context exception does not have a response by @dleen in #4948
- [Feat] Bedrock add support for Bedrock Guardrails by @ishaan-jaff in #4946
- build(deps): bump fast-xml-parser from 4.3.2 to 4.4.1 in /docs/my-website by @dependabot in #4950
- ui - allow entering custom model names for all all provider (azure ai, openai, etc) by @ishaan-jaff in #4951
- Fix bug in cohere_chat.py by @pat-cohere in #4949
- Feat UI - allow using custom header for litellm api key by @ishaan-jaff in #4916
- [Feat] Add
litellm.create_fine_tuning_job()
,litellm.list_fine_tuning_jobs()
,litellm.cancel_fine_tuning_job()
finetuning endpoints by @ishaan-jaff in #4956 - [Feature]: GET /v1/batches to return list of batches by @ishaan-jaff in #4969
- [Fix-Proxy] ProxyException code as str - Make OpenAI Compatible by @ishaan-jaff in #4973
- Proxy Admin UI - switch off console logs in production mode by @ishaan-jaff in #4975
- feat(huggingface_restapi.py): Support multiple hf embedding types + async hf embeddings by @krrishdholakia in #4976
- fix(cohere.py): support async cohere embedding calls by @krrishdholakia in #4977
- fix(utils.py): fix model registeration to model cost map by @krrishdholakia in #4979
New Contributors
- @pat-cohere made their first contribution in #4949
Full Changelog: v1.42.5...v1.42.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.42.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 150.6083251086047 | 6.375223413611649 | 0.0 | 1906 | 0 | 105.08289299997386 | 1346.7240439999841 |
Aggregated | Passed ✅ | 130.0 | 150.6083251086047 | 6.375223413611649 | 0.0 | 1906 | 0 | 105.08289299997386 | 1346.7240439999841 |