What's Changed
- Log 'api_base' on spend logs by @krrishdholakia in #9509
- [Fix] Use StandardLoggingPayload for GCS Pub Sub Logging Integration by @ishaan-jaff in #9508
- [Feat] Support for exposing MCP tools on litellm proxy by @ishaan-jaff in #9426
- fix(invoke_handler.py): remove hard coded final usage chunk on bedrock streaming usage by @krrishdholakia in #9512
- Add vertexai topLogprobs support by @krrishdholakia in #9518
- Update model_prices_and_context_window.json by @superpoussin22 in #9459
- fix vertex ai multimodal embedding translation by @krrishdholakia in #9471
- ci(publish-migrations.yml): add action for publishing prisma db migrations by @krrishdholakia in #9537
- [Feat - New Model] Add VertexAI
gemini-2.0-flash-lite
and Google AI Studiogemini-2.0-flash-lite
by @ishaan-jaff in #9523 - Support
litellm.api_base
for vertex_ai + gemini/ across completion, embedding, image_generation by @krrishdholakia in #9516
Full Changelog: 1.64.0.dev1...v1.64.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.64.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 530.0 | 583.0955007522234 | 5.641309914749418 | 0.0 | 1687 | 0 | 483.3096179999643 | 5048.277267999993 |
Aggregated | Failed ❌ | 530.0 | 583.0955007522234 | 5.641309914749418 | 0.0 | 1687 | 0 | 483.3096179999643 | 5048.277267999993 |