github BerriAI/litellm v1.43.18-stable

3 months ago

What's Changed

  • [Feat] return x-litellm-key-remaining-requests-{model}: 1, x-litellm-key-remaining-tokens-{model}: None in response headers by @ishaan-jaff in #5259
  • [Feat] - Set tpm/rpm limits per Virtual Key + Model by @ishaan-jaff in #5256
  • [Feat] add prometheus metric for remaining rpm/tpm limit for (model, api_ley) by @ishaan-jaff in #5257
  • [Feat] read model + API key tpm/rpm limits from db by @ishaan-jaff in #5258
  • Pass-through endpoints for Gemini - Google AI Studio by @krrishdholakia in #5260
  • Fix incorrect message length check in cost calculator by @dhlidongming in #5219
  • [PRICING] Use specific llama2 and llama3 model names in Ollama by @kiriloman in #5221
  • [Feat-Proxy] set rpm/tpm limits per api key per model by @ishaan-jaff in #5261
  • Fixes the tool_use indexes not being correctly mapped by @Penagwin in #5232
  • [Feat-Proxy] Use model access groups for teams by @ishaan-jaff in #5263

New Contributors

Full Changelog: v1.43.17...v1.43.18-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.18-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.18-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Passed ✅ 89 111.00064602360165 6.510491922043163 0.0 1949 0 69.99250100000154 2982.894710000039
Aggregated Passed ✅ 89 111.00064602360165 6.510491922043163 0.0 1949 0 69.99250100000154 2982.894710000039

Don't miss a new litellm release

NewReleases is sending notifications on new releases.