github BerriAI/litellm v1.46.0

latest releases: v1.52.2, v1.52.1.dev1, v1.52.1...
one month ago

What's Changed

  • [Fix] Performance - use in memory cache when downloading images from a url by @ishaan-jaff in #5657
  • [Feat - Perf Improvement] DataDog Logger 91% lower latency by @ishaan-jaff in #5687
  • (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827 by @F1bos in #5693
  • LiteLLM Minor Fixes and Improvements (09/13/2024) by @krrishdholakia in #5689
  • LiteLLM Minor Fixes and Improvements (09/14/2024) by @krrishdholakia in #5697
  • Update model_prices_and_context_window.json by @Ahmet-Dedeler in #5700
  • [Feat] Add max_completion_tokens param by @ishaan-jaff in #5691
  • [Feat] Stable Prs - Sep 14th (Sambanova API) by @ishaan-jaff in #5703
  • [Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments by @ishaan-jaff in #5698
  • [Feat-Prometheus] Track exception status on litellm_deployment_failure_responses by @ishaan-jaff in #5706
  • [Feat-Prometheus] Add prometheus metric for tracking cooldown events by @ishaan-jaff in #5705

New Contributors

Full Changelog: v1.45.0...v1.46.0

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.46.0

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name Status Median Response Time (ms) Average Response Time (ms) Requests/s Failures/s Request Count Failure Count Min Response Time (ms) Max Response Time (ms)
/chat/completions Failed ❌ 580.0 1015.3195512348642 5.0202586549188855 0.0 1503 0 71.49234000002025 12269.324192
Aggregated Failed ❌ 580.0 1015.3195512348642 5.0202586549188855 0.0 1503 0 71.49234000002025 12269.324192

Don't miss a new litellm release

NewReleases is sending notifications on new releases.