⚡️LiteLLM Proxy 100+ LLMs, Track Number of Requests, Avg Latency Per Model Deployment
🛠️ High Traffic Fixes - Fix for DB connection limit hits when model fallbacks occur
🚀 High Traffic Fixes - /embedding - bug "Dictionary changed size during iteration"
⚡️ High Traffic Fixes - Switch off --detailed_debug in default Dockerfile. Users will need to opt in to viewing --detailed_debug logs. (This led to a 5% decrease in avg latency across 1K concurrent calls)
📖 Docs - Fixes for /user/new on LiteLLM Proxy Swagger (show how to set tpm/rpm limits per user) https://docs.litellm.ai/docs/proxy/virtual_keys#usernew
⭐️ Admin UI - separate latency, num requests graphs for model deployments https://docs.litellm.ai/docs/proxy/ui
What's Changed
- (Fix) High Traffic Fix - handle litellm circular ref error by @ishaan-jaff in #2363
- (feat) admin UI show model avg latency, num requests by @ishaan-jaff in #2367
- (fix) admin UI swagger by @ishaan-jaff in #2371
- [FIX] 🐛 embedding - "Dictionary changed size during iteration" Debug Log by @ishaan-jaff in #2378
- [Fix] Switch off detailed_debug in default docker by @ishaan-jaff in #2375
- feat(proxy_server.py): retry if virtual key is rate limited by @krrishdholakia in #2347
- fix(caching.py): add s3 path as a top-level param by @krrishdholakia in #2379
Full Changelog: v1.29.4...v1.29.7