What's Changed
- feat: Add SDK support for additional headers by @uzaxirr in #14761
- Resolve Model Armor gaps (Pdfs, basic by @TeddyAmkie in #14758
- Fix: Streaming tool call index assignment for multiple tool calls by @timelfrink in #14587
- feat: Support requestMetadata in Bedrock Converse API by @timelfrink in #14570
- [Feat] Proxy CLI Auth - Allow re-using cli auth token by @ishaan-jaff in #14780
- Add W&B Inference to LiteLLM by @xprilion in #14416
- docs(provider_specific_params.md): fix docs by @krrishdholakia in #14787
- feat: enable custom fields in mcp_info configuration by @uzaxirr in #14794
- [Feat] Proxy CLI: Create a python method to login using litellm proxy by @ishaan-jaff in #14782
- opnerouter/x-ai/grok-4-fast:free" to model_prices_and_context_window.json by @CH-GAGANRAJ in #14779
- [Feat] Support flux image edit by @eycjur in #14790
- Add service_tier based pricing support for openai[ BOTH Service & Priority Support] by @Sameerlite in #14796
- fix vllm passthrough by @otaviofbrito in #14778
- Doc updates sept 2025 by @TeddyAmkie in #14769
- Docs: Update model references from gemini-pro to gemini-2.5-pro by @SmartManoj in #14775
- feat: remove server_name prefix from list_tools by @uc4w6c in #14720
- fix liniting issue by @Sameerlite in #14797
- fix: get metadata info from both metadata and litellm_metadata fields by @luizrennocosta in #14783
- [Fix] Priority Reservation: keys without priority metadata receive higher priority than keys with explicit priority configurations. by @ishaan-jaff in #14832
- feat: add xai/grok-4-fast models by @ishaan-jaff in #14833
- fix: cache root cause by @AlexsanderHamir in #14827
- Update vertex ai qwen model pricing by @ishaan-jaff in #14828
- docs: Letta Guide by @mubashir1osmani in #14798
- fix: added oracle to provider's list by @AlexsanderHamir in #14835
- fix: update sonnet 4 configs to reflect million-token context window pricing by @danielmklein in #14639
- feat: Add shared_session parameter for aiohttp ClientSession reuse by @dharamendrak in #14721
- Vertex AI Context Caching: use Vertex ai API v1 instead of v1beta1 and accept 'cachedContent' param by @otaviofbrito in #14831
- [Feat] Fixes for LiteLLM Proxy CLI to Auth to Gateway by @ishaan-jaff in #14836
New Contributors
- @uzaxirr made their first contribution in #14761
- @xprilion made their first contribution in #14416
- @CH-GAGANRAJ made their first contribution in #14779
- @otaviofbrito made their first contribution in #14778
- @danielmklein made their first contribution in #14639
Full Changelog: v1.77.3-nightly...v1.77.4-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.4-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 100.0 | 126.5160826668393 | 6.374734494623653 | 6.374734494623653 | 1906 | 1906 | 74.62785000006988 | 2626.797574999955 |
Aggregated | Failed ❌ | 100.0 | 126.5160826668393 | 6.374734494623653 | 6.374734494623653 | 1906 | 1906 | 74.62785000006988 | 2626.797574999955 |