We're launching Day 0 support for Anthropic Prompt Caching on LiteLLM π Start here: https://docs.litellm.ai/docs/providers/anthropic#prompt-caching
π Cut Costs and latency, use Anthropic prompt caching for the following scenarios:
-
Large Context Cachingβ https://docs.litellm.ai/docs/providers/anthropic#caching---large-context-caching
-
Tools definitions https://docs.litellm.ai/docs/providers/anthropic#caching---tools-definitions
-
Continuing Multi-Turn Convo https://docs.litellm.ai/docs/providers/anthropic#caching---continuing-multi-turn-convo
π οΈ [Fix-Proxy] Allow running docker, docker-database as non-root user (h/t Oz Elhassid)
π [Fix] Prometheus use 'litellm_' prefix for new deployment metrics (h/t Filipe Andujar)
β
[Feat-Proxy] Add failure logging for GCS bucket logging https://docs.litellm.ai/docs/proxy/bucket
What's Changed
- Update prices/context windows for Perplexity Llama 3.1 models by @bachya in #5206
- Allow specifying langfuse project for logging in key metadata by @krrishdholakia in #5176
- vertex_ai/claude-3-5-sonnet@20240620 support prefill by @paul-gauthier in #5203
- Enable follow redirects in ollama_chat by @fabceolin in #5148
- feat(user_api_key_auth.py): support calling langfuse with litellm user_api_key_auth by @krrishdholakia in #5192
- Use
AZURE_API_VERSION
env var as default azure openai version by @msabramo in #5211 - [Feat] Add Anthropic API Prompt Caching Support by @ishaan-jaff in #5210
New Contributors
- @fabceolin made their first contribution in #5148
Full Changelog: v1.43.12...v1.43.13
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.13
Don't want to maintain your internal proxy? get in touch π
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed β | 84 | 97.96550381346324 | 6.506952562817539 | 0.0 | 1946 | 0 | 66.09550899997885 | 1639.4581249999192 |
Aggregated | Passed β | 84 | 97.96550381346324 | 6.506952562817539 | 0.0 | 1946 | 0 | 66.09550899997885 | 1639.4581249999192 |