BerriAI/litellm v1.63.2-stable on GitHub

Full Changelog: v1.61.20-stable...v1.63.2-stable

New Models / Updated Models
1. Add supports_pdf_input: true for specific Bedrock Claude models
LLM Translation
1. Support /openai/ passthrough for Assistant endpoints
2. Bedrock Claude - fix amazon anthropic claude 3 tool calling transformation on invoke route
3. Bedrock Claude - response_format support for claude on invoke route
4. Bedrock - pass description if set in response_format
5. Bedrock - Fix passing response_format: {"type": "text"}
6. OpenAI - Handle sending image_url as str to openai
7. Deepseek - Fix deepseek 'reasoning_content' error
8. Caching - Support caching on reasoning content
9. Bedrock - handle thinking blocks in assistant message
10. Anthropic - Return signature on anthropic streaming + migrate to signature field instead of signature_delta
11. Support format param for specifying image type
12. Anthropic - /v1/messages endpoint - thinking param support: note: this refactors the [BETA] unified /v1/messages endpoint, to just work for the Anthropic API.
13. Vertex AI - handle $id in response schema when calling vertex ai
Spend Tracking Improvements
1. Batches API - Fix cost calculation to run on retrieve_batch
2. Batches API - Log batch models in spend logs / standard logging payload
Management Endpoints / UI
Logging / Guardrail Integrations
1. Fix prometheus metrics w/ custom metrics
Performance / Loadbalancing / Reliability improvements
1. Cooldowns - Support cooldowns on models called with client side credentials
2. Tag-based Routing - ensures tag-based routing across all endpoints (/embeddings, /image_generation, etc.)
General Proxy Improvements
1. Raise BadRequestError when unknown model passed in request
2. Enforce model access restrictions on Azure OpenAI proxy route
3. Reliability fix - Handle emoji’s in text - fix orjson error
4. Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
5. Enable setting timezone information in docker image

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable

Docker Run LiteLLM Proxy

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.63.2-stable

Don't want to maintain your internal proxy? get in touch 🎉

Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Load Test LiteLLM Proxy Results

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	190.0	223.19371836864636	6.25209576552295	0.0033451555727784642	1869	1	89.92210900004238	1948.821826000028
Aggregated	Passed ✅	190.0	223.19371836864636	6.25209576552295	0.0033451555727784642	1869	1	89.92210900004238	1948.821826000028