What's Changed
- [Refactor - Filtering Spend Logs] Add
status
to root of SpendLogs table by @ishaan-jaff in #10661 - Filter logs on status and model by @NANDINI-star in #10670
- [Refactor] Anthropic /v1/messages endpoint - Refactor to use base llm http handler and transformations by @ishaan-jaff in #10677
- [Feat] Add support for using Bedrock Invoke models in /v1/messages format by @ishaan-jaff in #10681
- fix(factory.py): Handle system only message to anthropic by @krrishdholakia in #10678
- Realtime API - Set 'headers' in scope for websocket auth requests + reliability fix infinite loop when model_name not found for realtime models by @krrishdholakia in #10679
- Extract 'thinking' from nova response + Add 'drop_params' support for gpt-image-1 by @krrishdholakia in #10680
- New azure models by @emerzon in #9956
- Add GPTLocalhost to "docs/my-website/docs/projects" by @GPTLocalhost in #10687
- Add nscale support for streaming by @tomukmatthews in #10698
New Contributors
- @GPTLocalhost made their first contribution in #10687
Full Changelog: v1.68.1.dev4...v1.68.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.68.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 223.07673508503882 | 6.209370359620187 | 0.0033419646714855688 | 1858 | 1 | 75.31227999999146 | 4978.849046000022 |
Aggregated | Passed ✅ | 190.0 | 223.07673508503882 | 6.209370359620187 | 0.0033419646714855688 | 1858 | 1 | 75.31227999999146 | 4978.849046000022 |