What's Changed
- changing ollama response parsing to expected behaviour by @TheDiscoMole in #1526
- Added cost & context metadata for openrouter/anthropic/claude-3-opus by @paul-gauthier in #3382
- fix - error sending details to log on sentry by @ishaan-jaff in #3384
- [UI] Fix show latency < 0.0001 for deployments that have low latency + only show non cache hits on latency UI by @ishaan-jaff in #3388
- [UI] show slow responses + num requests per deployment by @ishaan-jaff in #3390
- [Fix + Test] Errant prints on langfuse by @ishaan-jaff in #3391
- Add langfuse
sdk_integration
by @marcklingen in #2516 - Update langfuse in
requirements.txt
by @Manouchehri in #3262 - feat(openmeter.py): add support for user billing by @krrishdholakia in #3389
- [chore] Improve type-safety in Message & Delta classes by @elisalimli in #3379
- docs: add .github/pull_request_template.md by @nobu007 in #3349
- Update contributing instructions in README.md by @DomMartin27 in #3217
- build(deps): bump @hono/node-server from 1.9.0 to 1.10.1 in /litellm-js/spend-logs by @dependabot in #3169
- build(deps): bump idna from 3.6 to 3.7 by @dependabot in #2967
- Fix Greenscale Documentation by @greenscale-nandesh in #3278
- [Fix] bug where langfuse was reinitialized on every call by @ishaan-jaff in #3392
- Fix route
/openai/deployments/{model}/chat/completions
not working properly by @msabramo in #3375 - Litellm gh 3372 by @msabramo in #3402
- Vision for Claude 3 Family + Info for Azure/GPT-4-0409 by @azohra in #3405
- Improve mocking in
test_proxy_server.py
by @msabramo in #3406 - Disambiguate invalid model name errors by @msabramo in #3403
- fix - revert init langfuse client on slack alerts by @ishaan-jaff in #3409
- Add Llama3 tokenizer and allow custom tokenizers. by @Priva28 in #3393
- [Fix] Ensure callbacks are not added to router when
store_model_in_db=True
by @ishaan-jaff in #3419 - fix(lowest_latency.py): fix the size of the latency list to 10 by default (can be modified) by @krrishdholakia in #3422
New Contributors
- @nobu007 made their first contribution in #3349
- @DomMartin27 made their first contribution in #3217
- @azohra made their first contribution in #3405
- @Priva28 made their first contribution in #3393
Full Changelog: 1.35.33.dev4...v1.35.36-dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 41 | 43.09456797200015 | 1.6698929667670133 | 0.0 | 500 | 0 | 34.43404899996949 | 214.21879600001148 |
/health/liveliness | Passed ✅ | 25 | 28.224681616349365 | 15.443170156661338 | 0.006679571867068053 | 4624 | 2 | 23.087949000000663 | 1131.2702129999934 |
/health/readiness | Passed ✅ | 25 | 27.7633421125326 | 15.403092725458931 | 0.0 | 4612 | 0 | 23.193645000048946 | 1376.8194570000105 |
Aggregated | Passed ✅ | 25 | 28.769797206552926 | 32.51615584888728 | 0.006679571867068053 | 9736 | 2 | 23.087949000000663 | 1376.8194570000105 |