xorbitsai/inference v0.12.2
on GitHub

latest releases: v0.16.2, v0.16.1, v0.16.0...

4 months ago

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
BUG: Fix default rerank type by @codingl2k1 in #1649
BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

TST: Fix CI due to tenacity by @ChengjieLi28 in #1660

Others

CHORE: [pre-commit] Add exclude thirdparty rules by @frostyplanet in #1678

Full Changelog: v0.12.1...v0.12.2

Check out latest releases or
releases around xorbitsai/inference v0.12.2

Don't miss a new inference release

NewReleases is sending notifications on new releases.

Get notifications