github xorbitsai/inference v0.12.2

latest releases: v0.16.2, v0.16.1, v0.16.0...
4 months ago

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

  • FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
  • FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
  • FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
  • FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

  • ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

  • BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
  • BUG: Fix default rerank type by @codingl2k1 in #1649
  • BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

Others

Full Changelog: v0.12.1...v0.12.2

Don't miss a new inference release

NewReleases is sending notifications on new releases.