github xorbitsai/inference v1.8.0

latest releases: v1.9.1, v1.9.0, v1.8.1...
one month ago

What's new in 1.8.0 (2025-07-20)

These are the changes in inference v1.8.0.

New features

Enhancements

Bug fixes

  • BUG: disable flash_attn for qwen3 embedding & rerank when no gpu available by @qinxuye in #3739
  • BUG: Fix bugs in del async_client by @zhcn000000 in #3753
  • BUG: add message preprocessing to ensure that content is not null by @amumu96 in #3791
  • BUG: pre check to prevent from list index out of range for FunASR family models by @leslie2046 in #3809
  • BUG: resolve issue where AI output was lost when no tool was selected for function call #3767 by @aniya105 in #3768
  • BUG: fix error in content output at reasoning_content, when using enable_thinking in chat_template_kwargs by @amumu96 in #3794

Documentation

Full Changelog: v1.7.1...v1.8.0

Don't miss a new inference release

NewReleases is sending notifications on new releases.