github xorbitsai/inference v1.8.1.rc1

latest releases: v1.9.1, v1.9.0, v1.8.1...
pre-releaseone month ago

What's new in 1.8.1.rc1 (2025-08-03)

These are the changes in inference v1.8.1.rc1.

New features

Enhancements

  • ENH: add mlu device check by @nan9126 in #3844
  • ENH: Support for the bge-m3 llama.cpp backend by @codingl2k1 in #3861
  • ENH: Added mlx support for deepseek-v3-0324 by @uebber in #3864
  • ENH: Add context length limits and automatic truncation features to vLLM embedding models. by @amumu96 in #3887
  • BLD: remove sglang from pip install xinference[all] due to depedency conflicts with vllm by @qinxuye in #3865
  • BLD: upgrade base image for dockerfile by @zwt-1234 in #3318
  • REF: add ui module that includes web and gradio UIs. by @qinxuye in #3819
  • REF: move continuous batching scheduler into model by @qinxuye in #3824

Bug fixes

Documentation

  • DOC: add experimental feature for virtualenv by @qinxuye in #3818
  • DOC: add doc about model virtual env settings when lauching model by @qinxuye in #3885

Others

New Contributors

Full Changelog: v1.8.0...v1.8.1.rc1

Don't miss a new inference release

NewReleases is sending notifications on new releases.