What's new in 1.8.1.rc1 (2025-08-03)
These are the changes in inference v1.8.1.rc1.
New features
- FEAT: kokoro mlx support by @qinxuye in #3823
- FEAT: Qwen3-Instruct by @Jun-Howie in #3840
- FEAT: [UI] integrate user favorites into feature model output. by @yiboyasss in #3859
- FEAT: support enable virtualenv and specify packages when lauching model by @qinxuye in #3854
- FEAT: [UI] support enable virtualenv and specify packages when lauching model. by @yiboyasss in #3867
- FEAT: setting max_tokens to maximum if not specified by @qinxuye in #3872
- FEAT: [model] support GLM-4.5 series by @qinxuye in #3882
- FEAT: Qwen3-30B-A3B-it by @Jun-Howie in #3886
- FEAT: Support Qwen3-Thinking by @Jun-Howie in #3888
- FEAT: Support Qwen3-Coder by @Jun-Howie in #3889
Enhancements
- ENH: add mlu device check by @nan9126 in #3844
- ENH: Support for the bge-m3 llama.cpp backend by @codingl2k1 in #3861
- ENH: Added mlx support for deepseek-v3-0324 by @uebber in #3864
- ENH: Add context length limits and automatic truncation features to vLLM embedding models. by @amumu96 in #3887
- BLD: remove sglang from pip install xinference[all] due to depedency conflicts with vllm by @qinxuye in #3865
- BLD: upgrade base image for dockerfile by @zwt-1234 in #3318
- REF: add ui module that includes web and gradio UIs. by @qinxuye in #3819
- REF: move continuous batching scheduler into model by @qinxuye in #3824
Bug fixes
- BUG: Fixed an error when using structured output in sglang #3825 by @aniya105 in #3826
- BUG: fix compatibility for old vllm by @qinxuye in #3838
- BUG: Fix abnormal GPU memory usage in Qwen3 Reranker by @JDanielWu in #3846
- BUG: fix compatibility with vllm 0.10.0 by @qinxuye in #3875
- BUG: fix version checks for vllm by @qinxuye in #3891
Documentation
- DOC: add experimental feature for virtualenv by @qinxuye in #3818
- DOC: add doc about model virtual env settings when lauching model by @qinxuye in #3885
Others
- FIX: GLM4.1V Repository URL by @Jun-Howie in #3839
- CHORE: THUDM has been renamed to zai-org by @Jun-Howie in #3870
New Contributors
- @JDanielWu made their first contribution in #3846
- @uebber made their first contribution in #3864
- @zwt-1234 made their first contribution in #3318
Full Changelog: v1.8.0...v1.8.1.rc1