What's new in 2.1.0 (2026-02-14)
These are the changes in inference v2.1.0.
New features
- FEAT: [model] GLM-4.7 support by @Jun-Howie in #4565
- FEAT: [model] MinerU2.5-2509-1.2B removed by @OliverBryant in #4568
- FEAT: [model] GLM-4.7-Flash support by @OliverBryant in #4578
- FEAT: [model] Qwen3-ASR-0.6B support by @leslie2046 in #4579
- FEAT: [model] Qwen3-ASR-1.7B support by @leslie2046 in #4580
- FEAT: added support qwen3-asr models by @leslie2046 in #4581
- FEAT: [model] MinerU2.5-2509-1.2B support by @GaoLeiA in #4569
- FEAT: [model] FLUX.2-klein-4B support by @lazariv in #4602
- FEAT: [model] FLUX.2-klein-9B support by @lazariv in #4603
- FEAT: Add support for FLUX.2-Klein-9B and -4B models by @lazariv in #4596
Enhancements
- ENH: update model "DeepSeek-V3.2" JSON by @OliverBryant in #4563
- ENH: update model "DeepSeek-V3.2-Exp" JSON by @OliverBryant in #4567
- ENH: update models JSON [image] by @XprobeBot in #4606
- BLD: constrain setuptools<82 in Docker images by @qinxuye in #4607
- REF: extract Pydantic request schemas from restful_api.py into xinference/api/schemas/ by @amumu96 in #4598
- REF: extract route registration into domain-specific routers/ by @amumu96 in #4600
Bug fixes
- BUG: vllm embedding model error by @OliverBryant in #4562
- BUG: vllm reranker score error by @OliverBryant in #4573
- BUG: handle async tokenizer in vllm core by @ace-xc in #4577
- BUG: vllm reranker model gpu release error by @OliverBryant in #4575
Documentation
Others
- BUG:setuptools CI error by @OliverBryant in #4595
New Contributors
- @ace-xc made their first contribution in #4577
- @GaoLeiA made their first contribution in #4569
- @lazariv made their first contribution in #4602
Full Changelog: v2.0.0...v2.1.0