xorbitsai/inference v2.1.0
on GitHub

5 hours ago

What's new in 2.1.0 (2026-02-14)

These are the changes in inference v2.1.0.

New features

FEAT: [model] GLM-4.7 support by @Jun-Howie in #4565
FEAT: [model] MinerU2.5-2509-1.2B removed by @OliverBryant in #4568
FEAT: [model] GLM-4.7-Flash support by @OliverBryant in #4578
FEAT: [model] Qwen3-ASR-0.6B support by @leslie2046 in #4579
FEAT: [model] Qwen3-ASR-1.7B support by @leslie2046 in #4580
FEAT: added support qwen3-asr models by @leslie2046 in #4581
FEAT: [model] MinerU2.5-2509-1.2B support by @GaoLeiA in #4569
FEAT: [model] FLUX.2-klein-4B support by @lazariv in #4602
FEAT: [model] FLUX.2-klein-9B support by @lazariv in #4603
FEAT: Add support for FLUX.2-Klein-9B and -4B models by @lazariv in #4596

Enhancements

ENH: update model "DeepSeek-V3.2" JSON by @OliverBryant in #4563
ENH: update model "DeepSeek-V3.2-Exp" JSON by @OliverBryant in #4567
ENH: update models JSON [image] by @XprobeBot in #4606
BLD: constrain setuptools<82 in Docker images by @qinxuye in #4607
REF: extract Pydantic request schemas from restful_api.py into xinference/api/schemas/ by @amumu96 in #4598
REF: extract route registration into domain-specific routers/ by @amumu96 in #4600

Bug fixes

BUG: vllm embedding model error by @OliverBryant in #4562
BUG: vllm reranker score error by @OliverBryant in #4573
BUG: handle async tokenizer in vllm core by @ace-xc in #4577
BUG: vllm reranker model gpu release error by @OliverBryant in #4575

Documentation

DOC: add v2.0.0 release by @qinxuye in #4566

Others

BUG：setuptools CI error by @OliverBryant in #4595

New Contributors

@ace-xc made their first contribution in #4577
@GaoLeiA made their first contribution in #4569
@lazariv made their first contribution in #4602

Full Changelog: v2.0.0...v2.1.0

Check out latest releases or
releases around xorbitsai/inference v2.1.0

Don't miss a new inference release

NewReleases is sending notifications on new releases.

Get notifications