What's new in 1.6.0 (2025-05-16)
These are the changes in inference v1.6.0.
New features
- FEAT: [MODEL]XiYanSQL-QwenCoder-2504 by @Minamiyama in #3352
- FEAT: [Model]HuatuoGPT-o1 by @Minamiyama in #3353
- FEAT: [Model]DianJin-R1 by @Minamiyama in #3343
- FEAT: support image_to_video by @qinxuye in #3386
- FEAT: Qwen3-235B-A22B GPTQ Quantization Int4 Int8 by @Jun-Howie in #3422
- FEAT: use xo.wait_for instead of asyncio.wait_for for actor call by @qinxuye in #3439
- FEAT: video UI by @qinxuye in #3448
- FEAT: auto add tag when it is missed by @amumu96 in #3456
- FEAT: Support Skywork-OR1 by @Jun-Howie in #3447
- FEAT: audio UI by @qinxuye in #3457
- FEAT: Support Skywork-OR1 gptq for 32B by @Jun-Howie in #3464
- FEAT: support enable_thinking for loading qwen3 by @qinxuye in #3463
Enhancements
- ENH: Qwen/Qwen2.5-Omni-3B by @Minamiyama in #3366
- ENH: added mlx format for qwen3 & update docs by @qinxuye in #3369
- ENH: Qwen3-AWQ for 14B & 32B by @Minamiyama in #3370
- ENH: Update the activated_steze_In_billions parameter in the deepseek-vl2 model by @Jun-Howie in #3380
- ENH: add mlx-community/Qwen2.5-VL-32B-Instruct by @xiaohan815 in #3405
- ENH: [UI] add a documentation link button in side menu by @Minamiyama in #3411
- ENH: QwQ use unsloth gguf by @codingl2k1 in #3408
- ENH: llama.cpp backend use xllamacpp by @codingl2k1 in #3412
- ENH: [UI] display version info in side menu by @Minamiyama in #3423
- ENH: Worker env isolation by @codingl2k1 in #3362
- ENH: Use Qwen's official quantitative model repository by @Jun-Howie in #3436
- ENH: Update cosyvoice by @codingl2k1 in #3365
- BLD: isolate autoawq and GPTQModel into separate extra install by @qinxuye in #3397
- BLD: pin transformers version at 4.51.3 by @amumu96 in #3431
- REF: support loading model config in function by @Minamiyama in #3428
Bug fixes
- BUG: fix qwen3 235b spec by @qinxuye in #3375
- BUG: fix incomplete parsing of reasoning content in reasoning_parser by @amumu96 in #3391
- BUG: fix the processing logic for inference content parsing and tool calls by @amumu96 in #3394
- BUG: fix stop word handling logic in vllm model generation configuration by @amumu96 in #3414
- BUG: fix Model._get_full_prompt() takes 3 positional arguments but 4 were given by @qinxuye in #3417
- BUG: fix potential stop hang by @qinxuye in #3434
- BUG: [UI] Added cpu_offload parameter to video model and fixed bug in audio model's filtering function. by @yiboyasss in #3461
New Contributors
- @xiaohan815 made their first contribution in #3405
Full Changelog: v1.5.1...v1.6.0