What's new in 1.5.0 (2025-04-19)

These are the changes in inference v1.5.0.

New features

FEAT: Support megatts3 by @codingl2k1 in #3224
FEAT: InternVL3 by @Minamiyama in #3235
FEAT: support paraformer-zh by @qinxuye in #3236
FEAT:support SeaLLMs-v3 by @Jun-Howie in #3248
FEAT: support getting download progress and cancel download by @qinxuye in #3233
FEAT: add thinking process in gradio chat interface by @amumu96 in #3245
FEAT: support glm4-0414 by @Jun-Howie in #3251
FEAT: support min/max_pixels params for vision model by @amumu96 in #3242
FEAT: support skywork-or1-preview by @Jun-Howie in #3274
FEAT: [UI] progress bar and functionality to cancel model launch. by @yiboyasss in #3276
FEAT:Add AWQ quantization support for InternVL3 by @Jun-Howie in #3285
FEAT: Support virtualenv for models by @qinxuye in #3241
FEAT: support qwen2.5-omni by @qinxuye in #3279

ENH: Compatible with latest xllamacpp by @codingl2k1 in #3181
ENH: Use xllamacpp by default by @codingl2k1 in #3198
ENH: update gradio interface for chat model by @amumu96 in #3265
ENH: Set gradio default concurrency to cpu count by @codingl2k1 in #3278
BLD: fix compatibility for mlx-lm>=0.22.3 by @qinxuye in #3195
BLD: upgrade gradio version for docker by @Minamiyama in #3197
BLD: fix docker build by @qinxuye in #3207
BLD: remove setuptools limitation in project.toml by @qinxuye in #3212
BLD: fix docker build by @amumu96 in #3289
REF: simplify transformers model registration with decorators. by @Minamiyama in #3191

BUG: fix stop hang for vllm engine by @qinxuye in #3202
BUG: Fix qwq gguf model path by @codingl2k1 in #3232
BUG: Fix llama cpp backend load model with multiple parts by @codingl2k1 in #3261

Full Changelog: v1.4.1...v1.5.0