What's new in 1.5.0 (2025-04-19)
These are the changes in inference v1.5.0.
New features
- FEAT: Support megatts3 by @codingl2k1 in #3224
- FEAT: InternVL3 by @Minamiyama in #3235
- FEAT: support paraformer-zh by @qinxuye in #3236
- FEAT:support SeaLLMs-v3 by @Jun-Howie in #3248
- FEAT: support getting download progress and cancel download by @qinxuye in #3233
- FEAT: add thinking process in gradio chat interface by @amumu96 in #3245
- FEAT: support glm4-0414 by @Jun-Howie in #3251
- FEAT: support min/max_pixels params for vision model by @amumu96 in #3242
- FEAT: support skywork-or1-preview by @Jun-Howie in #3274
- FEAT: [UI] progress bar and functionality to cancel model launch. by @yiboyasss in #3276
- FEAT:Add AWQ quantization support for InternVL3 by @Jun-Howie in #3285
- FEAT: Support virtualenv for models by @qinxuye in #3241
- FEAT: support qwen2.5-omni by @qinxuye in #3279
Enhancements
- ENH: Compatible with latest xllamacpp by @codingl2k1 in #3181
- ENH: Use xllamacpp by default by @codingl2k1 in #3198
- ENH: update gradio interface for chat model by @amumu96 in #3265
- ENH: Set gradio default concurrency to cpu count by @codingl2k1 in #3278
- BLD: fix compatibility for mlx-lm>=0.22.3 by @qinxuye in #3195
- BLD: upgrade gradio version for docker by @Minamiyama in #3197
- BLD: fix docker build by @qinxuye in #3207
- BLD: remove setuptools limitation in project.toml by @qinxuye in #3212
- BLD: fix docker build by @amumu96 in #3289
- REF: simplify transformers model registration with decorators. by @Minamiyama in #3191
Bug fixes
- BUG: fix stop hang for vllm engine by @qinxuye in #3202
- BUG: Fix qwq gguf model path by @codingl2k1 in #3232
- BUG: Fix llama cpp backend load model with multiple parts by @codingl2k1 in #3261
Documentation
- DOC: Add usage doc for kokoro by @codingl2k1 in #3192
- DOC: add doc about virtual env & update models in README by @qinxuye in #3287
Full Changelog: v1.4.1...v1.5.0