🚀 Core Highlights
- Simplified Installation with sglang Submodule: The kvcache-ai/sglang fork is now vendored as a git submodule and published to PyPI as
sglang-kt. Installation is reduced from a multi-step manual process to a single./install.shcommand orpip install ktransformers(which auto-installssglang-kt). Added daily CI auto-sync of the sglang submodule and automated PyPI publishing on version bump. - New Model Support — Qwen3.5, GLM-5, MiniMax-M2.5, Qwen3-Coder-Next: Day-0 support for four new MoE models spanning a wide range of hardware requirements — from Qwen3-Coder-Next (1x RTX 4090, 80B-A3B) to Qwen3.5 (4x RTX 4090, 400B MoE). All models support BF16 and FP8 precision backends with CPU-GPU heterogeneous inference.
- Kimi-K2.5 Support & Mistral MoE Compatibility: Added Kimi-K2.5 deployment guides including SFT fine-tuning integration, fallback expert prefix lookup for robust weight loading, and Mistral MoE loader compatibility for broader model coverage.
📌 Models, Hardware & Tooling
- Model support updates
- Kernel & hardware improvements
- Tooling & integration
- Add top-level
install.shfor one-click source installation (sglang + kt-kernel). - Publish sglang fork as
sglang-kton PyPI; kt-kernel auto-installs it as dependency. - Add CI workflows: daily sglang submodule sync, automated sglang-kt PyPI publishing.
- Align sglang-kt version with ktransformers (single
version.pysource of truth). - kt-cli enhancements (#1834).
- Handle unquoted paths and special characters in model scanner (#1840).
- Update Docker build for submodule-based sglang installation.
- Add top-level
📝 Docs & Community
- Add Qwen3.5 tutorial: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/Qwen3.5.md
- Add GLM-5 tutorial: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/GLM-5-Tutorial.md
- Add MiniMax-M2.5 tutorial: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/MiniMax-M2.5.md
- Add Qwen3-Coder-Next tutorial: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/Qwen3-Coder-Next-Tutorial.md
- Add Kimi-K2.5 deployment & SFT guide: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/Kimi-K2.5.md
- Add maintainers list (#1837).
- Simplify sglang installation instructions across all 13 model tutorials.
🐛 Bug Fixes
- Fix Qwen3.5 FP8 load for VL detection (#1857).
- Fix k2-moe.hpp load weight issue (#1830).
- Fix wrapper import issue (#1819).
- Fix experts-sched-Tutorial.md (#1808).
- Handle unquoted paths and special characters in model scanner (#1840).
🌟 Contributors
- Thanks to all contributors who helped ship this release.
Full Changelog: v0.5.1...v0.5.2
CC: @ouqingliang @ErvinXie @chenht2022 @KMSorSMS @ovowei @SkqLiao @JimmyPeilinLi @mrhaoxx @james0zan