kvcache-ai/ktransformers v0.5.2.post1 on GitHub

🚀 Core Highlights

Simplified Installation with sglang Submodule: The kvcache-ai/sglang fork is now vendored as a git submodule and published to PyPI as sglang-kt. Installation is reduced from a multi-step manual process to a single ./install.sh command or pip install ktransformers (which auto-installs sglang-kt). Added daily CI auto-sync of the sglang submodule and automated PyPI publishing on version bump.
New Model Support — Qwen3.5, GLM-5, MiniMax-M2.5, Qwen3-Coder-Next: Day-0 support for four new MoE models spanning a wide range of hardware requirements — from Qwen3-Coder-Next (1x RTX 4090, 80B-A3B) to Qwen3.5 (4x RTX 4090, 400B MoE). All models support BF16 and FP8 precision backends with CPU-GPU heterogeneous inference.
Kimi-K2.5 Support & Mistral MoE Compatibility: Added Kimi-K2.5 deployment guides including SFT fine-tuning integration, fallback expert prefix lookup for robust weight loading, and Mistral MoE loader compatibility for broader model coverage.

📌 Models, Hardware & Tooling

Model support updates
- Add Qwen3.5 (MoE-400B) with FP8 VL detection fix.
- Add GLM-5 with BF16/FP8 precision support.
- Add MiniMax-M2.5 with FP8 weight optimization.
- Add Qwen3-Coder-Next (80B-A3B) for code generation.
- Add Mistral MoE loader compatibility (#1873).
- Add Kimi-K2.5 with fallback expert prefix lookup (#1822).
Kernel & hardware improvements
- Fix k2-moe.hpp weight loading (#1830).
- Fix wrapper import issue (#1819).
- Improve CUDA code readability with explicit ele_per_blk variable (#1784).
Tooling & integration
- Add top-level install.sh for one-click source installation (sglang + kt-kernel).
- Publish sglang fork as sglang-kt on PyPI; kt-kernel auto-installs it as dependency.
- Add CI workflows: daily sglang submodule sync, automated sglang-kt PyPI publishing.
- Align sglang-kt version with ktransformers (single version.py source of truth).
- kt-cli enhancements (#1834).
- Handle unquoted paths and special characters in model scanner (#1840).
- Update Docker build for submodule-based sglang installation.

📝 Docs & Community

🐛 Bug Fixes

🌟 Contributors

kvcache-ai/ktransformers v0.5.2.post1 KTransformers v0.5.2 on GitHub