flashinfer-ai/flashinfer v0.6.6 on GitHub

What's Changed

fix: move ArtifactPath/CheckSumHash imports inside gen_moe_utils_modu… by @dierksen in #2681
Enable sm120f compilation by @kahyunnam in #2650
Ensure -gencode flags are in deterministic order (for ccache) by @benbarsdell in #2674
int16 Block-Scaled State and Stochastic Rounding for SSU (mamba) by @ishovkun in #2645
feat: add pool+indices support to gated_delta_rule_decode_pretranspose (bf16 path) by @kaixih in #2619
chore: replace bare print() with logging across the package by @esmeetu in #2648
fix: reduce smem allocation for tinygemm2 kernel in SM120 by @jimmyzho in #2670
[chore] bench_moe_deepseek.py allows adjusting expert distribution by @rosenrodt in #2678
feat: add support for more MLA head dimensions by @hypdeb in #2677
[fp8_blockwise]Fix int32 overflow in TRTLLM fused MoE activation kernel by @charlotte12l in #2642
Give knam codeowner override for Qwen3.5 (gdn) related directories by @kahyunnam in #2680
HOTFIX: Skip mamba Stochastic Rounding tests on sm_120 by @ishovkun in #2699
chore: Update CODEOWNERS by @flashinfer-bot in #2712
feat: support mxfp4 & mxfp8 entrypoint for blackwell cutedsl dense gemm by @b8zhong in #2660
Undo fix to AutoTuner find_nearest_profile by @danisereb in #2697
Experiment Add @kahyunnam as co-owner for several files by @aleozlx in #2713
chore: Update CODEOWNERS by @flashinfer-bot in #2719
Implement cutlass_fused_moe mxfp8 by @zianglih in #2581

Full Changelog: v0.6.5...v0.6.6