github flashinfer-ai/flashinfer v0.5.2
Release v0.5.2

latest releases: nightly-v0.6.13-20260607, nightly-v0.6.13-20260606, nightly-v0.6.13-20260605...
7 months ago

What's Changed

  • ci: Update cudnn version requirements in CI container by @bkryu in #2039
  • test: Mark test_fp8_prefill.py as xfail on SM90 by @bkryu in #2038
  • Update Docker CI tags to 20251104-d528f0c by @flashinfer-bot in #2041
  • bugfix: fix failed unittest test_green_ctx and test_jit_example on spark (sm_121) by @yzh119 in #1951
  • perf: Speed up fp4 quantization for small batch with swizzling for cutlass MoE by @bkryu in #2025
  • Support cc common check decorator for empty backends by @jimmyzho in #2015
  • use scalar for kv_scale in xqa by @qsang-nv in #2033
  • fix: support both pip and uv pip for finding flashinfer-python package by @djmmoss in #2043
  • test: Fix test_sampling.py on Spark by @bkryu in #2042
  • Fix dtype of output scales from mnnvl_moe_alltoallv_prepare_without_allgather by @trevor-m in #2048
  • Update trtllm-gen fused moe routing kernel and add more kernels by @jiahanc in #1955
  • chore: Update CODEOWNERS by @flashinfer-bot in #1984
  • Add support for topkPacked input in block-level renormalize by @ChristinaZ in #2051
  • test: Skip test_fp8_quantize.py on Hopper by @bkryu in #2052
  • [BUG] Fix trtllm-gen fp4 moe renormalize routing by @IwakuraRein in #2049
  • release: Bump version for v0.5.2 release by @bkryu in #2057

Full Changelog: v0.5.1...v0.5.2

Don't miss a new flashinfer release

NewReleases is sending notifications on new releases.