github flashinfer-ai/flashinfer v0.2.4

latest releases: v0.6.11.post2, nightly-v0.6.11-20260513, v0.6.11.post1...
13 months ago

What's Changed

  • typo: fix pdl terminology by @yzh119 in #933
  • Fix "specutate" typo by @markmc in #934
  • typo: fix target_probs docs after uniform_samples removal by @markmc in #935
  • typo: remove another uniform samples leftover by @markmc in #937
  • Fix/precommit issues by @diptorupd in #931
  • ci: setup Jenkins by @yzh119 in #874
  • bugfix: fix include header name conflict by @yzh119 in #939
  • fix: Fix MLA TVM binding for the latest changes by @MasterJH5574 in #940
  • feat - support mla kvcache store by @baowendin in #888
  • Add POD-Attention to FlashInfer by @AKKamath in #858
  • bugfix: fix potential issues of FA3 template loading nans for PageAttention by @yzh119 in #945
  • fix - fix bug when not relevant seq has nan data by @baowendin in #942
  • misc: add ci-badge, update blog list by @yzh119 in #948
  • bugfix: Fix missing PyModuleDef field initializers by @sampan26 in #946
  • fix: fix pod-attention compilation time by @yzh119 in #954
  • bugfix: bugfix to #949 by @yzh119 in #951
  • misc: Temporarily disable POD from AOT wheels by @abcdabcd987 in #956
  • ci: improve jenkins by @yzh119 in #943
  • Fix compilation on cuda 12.2 by @goliaro in #961
  • doc: remove misleading docstring about non_blocking by @yzh119 in #966
  • perf: reduce torch.library dispatch overhead by @yzh119 in #968
  • [TVM] Added tvm binding for sampling kernel by @annanyapr in #958
  • perf: Fix python API overhead when CUDAGraph is not enabled by @yzh119 in #969
  • Fix POD JIT bugs by @AKKamath in #971
  • benchmark: add sampling.renorm benchmarks by @xslingcn in #970
  • perf: dual pivot top-p/top-k renorm by @xslingcn in #974
  • perf: Use 2WG pipeline design for MLA implementation on Hopper by @yzh119 in #952
  • release: bump version to v0.2.4 by @yzh119 in #980

New Contributors

Full Changelog: v0.2.3...v0.2.4

Don't miss a new flashinfer release

NewReleases is sending notifications on new releases.