github NVIDIA/cutlass v4.3.4
CUTLASS 4.3.4

10 hours ago

CuTe DSL

  • New features

  • Bug fixing and improvements

    • Fixed a frame refcnt issue with cuda graph
    • Enhancement for tvm-ffi AoT case for earlier module unload
    • Fixed order issue in make_smem_layout_a in utils/hopper_helpers.py

CUTLASS C++

  • Work around a driver TMA descriptor related bug which will cause occasionally errors on Blackwell when the tensor's backing memory allocation is less than 128KB and it is not a dense non-overlapping tensor.

Don't miss a new cutlass release

NewReleases is sending notifications on new releases.