github NVIDIA/cutlass v2.5.0
CUTLASS 2.5.0

latest releases: v3.5.0, v3.4.1, v3.4.0...
3 years ago

CUTLASS 2.5 is a minor release contributing:

  • Tensor reductions
    • m-to-n reductions of tensors with affine layout
    • Specializations for reductions including contiguous dimension
    • Specializations for reductions excluding contiguous dimension
    • Custom reduction functors such as cutlass::logical_and
    • Large tensor support, up to 2^63 elements (however, each dimension is limited to an extent of 2^31)
  • Optimizations for 3-D convolution
    • Optimized tile iterators using precomputed delta table for 3-D convolution
    • Full coverage of forward and backwards passes for 3D convolution
  • Fused Convolution+Convolution example
  • Corrections and bug fixes reported by the CUTLASS community
    • Thank you for filing these issues!

Don't miss a new cutlass release

NewReleases is sending notifications on new releases.