github NVIDIA/cutlass v2.3.0
CUTLASS 2.3

latest releases: v3.5.0, v3.4.1, v3.4.0...
3 years ago

CUTLASS 2.3

  • NVIDIA Ampere Architecture features
    • Sparse Tensor Core GEMM kernels:
      • Direct access to Sparse Tensor Cores and maximum performance via mma.sp.sync
    • Fast SGEMM targeting GeForce RTX 30-series CUDA Cores
  • Minor Features:
    • Activation functions such as GeLU and Sigmoid
    • Small matrix and quaternion template classes in device code
    • Floating-point constants
  • NVIDIA Ampere GPU Architecture examples and documentation:
    • Tensor Float 32 and
    • Sparse Tensor Cores
    • Documentation added on CUTLASS efficient row-major epilogue

Don't miss a new cutlass release

NewReleases is sending notifications on new releases.