github ggml-org/llama.cpp b8355

2 hours ago
Details

cuda : add RDNA4-specific MMVQ parameter table for bs=1 decode (#19478)

  • mmvq: add RDNA3/RDNA4-specific parameter table (nwarps=8, rows=1)

  • mmvq: add dedicated RDNA3 parameter table

  • mmvq: exclude RDNA3.5 (gfx1150/1151) from RDNA3 table

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.