github ggml-org/llama.cpp b8011

2 hours ago
Details

Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (#19461)

There is an upstream problem [1] with AMD's LLVM 22 fork and
rocWMMA 2.2.0 causing compilation issues on devices without
native fp16 support (CDNA devices).

The specialized types aren't resolved properly:

/opt/rocm/include/rocwmma/internal/mfma_impl.hpp:2549:37: error: ambiguous partial specializations of 'amdgcn_mfma<__half, __half, __half, 16, 16, 16>'
 2549 |             using ARegsT = typename Impl::ARegsT;

Add a workaround to explicitly declare the types and cast when
compiling with HIP and ROCWMMA_FATTN [2]. When this is actually
fixed upstream some guards can be used to detect and wrap the
version that has the fix to only apply when necessary.

Link: ROCm/rocm-libraries#4398 [1]
Link: #19269 [2]

Signed-off-by: Mario Limonciello mario.limonciello@amd.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.