github ggml-org/llama.cpp b8361

latest releases: b8368, b8366, b8363...
4 hours ago
Details

ggml/hip: fix APU compatibility - soft error handling for hipMemAdviseSetCoarseGrain (#20536)

  • ggml/hip: fix APU compatibility - soft error handling for hipMemAdviseSetCoarseGrain

On AMD APU/iGPU devices (unified memory architecture), hipMemAdviseSetCoarseGrain
returns hipErrorInvalidValue because the hint is not applicable to UMA systems.
The previous CUDA_CHECK() call treated this as a fatal error, causing crashes on
APU systems such as AMD Strix Halo (gfx1151).

Fix: treat hipMemAdviseSetCoarseGrain as an optional performance hint - call it
without error checking and clear any resulting error with hipGetLastError().

Also add pre-allocation debug logging (GGML_LOG_DEBUG) to help diagnose memory
issues on APU systems, and store totalGlobalMem in device info.

Context: AMD APUs on Windows are affected by a ROCm runtime bug that limits
hipMallocManaged to ~64GB regardless of available system RAM. A fix has been
submitted upstream: ROCm/rocm-systems#4077

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

  • ggml/hip: remove unrelated changes, keep only hipMemAdviseSetCoarseGrain fix

Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.