Details
ggml-webgpu: fix compiler warnings and refactor FlashAttention encoding (#21052)
-
Update workflows to remove dependence on llvmpipe
-
Try setting Dawn_DIR
-
remove c++20 initializers
-
Move to proper guid
-
Try avoiding segfaults on vulkan backend process exit
-
Remove compiler warnings on parameter casting
-
Fix soft_max and update reg_tile accumulation to f32 for better precision
-
Refactor flash_attn a bit
-
remove c++20 initializers and format
-
Increase div precision for NVIDIA
-
revert div precision and comment out ggml-ci node for now
-
Formatting
-
Try debugging on a failing CI node
-
Revert "Try debugging on a failing CI node"
This reverts commit 1971e33.
macOS/iOS:
- macOS Apple Silicon (arm64)
- macOS Apple Silicon (arm64, KleidiAI enabled)
- macOS Intel (x64)
- iOS XCFramework
Linux:
- Ubuntu x64 (CPU)
- Ubuntu arm64 (CPU)
- Ubuntu s390x (CPU)
- Ubuntu x64 (Vulkan)
- Ubuntu arm64 (Vulkan)
- Ubuntu x64 (ROCm 7.2)
- Ubuntu x64 (OpenVINO)
Android:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: