ggml-org/llama.cpp b8891
on GitHub

latest release: b8892

3 hours ago

Details

ggml-webgpu: Add fused RMS_NORM + MUL (#21983)

fused rms_norm_mul + mul
Add GGML_WEBGPU_DISABLE_FUSION for being able to disable kernel fusion.
Decouple num_fused_ops from webgpu_context; misc cleanup
Fix eps handling and remove disable_fusion.
Fix not to use c++20 initializers.

macOS/iOS:

Linux:

Android:

Android arm64 (CPU)

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8891

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications