ggml-org/llama.cpp b7962
on GitHub

latest release: b7964

2 hours ago

Details

ggml-webgpu: JIT compile binary operators and handle binding overlaps (#19310)

ggml webgpu: port binary operators to use pre-wgsl
Add binary.wgsl: unified shader with conditionals for all 4 ops
Add gen_binary_shaders.cpp: build tool for using pre_wgsl preprocessor
Remove bin_op.tmpl.wgsl and binary.wgsl (Python template)
Update CMake to generate binary operator shaders at build time
ggml-webgpu: migrate binary ops to JIT compilation with overlap handling
port binary operators from AOT to pre-wgsl JIT compilation
add src1=dst overlap handling for binary ops
use compile-time workgroup size defines instead of runtime overrides
ggml-webgpu: complete overlap handling for binary ops
add support for inplace & overlap case in binding setup
restructure conditional logic to handle all overlap cases
ensure all buffer bindings are correctly assigned for edge cases
ggml-webgpu: remove unused binary overlap cases

Remove src0==src1 binary overlap case that never occurs in practice.

keep INPLACE (src0==dst), OVERLAP (src1==dst), DEFAULT
remove unused src0==src1 and all-same variant
refactor wgsl to eliminate duplication

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b7962

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications