ggml-org/llama.cpp b7851 on GitHub

Details

ggml webgpu: Split shared state (webgpu_context) into global state and per-thread state (#18976)

Squashed commit of the following:

commit b3c6bf4
Author: Abhijit Ramesh abhijitramesh2k@gmail.com
Date: Mon Dec 1 18:29:00 2025 -0800

ggml webgpu: fix xielu parameter passing (#11)

The XIELU operation was incorrectly using static_cast to convert
float parameters to uint32_t, which converted numeric values instead
of preserving IEEE 754 bit patterns. This caused incorrect values
to be interpreted by the GPU shader.

* Use reinterpret_cast to preserve float bit patterns when passing
  through uint32_t params buffer
* Update WGSL shader parameter types from u32 to f32
* Re-enable XIELU support (was disabled due to numerical issues)

Fixes NMSE test failures for XIELU operation on WebGPU backend.

commit 5ca9b5e
Author: neha-ha 137219201+neha-ha@users.noreply.github.com
Date: Tue Nov 18 12:17:00 2025 -0800

Refactored pipelines and workgroup calculations (#10)

* refactored pipelines

* refactored workgroup calculation

* removed commented out block of prior maps

* Clean up ceiling division pattern

---------

Co-authored-by: Neha Abbas <nehaabbas@eduroam-169-233-141-223.ucsc.edu>
Co-authored-by: Reese Levine <reeselevine1@gmail.com>

Author: James Contini jamescontini@gmail.com
Date: Wed Oct 29 23:13:06 2025 -0700

formatted embed wgsl and ggml-webgpu.cpp

commit e1f6bae
Author: James Contini jamescontini@gmail.com
Date: Wed Oct 29 23:08:37 2025 -0700

implemented REPL_Template support and removed bug in unary operators kernel

commit 8c70b8f
Author: James Contini jamescontini@gmail.com
Date: Wed Oct 15 16:14:20 2025 -0700

responded and dealt with PR comments

commit f9282c6
Author: James Contini jamescontini@gmail.com
Date: Sun Oct 12 13:41:41 2025 -0700

removed unnecesarry checking if node->src[1] exists for unary operators

commit 4cf28d7
Author: James Contini jamescontini@gmail.com
Date: Sun Oct 12 13:32:45 2025 -0700

All operators (inlcluding xielu) working

commit 74c6add
Author: James Contini jamescontini@gmail.com
Date: Fri Oct 10 13:16:48 2025 -0700

fixed autoconfig

commit 3627499
Author: James Contini jamescontini@gmail.com
Date: Fri Oct 10 13:10:46 2025 -0700

removed vestigial files

commit cb08583
Author: James Contini jamescontini@gmail.com
Date: Fri Oct 10 12:59:32 2025 -0700

abides by editor-config

commit 5360e28
Author: James Contini jamescontini@gmail.com
Date: Fri Oct 10 12:45:57 2025 -0700

rms_norm double declaration bug atoned

commit 7b09baa
Merge: 8a6ec84 74b8fc1
Author: James Contini jamescontini@gmail.com
Date: Fri Oct 10 11:50:03 2025 -0700

resolving merge conflicts

commit 8a6ec84
Author: James Contini jamescontini@gmail.com
Date: Wed Oct 8 18:06:47 2025 -0700

unary operators pass ggml tests

commit c3ae382
Author: James Contini jamescontini@gmail.com
Date: Wed Oct 1 16:22:40 2025 -0700

neg passes backend test

commit aa1c9b2
Author: James Contini jamescontini@gmail.com
Date: Tue Sep 30 23:55:27 2025 -0700

neg f16xf32xip builds and runs, havent actually ran a model that uses neg kernel yet though

Co-authored-by: James Contini jamescontini@gmail.com
Co-authored-by: Neha Abbas neabbas@ucsc.edu
Co-authored-by: Abhijit Ramesh abhijitramesh2k@gmail.com

Remove extra code and format
Add ops documentation (finally)
ggml webgpu: add SOFTPLUS unary operator

Implements SOFTPLUS (log(1 + exp(x))) with f16/f32 support. Uses f32
precision for intermediate calculations to prevent f16 overflow.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
Follow Vulkan backend numerical stability pattern
ggml webgpu: add EXPM1 unary operator

Implements EXPM1 (exp(x) - 1) with f16/f32 support.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
ggml webgpu: add FLOOR unary operator

Implements FLOOR (rounds down to nearest integer) with f16/f32 support.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
ggml webgpu: add CEIL unary operator

Implements CEIL (rounds up to nearest integer) with f16/f32 support.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
ggml webgpu: add ROUND unary operator

Implements ROUND (rounds to nearest integer) with f16/f32 support.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
ggml webgpu: add TRUNC unary operator

Implements TRUNC (truncates towards zero) with f16/f32 support.

Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
Register pipelines and device support
docs : update WebGPU support for unary operators (FLOOR, CEIL, ROUND, TRUNC, EXPM1, SOFTPLUS)
Updates to webgpu get_memory
Move shared state (webgpu_context) and device creation out of registration context, device context, and buffer context, and move into backend context
Small cleanup
Move Instance, Device, Adapter, Device creation, and capabilities to global state while moving Queue, pipelines, and buffers to per-thread state.
Cleanups
More cleanup
Move staging_buf mutex to global context
Resolve merge
Resolve merge
Resolve merge
Clean up merge errors, delete forward declaration, and run clang-format
Rename device_init to backend_init
Move webgpu_context to backend_context
Move buffer context members into global context and refactor function calls
Run clang-format
Remove commends
Move parameter buffers to per-thread, add single memset_tensor param buf
Fix CI compilation issue
Fix builds for emscripten not supporting subgroups
cleanup
cleanup

Co-authored-by: Reese Levine reeselevine1@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler: