Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
ggml webgpu: add support for emscripten builds (#17184)
- Faster tensors (#8)
Add fast matrix and matrix/vector multiplication.
-
Use map for shader replacements instead of pair of strings
-
Wasm (#9)
-
webgpu : fix build on emscripten
-
more debugging stuff
-
test-backend-ops: force single thread on wasm
-
fix single-thread case for init_tensor_uniform
-
use jspi
-
add pthread
-
test: remember to set n_thread for cpu backend
-
Add buffer label and enable dawn-specific toggles to turn off some checks
-
Intermediate state
-
Fast working f16/f32 vec4
-
Working float fast mul mat
-
Clean up naming of mul_mat to match logical model, start work on q mul_mat
-
Setup for subgroup matrix mat mul
-
Basic working subgroup matrix
-
Working subgroup matrix tiling
-
Handle weirder sg matrix sizes (but still % sg matrix size)
-
Working start to gemv
-
working f16 accumulation with shared memory staging
-
Print out available subgroup matrix configurations
-
Vectorize dst stores for sg matrix shader
-
Gemv working scalar
-
Minor set_rows optimization (#4)
-
updated optimization, fixed errors
-
non vectorized version now dispatches one thread per element
-
Simplify
-
Change logic for set_rows pipelines
Co-authored-by: Neha Abbas nehaabbas@macbookpro.lan
Co-authored-by: Neha Abbas nehaabbas@ReeseLevines-MacBook-Pro.local
Co-authored-by: Reese Levine reeselevine1@gmail.com
-
Comment on dawn toggles
-
Working subgroup matrix code for (semi)generic sizes
-
Remove some comments
-
Cleanup code
-
Update dawn version and move to portable subgroup size
-
Try to fix new dawn release
-
Update subgroup size comment
-
Only check for subgroup matrix configs if they are supported
-
Add toggles for subgroup matrix/f16 support on nvidia+vulkan
-
Make row/col naming consistent
-
Refactor shared memory loading
-
Move sg matrix stores to correct file
-
Working q4_0
-
Formatting
-
Work with emscripten builds
-
Fix test-backend-ops emscripten for f16/quantized types
-
Use emscripten memory64 to support get_memory
-
Add build flags and try ci
Co-authored-by: Xuan Son Nguyen son@huggingface.co
-
Remove extra whitespace
-
Move wasm single-thread logic out of test-backend-ops for cpu backend
-
Disable multiple threads for emscripten single-thread builds in ggml_graph_plan
-
Fix .gitignore
-
Add memory64 option and remove unneeded macros for setting threads to 1
Co-authored-by: Xuan Son Nguyen son@huggingface.co
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: