github ggml-org/llama.cpp b8607

latest releases: b8609, b8610, b8608...
2 hours ago
Details

ggml webgpu: quantized buffers to u32 + wider browser/device support (#21046)

  • Work towards removing bitcast

  • Move rest of existing types over

  • Add timeout back to wait and remove synchronous set_tensor/memset_tensor

  • move to unpackf16 for wider compatibility

  • cleanup

  • Remove deadlock condition in free_bufs

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.