ggml-org/llama.cpp b8698
on GitHub

latest releases: b9357, b9354, b9353...

one month ago

Details

ggml-webgpu: parameterize submission size and add iOS specific limits (#21533)

Work towards removing bitcast
Move rest of existing types over
Add timeout back to wait and remove synchronous set_tensor/memset_tensor
move to unpackf16 for wider compatibility
cleanup
Remove deadlock condition in free_bufs
Start work on removing parameter buffer pools
Simplify and optimize further
simplify profile futures
Fix stride
Try using a single command buffer per batch
formatting
Add parameters for different browsers in-flight submissions
Update handling of batch size too
Throttle ios as much as possible
Increase timeout for llvm-pipe testing

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8698

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications