github ggml-org/llama.cpp b8698

latest releases: b8699, b8697
one hour ago
Details

ggml-webgpu: parameterize submission size and add iOS specific limits (#21533)

  • Work towards removing bitcast

  • Move rest of existing types over

  • Add timeout back to wait and remove synchronous set_tensor/memset_tensor

  • move to unpackf16 for wider compatibility

  • cleanup

  • Remove deadlock condition in free_bufs

  • Start work on removing parameter buffer pools

  • Simplify and optimize further

  • simplify profile futures

  • Fix stride

  • Try using a single command buffer per batch

  • formatting

  • Add parameters for different browsers in-flight submissions

  • Update handling of batch size too

  • Throttle ios as much as possible

  • Increase timeout for llvm-pipe testing

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.