github ggml-org/llama.cpp b7931

2 hours ago
Details

ggml-virtgpu: make the code thread safe (#19204)

  • ggml-virtgpu: regenerate_remoting.py: add the ability to deprecate a function

  • ggml-virtgpu: deprecate buffer_type is_host remoting

not necessary

  • ggml-virtgpu: stop using static vars as cache

The static init isn't thread safe.

  • ggml-virtgpu: protect the use of the shared memory to transfer data

  • ggml-virtgpu: make the remote calls thread-safe

  • ggml-virtgpu: backend: don't continue if couldn't allocate the tensor memory

  • ggml-virtgpu: add a cleanup function for consistency

  • ggml-virtgpu: backend: don't crash if buft->iface.get_max_size is missing

  • fix style and ordering

  • Remove the static variable in apir_device_get_count

  • ggml-virtgpu: improve the logging

  • fix review minor formatting changes

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.