github ggml-org/llama.cpp b9319

latest releases: b9329, b9326, b9320...
3 hours ago
Details

ggml: gguf_init_from_callback and gguf_init_from_buffer (#22341)

  • ggml: implement gguf_init_from_buffer

  • test: gguf_init_from_buffer

  • fix: memory breakdown for a model loaded with no_alloc from a file is consistent with being loaded from a buffer

  • fix: use GGML_UNUSED

Co-authored-by: Copilot copilot@github.com

  • fix: remove total_size from gguf_reader

  • fix: file offset calculation, rename offset to data_offset

Co-authored-by: Copilot copilot@github.com

  • refactor: extract model loader bug fixes to another PR

  • feat: add gguf_init_from_callback

  • fix: always require a max expected size

  • fix: change gguf_reader_callback_t's output type to void *, change max_expected_size and offsets to uint64_t

  • fix: harden against offset overflow in buffer read

  • fix: remove seek behavior from the callback

  • feat: max_chunk_read == 0 means SIZE_MAX

  • fix: seeking in a gguf file with no tensors


Co-authored-by: Copilot copilot@github.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.