ggml-org/llama.cpp b9319 on GitHub

Details

ggml: gguf_init_from_callback and gguf_init_from_buffer (#22341)

ggml: implement gguf_init_from_buffer
test: gguf_init_from_buffer
fix: memory breakdown for a model loaded with no_alloc from a file is consistent with being loaded from a buffer
fix: use GGML_UNUSED

Co-authored-by: Copilot copilot@github.com

Co-authored-by: Copilot copilot@github.com

refactor: extract model loader bug fixes to another PR
feat: add gguf_init_from_callback
fix: always require a max expected size
fix: change gguf_reader_callback_t's output type to void *, change max_expected_size and offsets to uint64_t
fix: harden against offset overflow in buffer read
fix: remove seek behavior from the callback
feat: max_chunk_read == 0 means SIZE_MAX
fix: seeking in a gguf file with no tensors

Co-authored-by: Copilot copilot@github.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI: