github ggml-org/llama.cpp b7668

latest releases: b7703, b7700, b7699...
3 days ago
Details

llama : add use_direct_io flag for model loading (#18166)

  • Adding --direct-io flag for model loading

  • Fixing read_raw() calls

  • Fixing Windows read_raw_at

  • Changing type off_t to size_t for windows and Renaming functions

  • disable direct io when mmap is explicitly enabled

  • Use read_raw_unsafe when upload_backend is available, not functional on some devices with Vulkan and SYCL

  • Fallback to std::fread in case O_DIRECT fails due to bad address

  • Windows: remove const keywords and unused functions

  • Update src/llama-mmap.cpp

Co-authored-by: Georgi Gerganov ggerganov@gmail.com


Co-authored-by: jtischbein jtischbein@gmail.com
Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.