github ggml-org/llama.cpp b8929

4 hours ago
Details

llama-quant : default ftype param Q5_1 --> Q8_0 (#20828)

Change the default ftype in llama_model_quantize_params from
LLAMA_FTYPE_MOSTLY_Q5_1 to LLAMA_FTYPE_MOSTLY_Q8_0.

In case some external program naively uses the default quantization
params, we should probably default to a known-good type like Q8_0 rather
than Q5_1, which is rather old.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.