github ggml-org/llama.cpp b8116

latest release: b8117
6 hours ago
Details

quantize : add --dry-run option (#19526)

  • clean slate for branch

  • use 6 characters for tensor dims

  • add --dry-run to llama-quantize

  • use 6 characters for tensor dims (cont.)

  • no need to re-calculate ggml_nbytes for tensor

  • fix indent

  • show model and quant BPW when quant completes

  • add example to --help

  • new function tensor_requires_imatrix, add courtesy warning about imatrix

  • missing func, move imatrix flag set

  • logic error

  • fixup tensor_requires_imatrix

  • add missing GGML_TYPEs

  • simplify and rename tensor_type_requires_imatrix

  • simplify for style

  • add back Q2_K edge case for imatrix

  • guard ftype imatrix warning

  • comment ref #12557

  • remove per @compilade

  • remove unused params parameter

  • move bool dry_run per GG

  • move bool dry_run per GG

  • Update src/llama-quant.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/llama-quant.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/llama-quant.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com


Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.