Details
quantize : add --dry-run option (#19526)
-
clean slate for branch
-
use 6 characters for tensor dims
-
add --dry-run to llama-quantize
-
use 6 characters for tensor dims (cont.)
-
no need to re-calculate ggml_nbytes for tensor
-
fix indent
-
show model and quant BPW when quant completes
-
add example to --help
-
new function
tensor_requires_imatrix, add courtesy warning about imatrix -
missing func, move imatrix flag set
-
logic error
-
fixup tensor_requires_imatrix
-
add missing
GGML_TYPEs -
simplify and rename
tensor_type_requires_imatrix -
simplify for style
-
add back Q2_K edge case for imatrix
-
guard ftype imatrix warning
-
comment ref #12557
-
remove per @compilade
-
remove unused
paramsparameter -
move
bool dry_runper GG -
move
bool dry_runper GG -
Update src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
- Update src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
- Update src/llama-quant.cpp
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: