ggml-org/llama.cpp b8885
on GitHub

latest releases: b9735, b9733, b9732...

one month ago

Details

mtmd, llama : Update HunyuanVL vision-language model support (#22037)

mtmd, llama : add HunyuanVL vision-language model support

add LLM_ARCH_HUNYUAN_VL with M-RoPE (XD-RoPE) support
add PROJECTOR_TYPE_HUNYUANVL with PatchMerger vision encoder
add HunyuanVL-specific M-RoPE position encoding for image tokens
add GGUF conversion for HunyuanVL vision and text models
add smoke test in tools/mtmd/tests.sh

fix: fix HunyuanVL XD-RoPE h/w section order
fix: Remove redundant code
convert : fix HunyuanOCR / HunyuanVL conversion

Tested locally: both HunyuanOCR and HunyuanVL-4B convert to GGUF
successfully and produce correct inference output on Metal (F16 / Q8_0).

clip : fix -Werror=misleading-indentation in bilinear resize
fix CI: convert_hf_to_gguf type check error

convert_hf_to_gguf.py: give HunyuanVLTextModel.init an explicit dir_model: Path parameter so ty can infer the type for load_hparams instead of reporting Unknown | None.

Co-authored-by: wendadawen wendadawen@tencent.com

macOS/iOS:

Linux:

Android:

Android arm64 (CPU)

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8885

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications