Blaizzy/mlx-vlm v0.3.12 on GitHub

What's Changed

[MODEL] support qwen3.5 series by @JJJYmmm in #722
qwen3_omni_moe: Fix destructuring error by @Nixuge in #723
docs: clarify that some models need [torch] extra for torchvision by @Mr-Neutr0n in #716
Use mlx-lm provided logits processors and samplers by @AG6GR in #724
Fix LoRA training crash when config uses image_token_id by @JesseRod329 in #720
video_generate.py: handle capitalized extensions in is_video_file by @Nixuge in #719
Honor quantization_config by @pcuenca in #692
[Ministral] Add FP8 dequantization for model weights by @Blaizzy in #727
Fix LFM2.5VL processor compatibility patch image placeholder count by @mattjcly in #725
Add activation quantization support for QQLinear layers by @Blaizzy in #728
Refactor processor return types in utils.py to use ProcessorMixin by @Blaizzy in #729
Fix glm (4v & 4v moe) broadcast by @Blaizzy in #731
[FastVLM] fix dtype cast by @Blaizzy in #733
Fix kimi broadcast by @Blaizzy in #734
Add custom processor for fastvlm by @Blaizzy in #736
Fix paligemma prefill and numerical precision by @Blaizzy in #737
Fix idefics3 llama3 and smovlm by @Blaizzy in #738
docs(README): fix JSON formatting in curl examples by @chsdwn in #739
fix chunk prefill in qwen3.5 moe by @JJJYmmm in #742
Fix florence-2 (processor and infer) by @Blaizzy in #743
Fix Qwen3.5 cast type and predicates by @Blaizzy in #744
[Qwen2, 2.5] Fix vision overflow by @Blaizzy in #745
[Ministral3] Fix multi-image by @Blaizzy in #747
[Phiv3] Fix dtype cast by @Blaizzy in #748
Add dots-ocr by @Blaizzy in #749

Full Changelog: v0.3.11...v0.3.12