Blaizzy/mlx-vlm v0.3.11 on GitHub

What's Changed

Refactor input embedding handling in _generate_batch function by @Blaizzy in #694
[Qwen2-VL ] Fix incorrect attention mask in Vision by @hturbe in #704
Add GLM-OCR by @Blaizzy & @mikolaj92 in #706
fix: Log underlying ImportError when model loading fails by @antonvice in #710
Fix wired limit, deepstack eval and add prefill step size argument by @Blaizzy in #699
[PaddleOCR] Fix hardcoded processor config by @Blaizzy in #712
[SmolVLM] Refactor Attention class to calculate n_kv_heads dynamically by @Blaizzy in #713

Full Changelog: v0.3.10...v0.3.11