What's Changed
- Refactor input embedding handling in _generate_batch function by @Blaizzy in #694
- [Qwen2-VL ] Fix incorrect attention mask in Vision by @hturbe in #704
- Add GLM-OCR by @Blaizzy & @mikolaj92 in #706
- fix: Log underlying ImportError when model loading fails by @antonvice in #710
- Fix wired limit, deepstack eval and add prefill step size argument by @Blaizzy in #699
- [PaddleOCR] Fix hardcoded processor config by @Blaizzy in #712
- [SmolVLM] Refactor Attention class to calculate n_kv_heads dynamically by @Blaizzy in #713
New Contributors
- @hturbe made their first contribution in #704
- @mikolaj92 made their first contribution in #706
- @antonvice made their first contribution in #710
Full Changelog: v0.3.10...v0.3.11