What's Changed
- Make sure to evaluate cached MRoPE parameters on model load by @lucasnewman in #1272
- Fix Gemma4 unified long text prefill by @Blaizzy in #1280
- Add support for the Ideogram 4 image generation model by @lucasnewman in #1281
- Fix Nemotron-H Nano Omni audio crash under nvfp4 quantization by @txdadlab in #1279
- Fix multi-image prompts in get_rope_index for remaining MRoPE families by @scyyh11 in #1282
- Fix PaddleOCR-VL stale mRoPE state under batching by @Blaizzy in #1285
- Add Ideogram 4 local prompt expansion support by @omercelik in #1276
- Fix APC for single requests by @lucasnewman in #1291
- Add video input support for Gemma 4 12B by @lucasnewman in #1292
- Fix stale test_utils.py regressions + extract get_class_predicate by @mdstaff in #1071
- Add regression tests for ThinkingBudgetCriteria.apply_forced_token by @glenncameron2 in #1297
- [Cohere] Add Cohere tool-calling and thinking parsing support by @Terrencezzj in #1298
- docs(gemma4): clarify base vs instruct and thinking modes by @Girish011 in #1300
- test: add PaddleOCR-VL processor regression coverage by @jimmyzhuu in #933
- docs: add model guides for ERNIE 4.5 VL and PaddleOCR-VL by @jimmyzhuu in #934
- Fix using structured output with thinking enabled by @lucasnewman in #1299
- Accept qwen3_5_vision / qwen3_5_moe_vision vision model_type by @davidrhodus in #1302
- Don't specify a default model for the server by @lucasnewman in #1303
- Fix gemma4 load for HF checkpoints with num_kv_shared_layers > 0 by @Blaizzy in #1301
- Update version to v0.6.2 by @Blaizzy in #1304
- Adjust Gemma4 quantization predicate by @Blaizzy in #1288
- Fix LFM2-VL checkpoint loading by @Blaizzy in #1308
- Fix Kimi VL quantized projector loading by @Blaizzy in #1309
New Contributors
- @txdadlab made their first contribution in #1279
- @scyyh11 made their first contribution in #1282
- @glenncameron2 made their first contribution in #1297
- @Girish011 made their first contribution in #1300
- @jimmyzhuu made their first contribution in #933
- @davidrhodus made their first contribution in #1302
Full Changelog: v0.6.1...v0.6.2