This patch contains the following bug fixes:
- Fix some bug for finetune and batch infer For GLM-4.1V (#39090)
- [bugfix] fix flash attention 2 unavailable error on Ascend NPU (#39166)
- Fix errors when use verl to train GLM4.1v model (#39199)
- [pagged-attention] fix off-by-1 error in pagged attention generation (#39258)
- [smollm3] add tokenizer mapping for
smollm3
(#39271) - [sliding window] revert and deprecate (#39301)
- fix Glm4v batch videos forward (#39172)
- Add a default value for
position_ids
in masking_utils (#39310)