github vllm-project/vllm v0.11.2

8 hours ago

This release includes 4 bug fixes on top of v0.11.1:

  • [BugFix] Ray with multiple nodes (#28873)
  • [BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (#29036)
  • [BugFix] Fix async-scheduling + FlashAttn MLA (#28990)
  • [NVIDIA] Guard SM100 CUTLASS MoE macro to SM100 builds v2 (#28938)

Don't miss a new vllm release

NewReleases is sending notifications on new releases.