github vllm-project/vllm v0.17.1

6 hours ago

This is a patch release on top of v0.17.0 to address a few issues:

  • Fix passing of activation_type to trtllm fused MoE NVFP4 and FP8 (#36017)
  • Fix/resupport nongated fused moe triton (#36412)
  • Re-enable EP for trtllm MoE FP8 backend (#36494)
  • [Mamba][Qwen3.5] Zero freed SSM cache blocks on GPU (#35219)
  • Fix TRTLLM Block FP8 MoE Monolithic (#36296)
  • [DSV3.2][MTP] Optimize Indexer MTP handling (#36723)

Don't miss a new vllm release

NewReleases is sending notifications on new releases.