github NVIDIA/TensorRT-LLM v0.20.0rc1

latest releases: v1.1.0rc2.post1, v1.1.0rc3, v1.1.0rc2...
pre-release4 months ago

Highlights

  • Features
    • PyTorch workflow
    • Part 1 of large-scale EP: Added MNNVL MoE A2A support. (#3504)
    • Added smart router for the MoE module. (#3641)
    • Added head size 72 support for QKV preprocessing kernel. (#3743)

What's Changed

Full Changelog: v0.20.0rc0...v0.20.0rc1

Don't miss a new TensorRT-LLM release

NewReleases is sending notifications on new releases.