github NVIDIA/TensorRT-LLM v0.18.0
TensorRT-LLM Release 0.18.0

2 days ago

Hi,

We are very pleased to announce the 0.18.0 version of TensorRT-LLM. This update includes:

Key Features and Enhancements

  • Features that were previously available in the 0.18.0.dev pre-releases are not included in this release.
  • [BREAKING CHANGE] Windows platform support is deprecated as of v0.18.0. All Windows-related code and functionality will be completely removed in future releases.

Known Issues

  • The PyTorch workflow on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.

Infrastructure Changes

  • The base Docker image for TensorRT-LLM is updated to nvcr.io/nvidia/pytorch:25.03-py3.
  • The base Docker image for TensorRT-LLM Backend is updated to nvcr.io/nvidia/tritonserver:25.03-py3.
  • The dependent TensorRT version is updated to 10.9.
  • The dependent CUDA version is updated to 12.8.1.
  • The dependent NVIDIA ModelOpt version is updated to 0.25 for Linux platform.

Don't miss a new TensorRT-LLM release

NewReleases is sending notifications on new releases.