Hi,
We are very pleased to announce the 0.18.0 version of TensorRT-LLM. This update includes:
Key Features and Enhancements
- Features that were previously available in the 0.18.0.dev pre-releases are not included in this release.
- [BREAKING CHANGE] Windows platform support is deprecated as of v0.18.0. All Windows-related code and functionality will be completely removed in future releases.
Known Issues
- The PyTorch workflow on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.
Infrastructure Changes
- The base Docker image for TensorRT-LLM is updated to
nvcr.io/nvidia/pytorch:25.03-py3
. - The base Docker image for TensorRT-LLM Backend is updated to
nvcr.io/nvidia/tritonserver:25.03-py3
. - The dependent TensorRT version is updated to 10.9.
- The dependent CUDA version is updated to 12.8.1.
- The dependent NVIDIA ModelOpt version is updated to 0.25 for Linux platform.