NVIDIA/TensorRT-LLM v0.18.0
TensorRT-LLM Release 0.18.0

on GitHub

2 days ago

Hi,

We are very pleased to announce the 0.18.0 version of TensorRT-LLM. This update includes:

Key Features and Enhancements

Features that were previously available in the 0.18.0.dev pre-releases are not included in this release.
[BREAKING CHANGE] Windows platform support is deprecated as of v0.18.0. All Windows-related code and functionality will be completely removed in future releases.

Known Issues

The PyTorch workflow on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.

Infrastructure Changes

The base Docker image for TensorRT-LLM is updated to nvcr.io/nvidia/pytorch:25.03-py3.
The base Docker image for TensorRT-LLM Backend is updated to nvcr.io/nvidia/tritonserver:25.03-py3.
The dependent TensorRT version is updated to 10.9.
The dependent CUDA version is updated to 12.8.1.
The dependent NVIDIA ModelOpt version is updated to 0.25 for Linux platform.

Check out latest releases or
releases around NVIDIA/TensorRT-LLM v0.18.0

Don't miss a new TensorRT-LLM release

NewReleases is sending notifications on new releases.

Get notifications