github triton-inference-server/server v0.10.0
Release 0.10.0 beta, corresponding to NGC container 19.01

latest releases: v2.45.0, v2.44.0, v2.43.0...
5 years ago

NVIDIA TensorRT Inference Server

The NVIDIA TensorRT Inference Server (TRTIS) provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or GRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.

What's New In 0.10.0 Beta

  • Custom backend support. TRTIS allows individual models to be implemented with custom backends instead of by a deep-learning framework. With a custom backend a model can implement any logic desired, while still benefiting from the GPU support, concurrent execution, dynamic batching and other features provided by TRTIS.

Don't miss a new server release

NewReleases is sending notifications on new releases.