github triton-inference-server/server v1.5.0
Release 1.5.0, corresponding to NGC container 19.08

latest releases: v2.45.0, v2.44.0, v2.43.0...
4 years ago

NVIDIA TensorRT Inference Server

The NVIDIA TensorRT Inference Server provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or GRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.

What's New In 1.5.0

  • Added a new execution mode allows the inference server to start without
    loading any models from the model repository. Model loading and unloading
    is then controlled by a new GRPC/HTTP model control API.

  • Added a new instance-group mode allows TensorFlow models that explicitly
    distribute inferencing across multiple GPUs to run in that manner in the
    inference server.

  • Improved input/output tensor reshape to allow variable-sized dimensions in
    tensors being reshaped.

  • Added a C++ wrapper around the custom backend C API to simplify the creation
    of custom backends. This wrapper is included in the custom backend SDK.

  • Improved the accuracy of the compute statistic reported for inference
    requests. Previously the compute statistic included some additional time
    beyond the actual compute time.

  • The performance client, perf_client, now reports more information for ensemble
    models, including statistics for all contained models and the entire ensemble.

Client Libraries and Examples

Ubuntu 16.04 and Ubuntu 18.04 builds of the client libraries and examples are included in this release in the attached v1.5.0_ubuntu1604.clients.tar.gz and v1.5.0_ubuntu1804.clients.tar.gz files. See the documentation section 'Building the Client Libraries and Examples' for more information on using these files.

Custom Backend SDK

Ubuntu 16.04 and Ubuntu 18.04 builds of the custom backend SDK are included in this release in the attached v1.5.0_ubuntu1604.custombackend.tar.gz and v1.5.0_ubuntu1804.custombackend.tar.gz files. See the documentation section 'Building a Custom Backend' for more information on using these files.

Don't miss a new server release

NewReleases is sending notifications on new releases.