NVIDIA/DALI v1.8.0 on GitHub

Key Features and Enhancements

This DALI release includes the following key features and enhancements.

Added batch mode support to external_source operator with parallel callback. (#3420 and #3397)
Extended crop_mirror_normalize operator to support per-sample normalization parameters. (#3455)
Improved error messages when trying to decode images with unsupported format. (#3445)
Documentation improvements. (#3448 and #3439)

Fixed Issues

This DALI release includes the following fixes:

Fixed unsound interpretation of the aspect ratio parameter in the random_bbox_crop operator, when input shape is provided. (#3425)
Fixed incorrect output shape in the experimental.readers.video operator. (#3460)

Improvements

Remove reseeding of numpy in RandomlyShapedDataIterator (#3466)
Add indexing information to TF external source tests (#3467)
Extend setup_packages.py to bing package with its dependencies (#3464)
Update dependency versions (#3457)
Optionally load plugins global symbols. (#3462)
Add NVIDIA Video Codec SDK - NVDECODE API (#3458)
CropMirrorNormalize: Add support for per-sample normalization arguments (#3455)
Support batch mode in parallel external source (#3397)
Turn off part of TL0_FW_iterators tests when sanitizers are enabled (#3456)
Read ArgValue constant arguments only once (#3453)
Rename InputRef/OutputRef to Input/Output in workspace API (#3451)
Reduce number of Workspace Input/Output APIs (#3446)
Fix error reporting in image factory (#3445)
Update custom op example for newer CMake (#3448)
Update TF dataset to 2.8 (#3442)
Fix documentation of CropMirrorNormalize dtype argument (#3439)
Bump up nvJPEG2k version to 0.4 (#3440)
Enable CUDA 11.5 builds (#3436)
Enable sanitizers in regular CI runs (#3422)
Improve the way how available python version is available (#3438)
RandomBBoxCrop: Fix interpretation of aspect ratio, when input shape is provided (#3425)
Change the permute function to infer the output size from the indices. (#3434)
Move to the upstream deb packages for JetPack compilation (#3432)
Change C++ standard to c++17 for non-CUDA sources (#3423)
Add epoch number to SampleInfo and introduce BatchInfo (#3420)
Separate type setting from data access in Buffer (#3414)
Make SBSA build compatible with all armv8-a CPUs (#3417)
Update TF plugin for future API change (#3415)
Replace pointers with references for ShareData parameter (#3408)
Code cleanup: remove unused variables, fix buffer overflow (#3410)
Enable usage of sanitizers in tests (#3377)

Bug Fixes

Update tensorflow version in conda build (#3471)
Fix STRING_VEC default arguments presentation in docs (#3470)
Remove broken class method from DALI Dataset (#3465)
Fix experimental.readers.video output shape (#3460)
Fix static analysis detected issues (#3444)
Silence output from build_per_python_lib cmake utility (#3454)
Make Workspace::Input return const reference (#3452)
Update imports from collections to collections.abc where needed (#3429)
Install boost/preprocessor headers (#3443)
Fix ShareData for TensorVector with no elements (#3435)
Update GCC version in conda recipe to 7.5 to workaround GCC bug 82461. (#3431)
Add a missing state destruction for the NVJPEG HW decoder (#3416)

Breaking API changes

There are no breaking changes in this DALI release.

Deprecated features

There are no deprecated features in this DALI release.

Known issues:

The video loader operator requires that the key frames occur at a minimum every 10 to 15 frames of the video stream. If the key frames occur at a lesser frequency, then the returned frames may be out of sync.
The DALI TensorFlow plugin might not be compatible with TensorFlow versions 1.15.0 and later.
To use DALI with the TensorFlow version that does not have a prebuilt plugin binary shipped with DALI, make sure that the compiler that is used to build TensorFlow exists on the system during the plugin installation. (Depending on the particular version, use GCC 4.8.4, GCC 4.8.5, or GCC 5.4.)
Due to some known issues with meltdown/spectra mitigations and DALI, DALI shows best performance when run in Docker with escalated privileges, for example:
- privileged=yes in Extra Settings for AWS data points
- --privileged or --security-opt seccomp=unconfined for bare Docker

Binary builds

Install via pip for CUDA 10.2:
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda102==1.8.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda102==1.8.0

or for CUDA 11:

CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x toolkit
while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later). 
Using the latest driver may enable additional functionality. 
More details can be found in enhanced CUDA compatibility guide.

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda110==1.8.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda110==1.8.0

Or use direct download links (CUDA 10.2):

Or use direct download links (CUDA 11.0):

FFmpeg source code:

This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here

Libsndfile source code:

https://developer.download.nvidia.com/compute/redist/nvidia-dali/libsndfile-1.0.31.tar.gz

NVIDIA/DALI v1.8.0 DALI v1.8.0 on GitHub