github NVIDIA/DALI v1.19.0
DALI v1.19.0

latest releases: v1.39.0-dev, v1.37.1, v1.37.0...
18 months ago

Key Features and Enhancements

This DALI release includes the following key features and enhancements:

  • Added the experimental.decoders.video stand-alone video decoder to decode video on GPU and CPU provided as an in-memory buffer (for example, through an external source) (#4354, #4296).
  • Added support to decode indexless videos (#4347, #4302, and #4335).

Fixed Issues

The following issues were fixed in this release:

  • Fixed the handling of Caffe LMDB empty samples (without data or labels) (#4266).

Improvements

  • Exclude HEVC files from video decoder test. (#4357)
  • Fix a typo in Debug Mode documentation (#4355)
  • Parallelize gpu video decoding (#4354)
  • Make tests for DALI linked dynamically with CUDA more flexible (#4341) [categories: Other]
  • Update MXNet version used in tests (#4342)
  • Enable indexless video decoding for GPU (#4347)
  • Prevent obtaining handle values from dead unique handles and stream leases. (#4346)
  • Update broadcasting shape simplification logic (#4314)
  • Add warning about the end of support for CUDA 10.2 (#4334)
  • Frames decoder gpu without index (#4302)
  • Enable indexless decoding in CPU video decoder (#4335)
  • Update outdated links in the documentation (#4329)
  • Add Mixed VideoDecoder (#4296)
  • Update cutlass and DALI_deps revision. (#4328)
  • Fixes and performance improvments in imgcodec/nvjpeg (#4318)
  • Update Jetson build env to support CUDA 11.4 and Orin (#4250)
  • Update nvJPEG2k version to 0.6.0 (#4320)
  • Add missing documentation to (Future)DecodingResult(Promise). (#4310)
  • Update libcudacxx target macros for clang and SM90. (#4315)
  • Don't use nvjpegGetHardwareDecoderInfo in pre-11.8 toolkits. (#4325)
  • Prune static cuda libraries DALI links with from unused archs (#4317)
  • Fix clang warnings (#4312)
  • Add pass-through tracking to auto-pinning buffers (#4294)
  • Update protobuf (v21.5 to v21.7) (#4313)
  • Extended ImageDecoder tests (#4297)
  • Refactor OpSchema - move implementation to one translation unit (#4293)
  • Emit the warning about the default value change only when using the default. (#4214)
  • Reduce the batch size in RN50 data pipeline tests. (#4304)
  • Enable ROI adjustment for multi-frame inputs + cleanup. (#4303)
  • Use GPU Convert in nvJPEG decoder (#4247)
  • Aggregating ImageDecoder (#4224)
  • Support palette TIFFs (#4206)
  • Refactor video decoder for reusability (#4290)
  • Add ROI support to nvJPEG (#4244)
  • RemapKernel API (#4284)
  • Presteps to image_decoder.* APIs (#4277)
  • Add frames decoder CPU without index (#4278)
  • Add experimental.decoders.video for CPU (#4270)
  • Fix a typo in the documentation (#4258)
  • Add orientation to GPU image data Convert (#4232)
  • Fix hang in TL1_tensorflow-dali_test (#4255)
  • Make test_dltensor_operator.py consistent when the HW decoder is available (#4272)
  • Fix issues in DALI in action snippet (#4268)
  • Assure operator documentation links to enum types (#4264)
  • Support applying orientation in Convert (#4219)
  • Add image decoder registry. (#4261)
  • Support tiled TIFFs (#4201)
  • Bump up TensorFlow version in tests (#4238)

Bug Fixes

  • Fix coverity issues (#4349)
  • Revert pruning of unused architectures (#4336)
  • Fix order of access order waiting in TL's set_order (#4338)
  • Fix NVJPEG pinned buffer synchronization. (#4337)
  • Change the default order of data storage objects (#4276)
  • Fix checking of the return status of the bundle lib tests (#4330)
  • Fix executor test - add test operators (#4323)
  • Fix parameter propagation in ImageDecoder. (#4309)
  • Fix normalization when running GPU color space conversion (#4285)
  • Fix support for ANY_DATA in nvJPEG2K (#4299)
  • Fix inconsistent tensor recreation in TensorList (#4286)
  • Fix no ffmpeg build (#4288)
  • Fix libtiff error handling (#4274)
  • Fix imgcodec batched APIs and tests (#4263)
  • Fix handling of Caffe LMDB without valid data (#4266)
  • Move params in PerThreadResources move constructor (#4265)
  • Fix fusing the dimensions in SliceFlipNormalizePermutePadGpu (#4234)
  • Improve error handling in LibTiffDecoder (#4210)
  • Fix exception handling in BatchParallelDecoderImpl (#4262)
  • Make nvjpeg decoder use its own thread pool (#4241)

Breaking API changes

There are no breaking changes in this DALI release.

Deprecated features

DALI will drop support for CUDA 10.2 in an upcoming release.

Known issues:

  • The video loader operator requires that the key frames occur, at a minimum, every 10 to 15 frames of the video stream.
    If the key frames occur at a frequency that is less than 10-15 frames, the returned frames might be out of sync.
  • Experimental VideoReaderDecoder does not support open GOP.
    It will not report an error and might produce invalid frames. VideoReader uses a heuristic approach to detect open GOP and should work in most common cases.
  • The DALI TensorFlow plugin might not be compatible with TensorFlow versions 1.15.0 and later.
    To use DALI with the TensorFlow version that does not have a prebuilt plugin binary shipped with DALI, make sure that the compiler that is used to build TensorFlow exists on the system during the plugin installation. (Depending on the particular version, you can use GCC 4.8.4, GCC 4.8.5, or GCC 5.4.)
  • In experimental debug and eager modes, the GPU external source is not properly synchronized with DALI internal streams.
    As a workaround, you can manually synchronize the device before returning the data from the callback.
  • Due to some known issues with meltdown/spectra mitigations and DALI, DALI shows best performance when running in Docker with escalated privileges, for example:
    • privileged=yes in Extra Settings for AWS data points
    • --privileged or --security-opt seccomp=unconfined for bare Docker.

Binary builds

Install via pip for CUDA 10.2:
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda102==1.19.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda102==1.19.0

or for CUDA 11:

CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x toolkit
while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later). 
Using the latest driver may enable additional functionality. 
More details can be found in enhanced CUDA compatibility guide.

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda110==1.19.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda110==1.19.0

Or use direct download links (CUDA 10.2):

Or use direct download links (CUDA 11.0):

FFmpeg source code:

  • This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here

Libsndfile source code:

Don't miss a new DALI release

NewReleases is sending notifications on new releases.