github NVIDIA/DALI v1.2.0
DALI v1.2.0

latest releases: v1.39.0-dev, v1.37.1, v1.37.0...
2 years ago

Key Features and Enhancements

This DALI release includes the following key features and enhancements.

  • New operators:
    • noise.shot CPU and GPU operators (#2861)
    • noise.gaussian CPU and GPU operators (#2846)
    • jpeg_compression_distortion CPU and GPU operators (#2823)
  • New mathematical operations (#2853):
    • Square and cubic root (sqrt, rsqrt, and cbrt)
    • Logarithms of different bases (log2 and log10)
    • Power (** operator and pow function)
    • Absolute value (abs and fabs)
    • Roundings (ceil and floor)
    • Trigonometric functions (sin, cos, and tan)
    • Inverse trigonometric functions (asin, acos, atan, and atan2)
    • Hyperbolic functions (sinh, cosh, and tanh)
    • Inverse hyperbolic functions (asinh, acosh, and atanh)
  • Added a Python wrapper for the fn.experimental.numba_function (#2886, #2835, #2903, #2893, and #2887)
  • Image decoder improvements:
    • Enabled ROI decoding in the hardware decoder (#2734).
    • Added support for the alpha channel in PNG and JP2 decoding (#2867).
    • Added support for YCbCr and BGR in JP2 decoding (#2867).
  • Updated the CUDA version to 11.3 (#2870).
  • Improved the documentation (#2915, #2911, #2927, #2862, and #2858).

Fixed issues

This DALI release includes the following fixes:

  • Fixed the readers.numpy cache issue (#2932).
  • Fixed an error in readers.nemo_asr (#2928).
  • Fixed a bug that caused the video reader hang (#2916).

Improvements

  • Improve Tensors docs (#2915)
  • DALI core allocation functions (#2930)
  • Update FFmpeg build guide and update DALI_deps version (#2911)
  • Default memory resources (#2890)
  • Better error message when insufficient data in cache (#2924)
  • Add a link to the TensorFlow ResNet50 training script in the Readme (#2927)
  • Numba func notebook (#2886)
  • Enable HW decoder ROI support (#2734)
  • Use a custom color space conversion kernel for all conversions (#2907)
  • Update packages used for DALI tests (#2906)
  • Refactor TF Dataset code and lint it (#2909)
  • Add ShotNoise CPU and GPU operators (#2861)
  • Remove workaround for the problem with patchelf changing TLS alignment for CUDA < 10.2 and > 11.1 (#2879)
  • Add dali_data_type_vec (#2887)
  • Composite resource + renaming. (#2891)
  • Update deps in third_party and conda (#2878)
  • Python wrapper for numba (#2835)
  • Image Decoder: Unified behavior across backends,Alpha channel support in PNG and JP2, YCbCr support in JP2 (#2867)
  • Better error handling in pipeline.py (#2864)
  • Update DALI deps (#2876)
  • Enable CUDA 11.3 based builds (#2870)
  • Updates MXNet plugin documentation regarding last_batch_policy (#2862)
  • README update with GTC2021 materials (#2860)
  • RNGBase to be used as base for noise augmentations + Add GaussianNoise operator (as an example) (#2846)
  • Pinned async resource (#2858)
  • Add more mathematical operations (#2853)
  • Add JpegCompressionDistortion CPU and GPU operators (#2823)
  • Split Python tests into smaller chunks (#2847)
  • Asynchronous pool memory resource (#2814)

Bug fixes

  • Add missing opencv-python dependency to TL2_FW_iterators_perf test (#2939)
  • Fix numpy reader header cache (#2932)
  • NemoAsrReader: Call Reset() on tensor vector holding the batch, to clear any previous shared data pointer. (#2928)
  • Fix DALI compilation for CUDA 11 pre 11.3 version (#2925)
  • Make dynlink_xxx use statically linked functions to load symbols. (#2931)
  • Fix test_detection_pipeline.py (#2929)
  • Add a missing av_bsf_flush call to a VideoRader seek function (#2916)
  • Run Optical Flow on stream 0 when running driver > 460. (#2914)
  • Fix nvcc warning about unused arguments in ResampleDepth_Channels (#2913)
  • Fix CUDA 10.0 compilation (#2917)
  • Use stream 0 in VideoDecoder when running driver >460 / CUDA >= 11.3. (#2902)
  • Fix docs and rename numba_func to numba_function (#2903)
  • Allow to specify optional args of Python-only types (#2898)
  • DALI TF install tool: Verify that a compatible prebuilt plugin is available for the required TF version before proceeding to attempt installation (#2882)
  • Fix coverity issues by adding lacking CUDA_CALL (#2888)
  • Fix failing test for Numba Func (#2893)
  • Fix double accumulation in horizontal resampling. Add test. (#2871)
  • Add espilon to math function tests and adjust epsilon for rsqrt. (#2865)
  • Make not schedule any pipeline run when the iterator has prepare_first_batch=False (#2859)
  • Adjust the filenames of decoder test files and update licenses (#2844)

Breaking API changes

There are no breaking changes in this DALI release.

Deprecated features

There are no deprecated features in this DALI release.

Known issues:

  • The video loader operator requires that the key frames occur at a minimum every 10 to 15 frames of the video stream. If the key frames occur at a lesser frequency, then the returned frames may be out of sync.
  • The DALI TensorFlow plugin might not be compatible with TensorFlow versions 1.15.0 and later.
    To use DALI with the TensorFlow version that does not have a prebuilt plugin binary shipped with DALI, make sure that the compiler that is used to build TensorFlow exists on the system during the plugin installation. (Depending on the particular version, use GCC 4.8.4, GCC 4.8.5, or GCC 5.4.)
  • Due to some known issues with meltdown/spectra mitigations and DALI, DALI shows best performance when run in Docker with escalated privileges, for example:
    • privileged=yes in Extra Settings for AWS data points
    • --privileged or --security-opt seccomp=unconfined for bare Docker

Binary builds

Install via pip for CUDA 10:
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda100==1.2.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda100==1.2.0

or for CUDA 11:

CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x toolkit
while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later). 
Using the latest driver may enable additional functionality. 
More details can be found in enhanced CUDA compatibility guide.

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda110==1.2.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda110==1.2.0

Or use direct download links (CUDA 10.0):

Or use direct download links (CUDA 11.0):

FFmpeg source code:

  • This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here

Libsndfile source code:

Don't miss a new DALI release

NewReleases is sending notifications on new releases.