Key Features and Enhancements
This DALI release includes the following key features and enhancements:
- Added CUDA 11.8 support.
- Improved color conversion performance and precision (#4139).
- Laid the groundwork for ongoing conditional execution effort (#4149, #4124, #4083, #3827, #4049).
- Laid the groundwork for ongoing effort on improved decoding and processing of images.
- Documentation improvements (#4168, #4102, #4059, #4094).
Fixed Issues
The following issues were fixed in this release:
- Fixed default dtype in color twist family of operators (#4067)
- Fix handling of TIFFs with palette (#4089)
Improvements
- Separating nvjpeg2k utils in imgcodec (#4160)
- Add NvJpeg2000Decoder (#4114)
- Port operators Python tests to
nose2
(#4037) - Refactor Tensor Vector (#4149)
- Rename ImageDecoder to ImageDecoderFactory. (#4169)
- Add section on deferred setup and shm limit to PES docs (#4168)
- Change pinned version of matplotlib (#4167)
- Add LibTIFF decoder (#4109)
- Make decoder_test_helper.h accept TensorView (#4154)
- Update dependencies (#4152)
- Add color conversion support (#4143)
- Extend the ImageDecoder testing framework to support GPU decoders (#4142)
- Add color space conversion to imgcodec (#4121)
- Fix CVE-2022-34526 (#4133)
- Copy nvjpeg utils into imgcodec (#4148)
- Fix linter for files inisde the dali_tf_plugin directory (#4118)
- Add LibJpegTurboDecoder (#4099)
- Color conversion - optimizations and tests (#4139)
- Move to CUDA 11.7U1 (#4137)
- Remove pageable copies from Convolution, Transpose and Warp kernels. (#4141)
- Add AsTensor and related APIs to Tensor Vector (#4124)
- [imgcodec] Add thread index and cuda stream to Decode APIs (#4128)
- Move operator test files (#4125)
- Silence some constexpr-related warnings in NVCC 10. (#4131)
- Move libjpeg-turbo utils/impl to imgcodec directory (#4129)
- Add missing constexpr to vec and mat. (#4130)
- Parse EXIF metadata in PNG imgcodec parser (#4122)
- Add parenthesis to assert to avoid using
\
(#4123) - Fix error reported by flake8 5.0.1 (#4120)
- Turn Python linter on by default (#3997)
- Add decoder test framework (#4103)
- Add dali namespace to third_party copy of OpenCV's exif (#4112)
- Parsing EXIF metadata in WebP images (#4087)
- Add PNG parser (#4052)
- Fix OpenCV warning in jpeg compression distortion tests (#4107)
- Document unsupported external source arguments in TF Dataset (#4102)
- Add boilerplate synchronization for batch copying (#4083)
- Pin Numba version to 0.55.2 (#4108)
- Example image decoder using OpenCV (#4036)
- Remove signal handler for SIGKILL (#4015)
- Extract common functions from numpy reader (#4100)
- Add JPEG EXIF parser (#4073)
- Remove video reader warning that a frame has been seen twice (#4092)
- Remove unnecessary loggin from resize checkerboard tests (#4086)
- Add Jpeg2000 parser (#4068)
- Fix flake8 warnings (#4074)
- Fix & extend formatting of collections. (#4082)
- Add inherited members to the Pytorch plugin docs (#4094)
- Adjust Doxygen configuration (#4088)
- Add imgcodec compatibility tests (#4057)
- Add restrictions to set_type (#4071)
- Add WebP parser (#4053)
- Add JPEG Parser (#4050)
- Silence buggy GCC warning about freeing non-heap objects. (#4077)
- Add a tool for testing Imgcodec against ImageMagick (#4058)
- BMP parser (#4062)
- Make endian swapping work with ADL. (#4075)
- Add utilities for swapping endianness. (#4069)
- Add PNM parser (#4044)
- Add references to image_processing/index. Add optional ordering to references. (#4059)
- Extract EXIF parser from OpenCV (#4063)
- Fix ifndef guards to be at the end of the file (#4064)
- Stop exposing internal contiguous TV storage (#3827)
- ReadValue extension to support enums (#4060)
- Propagate device_id in ShareData and SetSample APIs (#4049)
- Add TIFF parser (#4040)
- Make the DALI video reader throw an exception when the VFR video is decoded (#4022)
- Add ReadHeader util to parser baseclass (#4042)
Bug Fixes
- Prevent excessive synchronization in MakeContiguous (#4228)
- Prevent overflow in random_resized_crop tests (#4187)
- Fix invalid destruction order in decoder test helper (#4186)
- Added missing const in for loops (#4185)
- Fix coverity issues (#4164)
- Conditional compilation of TIFF Codec (#4166)
- Fix zlib CVE-2022-37434 (#4150)
- Pin matplotlib version to 3.5.2 (#4159)
- Fix parsing of grayscale bitmaps (#4147)
- Install flake8 for xavier builds (#4127)
- Fix handling of TIFFs with palette (#4089)
- Fix missing override in decoder test (#4105)
- Disable HEVC tests for FramesDecoderGpu when it is not supported by the GPU (#4084)
- Fix default dtype in color twist family of operators (#4067)
- Fix libtiff CVE-2022-2058, CVE-2022-2057, CVE-2022-2056 (#4047)
Breaking API changes
There are no breaking changes in this DALI release.
Deprecated features
There are no deprecated features in this DALI release.
Known issues:
- The video loader operator requires that the key frames occur, at a minimum, every 10 to 15 frames of the video stream.
If the key frames occur at a frequency that is less than 10-15 frames, the returned frames might be out of sync. - Experimental VideoReaderDecoder does not support open GOP.
It will not report an error and might produce invalid frames. VideoReader uses a heuristic approach to detect open GOP and should work in most common cases. - The DALI TensorFlow plugin might not be compatible with TensorFlow versions 1.15.0 and later.
To use DALI with the TensorFlow version that does not have a prebuilt plugin binary shipped with DALI, make sure that the compiler that is used to build TensorFlow exists on the system during the plugin installation. (Depending on the particular version, you can use GCC 4.8.4, GCC 4.8.5, or GCC 5.4.) - In experimental debug and eager modes, the GPU external source is not properly synchronized with DALI internal streams.
As a workaround, you can manually synchronize the device before returning the data from the callback. - Due to some known issues with meltdown/spectra mitigations and DALI, DALI shows best performance when running in Docker with escalated privileges, for example:
privileged=yes
in Extra Settings for AWS data points--privileged
or--security-opt seccomp=unconfined
for bare Docker.
Binary builds
Install via pip for CUDA 10.2:
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda102==1.17.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda102==1.17.0
or for CUDA 11:
CUDA 11.0 build uses CUDA toolkit enhanced compatibility. It is built with the latest CUDA 11.x toolkit
while it can run on the latest, stable CUDA 11.0 capable drivers (450.80 or later).
Using the latest driver may enable additional functionality.
More details can be found in enhanced CUDA compatibility guide.
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-cuda110==1.17.0
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/ nvidia-dali-tf-plugin-cuda110==1.17.0
Or use direct download links (CUDA 10.2):
- https://developer.download.nvidia.com/compute/redist/nvidia-dali-cuda102/nvidia_dali_cuda102-1.17.0-5838887-py3-none-manylinux2014_x86_64.whl
- https://developer.download.nvidia.com/compute/redist/nvidia-dali-tf-plugin-cuda102/nvidia-dali-tf-plugin-cuda102-1.17.0.tar.gz
Or use direct download links (CUDA 11.0):
- https://developer.download.nvidia.com/compute/redist/nvidia-dali-cuda110/nvidia_dali_cuda110-1.17.0-5838886-py3-none-manylinux2014_x86_64.whl
- https://developer.download.nvidia.com/compute/redist/nvidia-dali-cuda110/nvidia_dali_cuda110-1.17.0-5838886-py3-none-manylinux2014_aarch64.whl
- https://developer.download.nvidia.com/compute/redist/nvidia-dali-tf-plugin-cuda110/nvidia-dali-tf-plugin-cuda110-1.17.0.tar.gz
FFmpeg source code:
Libsndfile source code: