Highlights

Encoding / Decoding images

Torchvision is extending its encoding/decoding capabilities. For this version, we added a GIF decoder which is available as torchvision.io.decode_gif(raw_tensor), torchvision.io.decode_image(raw_tensor), and torchvision.io.read_image(path_to_image).

We also added support for jpeg GPU encoding in torchvision.io.encode_jpeg(). This is 10X faster than the existing CPU jpeg encoder.

Resizing according to the longest edge of an image

It is now possible to resize images by setting torchvision.transforms.v2.Resize(max_size=N): this will resize the longest edge of the image exactly to max_size, making sure the image dimension don't exceed this value. Read more on the docs!

Detailed changes

Bug Fixes

[datasets] SBDataset: Only download noval file when image_set='train_noval' (#8475)
[datasets] Update the download url in class EMNIST (#8350)
[io] Fix compilation error when there is no libjpeg (#8342)
[reference scripts] Fix use of cutmix_alpha in classification training references (#8448)
[utils] Allow K=1 in draw_keypoints (#8439)

New Features

[io] Add decoder for GIF images (decode_gif(), decode_image(),read_image()) (#8406, #8419)
[transforms] Add GaussianNoise transform (#8381)

Improvements

[transforms] Allow v2 Resize to resize longer edge exactly to max_size (#8459)
[transforms] Add min_area parameter to SanitizeBoundingBox (#7735)
[transforms] Make adjust_hue() work with numpy 2.0 (#8463)
[transforms] Enable one-hot-encoded labels in MixUp and CutMix (#8427)
[transforms] Create kernel on-device for transforms.functional.gaussian_blur (#8426)
[io] Adding GPU acceleration to encode_jpeg (10X faster than CPU encoder) (#8391)
[io] read_video: accept BytesIO objects on pyav backend (#8442)
[io] Add compatibility with FFMPEG 7.0 (#8408)
[datasets] Add extra to install gdown (#8430)
[datasets] Support encoded RLE format in for COCO segmentations (#8387)
[datasets] Added binary cat vs dog classification target type to Oxford pet dataset (#8388)
[datasets] Return labels for FER2013 if possible (#8452)
[ops] Force use of torch.compile on deterministic roi_align implementation (#8436)
[utils] add float support to utils.draw_bounding_boxes() (#8328)
[feature_extraction] Add concrete_args to feature extraction tracing. (#8393)
[Docs] Various documentation improvements (#8429, #8467, #8469, #8332, #8262, #8341, #8392, #8386, #8385, #8411).
[Tests] Various testing improvements (#8454, #8418, #8480, #8455)
[Code quality] Various code quality improvements (#8404, #8402, #8345, #8335, #8481, #8334, #8384, #8451, #8470, #8413, #8414, #8416, #8412)

Contributors

We're grateful for our community, which helps us improve torchvision by submitting issues and PRs, and providing feedback and suggestions. The following persons have contributed patches for this release:

Adam J. Stewart ahmadsharif1, AJS Payne, Andrew Lingg, Andrey Talman, Anner, Antoine Broyelle, cdzhan, deekay42, drhead, Edward Z. Yang, Emin Orhan, Fangjun Kuang, G, haarisr, Huy Do, Jack Newsom, JavaZero, Mahdi Lamb, Mantas, Nicolas Hug, Nicolas Hug , nihui, Richard Barnes , Richard Zou, Richie Bendall, Robert-André Mauchin, Ross Wightman, Siddarth Ijju, vfdev

pytorch/vision v0.19.0 Torchvision 0.19 release on GitHub