github xiph/rav1e v0.4.0
v0.4.0: Happy New Year

latest releases: p20240423, p20240416, p20240409...
3 years ago

rav1e 0.4.0 provides solid speed improvements on both x86_64 and aarch64.

image

This release, along with the 0.3.5 release, 0.4.0 supports Apple Silicon out of box.

image

The overall speedup is solid across the speed levels, both for 8bit and 10bit encoding. With some drastic improvement for aarch64 on speed 10.

image

Quality-wise, for 4:2:0 video, most metrics improved across all speed levels, with speed 5 getting the largest boost.
4:2:2 and 4:4:4 video saw greater improvements in quality, as they were brought to feature parity with 4:2:0. See the improvements section below for more details.

image

Speed level PSNR PSNR Cb PSNR Cr PSNR HVS SSIM MS SSIM CIEDE 2000 VMAF
0 -1.3542 -4.0733 -3.3946 -1.7433 -1.7734 -1.9269 -2.4361 -2.16
1 -1.0343 -3.7382 -3.4084 -1.3605 -1.2619 -1.53 -2.2265 -1.97
2 -1.0407 -3.9916 -3.6426 -1.4196 -1.6259 -1.8107 -2.3525 -2.28
3 -1.1544 -5.2352 -4.6235 -1.6259 -1.7752 -1.947 -2.672 -1.95
4 -0.548 -4.7456 -4.3344 -0.9114 -1.1915 -1.3188 -2.1232 -1.66
5 -2.3185 -4.5738 -4.4277 -2.6101 -2.9586 -2.8967 -3.2177 -3.19
6 -1.8238 -2.0511 -2.2246 -1.7811 -1.997 -1.8386 -1.9551 -2.44
7 -1.8314 -2.0694 -2.5498 -1.7675 -1.9612 -1.8752 -1.9191 -2.6
8 -1.8239 -2.2058 -2.5742 -1.795 -1.9449 -1.8676 -1.9334 -2.71
9 -1.6422 -2.0831 -2.314 -1.6198 -1.7644 -1.6923 -1.9255 -1.92
10 -0.108 -1.4077 -2.0309 -0.2935 0.1188 0.0143 -0.0963 -2.35

Improvements

  • Enable open partitions on frame boundaries (2% improvement to coding efficiency)
  • Use av-metrics in CLI to compute PNSR, PSNR-HVS, SSIM, MS-SSIM, and CIEDE2000 (see --metrics)
  • Enable deblocking in loop filter rate-distortion optimization (0.5% to 1.5% improvement to coding efficiency)
  • Thread CDEF loop filter with tiles (1.2% reduction in encoding time with 4 tiles)
  • Redesign the rate control API
  • Add monochrome support
  • Improve 4:2:2 support (37% reduction in encoding time, 0.8% to 5% improvement to coding efficiency)
  • Add compound prediction mode variants for drl=2 and drl=3
  • Enable NEAR_NEAR1MV and NEAR_NEAR2MV compound modes
  • Support arbitrary-SAR anamorphic video
  • Enforce a frame limit of 1 in still picture mode
  • Add a quiet mode to the CLI (--quiet flag)
  • Convert all motion vector predictors to full pixel precision
  • Update non-broken motion estimation predictors (0.28% improvement to coding efficiency)
  • Substantially rework initial motion estimation (9% reduction in encoding time)
  • Optimise predictors for multipass motion estimation (0.3% to 0.4% improvement to coding efficiency)
  • Optimize chroma quantizer offsets for 4:4:4 sampling
  • Allow opaque extra data to be attached to frames and retrieved from encoded packets via the API
  • Merge new dav1d assembly code for x86 and AArch64
  • Add/improve assembly code for distortion computations
  • Derive quantizers using linear models (0.7% to 1.7% improvement to coding efficiency)
  • Prune intra frame prediction mode list dynamically (5.5% to 12.2% reduction in encoding time at speed level 5)
  • Optimize rate-distortion optimization loop (1% reduction in encoding time)
  • Reduce memory allocation count in various areas
  • Optimize tile block access (1.5% reduction in encoding time)
  • Allow frame sizes <16x16 in still picture mode
  • Add high bit depth AVX2 assembly (9% to 31% reduction in encoding time for 10-bit video)

Bug Fixes

  • Fix rebuilding with fresh assembly output
  • Fix the chroma plane desyncs on narrow frames
  • Abort rate controlled encoding without a bitrate target in the CLI
  • Fix the -v CLI option
  • Fix a crash when using 4 tiles for 1080p 4:2:2 input
  • Fix intra edge filter desyncs with 4:2:2 and 4:4:4 input
  • Fix a symbol redefinition error for AArch64 builds using Clang
  • Fix loop restoration filter with 4:2:2 and 4:4:4 input
  • Fix incorrect quantizer index clamping
  • Fix cross-compiling for mingw-W64 on macOS
  • Avoid a buffer underflow condition in CDEF pad_into_tmp16()
  • Properly validate minimum RDO lookahead frames value
  • Respect quantizer bounds with rate control enabled
  • Restrict still picture mode to single-frame streams

Changes

  • Bump minimum version of NASM to 2.14.02
  • Update speed presets
    • Enable full SGR search for levels 0-4 instead of 0-8
    • Enable fine directional prediction for all speed levels
    • Enable reduced transform type search for levels 6-10 instead of 5-10
    • Disable transform type RDO for inter frames
  • Rename "native" CPU feature level to "Rust" (use RAV1E_CPU_TARGET=rust at runtime)
  • Remove in-library PSNR computation feature
  • Move frame-related data structures to a separate crate (v_frame)
  • Extend dump_lookahead_data feature
    • Export the frame_subtype property
    • Use the RAV1E_DATA_PATH environment variable to determine the output path
  • Refactor CDEF to allow easier importation of dav1d CDEF assembly, as well as simplifying interaction between loop filters
  • Remove leftover code ported from libaom
  • Remove unused diamond motion estimation
  • Reduce build time
    • Disable LTO by default
    • Disable code generation unit restriction
    • Allow incremental builds for the release profile
    • Inline various functions
    • Remove large stack allocations
    • Split large modules into multiple submodules
  • Add an unstable channel API feature
  • Prompt if the output file would be overwritten and add -y to override it.

Don't miss a new rav1e release

NewReleases is sending notifications on new releases.