github openucx/ucx v1.21.0

6 hours ago

1.21.0 (June 24, 2026)

Features:

UCP

  • Added UCX_PROTO_EMULATION_ENABLE option to force zero-copy RMA protocol selection
  • Added UCX_MAX_HCA_PER_GPU policy to limit GPU memory registrations to nearest HCAs
  • Added device lanes that can access host memory for GPU transfer fallback
  • Enabled gdr_copy for memtype endpoint transport

UCT

  • Added device channel pool support
  • Added CPU memory usage as AMO local buffer for device operations

RDMA CORE (IB, ROCE, etc.)

  • Added UCX_IB_GDA_RETAIN_INACTIVE_CTX option to control inactive CUDA context retention in GDAKI
  • Added configure option to enable or disable GGA transport

Build

  • Added --without-gda configure option
  • Made cuRAND an optional dependency for perftest CUDA kernels

CI/Testing

  • Added dry-run package installation checks to the release package build

Bugfixes:

Build

  • Fixed support for -Og by disabling always-inline attributes

UCP

  • Fixed progress counter to return the actual operation status
  • Fixed multi protocol minimum size handling for 1-byte operations
  • Fixed endpoint finalization when no P2P or connection-manager lane is available

UCT

  • Fixed notify callback handling by adding a NULL check

CUDA

  • Fixed CUDA IPC accessibility cache separation for local and remote rkeys
  • Fixed CUDA IPC cache/LRU invariant for referenced regions
  • Fixed DMA-BUF offsets for interior CUDA addresses

ROCM

  • Fixed hangs in HIP MPI and OMB tests
  • Fixed endpoint flush for in-flight ROCm operations

RDMA CORE (IB, ROCE, etc.)

  • Fixed GDA DMA-BUF offset handling
  • Fixed GDA WQE ordering by using CAS-based readiness marking
  • Fixed GDAKI CUDA context handling during endpoint creation
  • Fixed GDAKI NIC/GPU mapping when CUDA_VISIBLE_DEVICES hides physical GPUs

TCP

  • Fixed interface selection by skipping IPv4 link-local addresses

UCS

  • Reverted dynamically loaded external module/plugin support

Packaging

  • Fixed Debian maintainer field
  • Fixed GDA RPM build
  • Fixed GDA RPM/devel package layout for CUDA/GDA subpackages
  • Fixed RPM/DEB handling when GDA is disabled

Don't miss a new ucx release

NewReleases is sending notifications on new releases.