Release notes
Commits since last release: v25.3.0-rc.3...v25.3.0-rc.4. Changes are summarized below.
Fixes
- The logic for removing stale
ComputeDomainnode labels has been fixed and consolidated, which is especially important when workload pods are created and then deleted again in rapid succession (#404). - A
ComputeDomainupdate (an IMEX daemon Pod IP change) was not reliably leading to daemon restart (#407). - The IMEX daemon's liveness probe's stderr was not collected (#407).
- The IMEX daemon's log output was not reliably collected, especially around shutdown (#407).
- Controller pod and IMEX daemon pods now explicitly run with
NVIDIA_VISIBLE_DEVICES=voidwhich addresses various error symptoms in some environments (#402).
Notable changes
- Container images are now based on
nvcr.io/nvidia/cuda:12.9.1-base-ubi9.