github kubernetes-sigs/kueue v0.15.0-rc.0

pre-releaseone day ago

Changes since v0.14.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • MultiKueue: validate remote client kubeconfigs and reject insecure kubeconfigs by default; add feature gate MultiKueueAllowInsecureKubeconfigs to temporarily allow insecure kubeconfigs until v0.17.0.

    if you are using MultiKueue kubeconfigs which are not passing the new validation please
    enable the MultiKueueAllowInsecureKubeconfigs feature gate and let us know so that we can re-consider
    the deprecation plans for the feature gate. (#7439, @mszadkow)

  • The .status.flavors in LocalQueue is deprecated, which will be removed in the future release.

You can consider migrating from the field usage to VisibilityOnDemand. (#7337, @iomarsayed)

  • Update DRA API used from v1beta2 to v1

in order to use DRA integration by enabling the DynamicResourceAllocation feature gate in Kueue you need to use k8s 1.34+. (#7212, @harche)

Changes by Kind

Deprecation

  • Deprecate QueueVisibility in v1beta2 (#7319, @bobsongplus)
  • Remove deprecated PodIntegrationOptions (podOptions field) from v1beta2 Configuration. Users must migrate to using managedJobsNamespaceSelector (https://kueue.sigs.k8s.io/docs/tasks/run/plain_pods/) or continue using v1beta1 for this feature. (#7406, @nerdeveloper)
  • Remove deprecated retryDelayMinutes field from v1beta2 AdmissionCheckSpec. This field was deprecated since v0.8 and provided no functionality. Users using v1beta1 are unaffected. (#7407, @nerdeveloper)

API Change

  • Expose the v1beta2 API for CRD serving. V1beta1 remains as storage. (#7304, @mimowo)

  • FlavorFungibility: introduce MayStopSearch in place of Borrow/Preempt, which are now deprecated. (#7117, @ganczak-commits)

  • Graduate Config API to v1beta2 (#7375, @mbobrovskyi)

  • Removed the deprecated workload annotation key "kueue.x-k8s.io/queue-name".

    Please ensure you are using the workload label "kueue.x-k8s.io/queue-name" instead. (#7271, @ganczak-commits)

  • V1beta2: drop deprecated Flavors field from LocalQueueStatus (#7449, @mbobrovskyi)

  • V1beta2: graduate the visibility API (#7411, @mbobrovskyi)

  • V1beta2: introduce PriorityClassRef instead of PriorityClassSource and PriorityClassName (#7540, @mbobrovskyi)

  • V1beta2: remove deprecated .spec.admissionChecks field from ClusterQueue API in favor of .spec.admissionChecksStrategy. (#7490, @nerdeveloper)

  • ReclaimablePods feature gate is introduced to enable users switching on and off the reclaimable Pods feature (#7525, @PBundyra)

Feature

  • Add TAS support to the Kubeflow integration (#7249, @kaisoz)

  • Adjust the cluster_queue_weighted_share and cohort_weighted_share metrics to report the precise value for the
    Weighted share, rather than the value rounded to an integer. Also, expand the cluster_queue_weighted_share metric
    with the "cohort" label. (#7338, @j-skiba)

  • Fix: MultiKueue now supports Topology Aware Scheduling (TAS) and ProvisioningRequest integration. (#5361, @IrvingMg)

  • Improve Preemption message: include preemptor and preemptee object paths to make it easier to locate the objects involved in a preemption. (#7522, @mszadkow)

  • JobFramework: Introduce an optional interface for custom Jobs, called JobWithCustomWorkloadActivation, which can be used to deactivate or active a custom CRD workload. (#7199, @tg123)

  • Pod integration is now auto-enabled when using LeaderWorkerSet, StatefulSet, or Deployment frameworks. (#6736, @IrvingMg)

  • Promote AdmissionFairSharing to beta (#7463, @kannon92)

  • Promote ManagedJobsNamespaceSelectorAlwaysRespected feature to Beta (#7493, @PannagaRao)

  • Promote MultiKueueBatchJobWithManagedBy to beta. (#7341, @kannon92)

  • TAS: change the algorithm used in case of "unconstrained" mode (enabled by the kueue.x-k8s.io/podset-unconstrained-topology annotation, or when the "implicit" mode s used) from "BestFit" to "LeastFreeCapacity".

    This allows to optimize the fragmentation for workloads which don't require bin-packing. (#7416, @iomarsayed)

Bug or Regression

  • Add rbac for train job for kueue-batch-admin and kueue-batch-user (#7196, @kannon92)

  • Fix a bug where a workload would not get requeued after eviction due to failed hotswap. (#7376, @pajakd)

  • Fix eviction of jobs with memory requests in decimal format (#7430, @brejman)

  • Fix existing workloads not being re-evaluated when new clusters are added to MultiKueueConfig. Previously, only newly created workloads would see updated cluster lists. (#6732, @ravisantoshgudimetla)

  • Fix handling of RayJobs which specify the spec.clusterSelector and the "queue-name" label for Kueue. These jobs should be ignored by kueue as they are being submitted to a RayCluster which is where the resources are being used and was likely already admitted by kueue. No need to double admit.
    Fix on a panic on kueue managed jobs if spec.rayClusterSpec wasn't specified. (#7218, @laurafitzgerald)

  • Fix invalid annotations path being reported in JobSet topology validations. (#7189, @kshalot)

  • Fix malformed annotations paths being reported for RayJob and RayCluster head group specs. (#7183, @kshalot)

  • Fix the bug for the StatefulSet integration that the scale up could get stuck if
    triggered immediately after scale down to zero. (#7479, @IrvingMg)

  • Fix the kueue-controller-manager startup failures.

    This fixed the Kueue CrashLoopBackOff due to the log message: "Unable to setup indexes","error":"could not setup multikueue indexer: setting index on workloads admission checks: indexer conflict. (#7432, @IrvingMg)

  • Fixed a bug that Kueue would keep sending empty updates to a Workload, along with sending the "UpdatedWorkload" event, even if the Workload didn't change. This would happen for Workloads using any other mechanism for setting
    the priority than the WorkloadPriorityClass, eg. for Workloads for PodGroups. (#7299, @mbobrovskyi)

  • Fixed the bug that prevented managing workloads with duplicated environment variable names in containers. This issue manifested when creating the Workload via the API. (#7425, @mbobrovskyi)

  • Kueue now properly validates and rejects unsupported DRA (Dynamic Resource Allocation) features with clear error messages instead of silently failing or producing misleading "DeviceClass not mapped" errors. Unsupported features include: AllocationMode 'All', CEL Selectors, Device Constraints, Device Config, FirstAvailable device selection, and AdminAccess. (#7226, @harche)

  • MultiKueue x ElasticJobs: fix webhook validation bug which prevented scale up operation when any other
    than the default "AllAtOnce" MultiKueue dispatcher was used. (#7278, @mszadkow)

  • MultiKueue: Remove remoteClient from clusterReconciler when kubeconfig is detected as invalid or insecure, preventing workloads from being admitted to misconfigured clusters. (#7486, @mszadkow)

  • Requeue generic job when update workload's podsready condition fail. (#7364, @olderTaoist)

  • Services: fix the setting of the app.kubernetes.io/component label to discriminate between different service components within Kueue as follows:

    • controller-manager-metrics-service for kueue-controller-manager-metrics-service
    • visibility-service for kueue-visibility-server
    • webhook-service for kueue-webhook-service (#7371, @rphillips)
  • TAS: Increase the number of Topology levels limitations for localqueue and workloads to 16 (#7423, @kannon92)

  • TAS: Introduce missing validation against using incompatible PodSet grouping configuration in JobSet, MPIJob, LeaderWorkerSet, RayJobandRayCluster`.

    Now, only groups of two PodSets can be defined and one of the grouped PodSets has to have only a single Pod.
    The PodSets within a group must specify the same topology request via one of the kueue.x-k8s.io/podset-required-topology and kueue.x-k8s.io/podset-preferred-topology annotations. (#7061, @kshalot)

  • Visibility API: Fix a bug that the Config clientConnection is not respected in the visibility server. (#7223, @tenzen-y)

  • With BestEffortFIFO enabled, we will keep attempting to schedule a workload as long as
    it is waiting for preemption targets to complete. This fixes a bugs where an inadmissible
    workload went back to head of queue, in front of the preempting workload, allowing
    preempted workloads to reschedule (#7157, @gabesaba)

  • WorkloadRequestUseMergePatch: use "strict" mode for admission patches during scheduling which
    sends the ResourceVersion of the workload being admitted for comparing by kube-apiserver.
    This fixes the race-condition issue that Workload conditions added concurrently by other controllers
    could be removed during scheduling. (#7246, @mszadkow)

Other (Cleanup or Flake)

  • Improve the messages presented to the user in scheduling events, by clarifying the reason for "insufficient quota"
    in case of workloads with multiple PodSets.

    Example:

    • before: "insufficient quota for resource-type in flavor example-flavor, request > maximum capacity (24 > 16)"
    • after: "insufficient quota for resource-type in flavor example-flavor, previously considered podsets requests (16) + current podset request (8) > maximum capacity (16)" (#7232, @iomarsayed)
  • Restrict access to secrets for the Kueue controller manager only to secrets in the Kueue system namespace, ie
    kueue-system by default, or the one specified during installation with Helm. (#7188, @sbgla-sas)

  • Support mutating the kueue.x-k8s.io/priority-class label when quota is reserved (#7289, @mbobrovskyi)

  • V1beta2: Removed deprecated Preempt/Borrow from FlavorFungibility API (#7527, @mbobrovskyi)

Don't miss a new kueue release

NewReleases is sending notifications on new releases.