Changelog
General
- Experimental support for DRA autoscaling is implemented, disabled by default. #7530
- To enable it, set the
--enable-dynamic-resource-allocation
flag in Cluster Autoscaler, and theDynamicResourceAllocation
feature guard in the cluster. Additionally, RBAC configuration must allow Cluster Autoscaler to list the following new objects:resource.k8s.io/ResourceClaim
,resource.k8s.io/ResourceSlice
,resource.k8s.io/DeviceClass
. - The support is experimental and not yet intended for production use. In particular, details about how Cluster Autoscaler reacts to DRA resources might change in future releases. Most autoscaling scenarios should work, but with potentially reduced performance. Details about missing features can be found in #7530 PR description.
- To enable it, set the
- Remove legacy scale down code. #7219
--parallel-drain
flag was removed. To ensure only one node is drained at the same time, use--max-drain-parallelism=1
.--max-empty-bulk-delete
flag was deprecated. It still works, but is equivalent to and will be replaced by--max-scale-down-parallelism
in a future release.
- Add v1 CRD for ProvisioningRequest. #7223
- Enable ability to set custom lease resource name using the
--lease-resource-name
flag. #7233 - Make ds pods eviction best effort when draining empty nodes (nodes without user pods). Now Cluster Autoscaler will not wait for DS pods to get fully evicted in empty nodes. #7236
- Support for frequent loops when ProvisioiningRequest is encountered in the last loop. #7418
- Allows CheckCapacity ProvisioningRequests to be processed in batch mode with configurable max batch size and batch timebox. The feature is controlled via new flags:
--check-capacity-batch-processing
,--check-capacity-provisioning-request-max-batch-size
,--check-capacity-provisioning-request-batch-timebox
. #7283 - Add a
--force-delete-long-unregistered-nodes
flag, which if used allows Cluster Autoscaler to remove long unregistered nodes even if it would break the min size constraints of their node group. #7493 - New Parameter for CheckCapacity ProvisioningRequests that limits retry mechanism. #7496
- Mark ProvisioningRequest CheckCapacity conditions in parallel to increase throughput. #7561
AWS
- Add support for Nvidia L40s GPU instances. #7181
- Add g6e EC2 instance type. #7177
- Only cache instance requirements when needed. #7383
Azure
- Certain scale from zero scenarios around labels will be more accurate. #7208
- ACTION REQUIRED: VMSS GPU Nodes are now identified by the
kubernetes.azure.com/accelerator
label instead ofaccelerator
. #7235 - StrictCacheUpdates to disable proactive vmss cache updates. #7481
- Set node state to InstanceCreating to delete on CSE error. #7526
- Add flag to enable fast delete of failed VMSS. #7531
- Fix scaling of spot node pools. #7579
- Regenerate Azure static SKU list. #7614
Exoscale
- add support for --nodes flag. #6771
grpc
- If the gRPC expander server returns nil for its best options, the gRPC expander client will return nil. #7351
Hetzner
- The HCLOUD_ENDPOINT environment variable is now supported to set a custom endpoint for Hetzner usage is cluster-autoscaler. #7285
- Fix Hetzner Provider not starting after 2024-09-07. #7211
- Consider label
instance.hetzner.cloud/provided-by
for scheduling. #7441 - Add support for specifying node pool placement groups when using HCLOUD_CLUSTER_CONFIG. #6999
OCI
- Added OCI support for node-group-auto-discovery parameter. #7403
Images
registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.0
registry.k8s.io/autoscaling/cluster-autoscaler-arm64:v1.32.0
registry.k8s.io/autoscaling/cluster-autoscaler-amd64:v1.32.0
registry.k8s.io/autoscaling/cluster-autoscaler-s390x:v1.32.0