Changelog

General

Cluster Autoscaler can now provision nodes before all pending pods are created and marked as unschedulable by scheduler. This behavior is disabled by default and can be enabled with --enable-proactive-scaleup flag. --pod-injection-limit flag is introduced to allow fine-tuning this behavior. (#7145)
- This functionality can significantly speed up provisioning of nodes when hundreds or thousands of pods are created at the same time as well as lead to better scale-up decisions in those cases.
- Injecting too many pods can make CA unstable, depending on number of NodeGroups and scalability of particular cloud provider integration. --pod-injection-limit can help control this.
Added support for ProvisioningRequest v1 API. (#7195)
Allows the user to use in-cluster kubernetes configuration while self-hosting cluster-autoscaler as a pod within their cluster. (#7156)
Faster handling of failed scale ups, useful especially with multiple quota or stockout errors across the cluster. (#7087)
Bin packing will be cut short after exceeding "maxBinpackingDuration". The "maxBinpackingDuration" is set using an new flag "--max-binpacking-time". This can prevent rare cases where CA gets unresponsive in scenarios with a very large number of pods pending. (#6556)
Added a new least-nodes expander (#6792)

AWS

Fix an issue in the Kubernetes Cluster Autoscaler where actual AWS instances could be incorrectly scaled down instead of placeholders. (#6911)
Fix an issue with reading taints on Managed Node Groups scaled to zero, that can cause scale-up of nodes with taints that pending pods don't tolerate (#6482)

Azure

ACTION REQUIRED: VMSS GPU Nodes must now also include the kubernetes.azure.com/accelerator label in addition to accelerator. (#7203)
From now on, users should refer to https://cloud-provider-azure.sigs.k8s.io/install/configs/ for configuration interface (#6947)
Fixed an issue where environment variables were not being passed in when config file exists (#6947)
Fixed an issue where some cloud provider configurations were not being validated when UseManagedIdentityExtension is set to true (#6947)
Renamed several fields from config file, with old names are still acceptable and taking precedence: useWorkloadIdentityExtension to useFederatedWorkloadIdentityExtension, vmssCacheTTL to vmssCacheTTLInSeconds, vmssVmsCacheTTL to vmssVirtualMachinesCacheTTLInSeconds, enableVmssFlex to enableVmssFlexNodes (#6947)
Renamed several environment variables, with old names are still acceptable and taking precedence: ARM_USE_MANAGED_IDENTITY_EXTENSION to ARM_USE_FEDERATED_WORKLOAD_IDENTITY_EXTENSION, AZURE_VMSS_CACHE_TTL to AZURE_VMSS_CACHE_TTL_IN_SECONDS, AZURE_VMSS_VMS_CACHE_TTL to AZURE_VMSS_VMS_CACHE_TTL_IN_SECONDS, AZURE_ENABLE_VMSS_FLEX to AZURE_ENABLE_VMSS_FLEX_NODES (#6947)
Fix some cases where instance cache is outdated but not getting refreshes (#7116)
Support cloud provider AAD certificate authentication (#7003)
getVMSS api will be called when using spot instances for having better updated information (#6470)
The AZURE_CLUSTER_AUTOSCALER_USER_AGENT_SUFFIX variable can be used to customize the user agent for the Azure provider of cluster-autoscaler. Setting this to -my-user-agent results in a user agent like Go/go1.22.5 (amd64-linux) go-autorest/v14.2.1 cluster-autoscaler-my-user-agent/v1.31.0-alpha.2. (#7033)
You can now optionally specify a default min and max size for Azure VMSSs through the auto discovery tags. Explicit min and max tags on VMSSs will still be given priority over the default. (#6863).
Skips Azure-specific node labels that might mistakenly categorize nodegroups as different when, in reality, they are similar. (#6634)

Cluster API

Added configurable autoscaling options to clusterapi provider allowing users to configure e.g. --scale-down-unneeded-time on a per node group level. (#6743)

GCE

GCE cloud provider will use Instance.List api to list mig instances. IGM.ListManagedInstances api will be used as a fall back mechanism and for listing instances for migs that have instances in creating or deleting states. This should improve performance in clusters with a large number of NodeGroups. (#6955)

Hetzner

Fixed exhausted node groups not backing off for Hetzner Provider (#6750)

Images

registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
registry.k8s.io/autoscaling/cluster-autoscaler-arm64:v1.31.0
registry.k8s.io/autoscaling/cluster-autoscaler-amd64:v1.31.0
registry.k8s.io/autoscaling/cluster-autoscaler-s390x:v1.31.0

Full Changelog: cluster-autoscaler-1.30.0...cluster-autoscaler-1.31.0

kubernetes/autoscaler cluster-autoscaler-1.31.0 Cluster Autoscaler 1.31.0 on GitHub

Changelog

General

AWS

Azure

Cluster API

GCE

Hetzner

Images

kubernetes/autoscaler cluster-autoscaler-1.31.0
Cluster Autoscaler 1.31.0

on GitHub