cloudposse/terraform-aws-eks-node-group 3.0.0 on GitHub

New Features, Breaking Changes

tl;dr Upgrading to this version will likely cause your node group to be replaced, but otherwise should not have much impact for most users.

The major new feature in this release is support for Amazon Linux 2023 (AL2023). EKS support for AL2023 is still evolving, and this module will evolve along with that. Some detailed configuration options (e.g. KubeletConfiguration JSON) are not yet supported, but the basic features are there.

The other big improvements are in immediately applying changes and in selecting AMIs, as explained below.

Along with that, we have dropped some outdated support and changed the eks_node_group_resources output, resulting in minor breaking changes that we expect do not affect many users.

Create Before Destroy is Now the Default

Previously, when changes forced the creation of a new node group, the default behavior for this module was to delete the existing node group and then create a replacement. This is the default for Terraform, motivated in part by the fact that the node group's name must be unique, so you cannot create the new node group with the same name as the old one while the old one still exists.

With version 2 of this module, we recommended setting create_before_destroy to true to enable this module to create a new node group (with a partially randomized name) before deleting the old one, allowing the new one to take over for the old one. For backward compatibility, and because changing this setting always results in creating a new node group, the default setting was set to false.

With this release, the default setting of create_before_destroy is now true, meaning that if left unset, any changes requiring a new node group will cause a new node group to be created first, and then the existing node group to be deleted. If you have large node groups or small quotas, this can fail due to having the 2 node groups running at the same time.

Random name length now configurable

In order to support "create before destroy" behavior, this module uses the random_pet
resource to generate a unique pet name for the node group, since the node group name
must be unique, meaning the new node group must have a different name than not only the old one, but also all other node groups you have. Previously, the "random" pet name was 1 of 452 possible names, which may not be enough to avoid collisions when using a large number of node groups.

To address this, this release introduces a new variable, random_pet_length, that controls the number of pet names concatenated to form the random part of the name. The default remains 1, but now you can increase it if needed. Note that changing this value will always cause the node group name to change and therefore the node group to be replaced.

Immediately Apply Launch Template Changes

This module always uses a launch template for the node group. If one is not supplied, it will be created.

In many cases, changes to the launch template are not immediately applied by EKS. Instead, they only apply to Nodes launched after the template is changed. Depending on other factors, this may mean weeks or months pass before the changes are actually applied.

This release introduces a new variable, immediately_apply_lt_changes, to address this. When set to true, any changes to the launch template will cause the node group to be replaced, ensuring that all the changes are made immediately. (Note: you may want to adjust the node_group_terraform_timeouts if you have big node groups.)

The default value for immediately_apply_lt_changes is whatever the value of create_before_destroy is.

Changes in AMI selection

Previously, if the created launch template needed to supply an AMI ID (which is only the case if you supplied kubelet or bootstrap options), unless you specified a specific AMI ID, this module picked the "newest" AMI that met the selection criteria, which in turn was based on the AMI Name. The problem with that was that the "newest" might not be the latest Kubernetes version. It might be an older version that was patched more recently, or simply finished building a little later than the latest version.

Now that AWS explicitly publishes the AMI ID corresponding to the latest (or, more accurately, "recommended") version of their AMIs via SSM Public Parameters, the module uses that instead. This is more reliable and should eliminate the version regression issues that occasionally happened before.

The `ami_release_version` input has been updated

The ami_release_version input has been updated. It is the value that you can supply to aws_eks_node_group to track a specific patch version of Kubernetes. The previous validation for this variable was incorrect.

For Amazon Linux, it is the "Release version" from Amazon AMI Releases
For Bottlerocket, it is the release tag from Bottlerocket Releases without the "v" prefix.
For Windows, it is "AMI version" from AWS docs.

Note that unlike AMI names, release versions never include the "v" prefix.

Examples of AMI release versions based on OS:

Amazon Linux 2 or 2023: 1.29.3-20240531
Bottlerocket: 1.18.0 or 1.18.0-7452c37e # note commit hash prefix is 8 characters, not GitHub's default 7
Windows: 1.29-2024.04.09

Customization via `userdata`

Unsupported `userdata` now throws an error

Node configuration via userdata is different for each OS. This module has 4 inputs related to Node configuration that end up using userdata:

before_cluster_joining_userdata
kubelet_additional_options
bootstrap_additional_options
after_cluster_joining_userdata

but they do not all work for all OSes, and none work for Botterocket. Previously, they were silently ignored in some cases. Now they throw an error when set for an unsupported OS.

Note that for all OSes, you can bypass all these inputs and supply your own fully-formed, base64 encoded userdata via userdata_override_base64, and this module will pass it along unmodified.

Multiple lines supported in `userdata` scripts

All the userdata inputs take lists, because they are optional inputs. Previously, lists were limited to single elements. Now the list can be any length, and the elements will be combined.

Kubernetes Version No Longer Inferred from AMI

Previously, if you specified an AMI ID, the Kubernetes version would be deduced from the AMI ID name. That is not sustainable as new OSes are launched, so the module no longer tries to do that. If you do not supply the Kubernetes version, the EKS cluster's Kubernetes version will be used.

Output `eks_node_group_resources` changed

The aws_eks_node_group.resources attribute is a "list of objects containing information about underlying resources." Previously, this was output via eks_node_group_resources as a list of lists, due to a quirk of Terraform. It is now output as a list of resources, in order to align with the other outputs.

Special Support for Kubernetes Cluster Autoscaler removed

This module used to takes some steps (mostly labeling) to try to help the Kubernetes Cluster Autoscaler. As the Cluster Autoscaler and EKS native support for it evolved, the steps taken became either redundant or ineffective, so they have been dropped.

cluster_autoscaler_enabled has been deprecated. If you set it, you will get a warning in the output, but otherwise it has no effect.

AWS Provider v5.8 or later now required

Previously, this module worked with AWS Provider v4, but no longer. Now v5.8 or later is required.

Special Thanks

This PR builds on the work of @Darsh8790 (#178 and #180) and @QuentinBtd (#182 and #185). Thank you to both for your contributions.

🚀 Enhancements

Consolidate updates to test framework @Nuru (#177)

what

Update go k8s client and api packages to v0.29.4
Update go depenendcies

why

Track update to Kubernetes cluster version in #173
Resolve security alerts

references

Closes #159
Closes #164
Closes #165
Closes #166
Closes #167
Closes #171
Closes #172

feat: migrates example on eks-cluster-aws-4.x @gberenice (#173)

what

Upgrade the example to use eks-cluster v4.x.x, where any dependencies on the Kubernetes provider were removed.

why

This eliminates the Terraform test error caused by the kubernetes provider issue. As a consequence, this unlocks merging the PRs. Example of the error:

Error: Received unexpected error:
FatalError{Underlying: error while running command: exit status 1; ╷
     │ Error: Value Conversion Error
     │ 
     │ with module.eks_cluster.provider["registry.terraform.io/hashicorp/kubernetes"],
     │ on .terraform/modules/eks_cluster/auth.tf line 96, in provider "kubernetes":
     │ 96: provider "kubernetes" {

references

Add support for AL2023 @Nuru (#186)

what

Add initial support for EKS under Amazon Linux 2023 (AL2023)
Improve AMI selection process
Deprecate Kubernetes Cluster Autoscaler support

why

Amazon Linux 2023 (AL2023) is the latest offering from Amazon
Previously, AMIs were selected by name and date, which occasionally led to undesirable results
The support was either redundant or ineffective

references

Documentation:

Issues and Other PRs:

Closes #155
Closes #174
Supersedes and closes #178
Supersedes and closes #180
Supersedes and closes #182
Closes #183
Supersedes and closes #185

🤖 Automatic Updates

Update release workflow to allow pull-requests: write @osterman (#184)

what

Update workflow (.github/workflows/release.yaml) to have permission to comment on PR

why

So we can support commenting on PRs with a link to the release

Update GitHub Workflows to use shared workflows from '.github' repo @osterman (#181)

what

Update workflows (.github/workflows) to use shared workflows from .github repo

why

Reduce nested levels of reusable workflows

Update GitHub Workflows to Fix ReviewDog TFLint Action @osterman (#176)

what

Update workflows (.github/workflows) to add issue: write permission needed by ReviewDog tflint action

why

The ReviewDog action will comment with line-level suggestions based on linting failures

Update GitHub workflows @osterman (#175)

what

Update workflows (.github/workflows/settings.yaml)

why

Support new readme generation workflow.
Generate banners

Use GitHub Action Workflows from `cloudposse/.github` Repo @osterman (#170)

what

Install latest GitHub Action Workflows

why

Use shared workflows from cldouposse/.github repository
Simplify management of workflows from centralized hub of configuration

Add GitHub Settings @osterman (#163)

what

Install a repository config (.github/settings.yaml)

why

Programmatically manage GitHub repo settings

Update Scaffolding @osterman (#160)

what

Reran make readme to rebuild README.md from README.yaml
Migrate to square badges
Add scaffolding for repo settings and Mergify

why

Upstream template changed in the .github repo
Work better with repository rulesets
Modernize look & feel

cloudposse/terraform-aws-eks-node-group 3.0.0 v3.0.0 on GitHub

New Features, Breaking Changes

Create Before Destroy is Now the Default

Random name length now configurable

Immediately Apply Launch Template Changes

Changes in AMI selection

The ami_release_version input has been updated

Customization via userdata

Unsupported userdata now throws an error

Multiple lines supported in userdata scripts

Kubernetes Version No Longer Inferred from AMI

Output eks_node_group_resources changed

Special Support for Kubernetes Cluster Autoscaler removed

AWS Provider v5.8 or later now required

Special Thanks

🚀 Enhancements

what

why

references

what

why

references

what

why

references

🤖 Automatic Updates

what

why

what

why

what

why

what

why

what

why

what

why

what

why

cloudposse/terraform-aws-eks-node-group 3.0.0
v3.0.0

on GitHub

The `ami_release_version` input has been updated

Customization via `userdata`

Unsupported `userdata` now throws an error

Multiple lines supported in `userdata` scripts

Output `eks_node_group_resources` changed