Announcements
We are looking for maintainers, reach out in #5432.
Deprecation / Removal
- [metrics server] Remove addon-resizer from image list (no longer in use) (#8566, @cyril-corbon)
- Add kubeadm option to
etcd_deployment_type
to replace theetcd_kubeadm_enabled
variable (#8317, @necatican) (See Notes 3) - Removes runc-arm64-1.0.3 hash value for non existing binaries (#8391, @Payback159)
- Drop containerd 1.4 support (#8780, @oomichi)
Feature / Major changes
- Add hashes for Kubernetes 1.24.0, 1.24.1, 1.21.12, v1.21.13, 1.22.8, 1.22.9, v1.22.10, 1.21.11, 1.23.5, 1.23.6, v1.23.7 and make kubernetes v1.23.7 default (#8628, #8746, #8783, #8876, #8760, @mzaian, @cristicalin)
- Add youki runtime support to CRI-O (#8411, @electrocucaracha)
- [etcd] add 0 hash for arm v3.5.2 to prevent deployment failures (#8651, @cristicalin)
- [etcd] ensure etcd is properly upgraded when managed by kubeadm (#8722, @cristicalin)
- [etcd] Add etcd_max_request_bytes option to set the request size limit of etcd (#8849, @necatican)
- [etcd] add v3.5.1 for kubernetes 1.22+ (#8588, @mzaian)
- [etcd] Added node label to etcd metrics (#8475, @fungusakafungus)
- [Cilium] Update Cilium manifests and the default version to v1.11.3 (#8717, @necatican)
- [Cilium] Add identity_allocation_mode support (#8430, @necatican)
- [Cilium] Change Cilium setting identity_allocation_mode to cilium_identity_allocation_mode (#8519, @tomberget) (see Notes 1)
- [cilium] Add the cilium ip-masq-agent configuration support (#8893, @mahjonp)
- [docker] add support for cri-dockerd as a replacement for dockershim (#8623, @cristicalin)
- Add dual-stack support to kubelet --node-ip parameter, it works if set
ip6
option host vars (#8542, @kakkotetsu) - Add ppc64le support (#8505, @mgiessing)
- Add runc v1.1.0 hash values to support multi-arch installation. (arm64, amd64) (#8447, @Payback159)
- Add support for
EventRateLimit
plugin configuration (#8711, @alegrey91) - Add support for including annotations on aws-ebs-csi-controller (#8779, @dlouks)
- Add support for kube-vip (#8669, @sathieu)
- Add support for service-account-lookup parameter (using
kube_apiserver_service_account_lookup
) (#8781, @alegrey91) - [ansible] add support for ansible 5 (ansible-core 2.12) (#8512, @cristicalin)
- [ansible] make ansible 5.x the new default version (#8660, @cristicalin)
- [ansible] update ansible and cryptography requirements (#8826, @cristicalin)
- [cert-manager] Update cert-manager to 1.6.1 (#8377, @electrocucaracha)
- [cert-manager] Update cert-manager to v1.7.2 (#8648, @rtsp)
- [cert-manager] Upgrade to v1.8.0 (#8688, @rtsp)
- Add Ubuntu 22.04 support (#8841, #8795, #8754, @u2216, @arno01, @oomichi)
- Add evictionHard parameter to kubelet config (variables:
eviction_hard
/eviction_hard_control_plane
) (#8421, @cyril-corbon) - Add hcloud as external cloud provider (#8440, @oujonny)
- Add kube_router_cluster_asn option to set ASN number of the cluster (#8837, @rosskusler)
- Add option to use UpCloud's preconfigured server plans, firewalls and managed load balancers (upgrade to 2.4.0 from 2.0.0) (#8758, @Ajarmar)
- Add possibility to remove ippools from cni config (#8845, @tomcsi)
- Add the ability to set tolerations (
cert_manager_tolerations
), nodeselector (cert_manager_nodeselector
) and affinity (cert_manager_affinity
) in cert-manager templates (#8389, @cyril-corbon) - Add the possibility to use UpCloud csi-driver
Add the possibility to use ansbile_host as api ip for localhost kubeconfig (#8653, @robinAwallace) - Add Hardening setup guide (#8868, @alegrey91)
- Add variables to manage kubelet parameters (
kubelet_streaming_connection_idle_timeout
/kubelet_make_iptables_util_chains
) (#8796, @alegrey91) - Added the optional prompt or delay before uncordoning nodes after upgrades (see variable
upgrade_node_post_upgrade_confirm
). (#8530, @mac-chaffee) - Allow installation of a cluster using external CAs (kubernetes-ca, etcd-ca, kubernetes-front-proxy-ca) (#8620, @julienlefur)
- Allow the customization of snapshot controller namespace using
snapshot_controller_namespace
(#8305, @liupeng0518) - Allow to change cert-manager leader election namespace for GKE Autopilot support (#8424, @rtsp)
- Allow to choose image pull commands based on container manager or override them (#8380, @sathieu)
- Allow to specify CA data for webhooks (using
kube_webhook_token_auth_url_skip_tls_verify
/kube_webhook_token_auth
) (#8777, @dlouks) - Assert that IP range is enough for the nodes (#8720, @eakyildirim)
- Bastion support now works for remove-node.yml (#8504, @roedie)
- Bump upcloud csi-driver to v0.2.1 (#8784, @robinAwallace)
- Change default kube_encryption_algorithm to "secretbox" (#8574, @Payback159) (See Notes 2)
- Explicit
container_manager
variable for Etcd hosts (#8521, @vi7) - Improve first_kube_control_plane variable management to avoid installation failures due to variable overlapping (#8388, @unai-ttxu)
- Improve offline script
generate_list.sh
using ansible (#8538, @tmurakam) - [ingress-nginx] upgrade to 1.2.1
- Ingress controllers and external provisioners (respectively deployed via ingress_controller and external_provisioner roles meta dependencies) are now upgraded in upgrade-cluster.yml (#8640, @mirwan)
- Local volume provisioner tolerations removed by default. (#8805, @spaced)
- Replace CLB with NLB for kube-apiserver domain in Terraform AWS contrib code (#8578, @sophalHong)
- Split kube_feature_gates variable for different kubernetes components (#8677, @alegrey91) (See Notes 4)
- Helm-apps role for installing helm charts (#8347, @VannTen)
- Upgrade azuredisk csi to v1.10.0 (#8432, @cyril-corbon)
- Upgrade metrics-server to v0.5.2 and remove NET_BIND_SERVICE capabilities (#8338, @cyril-corbon) (See Notes 5)
- Vagrant: new var $ansible_verbosiity was introduced for setting up ansible verbosity level (#8639, @maciejaszek)
- [CI] Move from CentOS 8 to AlmaLinux 8 for kubespray CI, therefore CentOS 8 is no longer tested (#8297, @cristicalin)
- [CI] split molecule testes to run in parallel (#8756, @cristicalin)
- [container image] use focal (ubuntu 20.04) base image for our docker builds (#8631, @cristicalin)
- [coredns] Allow overriding the default CoreDNS zone's
cache
plugin configuration via thecoredns_default_zone_cache_block
variable (#8488, @Tristan971) - [csi-snapshotter] upgraded to 5.0.0 (#8403, @cristicalin)
- [download] add capability to specify alternative download mirrors for files (#8474, @cristicalin)
- [mitogen] update to 0.3.2 (#8470, @cristicalin)
- [reset] remove containerd storage during reset (#8469, @cristicalin)
- [sysctl] set fs.may_detach_mounts=1 to address pods stuck in Terminating state (#8635, @cristicalin)
Network
- [Calico] upgrade calico to 3.19.4, 3.20.4 and 3.21.4 (default) and add 3.22.0 experimental support (#8544, @cristicalin)
- [Calico] add 3.22.1 (#8612, @cristicalin)
- [Calico] Add calico apiserver (using
calico_apiserver_enabled
variable) (#8690, @liupeng0518) - [Calico] Add support for IP6_AUTODETECTION_METHOD using new variable
calico_ip6_auto_method
(#8541, @kakkotetsu) - [Calico] upgrade default calico version to v3.22.3 (#8897, @germetist)
- [Calico] Add configurable ipam strictaffinity (using
calico_ipam_strictaffinity
param) (#8581, @eyenx) - [Calico] Change the calico cni name from cni0 to k8s-pod-network by default (#8813, @cyclinder)
- [Calico] Fix Wireguard support for CentOS Stream 9/RHEL 9 Beta (#8625, @ThisIsQasim)
- [Calico] fix calico-kube-controllers verbs (#8847, @irizzant)
- [calico] Some commands only need to be run once (#8833, @liupeng0518)
- [calico] call calico checks early on to prevent altering the cluster with bad settings and causing traffic outages (#8707, @cristicalin)
- [calico] make calico 3.21.x the news default and drop 3.18.x (#8426, @cristicalin)
- [calico] switch default iptables backend detection to Auto (#8429, @cristicalin)
- [calico] Use vxlan instead of ipip as the default calico encapsulation mode. This change impacts existing deployments that don't explicitly set the encapsulation mode and will need to set calico_ipip_mode: Always and calico_network_backend: bird to avoid the upgrade process breaking. (#8434, @cristicalin)
- [calico] upgrade default calico version to v3.21.5 (#8745, @mzaian)
- [calico] Use ipamconfig instead of calico ipam command (#8839, @liupeng0518)
- [calico] don't clobber calico options set by the user (#8815, @cristicalin)
- [flannel] Use install-cni-plugin to fit upstream (#8714, @zhengtianbao)
- [kube-ovn] Sync some feature with upstream (#8790, @liupeng0518)
- [kube-ovn] The network plug-in kube-ovn does not require a cluster to allocate podcidr (#8454, @chenhuazhong)
Applications
- Instance customization via cloud init for openstack VMs deployed by terraform is now available. (#8394, @moss2k13) (See Notes 6)
- [MetalLB] Configure PriorityClassName for deployment (#8362, @unai-ttxu)
- [MetalLB] Improve validation conditions for BGP Peers (#8568, @kakkotetsu)
- [MetalLB] Upgrade metallb to v0.11.0 and add liveness and readiness probe (#8420, @cyril-corbon)
- [MetalLB] Allow to put node selectors and source address for each metallb peers (#8534, @hightoxicity)
- [MetalLB] Added MetalLB BGP peer password authentication option. (#8792, @Oogy)
- [MetalLB] Add images to downloads (#8715, @sathieu)
- [MetalLB] Fix wrong port name in metallb.yml.j2 (metrics not monitoring) (#8510, @binkoni)
- [OpenStack] Allow disabling port security in terraform contrib code (#8410, @cristicalin)
- [OpenStack] Updated openstack cloud controller to version
v1.22.0
(#8629, @Xartos) - [OpenStack] Create master nodes with
for_each
for openstack. Makes it easier to switch out master nodes via terraform. (#8709, @robinAwallace) - [OpenStack] Fixed cluster roles for openstack cloud controller (#8638, @Xartos)
- [OpenStack] Fix templating of ansible_ssh_common_args in no_floating.yml if used as TF module (#8646, @frittentheke)
- [OpenStack] allow disabling port security at port level (#8455, @cristicalin)
- [vSphere] Terraform code will need
var.vapp
when a vapp is referenced (vsphere_hostname
is also removed) (#8441, @ceesios) - [vsphere_csi] update to 2.5.1 and make external_vsphere_version 7.0u1 the default (#8676, @cristicalin)
- Terraform/gcp: Allow to change extra disk types (#8524, @sathieu)
- Terraform/gcp: Allow to use preemptible VM instances (using two new variable
master_preemptible
andworker_preemptible
) (#8480, @sathieu) - Terraform/gcp: Do not create unused subnetworks
terraform/gcp: Upgrade to latest google provider (#8497, @sathieu) - [Terraform AWS] Add tag to AWS VPC subnets for automatic subnet discovery (#8705, @sophalHong)
- [terraform] use modern day equinix metal provider (#8748, @cristicalin)
Container-Managers
- Check & uninstall container engine if needed (when changing container engine defined) (#8439, @cyril-corbon)
- [Docker] Add epoch to docker-ce and docker-ce-cli packages to ensure docker upgrade (on rhel based) (#8618, @unai-ttxu)
- [containerd] make containerd_insecure_registries into a dict similar to containerd_registries (#8340, @mircyb) (See Notes 7)
- [containerd] upgrade versions to fix CVE-2022-23648 (#8597, @cristicalin)
- [containerd] Upgrade containerd to 1.6.0 and re-enable arm architecture with default options
[runc] make 1.1.0 the default
[nerdctl] upgrade to 0.17.0 (#8555, @cristicalin) - [containerd] add hashes for 1.15.11 and 1.6.2 and make 1.6.2 the default (#8671, @cristicalin)
- [containerd] Update containerd to 1.5.9 (#8402, @electrocucaracha)
- [containerd] Fix containerd image download bug (#8894, @liupeng0518)
- [containerd] nerdctl insecure registry support (#8339, @mircyb)
- [containerd] Ensure containerd service unmasking (#8726, @rickerc)
- [cri-o] Update configuration of registries in cri-o (#7852, @bsloeserwij) (See Notes 8)
- [cri-o] add cri-0 1.23.x (#8599, @cristicalin)
- [crun] update to 1.4 and drop pre-1.x versions (#8330, @cristicalin)
- [crun] upgrade to 1.4.3 (#8598, @cristicalin)
Bug or Regression
- Add ETCD_EXPERIMENTAL_INITIAL_CORRUPT_CHECK flag to etcd config (default to true) (#8664, @floryut)
- Add
with_networks
variable toexternal_hcloud_cloud
in ansible playbook andnetwork_zone
variable to Hetzner Cloud Terraform. (#8702, @Anthony-Bible) - Allow replacement of address prefixes for all images (#8764, @ErikJiang)
- CRI-O: fix unqualified-search registries (#8496, @krystianmlynek)
- Change libvirt default disk controller from IDE to SCSI (#8656, @190ikp)
- Do not remove package in validate container engine role when FCOS (#8626, @LuckySB)
- Enable Kubespray deployment on vagrant (#8697, @oomichi)
- Enable several read-only tasks in check mode (#8584, @tjanson)
- Ensure all Kubelet required kernel values are configured when enabling protectKernelDefaults (#8692, @unai-ttxu)
- Ensure taint configuration for secondary control-plane acting both as control-plane and node (#8363, @unai-ttxu)
- Error: error parsing jsonpath {, unclosed action (#8683, @emiran-orange)
- Fix DNS configuration when using resolvconf_mode='host_resolvconf' during scale (#8361, @unai-ttxu)
- Fix GCP PVC creation on k8s v1.22 (#8616, @lmercl)
- Fix
0090-etchosts
file when settingoverride_system_hostname=false
(#7634, @liupeng0518) - Fix
kube-dns
service will no longer be deleted if not created by kubespray (#8565, @cyril-corbon) - Fix an issue the kube-vip manifest with extra space. (#8831, @yankay)
- Fix an issue users cannot skip redhat registration by specifying -e rhel_enable_repos=False (#8871, @gleb108)
- Fix an issue where offline script could not output URLs of both containerd and krew. (#8379, @oomichi)
- Fix condition on kata_containers_version/kube_version check when kata_containers_enabled is false (#8804, @emiran-orange)
- Fix container engine still installed on dedicated etcd node even if
etcd_deployment_type: host
(#8386, @rtsp) - Fix cri-o packages install for Rocky 8 (#8594, @brankomijuskovic)
- Fix etcd certificates reference to support
etcd_kubeadm_enabled: true
(#7766, @forselli-stratio) - Fix imageRepository path for CoreDNS (ensure coredns repository namespace is kept) (#8572, @nicolas-goudry)
- Fix incorrect condition type (#8822, @cyclinder)
- Fix incorrect leader election namespace with cert-manager leading to insufficient permission (#8433, @rtsp)
- Fix issue when PodSecurityPolicy is enabled static pods are now mirrored earlier by kubelet. Problem when installing HA etcd via kubeadm. (#8744, @robinAwallace)
- Fix kubectl call before installing it when setting
first_kube_control_plane
/joined_control_planes
(#8412, @floryut) - Fix kubelet_kubelet_cgroups_cgroupfs pointing incorrectly to slice (#8500, @fungusakafungus)
- Fix print_hostnames of inventory.py (#8554, @oomichi)
- Fix remove-node.yaml playbook fails when host is unreachable (#8843, @oomichi)
- Fix removing
docker-ce.repo
failed (#8856, @Thearas) - Fix the condition of drain on pre-remove task (#8634, @oomichi)
- Fix typo and duplicated declaration of ingressclasses (#8591, @spaced)
- Fix vagrant default value for parameters
local_path_provisioner_enabled
/multi_networking
(#8650, @liupeng0518) - Fix wrong item in mitogen contrib (#8508, @kdszoom)
- Fixed a bug where hosts with NetworkManager enabled were having their /etc/resolv.conf file edited directly instead of through NM.
Fixed a bug where DNS lookup failures would cause reset.yml or scale.yml to error out when resolvconf_mode=host_resolvconf (#8575, @mac-chaffee) - Fixed a bug where updated versions of etcd weren't being applied. Check your etcd instances to make sure their versions are what you expect. If not, restarting all etcd members should apply the update. (#8556, @mac-chaffee)
- Fixed a bug where upgrade-cluster.yaml would not apply updates to etcd-events (#8550, @mac-chaffee)
- Fixes missing checksum for kata-containers 2.2.3 on arm architectures (#8383, @Payback159)
- Fixes the etcd node removal by pointing ETCDCTL_ENDPOINTS to localhost (127.0.0.1) (#8526, @roedie)
- Prevent removing etcd member when running in check mode (#8570, @fungusakafungus)
- Removal flow: Waiting until Volumes will be detached from the node (#8739, @rocko-n)
- Removed quotation at nerdctl_extra_flags (#8695, @T-Eberle)
- Run 0100-dhclient-hooks only if dhcpclient is enabled (#8658, @oomichi)
- Update verbs for volumeattachments resource (#8731, @moule3053)
- Use correct service name for coredns when cleanup (#8811, @weizhoublue)
- [Terraform-AWS] Fix error when creating subnets more than AZ (#8516, @sophalHong)
- [cert-manager] Fix missing RBAC rules for ClusterRole cert-manager-cainjector (#8444, @onock)
- [containerd] avoid cleanup of /usr/bin on ostree distributions (#8624, @cristicalin)
- [reset] fix task inclusion logic for network plugin (#8727, @cristicalin)
- [systemd-resolved] Fix DNS early and late stages (
dns_early
|dns_late
) of cluster deployment
[systemd-resolved] Addupstream_dns_servers
toFallbackDNS
[cluster-reset] Revert DNS configuration to early stage (for instance: only defined upstream nameservers) (#8561, @onock) - Incompatible ipset protocol version (7) included in kube-proxy since k8s 1.23, is causing issue with fedora kernel. (#8397, @floryut) (See Notes 9)
Other (Cleanup or Flake)
- Add IPv6 listen directive to nginx if enable_dual_stack_networks (#8596, @kakkotetsu)
- Cleanup crictl configuration file during reset (#8569, @jayonlau)
- Remove
check_mode: no
from gen_certs_script.yml to prevent changing files (#8573, @fungusakafungus)
Component versions:
- Kubernetes v1.23.7
- Etcd v3.5.3
- Docker v20.10
- Containerd v1.6.4
- CRI-O v1.23
- CNI-plugins v1.1.1
- Calico v3.22.3
- Cilium v1.11.3
- Flannel v0.17.0
- Kube-ovn v1.9.2
- Kube-Router v1.4.0
- Multus v3.8
- Weave v2.8.1
- Ceph-provisioner v2.1.0-k8s1.11
- Cert-manager v1.8.0
- CoreDNS v1.8.6
- Nginx-ingress v1.2.1
Known issues
n/a
Note
- If the Cilium setting
identity_allocation_mode
has been overridden locally, it needs to be changed tocilium_identity_allocation_mode
. - If you are already using encrypting secret a rest and have not set the
kube_encryption_algorithm
flag, then you must setkube_encryption_algorithm
toaescbc
since the default value has changed to the more securesecretbox
standard. etcd_kubeadm_enabled
is deprecated. You can setetcd_deployment_type
tokubeadm
to get the same behaviour."- This add some variables that could be defined by the user. It doesn't introduce a breaking change because the previous unique variable (kube_feature_gates) still works as expected, new variables are: kube_apiserver_feature_gates/kube_controller_feature_gates/kube_scheduler_feature_gates/kube_proxy_feature_gates/kubelet_feature_gates
- Be careful as the port has now moved to 4443
- To use it, adjust
contrib/terraform/openstack/modules/compute/templates/cloudinit.yaml
before deployment. (Currently it uses one cloud init for all instances.) containerd_insecure_registries
needs to be updated or won't work anymore!- Update the configuration of cri-o registries to only use the crio_registries key
- Calico and K8S 1.23 might broke under some OS, please see additional details in PR