Features
- Add aggregate cluster role for admin permissions (#2226) #2226 (Matt Conway)
- Add
maxPods
support to the kubelet configuration (#2261) #2261 (Jonathan Innis) - Update Terraform getting started guide to default to multi-cluster tagging scheme (#1668) #1668 (Bryant Biggs)
- Make emptiness controller aware of scheduling (#2105) #2105 (Todd Neal)
- Do not drain when pods have no owner refs and add drain error events (#2092) #2092 (Brandon Wagner)
- Support all bottlerocket configuration (#2081) #2081 (Brandon Wagner)
- Add node termination metrics (#2139) #2139 (Jason Deal)
- Adding support for custom AMIs (#2014) #2014 (Suket Sharma)
- In the absence of instance-type/family filtering, provide a default (#2153) #2153 (Todd Neal)
- Add pod startup summary metric (#2146) #2146 (Jason Deal)
- Adding Custom AMIFamily and validations (#2154) #2154 (Suket Sharma)
- Add new grafana dashboards (#2185) #2185 (Jason Deal)
- Implemented support for GT and LT requirement operators (#2051) #2051 (Ellis Tarn)
- Override
systemReserved
when scheduling and in kubelet parameter startup (#2254) #2254 (Jonathan Innis) - add revisionHistoryLimit option to chart (#2102) (Ido Koren) #2102
- add instance type category, generation, and local nvme storage labels (#2165) (Brandon Wagner) #2165
- Add capability to use custom podDisruptionBudget name (#2174) (Shrikant Lavhate) #2174
- Warn on not enough instance types when flexible to spot and on-demand and requesting on-demand (#2274) (Brandon Wagner) #2274
Bug Fixes
- Consider unready nodes as in flight (#2224) #2224 (Ellis Tarn)
- get node state before pending pods (#2242) #2242 (Todd Neal)
- Separate zone caching by subnet selectors (#2229) #2229 (Asim Shankar)
- check instance state on termination failure (#2253) #2253 (Jim DeWaard)
- spot error type comparison (#2260) #2260 (Brandon Wagner)
- use the correct number of pods when bin-packing if AWS_ENI_LIMITED_POD_DENSITY is false (#2019) #2019 (Todd Neal)
- restrict iam:PassRole to just the node role we create (#2008) #2008 (Todd Neal)
- add initial delay to liveness and let CP know it's run by the webhook (#2032) #2032 (Todd Neal)
- correctly handle static volumes (#2033) #2033 (Todd Neal)
- eliminate timezone from pricing date (#2049) #2049 (Todd Neal)
- hostNetwork variable is not aligned with docs in Helm chart (#2043) (#2044) #2044 (flf2ko)
- update initial pricing async (#2063) #2063 (Todd Neal)
- Prevent use of launch templates and UserData together (#2076) #2076 (Suket Sharma)
- helm version causing errors (#2096) #2096 (Vishal Vazkar)
- de-prioritize c1 (#2103) #2103 (Todd Neal)
- reduce created event count (#2113) #2113 (Todd Neal)
- de-prioritize bare metal instance types (#2119) #2119 (Todd Neal)
- revert make apply to use -B rather than --bare (#2143) #2143 (Jason Deal)
- batch create fleet calls (#2130) #2130 (Todd Neal)
- duplicate termination event metrics (#2152) #2152 (Jason Deal)
- termination test errors (#2157) #2157 (Jason Deal)
- consider pod density in memory overhead calc (#1960) #1960 (Heiko Rothe)
- record provisioner requirements as valid topology domains (#2173) #2173 (Todd Neal)
- add in a grace period when checking if pods are stuck terminating (#2175) #2175 (Ellis Tarn)
- ensure cluster state is synced prior to scheduling (#2182) #2182 (Todd Neal)
- ignore pods that have an invalid storage class (#2192) #2192 (Todd Neal)
- eliminate controller-runtime debug log per event (#2206) #2206 (Todd Neal)
- Do not log evicted pods, since they are emitted as events (#2209) #2209 (Ellis Tarn)
- Removed spurious error logging when scaling down nodes by deleting a provisioner (#2196) #2196 (Ellis Tarn)
- Emit evictions as events (#2210) #2210 (Ellis Tarn)
- termination latency dashboard unit set to seconds (#2218) #2218 (Jason Deal)
- fix release-gen output (#2277) (Todd Neal) #2277
- fix aws pricing sdk change again (#2167) (Brandon Wagner) #2167
- Drain terminated pods (#2255) (Ellis Tarn) #2255
- fix LTs under management (#2108) (Brandon Wagner) #2108
- Fixes add-snapshot-tag.sh by adding missing parameter (Ryan Maleki) #2120
- remove instance type compression (#2121) (Brandon Wagner) #2121
- Fixed make verify failure (#2124) (Jason Deal) #2124
- fix static pod and unschedulable toleration evictions (#2136) (Brandon Wagner) #2136
- Fix logging into public ECR (#2204) (Jonathan Innis) #2204
- Remove spurious logging from tests and indefinitely rely on cluster state sync (#2188) #2188 (Ellis Tarn)
- add Unsupported to unfulfillable error codes (#2147) (Brandon Wagner) #2147
- Link to main doc page (Ryan Maleki) #2135
- This should send the new tag for nightlies and stable (Ryan Maleki) #2135
- Make the message look more informational since it is (Ryan Maleki) #2135
Documentation
- Upgrade v0.14.0 (#2270) #2270 (Brandon Wagner)
- correct information regarding default instance types (#2028) #2028 (Todd Neal)
- update node version (#2039) #2039 (Todd Neal)
- clarify use of prefix assignment mode (#2048) #2048 (Jim DeWaard)
- add more information about metrics (#2058) #2058 (Todd Neal)
- fixed inaccurate note about pod affinity (#2074) #2074 (Ellis Tarn)
- adding Omaze to list of adopters (#2068) #2068 (Dan Shepard)
- ensure we don't list capacity type on our instance type docs (#2078) #2078 (Todd Neal)
- add metrics / dashboard design (#2003) #2003 (Jason Deal)
- Remove dead links from README (#2107) #2107 (Ellis Tarn)
- document some advanced scheduling techniques (#2183) #2183 (Todd Neal)
- Add docs for customAMIs, tests for UserData (#2169) #2169 (Suket Sharma)
- Update api_version in helm provider (#2201) #2201 (juangascon)
- Add a warning about misspelled custom labels (#2195) #2195 (Ellis Tarn)
- Update ADOPTERS file with Beeswax organization (#2208) #2208 (James Wojewoda)
- fix misleading description on controller max-pods flag (#2148) #2148 (Enrique González)
- Updated ALB description in FAQ (#2245) (Chris Negus) #2245
- add nth docs to getting started guide (#2160) (Brandon Wagner) #2160
- Add more details about snapshots, nightly release and include the cadence for stable releases (Ryan Maleki) #2094
- crd upgrade instruction backfill (#2093) (Brandon Wagner) #2093
- documenting defaulting behavior for architecture (#2095) (Nick Tran) #2095
- Docs update to run daemonset plugin on GPU nodes (#2125) (Chris Negus) #2125
- Fix dead link and end alert message (#2215) (Jonathan Innis) #2215
- add consolidation design doc (#2207) #2207 (Todd Neal)
- typos (#2149) (Fernando Miguel) #2149
- Documentation update under 'getting started' section (#1788) (Shrikant Lavhate) #1788
- undo doc change (Ryan Maleki) #2135
- backfill upgrade guide for providerRef crd replace (#2142) (Brandon Wagner) #2142
- Add file extension (#2134) (Ryan Maleki) #2134
Tests
- remove examples in favor of real tests (#2219) #2219 (Ellis Tarn)
- add a test for inflight nodes combined with providerRef (#2013) #2013 (Todd Neal)
- prevent test flake by not overwriting the test wide controller (#2020) #2020 (Todd Neal)
- idempotent test infra setup (#2030) #2030 (Brandon Wagner)
- add initial integration tests (#2047) #2047 (Todd Neal)
- fix batcher test flake (#2086) #2086 (Todd Neal)
- add dynamic/static PVC E2E tests (#2097) #2097 (Todd Neal)
- Expand cluster installation test configs (#2022) #2022 (Nick Tran)
- run e2etests serially (#2122) #2122 (Todd Neal)
- Add a utilization test 1pod/node, 500 pods (#2194) #2194 (Ellis Tarn)
- Add tests to verify maxPods on nodes (#2198) #2198 (Suket Sharma)
- Fix storage test for static volumes (#2214) #2214 (Suket Sharma)
- add multi-provisioner subnet selector test (#2237) (Brandon Wagner) #2237
- Adding tests for TTL (#2202) (Nick Tran) #2202
- Remove validation from test.Provisioner() since some tests test validation (#2239) #2239 (Ellis Tarn)
- Implemented ginkgo test against local kubeconfig (#2015) #2015 (Ellis Tarn)
- Add a missing test directive to pkg/scheduling (#2106) #2106 (Ellis Tarn)
- Increase randomname uniqueness to avoid collisions (#2104) #2104 (Ellis Tarn)
- Added missing env.Stop() to avoid leaking etcd/apiserver (#2114) #2114 (Ellis Tarn)
- Fixed a few bugs with running integration tests in the integration environment (#2184) #2184 (Ellis Tarn)
- Implemented pipeline for integration test suite (#2127) #2127 (Ellis Tarn)
- Added kubeconfig to test environment (#2137) #2137 (Ellis Tarn)
- Fixed a testing bug that ignored --kubeconfig (#2138) #2138 (Ellis Tarn)
- Add log captures to e2e tests (#2170) (Ellis Tarn) #2170
Chores
- add make target for setup which creates IAM roles (#2223) #2223 (Jonathan Innis)
- deps: bump autoprefixer from 10.4.7 to 10.4.8 in /website (#2234) #2234 (dependabot[bot])
- deps: bump github.com/aws/aws-sdk-go from 1.44.60 to 1.44.66 (#2231) #2231 (dependabot[bot])
- deps: bump github.com/samber/lo from 1.21.0 to 1.27.0 (#2232) #2232 (dependabot[bot])
- deps: bump github.com/onsi/gomega from 1.19.0 to 1.20.0 (#2233) #2233 (dependabot[bot])
- Testing Infra Flux Setup (#2247) #2247 (Brandon Wagner)
- add k8s 1.23 to github CI and change default (#2249) #2249 (Todd Neal)
- Add
make build
target alongsidemake apply
(#2248) #2248 (Jonathan Innis) - deps: bump github.com/pelletier/go-toml/v2 from 2.0.1 to 2.0.2 (#2035) #2035 (dependabot[bot])
- deps: bump github.com/aws/aws-sdk-go from 1.44.25 to 1.44.46 (#2038) #2038 (dependabot[bot])
- deps: bump github.com/aws/amazon-vpc-resource-controller-k8s (#2034) #2034 (dependabot[bot])
- use "go test" instead of "ginkgo" (#2057) #2057 (Ellis Tarn)
- set CODEOWNERS on API Surfaces (#2056) #2056 (Ellis Tarn)
- remove unused code (#2084) #2084 (Todd Neal)
- remove pvc update permissions (#2083) #2083 (Todd Neal)
- add pod information to debug log (#2087) #2087 (Todd Neal)
- decouple webhook logic from cloud provider API (#2079) #2079 (Ellis Tarn)
- node metrics refactor (#2031) #2031 (Jason Deal)
- update docs (#2156) #2156 (Todd Neal)
- add tyk cloud as an adopter (#2181) #2181 (gowtham)
- update default value of
dnsPolicy
invalues.yaml
toDefault
(#2199) #2199 (Jonathan Innis) - unused import breaking e2etests (#2213) #2213 (Suket Sharma)
- 941b2cc: Refactored sets into requirements in preparation for GT/LT (#2241) (Ellis Tarn) #2241
- d43c9b2: lower spot to od threshold to 5 (#2190) (Brandon Wagner) #2190
- Add a script to parse community contributors in each release (#2236) #2236 (Ellis Tarn)
- add limit ranges (#2251) (Brandon Wagner) #2251
- remove unreachable err (#2263) (Brandon Wagner) #2263
- add prefer arm example (#2018) (Brandon Wagner) #2018
- Upgrade cosign to 1.9.0 (#2004) (Ryan Maleki) #2004
- Move the target account for nightlies and releases to the new account, and also remove snapshot-pr-approval and part of release.yaml that were attempting to make releases on PR commands and tags (#2072) (Ryan Maleki) #2072
- add hack to copy vpc limits file (#2110) (Brandon Wagner) #2110
- s/karpenter-snapshot/karpenter (Ryan Maleki) #2077
- Add notify function (Ryan Maleki) #2112
- Do not wait for user input (Ryan Maleki) #2112
- Notification for stable releases (Ryan Maleki) #2115
- Remove API Codeowners (#2126) #2126 (Ellis Tarn)
- add log for filtered instance types due to limits (#2089) (Brandon Wagner) #2089
- fix instance label description for generation (#2168) (Brandon Wagner) #2168
Commits
- b9306e7: Pull released images through pull through cache (#2262) (Ryan Maleki) #2262
- 54d9b64: add retry (#2264) (Ryan Maleki) #2264
- e1a0222: revert part of previous newnode optimization (#2267) (Todd Neal) #2267
- a844339: Add Anthropic to ADOPTERS.md (#2272) (Nova) #2272
- 9dc885c: v0.14.0 release (dewjam) #2278
- 7064c5c: AWS-context on ec2 fleet requests (#2007) (Brandon Wagner) #2007
- c7d63df: Add bug to exemptions for stalebot (Ryan Maleki) #2054
- 0027187: Revert "chore(deps): bump github.com/aws/amazon-vpc-resource-controller-k8s (#2034)" (Ryan Maleki) #2053
- 2bfefc3: V0.13.2 into main (#2075) (Todd Neal) #2075
- 0f6c1de: Revert "chore: decouple webhook logic from cloud provider API (#2079)" (#2155) (Ellis Tarn) #2155
- 7c7f339: Revert "Fix logging into public ECR (#2204)" (#2211) (Ryan Maleki) #2211