Features
- require 10 instance types when falling back from spot to OD (#2002) #2002 (Brandon Wagner)
- Update Terraform getting started guide to default to multi-cluster tagging scheme (#1668) #1668 (Bryant Biggs)
- make emptiness controller aware of scheduling (#2105) #2105 (Todd Neal)
- do not drain when pods have no owner refs and add drain error events (#2092) #2092 (Brandon Wagner)
- support all bottlerocket configuration (#2081) #2081 (Brandon Wagner)
- add node termination metrics (#2139) #2139 (Jason Deal)
- Adding support for custom AMIs (#2014) #2014 (Suket Sharma)
- in the absence of instance-type/family filtering, provide a default (#2153) #2153 (Todd Neal)
- add pod startup summary metric (#2146) #2146 (Jason Deal)
- Adding Custom AMIFamily and validations (#2154) #2154 (Suket Sharma)
- add new grafana dashboards (#2185) #2185 (Jason Deal)
Bug Fixes
- use the correct number of pods when bin-packing if AWS_ENI_LIMITED_POD_DENSITY is false (#2019) #2019 (Todd Neal)
- restrict iam:PassRole to just the node role we create (#2008) #2008 (Todd Neal)
- add initial delay to liveness and let CP know it's run by the webhook (#2032) #2032 (Todd Neal)
- correctly handle static volumes (#2033) #2033 (Todd Neal)
- eliminate timezone from pricing date (#2049) #2049 (Todd Neal)
- hostNetwork variable is not aligned with docs in Helm chart (#2043) (#2044) #2044 (flf2ko)
- update initial pricing async (#2063) #2063 (Todd Neal)
- Prevent use of launch templates and UserData together (#2076) #2076 (Suket Sharma)
- helm version causing errors (#2096) #2096 (Vishal Vazkar)
- de-prioritize c1 (#2103) #2103 (Todd Neal)
- reduce created event count (#2113) #2113 (Todd Neal)
- de-prioritize bare metal instance types (#2119) #2119 (Todd Neal)
- revert make apply to use -B rather than --bare (#2143) #2143 (Jason Deal)
- batch create fleet calls (#2130) #2130 (Todd Neal)
- duplicate termination event metrics (#2152) #2152 (Jason Deal)
- termination test errors (#2157) #2157 (Jason Deal)
- consider pod density in memory overhead calc (#1960) #1960 (Heiko Rothe)
- record provisioner requirements as valid topology domains (#2173) #2173 (Todd Neal)
- add in a grace period when checking if pods are stuck terminating (#2175) #2175 (Ellis Tarn)
- ensure cluster state is synced prior to scheduling (#2182) #2182 (Todd Neal)
- ignore pods that have an invalid storage class (#2192) #2192 (Todd Neal)
- eliminate controller-runtime debug log per event (#2206) #2206 (Todd Neal)
- Do not log evicted pods, since they are emitted as events (#2209) #2209 (Ellis Tarn)
- Removed spurious error logging when scaling down nodes by deleting a provisioner (#2196) #2196 (Ellis Tarn)
- Emit evictions as events (#2210) #2210 (Ellis Tarn)
- termination latency dashboard unit set to seconds (#2218) #2218 (Jason Deal)
Documentation
- correct information regarding default instance types (#2028) #2028 (Todd Neal)
- update node version (#2039) #2039 (Todd Neal)
- clarify use of prefix assignment mode (#2048) #2048 (Jim DeWaard)
- add more information about metrics (#2058) #2058 (Todd Neal)
- fixed inaccurate note about pod affinity (#2074) #2074 (Ellis Tarn)
- adding Omaze to list of adopters (#2068) #2068 (Dan Shepard)
- ensure we don't list capacity type on our instance type docs (#2078) #2078 (Todd Neal)
- add metrics / dashboard design (#2003) #2003 (Jason Deal)
- Remove dead links from README (#2107) #2107 (Ellis Tarn)
- document some advanced scheduling techniques (#2183) #2183 (Todd Neal)
- Add docs for customAMIs, tests for UserData (#2169) #2169 (Suket Sharma)
- Update api_version in helm provider (#2201) #2201 (juangascon)
- Add a warning about misspelled custom labels (#2195) #2195 (Ellis Tarn)
- Update ADOPTERS file with Beeswax organization (#2208) #2208 (James Wojewoda)
- fix misleading description on controller max-pods flag (#2148) #2148 (Enrique González)
Tests
- add a test for inflight nodes combined with providerRef (#2013) #2013 (Todd Neal)
- prevent test flake by not overwriting the test wide controller (#2020) #2020 (Todd Neal)
- idempotent test infra setup (#2030) #2030 (Brandon Wagner)
- add initial integration tests (#2047) #2047 (Todd Neal)
- fix batcher test flake (#2086) #2086 (Todd Neal)
- add dynamic/static PVC E2E tests (#2097) #2097 (Todd Neal)
- Expand cluster installation test configs (#2022) #2022 (Nick Tran)
- run e2etests serially (#2122) #2122 (Todd Neal)
- Add a utilization test 1pod/node, 500 pods (#2194) #2194 (Ellis Tarn)
- Add tests to verify maxPods on nodes (#2198) #2198 (Suket Sharma)
- Fix storage test for static volumes (#2214) #2214 (Suket Sharma)
Chores
- deps: bump github.com/pelletier/go-toml/v2 from 2.0.1 to 2.0.2 (#2035) #2035 (dependabot[bot])
- deps: bump github.com/aws/aws-sdk-go from 1.44.25 to 1.44.46 (#2038) #2038 (dependabot[bot])
- deps: bump github.com/aws/amazon-vpc-resource-controller-k8s (#2034) #2034 (dependabot[bot])
- use "go test" instead of "ginkgo" (#2057) #2057 (Ellis Tarn)
- set CODEOWNERS on API Surfaces (#2056) #2056 (Ellis Tarn)
- remove unused code (#2084) #2084 (Todd Neal)
- remove pvc update permissions (#2083) #2083 (Todd Neal)
- add pod information to debug log (#2087) #2087 (Todd Neal)
- decouple webhook logic from cloud provider API (#2079) #2079 (Ellis Tarn)
- node metrics refactor (#2031) #2031 (Jason Deal)
- update docs (#2156) #2156 (Todd Neal)
- add tyk cloud as an adopter (#2181) #2181 (gowtham)
- update default value of
dnsPolicy
invalues.yaml
toDefault
(#2199) #2199 (Jonathan Innis) - unused import breaking e2etests (#2213) #2213 (Suket Sharma)
Commits
- 39118f6: add prefer arm example (#2018) (Brandon Wagner) #2018
- 7064c5c: AWS-context on ec2 fleet requests (#2007) (Brandon Wagner) #2007
- 68fab0f: Upgrade cosign to 1.9.0 (#2004) (Ryan Maleki) #2004
- Implemented ginkgo test against local kubeconfig (#2015) #2015 (Ellis Tarn)
- c7d63df: Add bug to exemptions for stalebot (Ryan Maleki) #2054
- 0027187: Revert "chore(deps): bump github.com/aws/amazon-vpc-resource-controller-k8s (#2034)" (Ryan Maleki) #2053
- 5215ae0: add network related bottlerocket settings (#2060) (Brandon Wagner) #2060
- 2bfefc3: V0.13.2 into main (#2075) (Todd Neal) #2075
- 8fb4970: Move the target account for nightlies and releases to the new account, and also remove snapshot-pr-approval and part of release.yaml that were attempting to make releases on PR commands and tags (#2072) (Ryan Maleki) #2072
- bf18a22: s/karpenter-snapshot/karpenter (Ryan Maleki) #2077
- 007c192: add log for filtered instance types due to limits (#2089) (Brandon Wagner) #2089
- 9db18ce: Add more details about snapshots, nightly release and include the cadence for stable releases (Ryan Maleki) #2094
- 5d1a8c0: crd upgrade instruction backfill (#2093) (Brandon Wagner) #2093
- 415c50b: documenting defaulting behavior for architecture (#2095) (Nick Tran) #2095
- Add a missing test directive to pkg/scheduling (#2106) #2106 (Ellis Tarn)
- c6c2f37: add revisionHistoryLimit option to chart (#2102) (Ido Koren) #2102
- eff4321: add hack to copy vpc limits file (#2110) (Brandon Wagner) #2110
- 69bd00d: fix LTs under management (#2108) (Brandon Wagner) #2108
- Increase randomname uniqueness to avoid collisions (#2104) #2104 (Ellis Tarn)
- Added missing env.Stop() to avoid leaking etcd/apiserver (#2114) #2114 (Ellis Tarn)
- 6b485f8: Add notify function (Ryan Maleki) #2112
- 8b887a9: Do not wait for user input (Ryan Maleki) #2112
- 7de478e: Notification for stable releases (Ryan Maleki) #2115
- 485745b: Fixes add-snapshot-tag.sh by adding missin gparameter (Ryan Maleki) #2120
- 5116513: remove instance type compression (#2121) (Brandon Wagner) #2121
- dc338dc: Fixed make verify failure (#2124) (Jason Deal) #2124
- Remove API Codeowners (#2126) #2126 (Ellis Tarn)
- c830e9d: Docs update to run daemonset plugin on GPU nodes (#2125) (Chris Negus) #2125
- 996bbe5: Add file extension (#2134) (Ryan Maleki) #2134
- 2fc0300: Make the message look more informational since it is (Ryan Maleki) #2135
- dcf5999: Link to main doc page (Ryan Maleki) #2135
- 43ba786: This should send the new tag for nightlies and stable (Ryan Maleki) #2135
- 394990a: undo doc change (Ryan Maleki) #2135
- 7206640: fix static pod and unschedulable toleration evictions (#2136) (Brandon Wagner) #2136
- Added kubeconfig to test environment (#2137) #2137 (Ellis Tarn)
- Fixed a testing bug that ignored --kubeconfig (#2138) #2138 (Ellis Tarn)
- ba2d947: backfill upgrade guide for providerRef crd replace (#2142) (Brandon Wagner) #2142
- dddaa53: add Unsupported to unfulfillable error codes (#2147) (Brandon Wagner) #2147
- 37933ea: Documentation update under 'getting started' section (#1788) (Shrikant Lavhate) #1788
- 4917b57: typos (#2149) (Fernando Miguel) #2149
- 0f6c1de: Revert "chore: decouple webhook logic from cloud provider API (#2079)" (#2155) (Ellis Tarn) #2155
- 5495bfb: add nth docs to getting started guide (#2160) (Brandon Wagner) #2160
- 77ba211: add instance type category, generation, and local nvme storage labels (#2165) (Brandon Wagner) #2165
- Implemented pipeline for integration test suite (#2127) #2127 (Ellis Tarn)
- 3829e76: fix instance label description for generation (#2168) (Brandon Wagner) #2168
- 8ab2b20: fix aws pricing sdk change again (#2167) (Brandon Wagner) #2167
- 142e68d: Add capability to use custom podDisruptionBudget name (#2174) (Shrikant Lavhate) #2174
- 4d97adf: Add log captures to e2e tests (#2170) (Ellis Tarn) #2170
- Fixed a few bugs with running integration tests in the integration environment (#2184) #2184 (Ellis Tarn)
- d43c9b2: lower spot to od threshold to 5 (#2190) (Brandon Wagner) #2190
- Remove spurious logging from tests and indefinitely rely on cluster state sync (#2188) #2188 (Ellis Tarn)
- 8b24a50: Fix logging into public ECR (#2204) (Jonathan Innis) #2204
- add consolidation design doc (#2207) #2207 (Todd Neal)
- 570e2b3: Adding tests for TTL (#2202) (Nick Tran) #2202
- 7c7f339: Revert "Fix logging into public ECR (#2204)" (#2211) (Ryan Maleki) #2211
- dfa1488: Fix dead link and end alert message (#2215) (Jonathan Innis) #2215
- 167154f: Release v0.14.0-rc.0 (#2220) (Suket Sharma) #2220