Features
- add karpenter_pods_drained_total metric to track pod draining by reason (#2044) #2044 (Omer Aplatony)
- prioritize emptiness over other consolidation methods (#2180) #2180 (Reed Schalo)
- add metrics for disruption candidate validation (#2239) #2239 (Reed Schalo)
Bug Fixes
- Fix pod scheduling undecided time metric (#2147) #2147 (Jigisha Patil)
- Allow disabling syncing taints from nodeclaims to nodes (with node label) (#2125) #2125 (Jeremy Bolster)
- add runtime validation for requirements with known values (#2164) #2164 (Reed Schalo)
- should consider pending pods with preferred nodeAffinity when making provisioning decisions (#2182) #2182 (Rodrigo Fior Kuntzer)
- allow non-churn empty nodes to be disrupted (#2206) #2206 (Reed Schalo)
Documentation
- update issue triage meeting schedule (#2244) #2244 (Jason Deal)
Code Refactoring
- convert validation to an interface (#2220) #2220 (Reed Schalo)
Performance Improvements
- Parallelize node filtering (#2126) #2126 (Jonathan Innis)
- Speed-up resource checking for existing nodes (#2224) #2224 (Jonathan Innis)
- Poll for DaemonSet resources rather than watching (#2226) #2226 (Jonathan Innis)
- Don't deepcopy inside of watch handler functions (#2232) #2232 (Jonathan Innis)
- Only deep copy nodes during GetCandidates once (#2233) #2233 (DerekFrank)
- Only call .Available() once which prevents duplicate allocs (#2241) #2241 (DerekFrank)
- Avoid deepcopy when get nodePool/cluster health (#2247) #2247 (Jonathan Innis)
- Improve OrderByPrice performance (#2250) #2250 (Jonathan Innis)
Tests
- Migrate the start monitoring controller start-up to the BeforeEach and AfterEach stage (#2168) #2168 (Amanuel Engeda)
- Karpenter Integration Test Migration RFC (#2144) #2144 (Amanuel Engeda)
- Migrate the Chaos suite test from the AWS Karpenter Provider (#2167) #2167 (Amanuel Engeda)
- Migrate the Expiration suite test from the AWS Karpenter Provider (#2172) #2172 (Amanuel Engeda)
- Update NodeClaim expiration time for E2E tests (#2188) #2188 (Amanuel Engeda)
- Apply KWOK requirement to only work with KWOK nodeclass (#2189) #2189 (Amanuel Engeda)
- Migrate the Integration suite test from the AWS Karpenter Provider (#2197) #2197 (Amanuel Engeda)
- Update E2E tests to disable expiration for new nodes that have expired (#2190) #2190 (Amanuel Engeda)
- Migrate the NodeClaim suite test from the AWS Karpenter Provider (#2193) #2193 (Amanuel Engeda)
- Migrate the Drift suite test from the AWS Karpenter Provider (#2195) #2195 (Amanuel Engeda)
- Use Foreground deletion to cleanup our tests (#2209) #2209 (Amanuel Engeda)
- Add default disruption budgets for the drifted initialized failure tests (#2214) #2214 (Amanuel Engeda)
- Block NodeClaim registration for E2E tests using validating admission policy (#2216) #2216 (Amanuel Engeda)
- Migrate the Termination suite test from the AWS Karpenter Provider (#2173) #2173 (Amanuel Engeda)
- Use Pod Metadata field for test objects (#2225) #2225 (Amanuel Engeda)
- Lower resource requests for NodeClaim test (#2229) #2229 (Amanuel Engeda)
- Add random name string for NodePool and NodeClass (#2231) #2231 (Amanuel Engeda)
- Update E2E testing suite to be named Regression (#2234) #2234 (Amanuel Engeda)
- deflake NodeClaim and presubmit tests (#2240) #2240 (Reed Schalo)
- add validating admission policy for nodeclass status (#2251) #2251 (Reed Schalo)
Continuous Integration
- Enable Kind Cluster Testing on PR Merge (#2184) #2184 (Amanuel Engeda)
- Temporarily Disable Karpenter KPI Analysis Package (#2204) #2204 (Amanuel Engeda)
- Temporarily Disable Karpenter KPI actions (#2205) #2205 (Amanuel Engeda)
Chores
- Use structured errors in logger (#2136) #2136 (Jonathan Innis)
- deps: bump golang.org/x/net from 0.37.0 to 0.38.0 (#2149) #2149 (dependabot[bot])
- deps: bump github.com/docker/docker from 28.0.4+incompatible to 28.1.1+incompatible in the go-deps group (#2158) #2158 (dependabot[bot])
- Delete podSchedulableTime and associated metric if pod can't be scheduled (#2146) #2146 (Jigisha Patil)
- Update pod schedulable time (#2154) #2154 (Jigisha Patil)
- deps: bump actions/setup-python from 5.5.0 to 5.6.0 in the actions-deps group (#2174) #2174 (dependabot[bot])
- deps: bump github.com/samber/lo from 1.49.1 to 1.50.0 in the go-deps group (#2176) #2176 (dependabot[bot])
- Bump
github.com/awslabs/operatorpkg
to latest (#2177) #2177 (Jonathan Innis) - Fix event log line for a pointer (#2196) #2196 (Amanuel Engeda)
- deps: bump golang.org/x/text from 0.24.0 to 0.25.0 in the go-deps group (#2199) #2199 (dependabot[bot])
- Reduce "waiting on cluster sync" spam (#2203) #2203 (Jonathan Innis)
- Remove mismatch with deps used for E2E tests (#2207) #2207 (Amanuel Engeda)
- Fix scheduling timeout error check (#2215) #2215 (Jonathan Innis)
- Update the testing README to define mechanism for cloud provider to run the integrated testing (#2208) #2208 (Amanuel Engeda)
- deps: bump actions/setup-go from 5.4.0 to 5.5.0 in /.github/actions/install-deps in the action-deps group (#2222) #2222 (dependabot[bot])