Features
- Block readiness until caches are synced (#401) #401 (Ellis Tarn)
- Provisioner Static Drift (#417) (Amanuel Engeda) #417
- Add Node Requirement Drift (#423) (Amanuel Engeda) #423
- add a metric for replacement nodes that fail to launch (#475) #475 (Nick Tran)
Bug Fixes
- prevent consolidation from pausing for 5m (#415) #415 (Todd Neal)
- Don't treat consolidation validation failures as errors, and emit a debug log instead (#437) #437 (Ellis Tarn)
- add pod overhead into pod resources calculation #449 (Todd Neal)
- Handle ephemeral volume topology (#441) #441 (Jonathan Innis)
- Ensure
VolumeUsage
andHostPortUsage
state is consistent (#434) #434 (Ellis Tarn) - resolve issue interpreting preferred node affinities #450 (Todd Neal)
- support pdb policies #455 (Todd Neal)
- continue for expiration/drift when pods cannot reschedule (#451) #451 (Nick Tran)
- ignore permanently pending pods during consolidation #458 (Todd Neal)
- reorder drift checks to save on computation (#464) #464 (Nick Tran)
- Translate all in-tree storage provisioners to CSI (#438) (Jonathan Innis) #438
- do not consider provisioners without a provider (#491) #491 (Nick Tran)
- register timeout metrics (#492) #492 (Nick Tran)
Performance Improvements
- improve scheduling speed #418 (Todd Neal)
- cache the default storage class for 1 minute #418 (Todd Neal)
- handle the common case of a simple requirement faster #418 (Todd Neal)
- reduce allocs when summing resource lists #418 (Todd Neal)
- Implement timeouts and Max-batch size for Consolidation (#472) #472 (Nick Tran)
Tests
- Deflake consolidation timeout testing (#476) #476 (Jonathan Innis)
Chores
- deps: bump the go-deps group with 2 updates (#410) #410 (dependabot[bot])
- deps: bump the go-deps group with 1 update (#420) #420 (dependabot[bot])
- Add conversion utils functions for
NodePool
/NodeClaim
(#433) #433 (Jonathan Innis) - debug log when we add requirements due to volumes #456 (Todd Neal)
- deps: bump the go-deps group with 2 updates (#468) #468 (dependabot[bot])
- Add
v1beta1/NodeClaim
controllers that convert fromv1alpha5/Machine
(#445) #445 (Jonathan Innis) - Update drift reason (#474) #474 (Amanuel Engeda)
- Enable
v1beta1/NodePool
conversion in counter controller (#448) #448 (Jonathan Innis) - Enable
v1beta1/NodeClaim
conversion in consistency controller (#447) #447 (Jonathan Innis) - Garbage Collect Leaked Node Lease (#471) #471 (Amanuel Engeda)
- Enable
v1beta1/NodePool
provisioner hash controller (#477) #477 (Jonathan Innis) - upgrade to go 1.21 (#480) #480 (Brandon Wagner)
- Scope Lease to namespace
kube-node-lease
(#482) #482 (Amanuel Engeda) v1beta1
Conversion Support for Provisioning/Deprovisioning/State (#443) #443 (Jonathan Innis)- Perform metrics tracking through a
MetricStore
abstraction (#326) #326 (Jonathan Innis) - Allow
node-restriction.kubernetes.io/
prefix in the label set (#484) #484 (Jonathan Innis) - Reorganize controllers into logical directories (#485) #485 (Jonathan Innis)
- Add
v1beta1/NodePool
metrics controller (#486) #486 (Jonathan Innis) - Handle shared state for Machine/NodeClaim (#487) #487 (Jonathan Innis)
- Change duration type for settings (#483) #483 (Amanuel Engeda)
- Fix hash annotation key resolution for
v1beta1/NodeClaim
(#490) #490 (Jonathan Innis) - single init inflight capacity / startup taints (#495) #495 (Jason Deal)
Commits
- c2bae73: Group dependabot updates together (#404) (Jonathan Innis) #404
- f32767a: Bump k8s deps to 1.26 (#180) (Jonathan Innis) #180
- c88854d: Fix dependabot grouping (#407) (Jonathan Innis) #407
- 1110559: Fix missed k8s 1.26 deps (#408) (Jonathan Innis) #408
- 747f91b: Add "ci:" to PR template (#412) (Jonathan Innis) #412
- 1cfc2ea: Fix bool check in "make verify" (#411) (Jonathan Innis) #411
- 202df1e: Add retracted version for published test version (#413) (Jonathan Innis) #413
- b7a0a8f: Adding static drift annotation (#400) (Amanuel Engeda) #400
- 34e8049: Close response to fix file descriptor leak (#416) (Jonathan Innis) #416
- 51c6998: Add more exempt PR labels (#422) (Jonathan Innis) #422
- 0b48554: Adding LeaderElectionNamespace to init of controllerruntime Manager (#424) (abeer-stripe) #424
- 6b54c40: Consider existing capacity for scheduling (#414) (Jonathan Innis) #414
- eb4d8f2: Warn instead of Error on 'no provisioners found' (#425) (Jonathan Innis) #425
- bc56099: Prevent retry.Do() from hanging controller (#427) (Jonathan Innis) #427
- 83e9671: Add v1beta1 APIs (#426) (Jonathan Innis) #426
- 353ed63: discovery cluster test label (#431) (Amanuel Engeda) #431
- e40f655: Fire an info message rather than warn (#435) (Jonathan Innis) #435
- 3aa152d: Enforce stricter compatibility for existing nodes/machines (#432) (Jonathan Innis) #432
- 599497b: Wait for Karpenter-managed node to populate provider id (#439) (Jonathan Innis) #439
- 44f8af7: cloudProvider gives a reason for drift (#446) (Amanuel Engeda) #446
- b611090: Remove not drifted (#452) (Amanuel Engeda) #452
- 36c54ad: provisioner static drift (#453) (Amanuel Engeda) #453
- 18cfbd6: Deflake tests that validate deprov ordering (#460) (Jonathan Innis) #460
- 20516d9: Reduce event spam by using command.Action() instead of command.String() (#461) (Jonathan Innis) #461
- b90dbcc: Stop firing Blocked/Unconsolidatable events for some time on startup (#462) (Jonathan Innis) #462
- c15edc9: remove node look-up on machine annotation (#463) (Amanuel Engeda) #463
- ff6de1a: disruption controller metrics (#442) (Amanuel Engeda) #442
- b688eb7: Favor RequeueAfter over event enqueueing for Counter Controller (#478) (Jonathan Innis) #478