What's New
1. Add TDM plugin
TDM(Time Division Multiplexing) plugin aims to provide a mechanism for nodes, which can be used for K8S and other cluster(such as Yarn) in separate time.(#1269, @yahaa )
2. Add SLA plugin
SLA(Service Level Agreement) plugin works for job resource reservation feature. Users can set SLA for jobs to ensure specified jobs to be scheduled in time. It provides an better design and implementation for job resource reservation. (#1303, @jiangkaihua )
Other Notable Changes
- improve addResourceList func in job_controller_util.go(#1332, @shinytang6 )
- update overcommit plugin(#1324, @jiangkaihua )
- add e2e for sla plugin(#1319, @jiangkaihua )
- make sure non-preemptable and revocable workload not preempt other tasks in tdm plugin(#1314, @wpeng102 )
- support only specify preemptable=true for revocable workload(#1313, @wpeng102 )
- support revocable-zone annotaion for workload(#1312, @wpeng102 )
- add fail event for annotation admission(#1308, @wpeng102 )
- support min pod alive for tdm plugin(#1300, @wpeng102 )
- update enqueue action, import overcommit plugin to limit pending jobs from inqueue.(#1298, @jiangkaihua )
- build cache for revocable nodes(#1293, @yahaa )
- separate JobPipelined into two semantics for preempt action(#1288, @wpeng102 )
- support minAlive and evictMaxNum for job(#1287, @wpeng102 )
- non preemptable deployment preempt resource(#1286, @wpeng102 )
- update job-resource-reservation-design doc(#1282, @Thor-wl )
- add tdm design doc(#1277, @wpeng102 )
- refine deployment.yaml example(#1274, @wpeng102 )
- tdm plugin add victimsFn(#1276, @wpeng102 )
- add Makefile flag
SUPPORT_PLUGINS
(#1266, @zen-xu ) - update ssh secret when job updated(#1263, @shinytang6 )
- add job plugin example(#1254, @shinytang6 )
Bug Fixes
- replace removed command of kind when getting kube config(#1315, @rudeigerc )
- fix log in job_controller_actions.go(#1305, @gaocegege )
- correct log info in cache.go(#1302, @juchaosong )
- optimize nodeorder plugin(#1292, @huone1 )
- enhance tdm max evict step(#1290, @yahaa )
- revert ssh subpath for ssh plugin(#1280, @shinytang6 )
- fix e2e helm install timeout(#1262, @huone1 )
- fix more pods are reclaimed than required(#1260, @huone1 )
- fix CI: add hacky retry mechanism(#1248, @shinytang6 )