What's New
Support Dynamic scheduling Based on Real Node Load
This feature aims to schedule pods based on real node load instead of request resource, which will optimize the node resource utilization.Currently the pod is scheduled based on the request resources and node allocatable resources other than the node usage. This leads to the unbalanced resource usage of compute nodes. Pod is scheduled to node with higher usage and lower allocation rate. This is not what users expect. Users expect the usage of each node to be balanced. More details can be referred to https://github.com/volcano-sh/volcano/blob/master/docs/design/usage-based-scheduling.md. (#2023, #2129 @william-wang )
Support Rescheduling Based on Real Node Load
This feature enables users to rebalance the node utilization based on real node resource usage reqularlly, which is quite suitable for long-running workloads such as deployment. All the rescheduling policies and check interval can be configured according to custom scenarios. More details can be referred to https://github.com/volcano-sh/volcano/blob/master/docs/design/rescheduling.md. (#2174, #2184 @Thor-wl )
Support Elastic Job Scheduling
This feature allows Volcano to schedule volcano job based on the [min,max] configuration in the job, which will improve resource utilization rate and shorten the execution time of training job. More details can be referred to https://github.com/volcano-sh/volcano/blob/master/docs/design/elastic-scheduler.md. (#2105, @qiankunli )
Add MPI Job Plugin
This feature provides a new volcano job plugin - MPI Plugin. It will be more convient for MPI users to make use of volcano job instead of manually making connections for hosts of different roles, registering required environment variables and so on. More details can be referred to https://github.com/volcano-sh/volcano/blob/master/docs/design/distributed-framework-plugins.md. (#2237, @hwdef )
Other Notable Changes
- update helm version in install.sh(#2103, @hwdef )
- modify the way to install the controller-gen(#2104, @hwdef )
- add shuffle action(#2174, @Thor-wl )
- add e2e Spark integration test(#2113, @Yikun )
- if only one candidate node, no need do scoring for it(#2122, @wpeng102 )
- skip verify init container SecurityContex.Privileged(#2125, @zrss )
- add design doc for usage based scheduling(#2023, @william-wang )
- add usage based scheduling plugin(#2129, @william-wang )
- support elastic annotation in preempt/reclaim plugin(#2105, @qiankunli )
- add design doc for Enhance-Generate-PodGroup-OwnerReferences-for-Normal-Pod(#2151, @wpeng102 )
- allow no retry when task failed(#2154, @merryzhou )
- remove useless code in task-topology's manager.go(#2159, @HeGaoYuan )
- add user guidance for svc plugin(#2162, @Thor-wl )
- add user guidance of env plugin(#2153, @Thor-wl )
- add user guidance for ssh plugin(#2168, @Thor-wl )
- add user guidance about how to configure volcano scheduler(#2177, @Thor-wl )
- add user guidance about how to configure job and task policy(#2179, @Thor-wl )
- add overhead for pod request(#2170, @jiangxiaobin96 )
- rename ClusterRole from prometheus to prometheus-volcano(#2178, @SimonYang-CS )
- add image pull secret for volcano-admission-init job(#2185, @SimonYang-CS )
- add rescheduling plugin(#2184, @Thor-wl )
- feat(scheduler): support resource quota consideration during pod group enqueue procedure(#1345, @merryzhou )
- add priorityClassName for rescheduler(#2200, @jiangxiaobin96 )
- allow privilege containers to pass the admission webhook validation by default(#2222, @Thor-wl )
- clean up metrics of deleted objects(#2230, @xiaoanyunfei )
- sunset the reservation plugin and elect reserve actions(#2236, @william-wang )
- add more deploy switches on helm(#2267, @shinytang6 )
Bug Fixes
- fix dynamic provision ut case error(#2133, @wpeng102 )
- fix: add jobUID into job's podgroup name ensure podgroup's unique(#2140, @FengXingYuXin)
- fix: Add mirror for Spark voclano IT(#2163, @Yikun )
- fix controller job cache not sync latest version issue(#2169, @wpeng102 )
- fix: add jobUID into job's podgroup name ensure podgroup's unique(#2140, @FengXingYuXin )
- fix task MinAvailable issue(#2176, @merryzhou )
- fix calculate inqueue resource bug in opensession(#2214, @zbbkeepgoing )
- fix id of gpu devices never delete when number gpu decrease(#2215, @WingkaiHo)
- fix numa divided by zero(#2216, @elinx)
- fix helm install(#2218, @zirain )
- fix api-server deny empty admission response with PatchType set(#2267, @elinx)
- feat exclude unhealthy devices(#2267, @yongjiahe)
- fix unhealthy gpu data struc array(#2267, @yongjiahe)
- fix high priority task cannot preemt low priority task when queue is overused(#2267, @wpeng102 )
- avoid panic for query prometheus no data(#2267, @waiterQ )
- modify prometheus.query.result judg(#2267, @waiterQ )
- fix(scheduler): fix jobStarvingFn logic(#2271, @shinytang6 )