a lot of time i am running into the scenario that i have a set of N jobs where only X of them should run in parallel.
As of now i could achieve this type of parallelism by the usage of the serial_groups feature.
I create N/X many serial_groups labels and distribute them among the N jobs equally.
But the outcome of this parallelism is not very flexible.
- I can’t easily change the value of Jobs that run in parallel at once without editing the serial_groups labels.
- Maybe the Set of jobs that share the serial_group label foo are finished faster than the jobs that share the serial_groups label bar. In this case i am not using my resources efficient as i have to wait until the jobs with label bar finished sequentially.
My idea was to add another label type parallel_groups which is similar to serial_groups but is not constraining the parallelism by putting Jobs into sequential brackets.
Each parallel_group would define an integer value max_in_flight_jobs. All Jobs that are labeled with the same parallel_groups label, run in parallel and are only constrained by the maximal number of parallel jobs defined by max_in_flight_jobs.
This would guarantee that the all jobs in parallel_groups bracket would be executed in parallel as efficient as possible respecting the max_in_flight_jobs constraint.
If there is an interest i would like to participate in a PR.