I have a cluster in AWS that runs Concourse 5.3.0. The CPUs (workers) are all running at 100%, even when there’s hardly anything running. I have 3 workers on m5.xlarge instances, so they’re not tiny). This is causing jobs to take forever to complete, some jobs seemingly get stuck forever, and sometimes the workers stall because of it. It seems to happen increasingly often, and I’m a bit stuck on how to find out what’s causing it (and how to fix it).
Restarting the Concourse worker doesn’t seem to achieve much as CPU is still just as high after a restart.
Any help/thoughts/suggestions would be greatly appreciated!