Due to various requirements I ended up running
gdn (the one shipped with the concourse release) as well as
containerd as separate systemd services on bare metal worker using fedora.
Currently I am experiencing very big delays between a successful check of resources and task execution, there can easily pass 30minutes or more. This also happens in between two tasks / parallel task groups.
All worker logs, top, iotop, influxdb logs do not indicate where the error originates from. Essentially the worker is idle until the actual tasks kick in (duration ~30sec) and then idle again for 30minutes.
What’s the best approach to debug this? Help would be much appreciated, thanks!
Note that the previous setup did not use the
containerd but relied on direct
runc execution - which was snappy.