Determine Worker Used by specific job

Hi

I am looking for a way to determine on which worker a job ran.

At present we have a set of jobs failing intermittently and I am trying to determine if this is only on a specific worker node or not (we have around 48 workers)

I have tried trawling through the logs on the ATC, but can’t seem to find a correlating fact I can use to determine this

Any help is appreciated

Thanks

Are you consuming any metrics? It’s possible they’re surfaced there as tags/dimensions (and if not, might be a good feature request).

Hi,

I’m not consuming any metrics specific to concourse. I will raise a feature request

Thanks

Cool - To be clear, I’m saying if the metrics don’t surface that, it might be a good feature request to enhance the metrics.

Start with the metrics first if you haven’t :slight_smile:

We have the following command run at the beginning of most tasks to help with identifying the worker

printf "[Worker: %s]\n" "$(curl -s http://169.254.169.254/metadata/v1/hostname)"

Hi,

That’s useful, however we don’t control the pipelines and said tasks, so would be dependant on developer buy-in, and we would be subject to their timelines etc.

Ideally this would be implemented (ideally via a switch on/off) at the platform level

As a side we managed to get our issue resolved, as we where able to catch a failing job during its process. We where then able to narrow down the node (viewing containers) , and as expected the issue was localised to a couple of worker nodes

Thanks

Thanks for raising this - just wanted to let you know that surfacing more detailed information about when and where a task is running is something that the team is thinking about.

We’re figuring out how to display additional data in a more structured way (and trying to avoid quick fixes, like piping more into info stdout). The best way to follow updates is this issue thread: https://github.com/concourse/concourse/issues/4337#issuecomment-537115941