Concourse worker not starting

Hi Guys,

Production system concourse is down and below is the error on worker instance which is hosted in AWS EC2.

Jul 10 17:27:57 ip-10-10-4-163.us-west-2.compute.internal concourse_worker[3643]: {“timestamp”:“2020-07-10T17:27:57.761716754Z”,“level”:“error”,“source”:“worker”,“message”:“worker.beacon-runner.beacon.failed-to-connect-to-tsa”,“data”:{“error”:“dial tcp 10.10.4.112:2222: i/o timeout”,“session”:“4.1”}}
Jul 10 17:27:57 ip-10-10-4-163.us-west-2.compute.internal concourse_worker[3643]: {“timestamp”:“2020-07-10T17:27:57.761777725Z”,“level”:“error”,“source”:“worker”,“message”:“worker.beacon-runner.beacon.dial.failed-to-connect-to-any-tsa”,“data”:{“error”:“all worker SSH gateways unreachable”,“session”:“4.1.1”}}
Jul 10 17:27:57 ip-10-10-4-163.us-west-2.compute.internal concourse_worker[3643]: {“timestamp”:“2020-07-10T17:27:57.761795888Z”,“level”:“error”,“source”:“worker”,“message”:“worker.beacon-runner.beacon.failed-to-dial”,“data”:{“error”:“all worker SSH gateways unreachable”,“session”:“4.1”}}
Jul 10 17:27:57 ip-10-10-4-163.us-west-2.compute.internal concourse_worker[3643]: {“timestamp”:“2020-07-10T17:27:57.761813432Z”,“level”:“error”,“source”:“worker”,“message”:“worker.beacon-runner.beacon.exited-with-error”,“data”:{“error”:“all worker SSH gateways unreachable”,“session”:“4.1”}}
Jul 10 17:27:57 ip-10-10-4-163.us-west-2.compute.internal concourse_worker[3643]: {“timestamp”:“2020-07-10T17:27:57.761837227Z”,“level”:“error”,“source”:“worker”,“message”:“worker.beacon-runner.failed”,“data”:{“error”:“all worker SSH gateways unreachable”,“session”:“4”}}

Getting DB connection error also …

Jul 10 18:01:06 ip-10-10-4-112.us-west-2.compute.internal concourse_web[9066]: {“timestamp”:“2020-07-10T18:01:06.196192398Z”,“level”:“error”,“source”:“atc”,“message”:“atc.db.failed-to-open-db-retrying”,“data”:{“error”:“dial tcp 10.10.6.28:5432: connect: connection timed out”,“session”:“3”}}

Check your web node since it cannot reach the database. Try using PSQL to check the DB connection as well.
The worker won’t run until it can reach the TSA on the web node.

1 Like