I’ve deployed concourse via a terraform module we wrote in a single web/worker node for testing. You can find it here: https://ci.7fdev.io.
It’s concourse 3.14.1, and things appear to work until I execute a build using fly. I get the dreaded “Unable to connect to the docker registry” issue. I suspect it might be networking problems in AWS, but I need some help confirming. I’m running web/worker like this:
sudo docker run -d --network host --name concourse_worker --privileged=true --restart=unless-stopped -v /etc/concourse/keys/:/concourse-keys -v /tmp/:/concourse-tmp concourse/concourse:3.14.1 worker --peer-ip 126.96.36.199 --bind-ip 0.0.0.0 --garden-dns-server 188.8.131.52 --baggageclaim-bind-ip 0.0.0.0 --garden-bind-ip 0.0.0.0 --tsa-host conc-lb-us-east-1-1472385510.us-east-1.elb.amazonaws.com:2222 --work-dir /concourse-tmp
sudo docker run -d --network host --name concourse_web --restart=unless-stopped -v /etc/concourse/keys/:/concourse-keys concourse/concourse:3.14.1 web --peer-url http://184.108.40.206:8080 --postgres-data-source nope --external-url https://ci.7fdev.io --github-auth-client-id nope --github-auth-client-secret nope --github-auth-organization 7Factor
resource script '/opt/resource/check ' failed: exit status 1 stderr: failed to ping registry: 2 error(s) occurred: * ping https: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) * ping http: Get http://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) errored
Webs and workers live in a subnet with access to a NAT gateway as indicated in the picture below. Anyone have any ideas? I can CURL whatever I want from the host and inside the container–so currently I’m pretty baffled. Either I suck at understanding docker networking or I’m missing something small.