Installation Issues


#1

Having a hell of a time getting a web node to come up. I have the software installed, concourse user created, created a systemd setup for it. Postgresql is running on an external RDS instance and I have confirmed that I can connect to it from this instance, using the creds in the file. I cant even get it to tell me what isnt working exactly. Any insigh would be great. Ideally I would love to get the LDAP stuff configured, but cant even get the first step to work.

ubuntu@ip-172-21-8-99:~$ concourse --version
4.2.1
ubuntu@ip-172-21-8-99:~$ which concourse
/usr/local/bin/concourse

ubuntu@ip-172-21-8-99:~$ ls -l /etc/concourse/
total 32
-rw-r--r-- 1 concourse concourse  393 Sep 21 18:20 authorized_worker_keys
-rw-r--r-- 1 concourse concourse 1823 Sep 21 15:42 session_signing_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 session_signing_key.pub
-rw-r--r-- 1 concourse concourse 1823 Sep 21 15:42 tsa_host_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 tsa_host_key.pub
-rw------- 1 concourse concourse 1424 Sep 21 19:24 web_environment
-rw-r--r-- 1 concourse concourse 1823 Sep 21 15:42 worker_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 worker_key.pub

ubuntu@ip-172-21-8-99:~$ cat /etc/systemd/system/concourse-web.service 
[Unit]
Description=Concourse CI web process (ATC and TSA)

[Service]
User=concourse
Group=concourse
Type=simple
Restart=on-failure
ExecStart=/usr/local/bin/concourse web \
        --log-level debug \
        --session-signing-key /etc/concourse/session_signing_key \
        --tsa-host-key /etc/concourse/tsa_host_key \
        --tsa-authorized-keys /etc/concourse/authorized_worker_keys \
        --postgres-host stuff.cluster-cf48tnhs47l5.us-east-1.rds.amazonaws.com \
        --postgres-user concourse \
        --postgres-password PW_here \
        --postgres-database concourse \
        --add-local-user admin:pw_here \
        --main-team-local-user admin \
        --external-url http://concourse.cicd.whatever.com \
        --peer-url http://172.21.8.99:8080


[Install]
WantedBy=multi-user.target

Sep 21 23:43:57 ip-172-21-8-99 systemd[1]: Started Concourse CI web process (ATC and TSA).
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Unit entered failed state.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Failed with result 'exit-code'.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Service hold-off time over, scheduling restart.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: Stopped Concourse CI web process (ATC and TSA).
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: Started Concourse CI web process (ATC and TSA).
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Unit entered failed state.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Failed with result 'exit-code'.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: concourse-web.service: Service hold-off time over, scheduling restart.
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: Stopped Concourse CI web process (ATC and TSA).
Sep 21 23:43:58 ip-172-21-8-99 systemd[1]: Started Concourse CI web process (ATC and TSA).
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Unit entered failed state.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Failed with result 'exit-code'.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Service hold-off time over, scheduling restart.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: Stopped Concourse CI web process (ATC and TSA).
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: Started Concourse CI web process (ATC and TSA).
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Unit entered failed state.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Failed with result 'exit-code'.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: concourse-web.service: Service hold-off time over, scheduling restart.
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: Stopped Concourse CI web process (ATC and TSA).
Sep 21 23:43:59 ip-172-21-8-99 systemd[1]: Started Concourse CI web process (ATC and TSA).
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: concourse-web.service: Main process exited, code=exited, status=1/FAILURE
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: concourse-web.service: Unit entered failed state.
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: concourse-web.service: Failed with result 'exit-code'.
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: concourse-web.service: Service hold-off time over, scheduling restart.
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: Stopped Concourse CI web process (ATC and TSA).
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: concourse-web.service: Start request repeated too quickly.
Sep 21 23:44:00 ip-172-21-8-99 systemd[1]: Failed to start Concourse CI web process (ATC and TSA).

#2

First step is to run from the shell the same command-line you have in ExecStart and see what happens, I think not all output is going in the log.


#3
ubuntu@ip-172-21-8-99:~$ /usr/local/bin/concourse web \
>         --log-level debug \
>         --session-signing-key /etc/concourse/session_signing_key \
>         --tsa-host-key /etc/concourse/tsa_host_key \
>         --tsa-authorized-keys /etc/concourse/authorized_worker_keys \
>         --postgres-host stuff20180921151035119500000001.cluster-cf48tnhs47l5.us-east-1.rds.amazonaws.com \
>         --postgres-user concourse \
>         --postgres-password pw_here \
>         --postgres-database concourse \
>         --add-local-user user:password \
>         --main-team-local-user user \
>         --external-url http://url.com \
>         --peer-url http://172.21.8.99:8080
invalid argument for flag `--session-signing-key' (expected *flag.PrivateKey): asn1: structure error: tags don't match (16 vs {class:1 tag:15 length:112 isCompound:true}) {optional:false explicit:false application:false private:false defaultValue:<nil> tag:<nil> stringType:0 timeType:0 set:false omitEmpty:false} pkcs8 @2


ubuntu@ip-172-21-8-99:~$ ls -lh /etc/concourse/
total 32K
-rw-r--r-- 1 concourse concourse  393 Sep 21 18:20 authorized_worker_keys
-rw-r--r-- 1 concourse concourse 1.8K Sep 21 15:42 session_signing_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 session_signing_key.pub
-rw-r--r-- 1 concourse concourse 1.8K Sep 21 15:42 tsa_host_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 tsa_host_key.pub
-rw------- 1 concourse concourse 1.4K Sep 21 19:24 web_environment
-rw-r--r-- 1 concourse concourse 1.8K Sep 21 15:42 worker_key
-rw-r--r-- 1 concourse concourse  393 Sep 21 15:42 worker_key.pub

That key was created like this, on my local workstation, uploaded to S3, and then pulled down before concourse is ran.

sudo ssh-keygen -t rsa -q -N '' -f /etc/concourse/tsa_host_key
sudo ssh-keygen -t rsa -q -N '' -f /etc/concourse/worker_key
sudo ssh-keygen -t rsa -q -N '' -f /etc/concourse/session_signing_key

Am i hitting this?

edit: it was the above bug. Re-creating the keys on this box worked. I created them on an Arch linux box initially.


#4

Glad it worked! Could you please accept your own reply, so that people stumbling on this thread will know immediately what worked?

Since you are the creator of this thread, you will have a checkbox available:

solves


#5

Still doesnt really help. Its a show stopper at the moment. I need to be able to bring up masters on the fly and this pretty much kills that. Since when I copy those keys to a new install, the error just begins again.


#6

Let’s see if I understand correctly.

When you say “I need to be able to bring up masters on the fly”, you mean " I need to be able to bring up Concourse ATCs (that is, the binary invoked with concourse web) on the fly".

If the problem with the keys happens only when creating them with Arch Linux, why not creating them on Ubuntu or anything else ? You could even have Ubuntu as a Docker container on your Arch Linux. Would this work?


#7

I went back and created from a Ubuntu instance, and it had the same problem when autoscaling out a second web node.


#8

I used fedora, and this used to work about ~4months ago just fine. So this might be either a change in ssh-keygen OR a change in how that go lib parses the keys OR both.


#9

@jseiser this works for me:

for i in tsa_host_key session_signing_key worker_key; do
    # Generate the private key
    openssl genpkey -algorithm RSA -out $i -pkeyopt rsa_keygen_bits:2048
    # Extract the public key in a SSH-compatible format
    ssh-keygen -f $i -y > $i.pub
done

#10

@jseiser could you please change the title of this post to mention directly the problem with the RSA key ? thanks