Running workers on ephemeral volumes

Hello,

The helm chart for concourse suggests using concourse-worker on statefulset with PV, where you can map to a certain storageclass. Seems statefulset choice is mostly because of hostnames and registration of the workers, such that intercept etc can be done reliably and scheduling efficiently wrt resources. Given the stateless nature of the worker, are there good reasons why I should not use ephemeral volumes (except for obvious caching benefits) ? There are a few benefits irt avoiding depletion of backing resources and with upgrades of concourse version (re-mapping to old PVs sometimes causes some problems “cannot create volume”, or “volume not found”…

2 Likes

We came across something along these lines due to a bug with AWS / K8s persistent volumes.

After a while we noticed the helm chart configures the pods to actually rm -rf’ the content of the persistent volume on startup. (See here in the helm chart)

We ended up turning on hardAntiAffinity so each pod would start on a different k8s node and then had dedicated k8’s nodes with adequate storage to allow us to turn off PVs entirely.

Doesn’t really solve the problem but the setup works well for our needs.

We still got “volume not found” errors after this however.

To solve this we had to increase the “terminationGracePeriodSeconds” to give the worker adequate time to retire before k8s killed it.