How to find out, what slows down my task?

I am having a task and I’d like to find out, what slows it down. Actually I have a suspect, but log output does not help to confirm that. My actual work in the task takes 2m30 to run, but toe overall task needs much more time, and the overall job takes 1h46m.
What I found out so far is, that it took 1h30m from the last log entry of task execution until it was finally marked with a green tick. Meanwhile there was a cp process running in the container.

My suspect is, that I have used caches in my task, which are quite big (30GB). Its the build directory of an embedded linux, which does cross compiling. Without the cache the task needs 2h to complete by itself. With the cache its really fast.

Unfortunately I did not find much documentation about the details of the caches. Do they really need to be copied or are they just mounted? What else could that long-running copy command be? (Hint: the last copy command of my script copies an embedded linux image, which is approximately 250 MB in size).

Here are my build scripts for reference:

build-for-segin.yml:

platform: linux

image_resource:
  type: docker-image
  source: 
    repository: docker-reg:5000/yocto-build-environment
    insecure_registries: [ "docker-reg:5000" ]

inputs:
- name: twms-yocto-base
- name: twms-yocto-base-prepared

caches:
- path: work/build/cache
- path: work/build/downloads
- path: work/build/sstate-cache
- path: work/build/tmp

outputs:
- name: twms-segin

run:
  dir: twms-yocto-base
  path: ci/tasks/build-for-segin.sh

build-for-segin.sh:

#!/usr/bin/env bash
set -e
set -u
set -x
mkdir -p ../work
chown ci:ci ../work
cd ../work
chown ci:ci ./build || :
chown ci:ci ./build/* || :
rm ./work/build/tmp/deploy/images -rf
sudo -u ci tar -x --no-same-owner -f ../twms-yocto-base-prepared/twms-yocto-base-prepared.tar
# call the script in the root directory
sudo -u ci ./build-for-segin.sh
cp -r build/tmp/deploy/images ../twms-segin/

Interesting problem!

Shot in the dark: what happens if you avoid the chown and sudo bits ? I was thinking about this because the original ticket which introduced the task cache (https://github.com/concourse/concourse/issues/230) talks about a copy on write semantics (I might be far off since I didn’t look at the code!)

Another action that might help is validating if you really need the 30GB of cache or if there is a sort of progressive “leaking” of the cache size.

From a diagnostics point of view, in case you are not aware, I found:

  • you can clear a task cache, see section “fly clear-task-cache” at https://concourse-ci.org/managing-jobs.html
  • fly volumes will show task cache volumes as task-cache and show the name of the task the cache is for.

Hi marco-m,

Thank you for your response!

  • I did also hope, that it was copy on write. If that was the case there were no bulk copy action at the end.
  • The copy of 250MB (last command) should definitely not take ~ 1h30. There were no other things / jobs running on the machine, especially no disk IO. Plenty of RAM free so no IO for swap as well.
  • I cleared the task cache before the previous run of the command. The previous build collected those 30GB and cache should have kept it for the current run. And it was kept, just after the task there was this long running cp command. So the cache size came from one single run, fetching tons of repositories with sources for building a yocto linux for an embedded device.
  • Without the chown and sudo bits everything runs as root inside the container, which would be okay for me. However some of the configure scripts of the stuff used in the yocto build process refuses to be run as root “for security reasons” - they were not developed with docker in mind - probably some of them were developed even before docker was in anyones mind…
  • Luckily, except for the path to the cached directory, the cache seems to keep the ownership of the cached files, so the chown is not executed recursively, so it does not touch all the files.

ATM I am doing another run. It’s the following command which consumes that much time:

cp -a /worker-state/volumes/live/bf598b42-7de6-4c27-4f25-028ddf4f22ef/volume/. /worker-state/volumes/init/9844566e-c77b-447b-5831-eb1e73833e7f/volume

Can anyone see from this, whether it belongs to a cache operation or tell me how to find out what this copy is for? It’s maxing out disk IO to 95%.

Was the answer found to this query? We are using the builder-task (https://github.com/concourse/builder-task) to build docker images, making use of a cache path to preserve the img state. We’ve slightly modified the resource to prevent copying to and fro from the /scratch volume so that the cache is now shared between concurrent worker containers and there is no copy required. However, we are also noticing a delay between the img build task completing and Concourse moving on to the next task. There is a long running cp process which needs to complete before Concourse moves on. Is Concourse doing something with that cache volume? Is it not simply mounted? Why is this copy happening?

That’s the builder task syncing state to and from the cache. It’s pretty slow and clunky.

You could try this task by instead: https://github.com/issho-ni/concourse-buildkit - it doesn’t have to copy the cache around, so it should be faster. I’m thinking about revising the builder task to take the same approach.

Hi vito - thanks for the reply

Yes, I’ve actually worked around the state to cache syncing by reworking the build script for builder-task in a custom image. The script I’m using:

#!/bin/bash

set -e -u

function sanitize_cgroups() {
  mkdir -p /sys/fs/cgroup
  mountpoint -q /sys/fs/cgroup || \
    mount -t tmpfs -o uid=0,gid=0,mode=0755 cgroup /sys/fs/cgroup

  mount -o remount,rw none /sys/fs/cgroup

  sed -e 1d /proc/cgroups | while read sys hierarchy num enabled; do
    if [ "$enabled" != "1" ]; then
      # subsystem disabled; skip
      continue
    fi

    grouping="$(cat /proc/self/cgroup | cut -d: -f2 | grep "\\<$sys\\>")" || true
    if [ -z "$grouping" ]; then
      # subsystem not mounted anywhere; mount it on its own
      grouping="$sys"
    fi

    mountpoint="/sys/fs/cgroup/$grouping"

    mkdir -p "$mountpoint"

    # clear out existing mount to make sure new one is read-write
    if mountpoint -q "$mountpoint"; then
      umount "$mountpoint"
    fi

    mount -n -t cgroup -o "$grouping" cgroup "$mountpoint"

    if [ "$grouping" != "$sys" ]; then
      if [ -L "/sys/fs/cgroup/$sys" ]; then
        rm "/sys/fs/cgroup/$sys"
      fi

      ln -s "$mountpoint" "/sys/fs/cgroup/$sys"
    fi
  done

  if ! test -e /sys/fs/cgroup/systemd ; then
    mkdir /sys/fs/cgroup/systemd
    mount -t cgroup -o none,name=systemd none /sys/fs/cgroup/systemd
  fi
}

sanitize_cgroups

stty columns 80

ran_progress="false"

function progress() {
  if [ "$ran_progress" = "true" ]; then
    echo ""
  fi

  ran_progress="true"

  echo $'\e[1m'"$@"$'\e[0m'
}

TAG=${TAG:-latest}
TAG_FILE=${TAG_FILE:-}
CONTEXT=${CONTEXT:-.}
DOCKERFILE=${DOCKERFILE:-$CONTEXT/Dockerfile}
TARGET=${TARGET:-}
TARGET_FILE=${TARGET_FILE:-}
BUILD_ARGS_OPT=$(env | awk '/BUILD_ARG_/ {gsub(/BUILD_ARG_/, "--build-arg "); printf "%s ",$0}')
BUILD_ARGS_FILE=${BUILD_ARGS_FILE:-}
IMAGE_NAME=$(echo $REPOSITORY | sed 's/.*\///')


tag_name=""
if [ -n "$TAG_FILE" ]; then
  if [ ! -f "$TAG_FILE" ]; then
    echo "tag file '$TAG_FILE' does not exist"
    exit 1
  fi
  tag_name="$(cat $TAG_FILE)"
else
  tag_name="$TAG"
fi


target_arg=""
if [ -n "$TARGET_FILE" ]; then
  if [ ! -f "$TARGET_FILE" ]; then
    echo "target file '$TARGET_FILE' does not exist"
    exit 1
  fi
  target_arg="--target $(cat $TARGET_FILE)"
elif [ -n "$TARGET" ]; then
  target_arg="--target $TARGET"
fi


build_args=""
if [ -n "$BUILD_ARGS_FILE" ]; then
  if [ ! -f "$BUILD_ARGS_FILE" ]; then
    echo "build_args file '$BUILD_ARGS_FILE' does not exist"
    exit 1
  fi

  expanded_build_args=()
  while IFS= read -r line || [ -n "$line" ];
  do
    if [[ ! -z "$line" ]]; then
      expanded_build_args+=("--build-arg")
      expanded_build_args+=("${line}")
    fi
  done < $BUILD_ARGS_FILE
  build_args="${expanded_build_args[@]}"
else
  build_args="$BUILD_ARGS_OPT"
fi

progress "building and pushing"
# symlink the state - https://github.com/concourse/builder-task/issues/1
mkdir -p /root/.local/share
ln -s $(pwd)/cache/ /root/.local/share/img

img build --no-console -t $REPOSITORY:$tag_name -f $DOCKERFILE $target_arg $build_args $CONTEXT
img push --insecure-registry $REPOSITORY:$tag_name

# remove symlink
unlink /root/.local/share/img

Note the workaround as described at https://github.com/concourse/builder-task/issues/1 for allowing img to directly use the Concourse cache. However, there is still a copy phase at the end of the task like this: cp -a /worker-state/volumes/live/bf598b42-7de6-4c27-4f25-028ddf4f22ef/volume/. /worker-state/volumes/init/9844566e-c77b-447b-5831-eb1e73833e7f/volume which I’ve observed via htop on the worker, the time of which required to execute seems to depend on the size of the docker image which was just built.

I’m going to try the resource you’ve suggested anyway and see if the symptoms are the same - will post back here.

I should probably also note that I’m doing docker image builds in parallel, like this:

 - in_parallel:
    - task: docker-build-1
      privileged: true
      image: builder
      config:
        platform: linux
        params:
          REPOSITORY: local-registry:5000/repo/docker-1
          TAG: latest
        inputs:
        - name: repo
        caches:
        - path: cache
        run: {path: build}
    - task: docker-build-2
      privileged: true
      image: builder
      config:
        platform: linux
        params:
          REPOSITORY: local-registry:5000/repo/docker-2
          TAG: latest
        inputs:
        - name: repo
        caches:
        - path: cache
        run: {path: build}

This setup works, I get cache hits from img on both builds. However as mentioned the time savings made in using the cache are almost completely lost when this long running cp happens afterwards

Any ideas?

Ahh, maybe it’s Concourse turning the ‘copy on write’ cache volume back into a flat volume. The current cache implementation works this way because otherwise caches will continuously add more and more copy-on-write layers, and we didn’t want to hit an upper bound.

Maybe one alternative approach we could take is to just keep layering until we hit some preconfigured number of cache hits, after which we stop using the cache? This could be a nice way of keeping the caches from accumulating cruft forever. :thinking:

Does that imply that you would have to implicitly clear the cache in order to start using it again? If so, that sounds like a reasonable approach for this use case, since we could schedule the clearing of the cache and execute with fly during quiet times. Come to think of it, we could schedule clearing the cache and then kick off the job which would warm it again

Although, is it conceivable to have the flattening happen outside of the job’s normal flow? If Concourse can tell that this cache won’t be used anymore in the job, kick off the flattening process but let it happen outside of the job so that subsequent tasks don’t have to wait for it. Actually, I suppose this could even happen at any arbitrary time to be fair, kind of like a ‘cache optimisation’ background task that Concourse performs from time to time on all pipeline caches? If a job happened to be kicked off during an optimisation background task affecting one of its caches, it would have to wait. Not sure how easy this would be to implement, I’m not too clued up on whether Concourse does anything in the background already