Handling GitHub pull requests in a monorepo


#1

Hello, everyone! I am trying to set up continuous integration for GitHub pull requests against a monorepo using Concourse and the github-pullrequest-resource.

Background

Let’s say there is a GitHub repository with the following structure:

/
  app-one/
  app-two/
  ci/

The two applications are decoupled from each other, apart from some Protobuf files in app-one which are also referenced in app-two. Also, the CI checks for app-one are very slow, while the CI checks for app-two are very fast. Both of these points are important and will be referenced later in this post.

I started off by creating a very simple pipeline for checking only app-one, which looks like this:

Below is a sanitized version of the pipeline I created:

resource_types:
  - name: pull-request
    type: docker-image
    source:
      repository: jtarchie/pr

resources:
  - name: ci-config
    type: git
    repo: org/repo
    uri: git@github.com:org/repo.git
    private_key: ((github-ssh-key))
    webhook_token: REPO_PUSHED_TOKEN
    source:
      branch: master
      access_token: ((github-access-token))
      paths:
        - ci/*

  - name: pull-requests
    type: pull-request
    repo: org/repo
    uri: git@github.com:org/repo.git
    private_key: ((github-ssh-key))
    webhook_token: REPO_PUSHED_TOKEN
    source:
      base: master
      access_token: ((github-access-token))
      paths:
        - app-one/*

jobs:
  - name: tests
    plan:
      - aggregate:
          - get: ci-config
          - get: pull-requests
            trigger: true

      - put: pull-requests
        params:
          path: pull-requests
          status: pending

      - task: fetch-and-test
        file: ci-config/ci/tasks/fetch-and-test.yaml
        input_mapping:
          source: pull-requests
        on_failure:
          put: pull-requests
          params:
            path: pull-requests
            status: failure
        on_abort:
          put: pull-requests
          params:
            path: pull-requests
            status: error
        on_success:
          put: pull-requests
          params:
            path: pull-requests
            status: success

This pipeline works well, but I run into difficulties once I try to add in checks for app-two.

Problem description

Initially, I added app-two/* to the paths key of the pull-requests resource and modified the tests job to run tests for both applications in an aggregate step, like so:

jobs:
  - name: tests
    plan:
      - aggregate:
          - get: ci-config
          - get: pull-requests
            trigger: true

      - put: pull-requests
        params:
          path: pull-requests
          status: pending

      - aggregate:
        - task: test-app-one
          file: ci-config/ci/tasks/app-one/fetch-and-test.yaml
          input_mapping:
            source: pull-requests
          on_failure:
            put: pull-requests
            params:
              path: pull-requests
              status: failure
          on_abort:
            put: pull-requests
            params:
              path: pull-requests
              status: error

        - task: test-app-two
          file: ci-config/ci/tasks/app-two/fetch-and-test.yaml
          input_mapping:
            source: pull-requests
          on_failure:
            put: pull-requests
            params:
              path: pull-requests
              status: failure
          on_abort:
            put: pull-requests
            params:
              path: pull-requests
              status: error

      - put: pull-requests
        params:
          path: pull-requests
          status: success

This new pipeline technically works as it should, but it’s incredibly inefficient in some cases. As mentioned earlier, the app-one CI checks are much slower than the app-two checks. If I create a pull request that only affects app-two, I will be forced to sit through the long app-one checks (which aren’t really necessary) to see the status update on GitHub.

Is there any way I can fix this? I was considering splitting the pull-requests resource and the tests job into separate pipelines, i.e. an app-one-pr resource and test-app-one job, and also an app-two-pr and test-app-two job. But given how the put step of github-pullrequest-resource works, I don’t think that would be the best idea.

What are your thoughts?


#2

That seems like it should be separate jobs. Have you tried it? I can’t think of a good reason the put would break.

FWIW https://github.com/concourse/concourse/issues/1707 (“Spaces”) will have a big impact on pull request pipelines. There are a few posts about it on our Medium:


#3

I had a few concerns about splitting them up into separate jobs, namely because each job would put the github-pullrequest-resource pointing to the same GitHub pull request twice. Which one would take precedence over the other?

Spaces look quite interesting! Thanks for the heads up. I’ll be following the progress very closely as it advances.