Git – Puts with Paths


#1

I have a pipeline setup to pull from a git repo, do some work, and then put to that repo. I have some other jobs in the pipeline that need to be triggered based on changes that have passed through the previous job, but they need to utilize the git resource “paths” option. For the git resource, there is the following note that discusses this:

Note that if you want to push commits that change these files via a put , the commit will still be “detected”, as check and put both introduce versions. To avoid this you should define a second resource that you use for commits that change files that you don’t want to feed back into your pipeline - think of one as read-only (with ignore_paths ) and one as write-only (which shouldn’t need it).

I am having a hard time wrapping my head around this entirely. Basically, I have it set up now so that the first job takes a repo (e.g., git-repo-1) and then puts to that repo at the end of the job. I then have other jobs that read from the same git repo URL, but are an entirely different git resource (e.g., git-repo-with-specific-path). However, I cannot figure out how to utilize the “pass” flag for getting the repo.


#2

I’m not sure I fully understand your problem but I’ll try to provide some information.

The paths option on the git resource only acts as a filter on what changes to the repo generate a new version in Concourse. For example, if you have the following resource:

- name: a-repo
  type: git
  source:
    uri: git@github.com:org/repo
    branch: master
    private_key: ((github_private_key))
    paths:
    - manifest.yml

Then pushing a commit to the repo changing manifest.yml would result in a new version being picked up by your pipeline. Pushing a commit that doesn’t modify manifest.yml would not be detected as a new version. In both cases, any time you get the resource you will get all the files in the repo at the SHA of the latest detected version. You don’t just get the file(s) listed in paths.

The part you’ve quoted is only relevant when you get and put to the same resource. With the resource defined above, suppose you have the following job in your pipeline:

- name: detect-versions
  serial_groups: [deployment]
  serial: true
  plan:
  - get: a-repo
    trigger: true
  - task: modify
    <modify other-file.yml and output to repo-out>
  - put: a-repo
    params:
      repository: repo-out

In this case the put at the end will push a modified other-file.yml which will be picked up as a new version even though manifest.yml has not changed. This will re-trigger the job.

The solution here is to have one resource defined with no paths and another (with the same git repo url) with the paths defined. You use latter to read from (get) and the former to write to (put). So you would pass the paths resource across all the jobs that need to trigger off of it.

Its a bit hard to wrap your head around and sometimes you can hit merge conflicts if the file listed under paths doesn’t get updated frequently so that resource falls behind. Hopefully this helped.


#3

@crsimmons: Sorry if I wasn’t clear, but I’ve struggled to figure out how to word it, so thanks for the additional information!

So to clarify, I am basically in a situation where I do have a get and a put on the same resource. I then have that resource feeding into another job. For that other job, I only want it to fire if particular files match (i.e., using the paths option for the git resource. I’ve figured out a way to make it work, but it’s not exactly how I want it.

I basically have the first job do a get and a put on the same git resource, similar to your second YAML example. My second job then does a get to the same repo using a different git resource that utilizes the paths parameter. The piece that is missing for me is that there’s no way to set the passed parameter for the repos in the second job, meaning they’ll respond to commits that haven’t already passed the previous job. This also makes the pipeline less intuitive in the UI, as the first and second job are no longer connected.

Below is an example of the pipeline in the UI. What I really want to have is that a-repo-file-1 and a-repo-file-2 come out of the do-something job and feed into their respective jobs.

jobs:
  - name: do-something
    plan:
      - get: a-repo
        trigger: true
      - task: modify-repo-file
        config:
          inputs:  [a-repo]
          outputs: [repo-out]
      - put: a-repo
        params:
          repository: repo-out
  - name: something
    plan:
      - get: a-repo-file-1
        trigger: true
      - task: do-something
        config:
          inputs: [a-repo-file-1]
  - name: something
    plan:
      - get: a-repo-file-2
        trigger: true
      - task: something
        config:
          inputs: [a-repo-file-2]

resources:
  - name: a-repo
    type: git
    source:
      uri: git@github.com:jgoodhouse/a-repo.git
      branch: master
      private_key: {{private_key}}
  - name: a-repo-file-1
    type: git
    source:
      uri: git@github.com:jgoodhouse/a-repo.git
      branch: master
      private_key: {{private_key}}
      paths: [a-file-1.yaml]
  - name: a-repo-file-2
    type: git
    source:
      uri: git@github.com:jgoodhouse/a-repo.git
      branch: master
      private_key: {{private_key}}
      paths: [a-file-2.yaml]

Does that makes sense? Is that something that’s possible to do?