Using concourse as a job scheduler


#1

Hi, First post. We use concourse as our CI tool of choice but recently there has been some discussion around using it as a standard enterprise style job scheduler. Do you think it is a good fit. One concern I have is inter pipeline dependency.

Cheers

Dave


#2

We’ve always joked that Concourse is “cloud make”…so its definitely possible. Some teams in Pivotal use Concourse for a lot of automated tasks. I guess the answer is ultimately “it depends”


#3

Haha, good answer. But what do you think. Can you have inter pipeline dependencies and other scheduler type stuff. We are looking to use it to handle the running of all our batch ETL work and all the interdepencies between these bits. I’m not sure it looks the right tool for the job. Great for CI but … Be nice to have a personal perspective from someone with deep concourse knowledge as we are newbies. Cheers


#4

I haven’t done what you are asking, but here are my thoughts coming from my Concourse experience:

  • it depends on the complexity of the logic. If you have branching (if/then/else) maybe Concourse is not adapted for your use case.
  • inter-pipeline dependencies are possible but somehow implicit. By this I mean that pipeline A can put something on a S3 bucket and pipeline B can observe that bucket and be triggered by changes. You can do it, but you cannot see it in the Concourse UI. Wether this is important or not depends by your users.
  • be careful about error handling. Currently a pipeline has a mechanism equivalent to a try/catch/finally block, see https://concourse-ci.org/jobs.html#job-ensure. But, if the worker on which the ensure task is scheduled fails mid-way (I saw this happening due for example to SSM throttling, which gave life to the Pool Boy, see [ANN] Concourse Pool Boy (detect and release stale pool resource locks)), then the whole pipeline will fail without recovery. There is on-going work to tackle this issue, see https://github.com/concourse/concourse/issues/2581. So if your job MUST run to completion, think twice about using the current Concourse. If on the other hand the job is written to tolerate failure, as any well-written batch job, then the next run of the pipeline will fix the (rare) failure scenario I mentioned.

To summarize: why not trying with a proof of concept ? This will also help you to clarify in your mind how to use Concourse primitives to obtain this “job scheduler” behavior.


#5

Thanks for the detailed response. Much appreciated. We are doing some POC work but we are up against it time wise so just trying to short circuit some bits to get early answers. Your advice has been very helpful so thanks for taking the time.


#6

As long as you trust your Concourse’s availability it definitely is suited to run things when things change.

It has retries, conditional execution, composability, visibility, … In terms of flow control you should be covered.

It’s very easy to have a job run after another if it passed. If they’re in the same pipeline is trivial, if they’re in different pipelines you just bridge them with a resource that is output for one and input to the other, typically for some metadata about the job results.

Jobs themselves can be arbitrarily complex in terms of orchestrating the getting of data, processing it and putting it somewhere else: you compose parallel, serial, try, on_success, on_failure and all that as you wish (mostly). Your jobs can run in specific instances of its workers as they get matched on tags that you can apply to both so you can isolate the throughput-oriented from the latency-oriented.

Granted it’s not Mesos, Yarn or Airflow but it sure can run your scheduled jobs and perhaps avoid the added complexity of bringing another tool to the table just for them.