Description including problem, use cases, benefits, and/or goals
When pipeline runs are queueing, it's often a waste to run tests on every incremental change on the same branch. On gitlab-ce repo we push often and end-up in long build queues. Let's introduce a feature that would say, run build only if the commit is the HEAD of the branch.
Proposal
There's lots of ways to do this. Some options:
When adding a pipeline to the queue, check if there are other pipelines already queued for the same ref, and if so, delete the old ones (or don't queue the new one) such that only one run will be queued at a time.
When adding a pipeline to the queue, not only delete other queued entries for the same ref, but kill any running pipelines too. This is more aggressive, but runs the systemic risk of never finishing a pipeline.
When pulling builds off of the queue, abort any for a SHA that is non-HEAD. This leaves builds in the queue and puts the cleanup at pull-time rather than push-time.
We have awesome when.
We can extend it with: if_latest.
It will make the pending build be run only if the commit is the HEAD of the branch. If it's not the build will be marked as canceled (or skipped that would require to introduce additional status).
Example:
job: script: echo test when: if_latest only: - branches
Current implementation
We do it simple. Whenever we receive a push, creating a pipeline, we also check if there are other pipelines which are from the same branch, created of pending, non-HEAD, and cancel all of them.
This is controlled by a new setting: "Auto-cancel redundant, pending pipelines" as a project setting. Pipelines cancelled this way would have a tooltip showing that it's auto-cancelled.
I think it may be a nice feature to make it possible to run all builds that has been trigger on master (even if not on HEAD anymore), while building only HEAD-attached builds in other branches (probably best approach is making this behavior configurable). What do you think?
Maybe also worth considering: If some builds were skipped in favour of a later commit and this build fails, it may be beneficial to also build commits in between to find the latest "buildable" commit.
Personally, I'm not sure I like declaring this at the job level. You generally don't want some jobs running, but some skipped. We should apply to the entire pipeline.
To further expound on the point by @markpundsack this should surely not be defined using 'when', since the other uses of that field would still be necessary in conjunction with this one.
i.e. building HEAD only doesn't preclude the need to run a job 'on_failure' or 'always' for example.
Also an elegant approach I've seen to this elsewhere is to allow a filter, where any pushes passed by the filter will cause the build to be cancelled.
For example:
cancel_on:only:-master
Would then cancel this build on any successive pushes to master.
This would also provide the necessary block (cancel_on) to add 'except' conditions to, such as not cancelling if '[ci skip]' were in the commit message, as requested by @DouweM.
I imagine the machinery for filtering pushes appropriately for this is partiality in place, in order to implement the existing 'only' and 'ci skip' support.
Hi there. I'm willing to work on this but I'm not that familiar with gitlab-ce code. Could you give me a few pointers on the source to get started ? Also, #21967 (closed) may be a duplicate of this one.
If there will be an option for cancelling the running build, some of these issues should be resolved so that there is no rubbish left on the server: #15578 (moved) and #21940 (moved).
I just want to that I would like gitlab to be careful about canceling a build.
Our CI builds stuff inside a Docker runner using some shared cache to speed things up; even with this it can take half an hour to go through a pipeline. If a build is running and is being canceled, the cache will be corrupted and needs to be completely deleted on the runner.
If a pipeline is not running but is in the queue, I think it's fine if it gets removed from the queue and never run. If a build is in progress I would prefer to at least wait until it is done to cancel the pipeline.
Any chance we'd be able to consider https://gitlab.com/gitlab-org/gitlab-ce/issues/25697 with this feature? (proposal to extend the when to include the ability to run scripts) I feel like they're fairly related and may be easier to implement together.
Skipping non-HEAD pipelines would be a simplest first iteration for us, but I hope we eventually go more in the direction of auto-canceling all redundant builds, including canceling currently running pipelines once a new commit is pushed to the same branch.
I know that some people will want an option to not kill deploy builds that have already started because it can leave an environment in an inconsistent state. Perhaps we can be smart about detecting jobs with an environment specified, and don't auto-cancel them, but auto-cancel every other type of job.
I feel at any case we should have an option, because for example, at times I do want the old jobs to be done, because the whole test suites require a lot of time to run, and I really want to see the result of my 50% done tests, because I know that my new changes are not going to touch most of the results, so they should be the same, and I want to see the result sooner than later.
If there's no option for this, then I might hesitate to push if I want to know the current result.
Here's a joke, we could detect if the commit message contains [keep last ci]. This is certainly stupid but something like that would be great.
I don't think I'm in favor having having a per-commit or UI option. I was thinking more project-level setting or .gitlab-ci.yml setting. Ideally I wouldn't have any option though, we'd just do the "right thing".
Yeah, I could see that. The most important thing I feel is that we can't just change it and give no options to go back to original behaviour. Being smart and doing it right is certainly the best, but given this feature, I think it really depends on workflow so I don't feel there would be a one-size-fits-all solution.
Of course I am not saying we should have a per-commit option, that's probably not worth it at all. Even [skip ci] would cause some issues right now.
The current implementation is that whenever there's a new push to a branch or tag, it would try to cancel all non-HEAD created or pending pipelines for the given ref. Running pipelines would keep running.
If we can't decide what should work best right now, I propose we implement it the least harmful/incompatible way, because deciding an option might be a tough decision, and have something working would also give us some ideas in order to make a step further.
I propose:
Ignore options for now
Ignore protected branches for now
Cancel only created or pending pipelines. We don't lose any progress here anyway.
@godfat I don't think we should cancel any pipeline without en explicit request issued by an user that this is an expected behavior. For me personally .gitlab-ci.yml would work better than setting this in Project -> Settings.
A new push can be a [ci skip] docs-only commit, in which case we probably would want the commit before that to be built.
Yes, we should handle that.
Maybe also worth considering: If some builds were skipped in favour of a later commit and this build fails, it may be beneficial to also build commits in between to find the latest "buildable" commit.
I'm going to argue against that. If the current state of the branch is broken, then it should be sufficient to indicate the status as broken. You should be able to retry the auto-canceled build if you really need to debug when it broke, but hopefully your test output is sufficient to tell you what you need to do.
Ignore options for now
@godfat Which options are you referring to here? I think having a single project-level setting to enable/disable the feature would make it easier to introduce this with lower risk.
Ignore protected branches for now
@godfat Can you elaborate on your concern for protected branches? Just "in case"? I'm not aware of any reasons protected branches should be exempt here, especially since we're not talking about canceling a running pipeline (that may contain a deploy which you don't want to abort mid-way).
@grzesiek I don't like the .gitlab-ci.yml syntax. Since this is defining how two different pipeline runs interact, it's unclear which version of .gitlab-ci.yml controls the behavior. Do you define when a pipeline can be canceled, or do you define when a pipeline can cancel another pipeline? Or from another lens, is this option necessary to build, test, and deploy the app? Is it something that needs to be in version control so that pipeline runs of old code have old behavior (for reproducibility), and forks of projects have the same behavior? Is it something that needs to differ for individual jobs within a pipeline? If yes to any of those, then it should be in .gitlab-ci.yml.
I'd argue this is a meta choice that feels more natural as a project setting.
@dimitrieh Can you mockup what this would look like in CI/CD Pipeline settings?
@godfat Which options are you referring to here? I think having a single project-level setting to enable/disable the feature would make it easier to introduce this with lower risk.
@markpundsack I was referring to the option to disable/enable this behaviour, because it seems we don't yet have a conclusion for how we want to do that, and I feel this would be harmless anyway, therefore I propose to just do it without an option. But as long as we have a good conclusion, of course having options would be better as I mentioned at https://gitlab.com/gitlab-org/gitlab-ce/issues/8998#note_20833551
@godfat Can you elaborate on your concern for protected branches? Just "in case"? I'm not aware of any reasons protected branches should be exempt here, especially since we're not talking about canceling a running pipeline (that may contain a deploy which you don't want to abort mid-way).
I was thinking exactly for deployments and environments and alike. I was just trying to simplify the implementation and reduce the risk for having no options. If we're going to add options, then I think there's no point to ignore protected branches or so. So please disregard :P
I don't like the .gitlab-ci.yml syntax. Since this is defining how two different pipeline runs interact, it's unclear which version of .gitlab-ci.yml controls the behavior.
Good point! So perhaps this setting should be similar to protected branches, which could have glob like stable-*.
I don't like the .gitlab-ci.yml syntax. Since this is defining how two different pipeline runs interact, it's unclear which version of .gitlab-ci.yml controls the behavior.
Would be that each .gitlab-ci.yml should decide whether itself could be cancelled. So every time we want to cancel a particular pipeline automatically, we'll need to ask that pipeline. For example, if a later update removed that option, then that pipeline for that commit should not cancel itself automatically.
This means we might need to add a new column for pipelines. (or the pipeline would need to examine the code in .gitlab-ci.yml
@grzesiek I think we need to be thoughtful about which configurations belong in .gitlab-ci.yml vs project settings. We can't just blindly move all settings into .gitlab-ci.yml.
@godfat Yeah, but that feels like bad UX. I shouldn't have to even think about what that means. A project-level settings is easier to understand. It just applies to all pipelines.
@dimitrieh Not sure I like the "quota strategy" part. While shared-runner quota may be a factor in deciding to enable or not, there are other reasons too, especially when considering project-specific runners and on-prem installations with few, if any, shared runners.
I think the feature should be about auto-canceling redundant pipelines. To start, we'll only cancel unstarted pipelines, but eventually, I'd like to have an option to cancel pipelines that have actually started too (being mindful of letting deploy jobs finish). I don't think people should have to parse what "non-HEAD" means.
Can we make it an enable/disable rather than a choice between options?
@innerwhisper Yeah, that works for me. The wording might be a bit awkward for people. I'm not sure "non-HEAD" is as meaningful until after you've thought about it for a while.
Something like "Auto-cancel redundant, pending pipelines" and "New pipelines will cancel older, pending pipelines on the same branch".
Kamil Trzcińśkichanged title from Make builds run only if they are on HEAD of the branch being build to Cancel potentially redundant pipelines automatically
changed title from Make builds run only if they are on HEAD of the branch being build to Cancel potentially redundant pipelines automatically
@axil I'll update the description to add a new "Current implementation" section. I am saying this to make sure that we're not stepping on each others. Would let you know when it's updated. /cc @ayufan
@ayufan@godfat This feature only closes non-HEAD pipelines? What happens if there are multiple redundant pipelines on the HEAD (same sha)? That can happen when pipelines are triggered by the API and not by a push.
@alexispires We don't cancel them automatically because it's hard to detect if they're really identical pipelines given that they could have different variables. We'll need more discussion to move that forward. This is the safest approach right now.
I have a similar issue of what @alexispires mentioned. I have a trigger that builds a pipeline. The trigger, however, is fired twice due to the current policies on gitlab (on change of build instead of success). Thus, the pipeline is triggered twice.
However, the cancellation of redundant pipelines is not working, even after checking the option of canceling redundant pipelines.