We would like to start using review apps to deploy multiple staging environments for GitLab.com. The issue we are foreseeing, and that we are already seeing in www-gitlab-com, is that currently, the only way a review app is cleaned is when we drop the branch on which the MR is based.
Sadly, in the previous example, we have been seeing that people do not care too much about dropping those branches. This could be because people just forgets to click "remove branch on merge", but that is still not good enough for us because one thing is deploying a static website that is a couple of megs, one quite different thing is building a full blown GitLab version with storage and database and leave it there working, forgotten.
Proposal
We need to have a way of defining a strict lifecycle for review apps. We can build the deploy mechanism, but we will need to enforce cleanup to avoid bleeding resources like crazy.
A simple way would be to have a way of manually triggering the deploy of the review app (so people can work with the MR and only click deploy when they are confident enough - this could be covered by the WIP though) but then having a defined lifecycle, for example, dropping the review app 2 hours after the deploy is finished.
This way we would enforce resource reaping, which is critical for a project as large as GitLab-CE
Another thing to think about is that we need to prevent people from using review apps from the outside without our permission, think of an abuser pushing multiple MRs to this project and consuming lots of resources, this could easily become an attack vector.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
I don't think the manual deploy and drop after 2 hours is reasonable. It'll lose a lot of its value.
For internal usage, we should close them on merge and not just on branch deletion. I'm also fine with shutting them down if there is no activity / new commit in a MR for a certain amount of time ( a day?).
Think that just one unpackaged GitLab-CE takes 1G - it's a matter of time until we have a lot of these laying around and we have to keep attaching drives to store lots and lots of dead instances that no one is actually using.
Good point. We can figure out a way to remove old instances. Generally. It is also possible to have base image and incremental changes with auto sleep of instances. This should increase binpack of review apps.
Set project default that all MRs delete source branch on merge (#25076 (moved)).
Only create review apps when MR is created, not just when branch is created. And destroy when MR is closed, regardless of branch deletion.
Expire review apps after X days.
(Optional) Only create review apps after WIP label is removed (I'm torn on this since I often want to see the results during WIP.)
I think the most bang-for-buck would be the first item. The second item won't help that much, but is a pretty important step for the product. For .com, the third item might be the kicker.
In the meantime, I wonder if we should use a manual flow where people have to explicitly choose to create a review app for a MR, and perhaps have those environments self-destruct after X hours. That wouldn't be built into the product, just something we'd configure on our own jobs. GitLab's awareness of the review apps would get out of sync then, which would be annoying, unless the reaper also notified us via an API (that might not exist).
I wonder if we could make a scheduled job that would stop all review environments nightly. Again, not a product feature, just a job with some API access.
@ayufan Perhaps we can prioritize the APIs necessary to make this work, if there isn't enough already? Actually, we can already query environments for a project and find manual actions jobs, right? We use that for ChatOps. Are all the components there to automatically stop all review app environments via a schedule pipeline (or even just a regular cron job)?
Presuming #1 (closed) & #2 (closed) help address the storage problem, maybe we call #3 (closed) like a "snooze" state? The review app can expire X days after last commit, but then you get a big "start" button on the MR page to kick it off again as needed? Would then live another X days until stopping.
The only real issue would be potential data loss you generated while initially testing. To solve that, maybe eventually we could even have a "suspend" state, so they can clean up everything but storage/db/pvc's. If no "suspend" state exists then expire calls "stop".
@joshlambert I think that a review app is transient by design, therefore I'm not concerned of any form of data loss here. We can burn them with fire anytime.
The manual lifecycle also makes sense, but we do need the timeout, I can easily imagine people forgetting to clean up those environments, just as we have people not deleting merged branches right now.
@pcarranza I do agree we need some automated controls here. Especially with us metering build minutes, people won't want to waste money on a review app that no one is touching for days.
Maybe we have a default "Suspend" timer, where if no commits or discussions happen on an MR for X period the app goes "suspends". On any subsequent commit it would restart, or if someone manually pushed the "Resume" button. The timer process then starts again.
I'm fine with to start, it is certainly the boring solution. If we could persist some state though at some point that would be neat, I know I tend to do some config with GDK for example to get them into a state I need for easy testing.
@joshlambert Small thing, but review apps don't consume build minutes while they're running, just while they're building and deploying. You're responsible for bringing your own target infrastructure.
You don't really need to worry about resuming since review apps should generally be configured to just re-create themselves on next push. Sure, you lose whatever data was there in between, but that should be a non-issue. This isn't a production system; the data should be ephemeral. Any state you need for testing should be in your seeds or other setup config. But yeah, when testing a new feature, you might not have seeds for that feature yet (although, really, the MR should have seeds that exercise the MR by the time it's merged; it's just hard to require that at the beginning, but then, the review app won't be deleted while it's actively being developed).
@markpundsack Good point on GitLab.com build minutes. I keep seeing these various "creative" uses of CI/CD, but that is not the general case!
My thought process on offering a button to "Resume" was for broader team usage. For example say a developer commits an initial cut of a feature on Thursday, and requests initial feedback from the broader team. It could be a couple days before QA, Product, UX, etc. all jump in and have a chance to review and pass feedback.. especially given one that overlaps with a weekend.
I would think that this flow is not that uncommon, especially in companies with longer release cycles. Having a manual button to restart the environment without a commit seems handy to me, should allow shorter expiries to be more efficient with customer resource usage.
Maybe we just need scheduled jobs? Adding a very simple action that would allow us to run stop_review on demand, after the specific time. Wouldn't it be enough for us?
@ayufan Seems like we could auto-create the scheduled job in the background maybe? Ideally I'd think we could then have a button to restart it if the MR is still open, thought that can come later.
GitLab is moving all development for both GitLab Community Edition
and Enterprise Edition into a single codebase. The current
gitlab-ce repository will become a read-only mirror, without any
proprietary code. All development is moved to the current
gitlab-ee repository, which we will rename to just gitlab in the
coming weeks. As part of this migration, issues will be moved to the
current gitlab-ee project.
If you have any questions about all of this, please ask them in our
dedicated FAQ issue.
Using "gitlab" and "gitlab-ce" would be confusing, so we decided to
rename gitlab-ce to gitlab-foss to make the purpose of this FOSS
repository more clear
I created a merge requests for CE, and this got closed. What do I
need to do?
Everything in the ee/ directory is proprietary. Everything else is
free and open source software. If your merge request does not change
anything in the ee/ directory, the process of contributing changes
is the same as when using the gitlab-ce repository.
Will you accept merge requests on the gitlab-ce/gitlab-foss project
after it has been renamed?
No. Merge requests submitted to this project will be closed automatically.
Will I still be able to view old issues and merge requests in
gitlab-ce/gitlab-foss?
Yes.
How will this affect users of GitLab CE using Omnibus?
No changes will be necessary, as the packages built remain the same.
How will this affect users of GitLab CE that build from source?
Once the project has been renamed, you will need to change your Git
remotes to use this new URL. GitLab will take care of redirecting Git
operations so there is no hard deadline, but we recommend doing this
as soon as the projects have been renamed.
Where can I see a timeline of the remaining steps?