It's not possible (in most cases) to merge a binary file (a non-text file). This causes issue when you make changes to one, but someone pushes a change to the same file before you were able to push your changes. You end up with an unmergeable situation.
To prevent this, we propose to implement file locking. File locking allows you to prevent other people from modifying one of multiple files, while still allowing you to push changes. This prevents situations where files become unmergeable.
Use cases
Anyone using binary files in their repository can use this. For instance, game developers are likely to have many unmergeable files in their repositories like PSD's, images, 3d models, etc. They would greatly benefit from this feature.
As the use of LFS increases, people are more likely to have binary files in their repositories, increasing the odds someone will use this feature.
Implementation
If we add the project.path_locked?(path) check into Gitlab::GitAccess, it will automatically apply to web edits, git pushes and LFS pushes.
So this feature entails:
Adding model for locked files or directories
Adding "Lock"/"Unlock" button in UI. A locked file or directory can only be unlocked by the lock user or a project master
Disable edit, replace, delete, new file buttons with a tooltip as appropriate
Show lock icon by any locked file/dir with a tooltip on that lock icon with text like "Locked by Douwe Maan 5 minutes ago"
Add page to list all locked files/dirs to get overview in case people forget to unlock
Add path_locked? check en error message in Gitlab::GitAccess
@DouweM EE option. I'm not sure whether this should be an EE option. It is definitely interesting, but because it's so integrated I worry about the implications on the code and the UI. I also wonder about the business viability of this as an option
Deny push which affect this file unless pushed by person who locked.
If the goal is to prevent accidental changes, we should block pushes even by the person who locked it. Make them unlock first before allowing the change.
A locked file or directory can only be unlocked by the lock user or a project master
As a simplification, how about only allowing project masters to lock and unlock? Then you don't need to record who did the locking. I don't know how customers intend on using this feature, so I don't know if that's a palatable solution or not. But it would be simpler. :) And it might be easier for people to reason about.
Eventually, if this is a really useful feature, I could imagine people asking for a new role specifically for this, in between developer and master. But if it's lightly used, then requiring master might be sufficient.
If the goal is to prevent accidental changes, we should block pushes even by the person who locked it. Make them unlock first before allowing the change.
@markpundsack@sytses
This feature originated from the problem with LFS and binary files, where if two people edit a binary file, the files becomes unmergeable after the first person pushed. So you want to be able to lock a file to edit it safely, preventing your colleagues from editing the file before you manage to push. So not only do you want to be able to push to the file you locked, we might even want to specific who locked the file.
So developers and up should be able to lock any file and push to it, preventing everyone else from pushing.
I'd expect users to use this largely as @sytses said, as a way of checking things out and not necessarily as a way of preventing change. But the latter is something I would absolutely expect to see in some cases. I'd also say from experience that as soon as it is introduced users will lock a file and then disappear (for a beer, for a vacation, etc.). So, a well-defined way of allowing an administrator to easily unlock files is absolutely required.
@markpundsack I'm not sure of the infrastructure costs associated with keeping track of who created the lock is, but it would absolutely be useful to keep track of this. When overriding/removing a lock knowing who did what and why is valuable information.
In monolithic tools such as Perforce this sort of feature is regularly used by default on files with defined extensions and/or paths (.jpg, .pdf, .exe, .gz, etc.) One may consider it bad practice but as LFS makes storage of files far beyond the realm of source code and text files easier, some organisations are going to start putting everything in.
So, a well-defined way of allowing an administrator to easily unlock files is absolutely required.
...
I'm not sure of the infrastructure costs associated with keeping track of who created the lock is, but it would absolutely be useful to keep track of this.
I, too, like the proposal. I would change it slightly when it comes to "Show lock icon by any locked file/dir with a tooltip" to add an indication on the file without a tooltip. Having to hover over the file entry to see whether it is locked isn't user friendly if one is looking at a large list of files. I think a slightly different file icon, with a lock on/next to it, would be a better way to implement this.
@DouweM -- does my file icon with lock proposal (see previous message) seem okay? I'm not sure what the protocol is for suggesting changes to implementation proposals.
OK, now that I've thought about it a bit more, with a better understanding of the use cases, I'm not sure the proposal actually works. Let me know if I've missed something.
The git-lfs proposal gets to the heart of it. You need to make sure that you're on HEAD when you lock (which is hard for a web interface that has no idea what you've got locally) and that any subsequent pushes to the file in question are based off other revisions in a linear fashion.
Consider the following examples:
Sequential
__________________________ \________/ \_____/ A B
Encompassed
______________________________ \ \_____/ / \ B / \___________________/ A
Overlapping
__________________________ \ \ _/____/ \ / B \______/ A
In (1), there's no problem, assuming A and B both pull before they start to write changes because the changes are sequential and there is no lock contention.
(2) and (3) aren't problems IF A and B try to grab the lock before making any changes since B would be blocked from grabbing the lock while A has it. Yay, right!
The problem is that the files are read-write locally. Other VCSs would have your local version be read-only until you checked out the file, and at that time, they would block other users from checking out. But git doesn't work that way.
It's going to be really easy to forget to grab the lock before you start modifying it. So back to case (2), if you only try to grab the lock at checkin time, B will grab the lock, commit the file, then release the lock. Then some time later, A finally goes to commit the long-running changes, grabs the lock, commits (overwriting previous changes) and then releases the lock.
How does the proposal above protect against this? Or do we consider it not worth protecting against this?
An alternate proposal might be a bit more complicated. For example, you could check every push on the file to see if it is derived from master/HEAD and block anything that isn't. I think you need a protected? attribute in addition to knowing if it is currently locked?. That way you do the HEAD check for every push, not just when it's currently locked. It's precisely when it's not locked that the merge conflicts come in. At this point, locked? is really just a courtesy to other people; the real work is done by enforcing linear changes for every push.
As a side note, if locking is available via the API and you use something like git lab lock foo, the CLI could make sure foo is at master/HEAD before proceeding, which might help reduce surprises. This is likely something best solved in git itself. :)
How does the proposal above protect against this? Or do we consider it not worth protecting against this?
@markpundsack I think we should keep it simple as a first iteration and only lock the remote file, not protecting against diverging remotes.
Situation 1 would be covered.
Situation 2 (encompassed), would act as you describe (A overwrite B). This is a typical danger with long-diverging remotes and one that you should take into account anyway. Eventually we should think about warning the user.
Situation 3 is similar. I think for the first iteration we can assume that before work starts, you update your local copy, therefore checking the lock.
I think this is the simplest issue we can solve right now: preventing conflicts by blocking users from pushes that make changes to the file that is locked. It's relatively easy to implement and gives us plenty of space to improve. As a means to alleviate the issues you mentioned, we can think about reporting to the user whether certain files are locked on fetching and making it obvious who locked what and when.
CLI for this would be excellent, but if it's git itself we don't have the engineering capacity at the moment.
@JobV OK, if we're not worried about the robustness of the lock, we can tell people to lock the file before they start making any changes, and sync to HEAD before making any changes. It'll be almost purely a policy thing rather than enforcement. So if users don't lock before starting changes, they likely will run into problems. But if everyone follows the policy, then the lock will help prevent accidental problems.
We can consider releasing most of the code of this feature in CE and allow additional management as an EE option (override lock by owner/master/admin).
OK, how is it related to current feature?! File Lock feature relies on the same functionality as git hooks. In other words until git hooks is not fixed the file lock feature will not work when you change file using UI.
@JobV Why is this being implemented as an option vs. as a feature? I think it's correct that this belongs in EE -- but I'm curious why we think this shouldn't be part of the base EE product.
@deilering we'll use Lab CLI for that, planned for 8.10 :)
@xyzzy we think this is something that is not necessarily interesting for all teams and offers substantial value for larger teams working with media. If we're wrong, we can always merge it back in EE or even CE, but the other direction is not possible; therefore we start with it as EE option.
We might consider one day offering the basic feature in CE and offer advanced access control as EE option.