Investigate possibility of supporting LFS content stored in object storage
Description
We should consider moving LFS content off of local/NFS into object storage. This is for two primary reasons:
- Object storage is linearly scalable, whereas NFS and local storage are more difficult to do so.
- Object storage is cheaper than NFS or local storage.
- It is also easy to make geographically redundant.
For example consider cost for GitLab.com, and our competitors pricing:
- BitBucket charges $0.10/GB, and does not seem to meter Bandwidth. ($5 for 50GB of storage)
- GitHub charges $0.10/GB of combined bandwidth and storage. ($5 for 50GB of storage and bandwidth)
Considering AWS for example, there are 3 main storage types (assuming GitLab server is hosted in AWS):
- S3 - $.023/gb
- EBS - $.025/gb (cold HDD) to $.1 (SSD)
- EFS - $.3/gb
When comparing Cold HDD to S3 be aware of the difference in durability, as well as S3 being reachable from any GitLab worker node. (Also options of S3-IA, etc.)
Proposal
When setting up GitLab, we could have a server level configuration option which could take S3 (or other object provider, although many are S3 compatible) credential information. Once verified connectivity is there, we could then use that in the future for storing LFS data. This could also use the same settings as the CI counterpart, with a checkbox to save either CI or LFS (or both) in that account..
One challenge is working with user authentication. Git LFS primarily uses HTTP Basic Auth, which S3 for example doesn't support. One option is to simply have GitLab proxy this data on the backend. If GitLab is hosted in the same cloud provider as the object storage, typically this additional hop of bandwidth is free.
We should also consider migration, as this may be a nice feature for larger orgs who have been testing on CE or EES and need to migrate data. Having a productized solution would also help GitLab.com migrate, as well. For example, we may have a flag to indicate where a particular artifact is (local vs. object) which we can update as the migration proceeds. Once complete, the local admin can deal with the existing artifacts as needed. (Delete, archive, etc.)