Added LFS support to SSH
What does this MR do?
Adds the much requested LFS support to SSH. It also allows DeployKeys to download LFS files without having to do anything extra!
HTTP access is still required, since that is what is used to retrieve the LFS data, but authentication happens with the user's SSH Key, so there is no need to input any credentials.
These merge request depends on GitLab Shell being updated with gitlab-shell!86 (merged) .
What are the relevant issue numbers?
Screenshots (if relevant)


Merge request reports
Activity
cc @DouweM
mentioned in merge request gitlab-shell!86 (merged)
Added 1 commit:
- c26d0420 - Added LFS support to SSH
Added 1 commit:
- 97290649 - Added LFS support to SSH
Reassigned to @dbalexandre
@dbalexandre pretty please?
Added 1 commit:
- 12d6bd8f - Added CHANGELOG item and documentation.
So it looks like this MR adds an LFS token, something resembling a personal access token, to each user and deploy key. This LFS token never expires and cannot be revoked. Do I read that correctly @patricio ?
Can the holder of this token also do regular Git push and pull over HTTP? That is maybe not something to be concerned about but we should think about it.
I am not sure if having tokens that never expire and cannot be revoked is a good idea. Because there is no human being who has to remember tokens we can use ones that expire quickly without causing inconvenience. Some possible mechanisms would be Redis (token = redis key with ttl) or JWT (token = jwt claim with expiry time: be careful to verify whether the token has expired).
Also, nice job finding a simple solution that works so quickly @patricio !
@jacobvosmaer-gitlab you are correct in your assumptions.
I agree with you that we could use a token that expires quickly instead of a permanent token, specially since this token would also allow for regular Git over HTTP actions. How would using a Redis key as a token work, though? I have never done this before, so I would really appreciate your help :)
@jacobvosmaer-gitlab regarding Kerberos, how would it affect LFS? If the special port is used, would
project.http_url_to_repo
return the URL with the special port? If that is the case, then would LFS also work on that port? I think it will just workTM, since the authentication headers are populated from the initial request, but without a working Kerberos setup I cannot test it.@patricio testing with Kerberos is not hard. You need a Linux server that runs your modified version of GitLab (can be a local VM, DO, whatever) and you can use your Macbook as the client. There is documentation at https://gitlab.com/gitlab-cookbooks/kerberos-test-server/blob/master/README.md ; ask me on Slack if you need help with that.
Redis: maybe better if you ask me on Slack, I don't know which part needs explaining and I don't want to write a long story here. :)
@jacobvosmaer-gitlab I will deploy a test environment this week to see how my changes behave with Kerberos. I'll focus first on getting it working with a Redis key.
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Reassigned to @patricio
Added 42 commits:
-
12d6bd8f...0693fee2 - 39 commits from branch
master
- 0fb96f83 - Added LFS support to SSH
- 6a8c335c - Added CHANGELOG item and documentation.
- aa744f73 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
Toggle commit list-
12d6bd8f...0693fee2 - 39 commits from branch
Added 1 commit:
- bea8b983 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
@jacobvosmaer-gitlab I've been looking at the logs, at it looks like LFS will execute the
git-lfs-authenticate
command every time it needs to perform an action against the LFS server, which would mean that an expiry time of 3600 seconds for the Redis key is really high, specially since it gets regenerated every time the/discover
endpoint is accessed.In the example I'm posting below, I'm downloading 2 files via LFS. The request flow seems to be as follows:
- Run
git-lfs-authenticate
via SSH, receive authentication header. - POST to
/info/lfs/objects/batch
to get information about the objects - Receive the first object
- GET
/gitlab-lfs/objects/:oid
to validate object - Repeat for the second object.
So by the time it tries to fetch the second object, the Auth header is regenerated with a new key.
What do you think, should I reduce the expiry time or leave it as is?
14:39:02 sshd.1 | Connection from 127.0.0.1 port 54935 on 127.0.0.1 port 2222 14:39:02 sshd.1 | Failed none for pato from 127.0.0.1 port 54935 ssh2 14:39:02 sshd.1 | Postponed publickey for pato from 127.0.0.1 port 54935 ssh2 14:39:02 sshd.1 | Accepted publickey for pato from 127.0.0.1 port 54935 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs 14:39:02 sshd.1 | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted 14:39:02 sshd.1 | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54935 id 0 14:39:02 sshd.1 | Close session: user pato from 127.0.0.1 port 54935 id 0 14:39:02 sshd.1 | Received disconnect from 127.0.0.1 port 54935:11: disconnected by user 14:39:02 sshd.1 | Disconnected from 127.0.0.1 port 54935 14:39:02 sshd.1 | Connection from 127.0.0.1 port 54938 on 127.0.0.1 port 2222 14:39:02 sshd.1 | Failed none for pato from 127.0.0.1 port 54938 ssh2 14:39:02 sshd.1 | Postponed publickey for pato from 127.0.0.1 port 54938 ssh2 14:39:02 sshd.1 | Accepted publickey for pato from 127.0.0.1 port 54938 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs 14:39:02 sshd.1 | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted 14:39:02 sshd.1 | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54938 id 0 14:39:03 sshd.1 | Received disconnect from 127.0.0.1 port 54938:11: disconnected by user 14:39:03 sshd.1 | Disconnected from 127.0.0.1 port 54938 14:39:03 gitlab-workhorse.1 | localhost:3000 127.0.0.1:54939 - - [2016-08-29 14:39:03.285717592 -0500 CDT] "POST /root/test-lfs-bug.git/info/lfs/objects/batch HTTP/1.1" 200 379 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 0.329941 14:39:03 gitlab-workhorse.1 | 2016/08/29 14:39:03 Send file "/Users/pato/RubyMineProjects/gdk/gitlab/shared/lfs-objects/fd/e8/911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4" for GET "/root/test-lfs-bug.git/gitlab-lfs/objects/fde8911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4" 14:39:05 gitlab-workhorse.1 | localhost:3000 127.0.0.1:54939 - - [2016-08-29 14:39:03.616589146 -0500 CDT] "GET /root/test-lfs-bug.git/gitlab-lfs/objects/fde8911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4 HTTP/1.1" 200 433489920 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 2.347567 14:39:06 sshd.1 | Connection from 127.0.0.1 port 54942 on 127.0.0.1 port 2222 14:39:06 sshd.1 | Failed none for pato from 127.0.0.1 port 54942 ssh2 14:39:06 sshd.1 | Postponed publickey for pato from 127.0.0.1 port 54942 ssh2 14:39:06 sshd.1 | Accepted publickey for pato from 127.0.0.1 port 54942 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs 14:39:06 sshd.1 | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted 14:39:06 sshd.1 | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54942 id 0 14:39:07 sshd.1 | Received disconnect from 127.0.0.1 port 54942:11: disconnected by user 14:39:07 sshd.1 | Disconnected from 127.0.0.1 port 54942 14:39:07 gitlab-workhorse.1 | localhost:3000 127.0.0.1:54943 - - [2016-08-29 14:39:07.420328882 -0500 CDT] "POST /root/test-lfs-bug.git/info/lfs/objects/batch HTTP/1.1" 200 380 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 0.241582 14:39:07 gitlab-workhorse.1 | 2016/08/29 14:39:07 Send file "/Users/pato/RubyMineProjects/gdk/gitlab/shared/lfs-objects/45/f7/63efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321" for GET "/root/test-lfs-bug.git/gitlab-lfs/objects/45f763efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321" 14:39:12 gitlab-workhorse.1 | localhost:3000 127.0.0.1:54943 - - [2016-08-29 14:39:07.662789667 -0500 CDT] "GET /root/test-lfs-bug.git/gitlab-lfs/objects/45f763efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321 HTTP/1.1" 200 1146093568 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 4.373223
- Run
Reassigned to @dbalexandre
@patricio regardless of key expiry, it occurred to after we talked yesterday that it would be better to have a separate API endpoint for git-lfs-authenticate. Right now we squeeze the two pieces of information it needs (user/pass and repo URL) into /allowed and /discover. I think it would be nicer to have /lfs-authenticate for that.
I know that some LFS interactions generate a lot of requests. Depending on the Git version,
git clone
can be bad because if fetches each LFS object one at a time. But have you tried what happens when you add 100 files,git lfs track
them, and then push? Does that hit lfs-authenticate 100 times?@jacobvosmaer-gitlab that is an interesting point. I'm going to try that today. Now, for the reason you want to have
git-lfs-authenticate
have it's own API enpoint, is it just to keep the code clean, or do you anticipate it requiring less API calls?If the
git-lfs-command
gets triggered once for every file, then all API endpoints will be called as well./discover
will still be called to get the key info,/allowed
would still be called to get project info and if we add/lfs-authenticate
as well, it would also be triggered.I'll report in a few what happens when uploading hundreds of files.
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
@patricio It looks good! I had just few minor comments.
Reassigned to @patricio
@jacobvosmaer-gitlab I tried uploading 117 image files and it only required 2
git-lfs-authenticate
calls. The logs are in this snippet, as they are quite long, but I did see a lot of 401s at first and then a lot of 200s.git clone
generates 117 differentgit-lfs-authenticate
calls, whilegit lfs fetch
is smart and authenticates once and fetches the files in bulk.@patricio that sounds about right. It is a limitation of the way LFS is integrated into Git: during
git clone
it runs a 'smudge filter' (I think) for each file, and each of those makes a separate LFS call.Re separate API: this would also have the advantage of not creating (mostly unused) LFS tokens for every new gitlab-shell connection. I understand that the end result would be more API calls from gitlab-shell (3 instead of 2 as it is now). I think my main motivation is clearer code organization and not creating tons of 99% unused LFS tokens in Redis.
@jacobvosmaer-gitlab you are right. I hadn't consider that we are creating tokens left and right every time the
/discover
endpoint is touched. It makes perfect sense to add an extra API endpoint, in that case. I also like the idea of keeping related code better organized. I'm going to add it right nowAdded 1 commit:
- 93d1ec11 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
@dbalexandre thank you for giving this a review, and sorry for all the changes happening as you were reviewing. I think this should be ready for a final miniboss review. Can you have another look?
Reassigned to @dbalexandre
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
@patricio It looks good! I had few remarks.
Reassigned to @patricio
Added 1 commit:
-
a209faf3 - Refactored handling of the
LfsToken
and added functionality to it to simplify external code.
-
a209faf3 - Refactored handling of the
Added 222 commits:
-
a209faf3...9ba9e098 - 217 commits from branch
master
- c7b16637 - Added LFS support to SSH
- ff5c6272 - Added CHANGELOG item and documentation.
- fdfdd8f6 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- aeaf9c51 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
6394894e - Refactored handling of the
LfsToken
and added functionality to it to simplify external code.
Toggle commit list-
a209faf3...9ba9e098 - 217 commits from branch
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Added 1 commit:
- 3a103b0a - Improve string handling.
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Added 38 commits:
-
3a103b0a...08ab410f - 32 commits from branch
master
- a127c464 - Added LFS support to SSH
- 5443d9b4 - Added CHANGELOG item and documentation.
- b0a7ca60 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- 10184e05 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
8881abd2 - Refactored handling of the
LfsToken
and added functionality to it to simplify external code. - c72cda71 - Improve string handling.
Toggle commit list-
3a103b0a...08ab410f - 32 commits from branch
@jacobvosmaer-gitlab thank you for reviewing the code as well. I'm going to deploy these changes with EE to a test server to test the Kerberos interaction. Once that is done, I'll assign the MR to an endboss for final review.
Added 170 commits:
-
c72cda71...9f7349f6 - 164 commits from branch
master
- 0dd1c163 - Added LFS support to SSH
- 60b84094 - Added CHANGELOG item and documentation.
- a63cc652 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- 16d2cb13 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
331e3dfb - Refactored handling of the
LfsToken
and added functionality to it to simplify external code. - ba6fdb9c - Improve string handling.
Toggle commit list-
c72cda71...9f7349f6 - 164 commits from branch
@jacobvosmaer-gitlab the assumptions we had during our call today were correct. Even with the Kerberos Alternate Port enabled, the
project.http_url_to_repo
returns the correct URL, which means LFS objects can be pushed via HTTP using the credentials retrieved via SSH withgit-lfs-authenticate
.I also made sure that the following code does not affect the authentication flow:
def allow_basic_auth? if Gitlab.config.kerberos.enabled && Gitlab.config.kerberos.use_dedicated_port !request_uses_kerberos_dedicated_port? else true end end
In our case we do have Kerberos enabled and it has a special port, so we enter the first case of the
if
clause. But since the HTTP request is made to the regular HTTP port and not the Kerberos HTTP port,!request_uses_kerberos_dedicated_port?
will return true, allowing us to continue with the request.As we saw today, LFS is broken when cloning and pushing via Kerberos, so I created an issue for it: gitlab-org/gitlab-ee#968
I think this is ready for final review.
Reassigned to @DouweM
@DouweM can you help me with an endboss review?
WIP until https://gitlab.com/gitlab-org/gitlab-shell/merge_requests/86 is mergedLooks like it's the other way around!
Edited by Douwe Maan- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
25 25 def lfs_download_access? 26 26 return false unless project.lfs_enabled? 27 27 28 project.public? || ci? || (user && user.can?(:download_code, project)) 28 return true if project.public? || ci? || lfs_deploy_key? I agree that would be nice but it may be too much for this MR.
One interesting problem to solve there is that GitAccess does authorization but in GitHttpClientController we are also in the middle authentication. Specifically, we skip authentication on read access to public projects. SSH access to GitLab on the other hand is always fully authenticated before it hits authorization.
Edited by Jacob Vosmaer (GitLab)
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Reassigned to @patricio
- Resolved by Douwe Maan
Added 1 commit:
- 5f035973 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
Added 1 commit:
- 941bbc2c - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
Reassigned to @DouweM
@DouweM I addressed (most of) your comments, please have another look
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Added 1 commit:
-
2bbdab88 - Use special characters for
lfs+deploy-key
to prevent a someone from creating a…
-
2bbdab88 - Use special characters for
mentioned in issue #13541 (moved)
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Reassigned to @patricio
Added 252 commits:
-
2bbdab88...5533fd17 - 243 commits from branch
master
- adcda5ca - Added LFS support to SSH
- beb563d1 - Added CHANGELOG item and documentation.
- 3f70ddff - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- b913628f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
2ef97a45 - Refactored handling of the
LfsToken
and added functionality to it to simplify external code. - fb487ffa - Improve string handling.
- 3bf0e73b - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
-
d9af8462 - Use special characters for
lfs+deploy-key
to prevent a someone from creating a… - 07e1eaea - Further refactoring of authentication code, and code style fixes.
Toggle commit list-
2bbdab88...5533fd17 - 243 commits from branch
Added 1 commit:
- 489381cf - Further refactoring of authentication code, and code style fixes.
Added 152 commits:
-
489381cf...3de4e8b5 - 143 commits from branch
master
- 4b4b240d - Added LFS support to SSH
- 3e197ce6 - Added CHANGELOG item and documentation.
- 05ffdc56 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- 9ccf672f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
ee65e8f0 - Refactored handling of the
LfsToken
and added functionality to it to simplify external code. - 11e12249 - Improve string handling.
- 8bc55ff7 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
-
abd5ed33 - Use special characters for
lfs+deploy-key
to prevent a someone from creating a… - ba544866 - Further refactoring of authentication code, and code style fixes.
Toggle commit list-
489381cf...3de4e8b5 - 143 commits from branch
Reassigned to @DouweM
@DouweM can you give this another review, please?
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
118 110 @ci.present? 119 111 end 120 112 113 def user - Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
- Resolved by Patricio Cano
Reassigned to @patricio
Mentioned in merge request !5735 (closed)
Added 1 commit:
- 93556477 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
Reassigned to @DouweM
@DouweM done. I truly believe we can merge this now
Added 150 commits:
-
93556477...f8bd9625 - 140 commits from branch
master
- e40e3fdc - Added LFS support to SSH
- 372be2d2 - Added CHANGELOG item and documentation.
- cb85cf1f - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
- 48f1a61f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
-
c25630ee - Refactored handling of the
LfsToken
and added functionality to it to simplify external code. - 85152f02 - Improve string handling.
- c144db29 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
-
71aff7f6 - Use special characters for
lfs+deploy-key
to prevent a someone from creating a… - de24075e - Further refactoring of authentication code, and code style fixes.
- be09bcf0 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
Toggle commit list-
93556477...f8bd9625 - 140 commits from branch
- Resolved by Patricio Cano
Added 1 commit:
- 3d933082 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
A few comments:
- there's no specs for testing usage of
:lfs_token
, - there's only read test for
lfs+deploy-key
, - there's no test for denial of push access of
lfs+deploy-key
, - (minor) the
:lfs_token
and:lfs_deploy_token
allows to access all GIT repositories, where it should probably be limited only for fetching LFS objects (it seems pretty unlikely to steal this token), - (minor) the
:lfs_token
allows to push to GIT repositories as an user, - (probably not important) current implementation doesn't allow to concurrently run two
git clones
, because token is overwritten,
The implementation seems to be secure, it only exposes GIT repositories for
:lfs_token
and:lfs_deploy_token
.- there's no specs for testing usage of
Mentioned in commit 92a2a0fe
Mentioned in commit 6d43c95b
Mentioned in merge request !6409 (merged)
Mentioned in commit fe084819
Wow, I didn't merge this merge request, I only merged !6409 (merged)... :(
Ok. We hit the into messy situation.
My !6409 (merged) had changes from this branch the base changes for Authorization module, with a commit changes related to LFS reverted. This lead to GitLab to discover that this MR is also merged.
@ayufan is the code of this MR fully merged, or just partially?
EDIT: Going by the commits in !6409 (merged), it would seem this is fully merged.
Edited by Patricio CanoMentioned in commit 3c1bb343
@ayufan you reverted the commits, but the MR still shows as merged. Should I resubmit? Rebase? The commits/changes are still there, they just have been reverted by new commits. What is the best way forward here?
cc @rymai
Edited by Patricio CanoMentioned in issue #22334 (closed)
Mentioned in merge request !6413 (merged)
- Edited by Kamil Trzcińśki
Mentioned in commit 7c7c89ac
Mentioned in commit 667d2350
Mentioned in merge request gitlab-shell!86 (merged)
Mentioned in commit gitlab-shell@6c7f7b4d
Mentioned in commit 7fd0e153
Mentioned in issue #14278 (closed)
Just wanted to thank all involved in getting this implemented. I can now use Unity Cloud build for my projects. Very grateful for all your hard work
Edited by username-removed-90271