Skip to content
Snippets Groups Projects

Added LFS support to SSH

Merged Patricio Cano requested to merge lfs-support-for-ssh into master
2 unresolved threads

What does this MR do?

Adds the much requested LFS support to SSH. It also allows DeployKeys to download LFS files without having to do anything extra!

HTTP access is still required, since that is what is used to retrieve the LFS data, but authentication happens with the user's SSH Key, so there is no need to input any credentials.

These merge request depends on GitLab Shell being updated with gitlab-shell!86 (merged) .

What are the relevant issue numbers?

#3589 (closed)

Screenshots (if relevant)

Screen_Shot_2016-08-24_at_7.36.22_PM Screen_Shot_2016-08-24_at_7.36.22_PM

cc @jacobvosmaer-gitlab @marin

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Patricio Cano Added 42 commits:

    Added 42 commits:

    • 12d6bd8f...0693fee2 - 39 commits from branch master
    • 0fb96f83 - Added LFS support to SSH
    • 6a8c335c - Added CHANGELOG item and documentation.
    • aa744f73 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
  • Patricio Cano Resolved all discussions

    Resolved all discussions

  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • bea8b983 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
  • Author Contributor

    @jacobvosmaer-gitlab I've been looking at the logs, at it looks like LFS will execute the git-lfs-authenticate command every time it needs to perform an action against the LFS server, which would mean that an expiry time of 3600 seconds for the Redis key is really high, specially since it gets regenerated every time the /discover endpoint is accessed.

    In the example I'm posting below, I'm downloading 2 files via LFS. The request flow seems to be as follows:

    1. Run git-lfs-authenticate via SSH, receive authentication header.
    2. POST to /info/lfs/objects/batch to get information about the objects
    3. Receive the first object
    4. GET /gitlab-lfs/objects/:oid to validate object
    5. Repeat for the second object.

    So by the time it tries to fetch the second object, the Auth header is regenerated with a new key.

    What do you think, should I reduce the expiry time or leave it as is?

    14:39:02 sshd.1                  | Connection from 127.0.0.1 port 54935 on 127.0.0.1 port 2222
    14:39:02 sshd.1                  | Failed none for pato from 127.0.0.1 port 54935 ssh2
    14:39:02 sshd.1                  | Postponed publickey for pato from 127.0.0.1 port 54935 ssh2
    14:39:02 sshd.1                  | Accepted publickey for pato from 127.0.0.1 port 54935 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs
    14:39:02 sshd.1                  | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted
    14:39:02 sshd.1                  | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54935 id 0
    14:39:02 sshd.1                  | Close session: user pato from 127.0.0.1 port 54935 id 0
    14:39:02 sshd.1                  | Received disconnect from 127.0.0.1 port 54935:11: disconnected by user
    14:39:02 sshd.1                  | Disconnected from 127.0.0.1 port 54935
    14:39:02 sshd.1                  | Connection from 127.0.0.1 port 54938 on 127.0.0.1 port 2222
    14:39:02 sshd.1                  | Failed none for pato from 127.0.0.1 port 54938 ssh2
    14:39:02 sshd.1                  | Postponed publickey for pato from 127.0.0.1 port 54938 ssh2
    14:39:02 sshd.1                  | Accepted publickey for pato from 127.0.0.1 port 54938 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs
    14:39:02 sshd.1                  | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted
    14:39:02 sshd.1                  | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54938 id 0
    14:39:03 sshd.1                  | Received disconnect from 127.0.0.1 port 54938:11: disconnected by user
    14:39:03 sshd.1                  | Disconnected from 127.0.0.1 port 54938
    14:39:03 gitlab-workhorse.1      | localhost:3000 127.0.0.1:54939 - - [2016-08-29 14:39:03.285717592 -0500 CDT] "POST /root/test-lfs-bug.git/info/lfs/objects/batch HTTP/1.1" 200 379 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 0.329941
    14:39:03 gitlab-workhorse.1      | 2016/08/29 14:39:03 Send file "/Users/pato/RubyMineProjects/gdk/gitlab/shared/lfs-objects/fd/e8/911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4" for GET "/root/test-lfs-bug.git/gitlab-lfs/objects/fde8911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4"
    14:39:05 gitlab-workhorse.1      | localhost:3000 127.0.0.1:54939 - - [2016-08-29 14:39:03.616589146 -0500 CDT] "GET /root/test-lfs-bug.git/gitlab-lfs/objects/fde8911461fe322d34a813a20ff806476a6bcd4c0cddb8af5b109ffa3eb919e4 HTTP/1.1" 200 433489920 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 2.347567
    14:39:06 sshd.1                  | Connection from 127.0.0.1 port 54942 on 127.0.0.1 port 2222
    14:39:06 sshd.1                  | Failed none for pato from 127.0.0.1 port 54942 ssh2
    14:39:06 sshd.1                  | Postponed publickey for pato from 127.0.0.1 port 54942 ssh2
    14:39:06 sshd.1                  | Accepted publickey for pato from 127.0.0.1 port 54942 ssh2: RSA SHA256:mnHR9nCnGOoS69Qu/qAsIPxJZO6DYoPp+Hwed0QRWXs
    14:39:06 sshd.1                  | BSM audit: bsm_audit_session_setup: setaudit_addr failed: Operation not permitted
    14:39:06 sshd.1                  | Starting session: forced-command (key-option) '/Users/pato/RubyMineProjects/gdk/gitlab-shell/bin/gitlab-shell key-19' for pato from 127.0.0.1 port 54942 id 0
    14:39:07 sshd.1                  | Received disconnect from 127.0.0.1 port 54942:11: disconnected by user
    14:39:07 sshd.1                  | Disconnected from 127.0.0.1 port 54942
    14:39:07 gitlab-workhorse.1      | localhost:3000 127.0.0.1:54943 - - [2016-08-29 14:39:07.420328882 -0500 CDT] "POST /root/test-lfs-bug.git/info/lfs/objects/batch HTTP/1.1" 200 380 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 0.241582
    14:39:07 gitlab-workhorse.1      | 2016/08/29 14:39:07 Send file "/Users/pato/RubyMineProjects/gdk/gitlab/shared/lfs-objects/45/f7/63efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321" for GET "/root/test-lfs-bug.git/gitlab-lfs/objects/45f763efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321"
    14:39:12 gitlab-workhorse.1      | localhost:3000 127.0.0.1:54943 - - [2016-08-29 14:39:07.662789667 -0500 CDT] "GET /root/test-lfs-bug.git/gitlab-lfs/objects/45f763efd3e09c68e3b6128bd3dbdefbde1360d446cba8d2e06345eeff407321 HTTP/1.1" 200 1146093568 "" "git-lfs/1.4.0 (GitHub; darwin amd64; go 1.7)" 4.373223
  • Reassigned to @dbalexandre

  • @patricio regardless of key expiry, it occurred to after we talked yesterday that it would be better to have a separate API endpoint for git-lfs-authenticate. Right now we squeeze the two pieces of information it needs (user/pass and repo URL) into /allowed and /discover. I think it would be nicer to have /lfs-authenticate for that.

    I know that some LFS interactions generate a lot of requests. Depending on the Git version, git clone can be bad because if fetches each LFS object one at a time. But have you tried what happens when you add 100 files, git lfs track them, and then push? Does that hit lfs-authenticate 100 times?

  • Author Contributor

    @jacobvosmaer-gitlab that is an interesting point. I'm going to try that today. Now, for the reason you want to have git-lfs-authenticate have it's own API enpoint, is it just to keep the code clean, or do you anticipate it requiring less API calls?

    If the git-lfs-command gets triggered once for every file, then all API endpoints will be called as well. /discover will still be called to get the key info, /allowed would still be called to get project info and if we add /lfs-authenticate as well, it would also be triggered.

    I'll report in a few what happens when uploading hundreds of files.

  • username-removed-283999
  • @patricio It looks good! I had just few minor comments.

  • Author Contributor

    @jacobvosmaer-gitlab I tried uploading 117 image files and it only required 2 git-lfs-authenticate calls. The logs are in this snippet, as they are quite long, but I did see a lot of 401s at first and then a lot of 200s.

    git clonegenerates 117 different git-lfs-authenticate calls, while git lfs fetch is smart and authenticates once and fetches the files in bulk.

  • @patricio that sounds about right. It is a limitation of the way LFS is integrated into Git: during git clone it runs a 'smudge filter' (I think) for each file, and each of those makes a separate LFS call.

    Re separate API: this would also have the advantage of not creating (mostly unused) LFS tokens for every new gitlab-shell connection. I understand that the end result would be more API calls from gitlab-shell (3 instead of 2 as it is now). I think my main motivation is clearer code organization and not creating tons of 99% unused LFS tokens in Redis.

  • Author Contributor

    @jacobvosmaer-gitlab you are right. I hadn't consider that we are creating tokens left and right every time the /discover endpoint is touched. It makes perfect sense to add an extra API endpoint, in that case. I also like the idea of keeping related code better organized. I'm going to add it right now :smiley:

  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 93d1ec11 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
  • Patricio Cano Resolved all discussions

    Resolved all discussions

  • Author Contributor

    @dbalexandre thank you for giving this a review, and sorry for all the changes happening as you were reviewing. I think this should be ready for a final miniboss review. Can you have another look?

  • Reassigned to @dbalexandre

  • @patricio It looks good! I had few remarks.

  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • a209faf3 - Refactored handling of the LfsToken and added functionality to it to simplify external code.
  • Patricio Cano Resolved all discussions

    Resolved all discussions

  • Patricio Cano Added 222 commits:

    Added 222 commits:

    • a209faf3...9ba9e098 - 217 commits from branch master
    • c7b16637 - Added LFS support to SSH
    • ff5c6272 - Added CHANGELOG item and documentation.
    • fdfdd8f6 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • aeaf9c51 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • 6394894e - Refactored handling of the LfsToken and added functionality to it to simplify external code.
  • Patricio Cano Added 1 commit:

    Added 1 commit:

  • Patricio Cano Resolved all discussions

    Resolved all discussions

  • Patricio Cano Added 38 commits:

    Added 38 commits:

    • 3a103b0a...08ab410f - 32 commits from branch master
    • a127c464 - Added LFS support to SSH
    • 5443d9b4 - Added CHANGELOG item and documentation.
    • b0a7ca60 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • 10184e05 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • 8881abd2 - Refactored handling of the LfsToken and added functionality to it to simplify external code.
    • c72cda71 - Improve string handling.
  • Author Contributor

    @jacobvosmaer-gitlab thank you for reviewing the code as well. I'm going to deploy these changes with EE to a test server to test the Kerberos interaction. Once that is done, I'll assign the MR to an endboss for final review.

  • Patricio Cano Added 170 commits:

    Added 170 commits:

    • c72cda71...9f7349f6 - 164 commits from branch master
    • 0dd1c163 - Added LFS support to SSH
    • 60b84094 - Added CHANGELOG item and documentation.
    • a63cc652 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • 16d2cb13 - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • 331e3dfb - Refactored handling of the LfsToken and added functionality to it to simplify external code.
    • ba6fdb9c - Improve string handling.
  • Author Contributor

    @jacobvosmaer-gitlab the assumptions we had during our call today were correct. Even with the Kerberos Alternate Port enabled, the project.http_url_to_repo returns the correct URL, which means LFS objects can be pushed via HTTP using the credentials retrieved via SSH with git-lfs-authenticate.

    I also made sure that the following code does not affect the authentication flow:

    def allow_basic_auth?
      if Gitlab.config.kerberos.enabled && Gitlab.config.kerberos.use_dedicated_port
        !request_uses_kerberos_dedicated_port?
      else
        true
      end
    end

    In our case we do have Kerberos enabled and it has a special port, so we enter the first case of the if clause. But since the HTTP request is made to the regular HTTP port and not the Kerberos HTTP port, !request_uses_kerberos_dedicated_port? will return true, allowing us to continue with the request.

    As we saw today, LFS is broken when cloning and pushing via Kerberos, so I created an issue for it: gitlab-org/gitlab-ee#968

    I think this is ready for final review.

  • Reassigned to @DouweM

  • Author Contributor

    @DouweM can you help me with an endboss review?

  • Douwe Maan Marked this merge request as a Work In Progress

    Marked this merge request as a Work In Progress

  • WIP until https://gitlab.com/gitlab-org/gitlab-shell/merge_requests/86 is merged

    Looks like it's the other way around!

    Edited by Douwe Maan
  • Douwe Maan Unmarked this merge request as a Work In Progress

    Unmarked this merge request as a Work In Progress

  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
    Douwe Maan @DouweM started a thread on the diff
  • 25 25 def lfs_download_access?
    26 26 return false unless project.lfs_enabled?
    27 27
    28 project.public? || ci? || (user && user.can?(:download_code, project))
    28 return true if project.public? || ci? || lfs_deploy_key?
    • I wish we could reuse Gitlab::GitAccess here and in lfs_check_access!, because that already has all of the checks for deploy keys etc.

    • Author Contributor

      Any ideas how we could achieve that?

    • I agree that would be nice but it may be too much for this MR.

      One interesting problem to solve there is that GitAccess does authorization but in GitHttpClientController we are also in the middle authentication. Specifically, we skip authentication on read access to public projects. SSH access to GitLab on the other hand is always fully authenticated before it hits authorization.

      Edited by Jacob Vosmaer (GitLab)
    • Yeah, it would be a bit much for this MR, but there is an opportunity for future refactoring there.

    • Please register or sign in to reply
  • Douwe Maan
  • Douwe Maan
  • Reassigned to @patricio

  • Douwe Maan
  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 5f035973 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 941bbc2c - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
  • Reassigned to @DouweM

  • Author Contributor

    @DouweM I addressed (most of) your comments, please have another look :smile:

  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 2bbdab88 - Use special characters for lfs+deploy-key to prevent a someone from creating a…
  • mentioned in issue #13541 (moved)

  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Reassigned to @patricio

  • Patricio Cano Added 252 commits:

    Added 252 commits:

    • 2bbdab88...5533fd17 - 243 commits from branch master
    • adcda5ca - Added LFS support to SSH
    • beb563d1 - Added CHANGELOG item and documentation.
    • 3f70ddff - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • b913628f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • 2ef97a45 - Refactored handling of the LfsToken and added functionality to it to simplify external code.
    • fb487ffa - Improve string handling.
    • 3bf0e73b - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
    • d9af8462 - Use special characters for lfs+deploy-key to prevent a someone from creating a…
    • 07e1eaea - Further refactoring of authentication code, and code style fixes.
  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 489381cf - Further refactoring of authentication code, and code style fixes.
  • Patricio Cano Added 152 commits:

    Added 152 commits:

    • 489381cf...3de4e8b5 - 143 commits from branch master
    • 4b4b240d - Added LFS support to SSH
    • 3e197ce6 - Added CHANGELOG item and documentation.
    • 05ffdc56 - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • 9ccf672f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • ee65e8f0 - Refactored handling of the LfsToken and added functionality to it to simplify external code.
    • 11e12249 - Improve string handling.
    • 8bc55ff7 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
    • abd5ed33 - Use special characters for lfs+deploy-key to prevent a someone from creating a…
    • ba544866 - Further refactoring of authentication code, and code style fixes.
  • Reassigned to @DouweM

  • Author Contributor

    @DouweM can you give this another review, please?

  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
    Douwe Maan @DouweM started a thread on the diff
  • 118 110 @ci.present?
    119 111 end
    120 112
    113 def user
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Douwe Maan
  • Reassigned to @patricio

  • Mentioned in merge request !5735 (closed)

  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 93556477 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
  • Reassigned to @DouweM

  • Author Contributor

    @DouweM done. I truly believe we can merge this now :smiley:

  • Patricio Cano Added 150 commits:

    Added 150 commits:

    • 93556477...f8bd9625 - 140 commits from branch master
    • e40e3fdc - Added LFS support to SSH
    • 372be2d2 - Added CHANGELOG item and documentation.
    • cb85cf1f - Refactor LFS token logic to use a Redis key instead of a DB field, making it a 1 use only token.
    • 48f1a61f - Refactored LFS auth logic when using SSH to use its own API endpoint `/lfs_authe…
    • c25630ee - Refactored handling of the LfsToken and added functionality to it to simplify external code.
    • 85152f02 - Improve string handling.
    • c144db29 - Better authentication handling, syntax fixes and better actor handling for LFS Tokens
    • 71aff7f6 - Use special characters for lfs+deploy-key to prevent a someone from creating a…
    • de24075e - Further refactoring of authentication code, and code style fixes.
    • be09bcf0 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
  • Kamil Trzcińśki
  • Patricio Cano Added 1 commit:

    Added 1 commit:

    • 3d933082 - Refactored authentication code to make it a bit clearer, added test for wrong SSH key.
  • A few comments:

    • there's no specs for testing usage of :lfs_token,
    • there's only read test for lfs+deploy-key,
    • there's no test for denial of push access of lfs+deploy-key,
    • (minor) the :lfs_token and :lfs_deploy_token allows to access all GIT repositories, where it should probably be limited only for fetching LFS objects (it seems pretty unlikely to steal this token),
    • (minor) the :lfs_token allows to push to GIT repositories as an user,
    • (probably not important) current implementation doesn't allow to concurrently run two git clones, because token is overwritten,

    The implementation seems to be secure, it only exposes GIT repositories for :lfs_token and :lfs_deploy_token.

  • Mentioned in commit 92a2a0fe

  • Mentioned in commit 6d43c95b

  • Kamil Trzcińśki Mentioned in merge request !6409 (merged)

    Mentioned in merge request !6409 (merged)

  • Mentioned in commit fe084819

  • username-removed-128633 Status changed to merged

    Status changed to merged

  • Wow, I didn't merge this merge request, I only merged !6409 (merged)... :(

  • Ok. We hit the into messy situation.

    My !6409 (merged) had changes from this branch the base changes for Authorization module, with a commit changes related to LFS reverted. This lead to GitLab to discover that this MR is also merged.

  • Author Contributor

    @ayufan is the code of this MR fully merged, or just partially?

    EDIT: Going by the commits in !6409 (merged), it would seem this is fully merged.

    Edited by Patricio Cano
  • Mentioned in commit 3c1bb343

  • Author Contributor

    @ayufan you reverted the commits, but the MR still shows as merged. Should I resubmit? Rebase? The commits/changes are still there, they just have been reverted by new commits. What is the best way forward here?

    cc @rymai

    Edited by Patricio Cano
  • Patricio Cano Added ~149423 label

    Added ~149423 label

  • Mentioned in issue #22334 (closed)

  • Kamil Trzcińśki Mentioned in merge request !6413 (merged)

    Mentioned in merge request !6413 (merged)

  • Mentioned in commit 7c7c89ac

  • Douwe Maan Mentioned in commit 667d2350

    Mentioned in commit 667d2350

  • Mentioned in merge request gitlab-shell!86 (merged)

  • Kamil Trzcińśki Removed ~149423 label

    Removed ~149423 label

  • Mentioned in commit gitlab-shell@6c7f7b4d

  • Douwe Maan Mentioned in commit 7fd0e153

    Mentioned in commit 7fd0e153

  • Stan Hu Mentioned in issue #14278 (closed)

    Mentioned in issue #14278 (closed)

  • Just wanted to thank all involved in getting this implemented. I can now use Unity Cloud build for my projects. Very grateful for all your hard work

    Edited by username-removed-90271
  • Please register or sign in to reply
    Loading