The only way to optimize code like this is to either use a faster storage mechanism, or run it directly on the physical disk(s) storing the Git repository. The latter in turn requires some kind of API/service to run on said servers that takes a Git action to run (e.g. via HTTP) and spits out the result.
Rugged::Diff#each and Rugged::Diff#each_patch can be quite slow (up to 30 seconds).
I think this is case where Gitaly will come into play, unless we can somehow optimize libgit2 or NFS. In looking at the strace logs, I believe it just takes a long time to seek/mmap big files over a networked file system, just as we saw in https://gitlab.com/gitlab-org/gitlab-ee/issues/1811#note_26109097. If you look at the strace logs, you see large blocks of time taken like this:
libgit2 is trying to memory-map large pack files over NFS, which causes a lot of round-trips and network transfer. The more files there are in the diff, the more times we have to do this dance.
@stanhu thanks. I can just about tell from the screenshot that the problem appears to be in getting the changes count (?), as we load the changes tab async on the new MR page. I agree with everything in your previous comment, but is there anything here we think was caused by 9.0 specifically?
Error 502 when creating a merge request through the UI or API. Some of these are on the gitlab-ce project.
For the UI I've had 502s for #new, when changing source branch and might also have had it for create. For the API it would have been the equivalent of #create.
This may or may not correlate with high load. A few times there have been questions about 502s for merge requests a couple of mins before a more general outage.
@eReGeBe we tend to refer to migrations by their GitLab::Git migration site. Do you know where Rugged::Diff#find_similar! is being called from in our codebase?
@andrewnGitlab::Git::Repository#diff_patches. I don't see it on the migration board, so I'm guessing we haven't include it. It looks easy to do with gitaly-ruby.
@eReGeBe it needs to be migrated and with gitaly-ruby it is now possible. But the way the code is structured makes it hard as it is. Too many lazy interactions.
Gitlab::Git::Repository#diff_patches. I don't see it on the migration board, so I'm guessing we haven't include it. It looks easy to do with gitaly-ruby.
GitLab is moving all development for both GitLab Community Edition
and Enterprise Edition into a single codebase. The current
gitlab-ce repository will become a read-only mirror, without any
proprietary code. All development is moved to the current
gitlab-ee repository, which we will rename to just gitlab in the
coming weeks. As part of this migration, issues will be moved to the
current gitlab-ee project.
If you have any questions about all of this, please ask them in our
dedicated FAQ issue.
Using "gitlab" and "gitlab-ce" would be confusing, so we decided to
rename gitlab-ce to gitlab-foss to make the purpose of this FOSS
repository more clear
I created a merge requests for CE, and this got closed. What do I
need to do?
Everything in the ee/ directory is proprietary. Everything else is
free and open source software. If your merge request does not change
anything in the ee/ directory, the process of contributing changes
is the same as when using the gitlab-ce repository.
Will you accept merge requests on the gitlab-ce/gitlab-foss project
after it has been renamed?
No. Merge requests submitted to this project will be closed automatically.
Will I still be able to view old issues and merge requests in
gitlab-ce/gitlab-foss?
Yes.
How will this affect users of GitLab CE using Omnibus?
No changes will be necessary, as the packages built remain the same.
How will this affect users of GitLab CE that build from source?
Once the project has been renamed, you will need to change your Git
remotes to use this new URL. GitLab will take care of redirecting Git
operations so there is no hard deadline, but we recommend doing this
as soon as the projects have been renamed.
Where can I see a timeline of the remaining steps?