Resolve "GitHub import should fetch 100 results per page to limit the change to hit rate limiting"
What does this MR do?
See commits for details.
When importing https://github.com/guard/guard/pulls?q=is%3Apr+sort%3Acreated-desc+is%3Aopen, before the change:
Fetched pull requests in 93.480000 12.430000 117.930000 (1051.885330)
Fetched issues in 62.570000 2.390000 64.960000 (1142.657154)
Import finished. Timings: 158.910000 15.060000 186.280000 (2208.732847)
After the change:
Fetched pull requests in 89.090000 12.090000 113.160000 (698.703639)
Fetched issues in 65.940000 2.540000 68.480000 (1134.679727)
Import finished. Timings: 158.230000 14.830000 185.350000 (1843.907216)
That's a 17% improvement!
Are there points in the code the reviewer needs to double check?
From https://gitlab.com/gitlab-org/gitlab-ce/issues/38198#note_40975277:
We do need to check that this does not increase memory usage much, though.
Locally the memory went from 302 MB to 441 MB at most with this change. Before the change, the memory went from 301 MB to 432 MB at most.
Why was this MR needed?
To improve the performance of the GitHub import.
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary - Review
-
Has been reviewed by Backend -
Has been reviewed by Database
-
-
Conform by the merge request performance guides -
Conform by the style guides -
Squashed related commits together
What are the relevant issue numbers?
Closes #38198 (closed) and implements first (simplest) solution of #38200.
Edited by username-removed-128633