Reschedule MR diff background migrations
On GitLab.com, we haven't migrated all of the MR diff rows: https://gitlab.com/gitlab-com/infrastructure/issues/2663#note_39244715
As that comment says, we'll try to fix this in 10.1:
- We'll create a new migration to do the scheduling. This will reschedule any un-migrated rows. For users who haven't finished the previous migration (say, they upgraded 9.4 -> 10.1) this will lead to double scheduling, but the migration is already robust to retries, so any duplicate jobs will finish quickly.
- This migration will schedule with a smaller batch size - possibly much smaller.
- We will keep substantial gaps between jobs, as we expect the leftover jobs to be the slowest: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/13661#note_37915891
- We will change the background migration to not do everything in a single transaction, but break up inserts into as small COMMITs as sensible (1,000 rows).
- This also involves some basic duplicate checking; or we could just insert and handle the error if there's a uniqueness validation failure.
- We will also add a custom exception class to wrap any errors thrown by the background migration. This will help us:
- Log those errors in an easily-searchable way.
- Find them better in Sentry.
Edited by username-removed-443319