Geo: Backfill repositories from primary node without using rsync
We have a documented step that requires anyone using Geo (#76) to rsync repositories to backfill a new node, as we currently don't make any kind of check to detect this automatically.
There are two simple ways to make this happen:
We can iterate from the secondary node to all Project
we currently know from the database and schedule the same update job we use when receiving the update notification, this has the benefit of us being able to "detect" if a repository is empty / not created and try to update only this (considering that a non empty project is a project that has already received a recent update, and so, there is no need to backfill for the purpose of initial replication).
We can iterate from the primary node to all Project
and send notifications to a specific secondary one (this is a more "brute force" way of handling the situation, but can be triggered as part of "enabling" a new node step right from the interface).
(Actually the job in the secondary node could too, but it's just more convenient to do it in the primary).
Both should be made with the use of "jobs scheduling more jobs".
cc @sytses @stanhu @patricio (based on the discussion: https://gitlab.com/gitlab-com/infrastructure/issues/415#note_17370229)