Geo: when clone_url_prefix is nil synchronization fails without useful error message
Summary
As part of the Geo Log Cursor changes we've introduced the clone_url_prefix
setting in the database. This allows secondary nodes to figure out where to clone from. Previously we would send this information through System Hooks.
Steps to reproduce
If you are using 9.3.x without the geo_*_role['enable'] = true
, in the primary node, you will end up not setting an important flag in the gitlab.yml
which will not update clone_url_prefix
when machine is restarted.
What is the current bug behavior?
Repository sync will not work, and the statistics will show as none of your repositories being in sync.
When checking the logs you will find something similar to this in the gitlab-rails production.log
:
Geo::RepositorySyncService: Updating wiki sync information for project gitlab/gitlab-ee (3)
Geo::RepositorySyncService: Finished repository sync for project gitlab/gitlab-ee (3)
Geo::RepositorySyncService: Releasing leases to sync repository for project gitlab/gitlab-ee (3)
Geo::RepositorySyncService: Trying to obtain lease to sync repository for project gitlab-org/gitlab-ce (5)
Geo::RepositorySyncService: Started repository sync for project gitlab-org/gitlab-ce (5)
Geo::RepositorySyncService: Fetching project repository for project gitlab-org/gitlab-ce (5)
Geo::RepositorySyncService: Error syncing repository for project gitlab-org/gitlab-ce: fatal: 'gitlab-org/gitlab-ce.git' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
What is the expected correct behavior?
Instead of trying to clone from a remote without the clone_url_prefix
part, we should error out something like: "Cannot sync repositories as there is no clone_url_prefix
in the primary node database. Check Geo installation details or try restarting your primary node"
This is a fallback for %9.3 branch as we intend to fix the reason why this could happen in a %9.4 branch. See https://gitlab.com/gitlab-org/gitlab-ee/issues/2826
Possible fixes
We should check first if the clone_url_prefix
is defined and abort the sync with a reasonable error message telling the user why, instead of trying to clone from there.