Geo: Files are being created on disc during sync with wrong contents
Summary
We just added LFS objects and attachments to the Geo testbed: https://gitlab.com/gitlab-com/infrastructure/issues/2805#note_42261401
Tracking LFS object with database ID 2559490 (LfsObject.find(2559490)
), I noticed that this genuinely doesn't exist on prod.geo for whatever reason. A job has run to synchronize it on sync.geo
, which failed with a 404, but there exists a file in the right place on sync.geo.gitlab.com
with these contents:
{"code":"not_found","message":"LFS object does not have a file"}
Steps to reproduce
This will be seen whenever a response to a file sync attempt (of any object type) returns anything but the expected file contents.
What is the current bug behavior?
Incorrect data is written to disc
What is the expected correct behavior?
If the object is not found on the primary, we should not write anything to disc on the secondary
Possible fixes
https://gitlab.com/gitlab-org/gitlab-ee/blob/v10.0.0-ee/lib/gitlab/geo/transfer.rb#L63
If the response code is anything but a 200, we can discard the request body (possibly after logging it) without streaming it to disc.
Streaming to disc is the right thing to do in the 200 case as it means we don't hold the entire response in memory. If receiving the response headers is somehow inseparable from receiving the response body (and it shouldn't be!), we can stream to a temporary file then mv
it into place.
I expect that if we try to download one of these unsuccessful files from the sync node, we'll get back the bad data rather than a 404 or 500 response. This suggests that when we fix this, we'll need to include some means of wiping the bad files :-/
/cc @stanhu @dbalexandre @to1ne @brodock
/cc @stanhu @