Skip to content
Snippets Groups Projects

Add foreign keys to various tables that point to the "projects" table

Merged yorickpeterse-staging requested to merge foreign-keys-for-project-model into master

This adds foreign keys to various tables that have project_id columns referring to the projects table. All these foreign keys have a ON DELETE CASCADE clause set, making it much easier and faster to remove data associated with a project (while also enforcing consistency). The MR includes a rather big migration to do all of this without requiring downtime and while making sure no orphaned data exists.

Some assocations are still removed by Rails. For example, LFS objects are still removed one by one as for every row we also need to remove data on the file system and there's no easy way of doing this in bulk. The same applies to CI artifacts and traces, which need to be migrated directory wise first (taken care of in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/11641).

The EE version of this MR (to deal with EE code such as ElasticSearch) can be found here: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/2223

Related issues/MRs:

Migration Timings

Migration Time on Staging
ProjectForeignKeysWithCascadingDeletes 60 minutes at least
CorrectProtectedBranchesForeignKeys 1.6 seconds
AddForeignKeyForMergeRequestDiffs 60 seconds

The migration ProjectForeignKeysWithCascadingDeletes had to be run 3 times as the first time it did not take care of orphans in the protected_branch_push_access_levels table, leading to it failing when it tried to remove orphans from protected_branches. The second time it failed because a table had orphans again that were added after the last removal. The 3rd time it took 30 minutes to complete.

Edited by yorickpeterse-staging

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • mentioned in merge request !8196 (merged)

  • added 1571 commits

    Compare with previous version

  • added 1 commit

    • be647dd0 - Add many foreign keys to the projects table

    Compare with previous version

  • added 1 commit

    • e10d7c1a - Add many foreign keys to the projects table

    Compare with previous version

  • yorickpeterse-staging changed milestone to %9.1

    changed milestone to %9.1

  • added 1779 commits

    Compare with previous version

  • mentioned in commit 47237ad7

  • mentioned in commit 1b620a4d

  • mentioned in commit 962bf01e

  • yorickpeterse-staging changed milestone to %9.2

    changed milestone to %9.2

  • added availability ~18308 labels

  • To recap, this is currently blocked by CI builds having data on the file system in multiple places. To allow PostgreSQL to remove ci_builds rows we need to be able to remove these build files (e.g. traces) without having to rely on DB rows (as these are removed at this point). The easiest way to do so is to store all these files in a directory scoped per project ID, that way we can just nuke the entire directory in one go. This however requires that we first move all existing files into the right place.

    Still removing CI builds one by one in Rails is not an option as this can still have a negative impact on both performance and availability (e.g. try removing 20 000 rows that way).

  • In other words, we need a file structure that looks like this:

    shared/
      ci/
        123/
          artifacts/
          traces/
          kittens/
        13083/
          artifacts/
            123.txt
          traces/
            456.txt
          kittens/

    In this setup removing the files is just a matter of rm -rf shared/ci/13083 with 13083 being the ID of the project to remove.

  • Looking at the code and @ayufan's suggestion above I think we're not blocked, instead we can use globs and run something like this:

    rm -rf shared/artifacts/*/project-id-or-ci-id-here
    rm -rf shared/builds/*/project-id-or-ci-id-here

    This should take care of removing the trace data, the artifacts file, and the artifacts metadata.

    Edited by yorickpeterse-staging
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading