Skip to content
Snippets Groups Projects
Unverified Commit 71e34356 authored by Shubham Kumar's avatar Shubham Kumar Committed by GitLab
Browse files

Add and backfill project_id for container_repository_states

## What does this MR do and why?

Add and backfill project_id for container_repository_states.

This table has a
[desired sharding key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key)
configured ([view configuration](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/container_repository_states.yml)).

This merge request is the first step towards transforming the desired sharding key into a
[sharding key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#defining-a-sharding-key-for-all-cell-local-tables).

This involves three changes:

- Adding a new column that will serve as the sharding key (along with the relevant index and foreign key).
- Populating the sharding key when new records are created by adding a database function and trigger.
- Scheduling a [batched background migration](https://docs.gitlab.com/ee/development/database/batched_background_migrations.html)
  to set the sharding key for existing records.

Once the background migration has completed, a second merge request will be created to finalize the background
migration and validate the not null constraint.

## How to verify

We have assigned a random backend engineer from ~"group::geo" to review these changes. Please review this merge
request from a ~backend perspective. The main thing we are looking to verify is that the added column and association
match the values specified by the [desired sharding key](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/container_repository_states.yml)
configuration and that backfilling the column from this other table makes sense in the context of this feature.

When you are finished, please:

1. Trigger the [database testing pipeline](https://docs.gitlab.com/ee/development/database/database_migration_pipeline.html)
   as instructed by Danger.
1. Request a review from the ~backend maintainer and ~database reviewer suggested by Danger.

If you have any questions or concerns, reach out to `@tigerwnz` or @shubhamkrai.

This merge request was generated by a once off keep implemented in
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/143774

This change was generated by
[gitlab-housekeeper](https://gitlab.com/gitlab-org/gitlab/-/tree/master/gems/gitlab-housekeeper)
using the Keeps::BackfillDesiredShardingKeySmallTable keep.

To provide feedback on your experience with `gitlab-housekeeper` please create an issue with the
label ~"GitLab Housekeeper" and consider pinging the author of this keep.

Changelog: other
parent f4b8d893
No related branches found
No related tags found
No related merge requests found
Showing
with 203 additions and 0 deletions
---
migration_job_name: BackfillContainerRepositoryStatesProjectId
description: Backfills sharding key `container_repository_states.project_id` from `container_repositories`.
feature_category: geo_replication
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/169240
milestone: '17.6'
queued_migration_version: 20241015082361
finalized_by: # version of the migration that finalized this BBM
Loading
Loading
@@ -19,3 +19,4 @@ desired_sharding_key:
table: container_repositories
sharding_key: project_id
belongs_to: container_repository
desired_sharding_key_migration_job_name: BackfillContainerRepositoryStatesProjectId
# frozen_string_literal: true
class AddProjectIdToContainerRepositoryStates < Gitlab::Database::Migration[2.2]
milestone '17.6'
def change
add_column :container_repository_states, :project_id, :bigint
end
end
# frozen_string_literal: true
class IndexContainerRepositoryStatesOnProjectId < Gitlab::Database::Migration[2.2]
milestone '17.6'
disable_ddl_transaction!
INDEX_NAME = 'index_container_repository_states_on_project_id'
def up
add_concurrent_index :container_repository_states, :project_id, name: INDEX_NAME
end
def down
remove_concurrent_index_by_name :container_repository_states, INDEX_NAME
end
end
# frozen_string_literal: true
class AddContainerRepositoryStatesProjectIdFk < Gitlab::Database::Migration[2.2]
milestone '17.6'
disable_ddl_transaction!
def up
add_concurrent_foreign_key :container_repository_states, :projects, column: :project_id, on_delete: :cascade
end
def down
with_lock_retries do
remove_foreign_key :container_repository_states, column: :project_id
end
end
end
# frozen_string_literal: true
class AddContainerRepositoryStatesProjectIdTrigger < Gitlab::Database::Migration[2.2]
milestone '17.6'
def up
install_sharding_key_assignment_trigger(
table: :container_repository_states,
sharding_key: :project_id,
parent_table: :container_repositories,
parent_sharding_key: :project_id,
foreign_key: :container_repository_id
)
end
def down
remove_sharding_key_assignment_trigger(
table: :container_repository_states,
sharding_key: :project_id,
parent_table: :container_repositories,
parent_sharding_key: :project_id,
foreign_key: :container_repository_id
)
end
end
# frozen_string_literal: true
class QueueBackfillContainerRepositoryStatesProjectId < Gitlab::Database::Migration[2.2]
milestone '17.6'
restrict_gitlab_migration gitlab_schema: :gitlab_main_cell
MIGRATION = "BackfillContainerRepositoryStatesProjectId"
DELAY_INTERVAL = 2.minutes
BATCH_SIZE = 1000
SUB_BATCH_SIZE = 100
def up
queue_batched_background_migration(
MIGRATION,
:container_repository_states,
:container_repository_id,
:project_id,
:container_repositories,
:project_id,
:container_repository_id,
job_interval: DELAY_INTERVAL,
batch_size: BATCH_SIZE,
sub_batch_size: SUB_BATCH_SIZE
)
end
def down
delete_batched_background_migration(
MIGRATION,
:container_repository_states,
:container_repository_id,
[
:project_id,
:container_repositories,
:project_id,
:container_repository_id
]
)
end
end
b9f5d2ca9d39c79c8f959cf10ef337e4cb8c9f814db7b75a1df31905ade33092
\ No newline at end of file
d30752181eecdebf3b831f661041f0334bf77e93955fe4e38518e40c92179368
\ No newline at end of file
aedd6ae050b120a58d190d304cd5291c58e2580174a42775a6d0f89d81a16f88
\ No newline at end of file
a4ae12a361d0ec188a8ee0fba7befd753016014bf7e0e31008bdfb7592a75e18
\ No newline at end of file
3748a53af91a3292c72687ac815b8a1291dd54f73337b6d2d584e4946a095e8a
\ No newline at end of file
Loading
Loading
@@ -2635,6 +2635,22 @@ RETURN NEW;
END
$$;
 
CREATE FUNCTION trigger_fd4a1be98713() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
IF NEW."project_id" IS NULL THEN
SELECT "project_id"
INTO NEW."project_id"
FROM "container_repositories"
WHERE "container_repositories"."id" = NEW."container_repository_id";
END IF;
RETURN NEW;
END
$$;
CREATE FUNCTION trigger_ff16c1fd43ea() RETURNS trigger
LANGUAGE plpgsql
AS $$
Loading
Loading
@@ -9889,6 +9905,7 @@ CREATE TABLE container_repository_states (
verification_retry_count smallint DEFAULT 0 NOT NULL,
verification_checksum bytea,
verification_failure text,
project_id bigint,
CONSTRAINT check_c96417dbc5 CHECK ((char_length(verification_failure) <= 255))
);
 
Loading
Loading
@@ -28714,6 +28731,8 @@ CREATE INDEX index_container_repository_states_failed_verification ON container_
 
CREATE INDEX index_container_repository_states_needs_verification ON container_repository_states USING btree (verification_state) WHERE ((verification_state = 0) OR (verification_state = 3));
 
CREATE INDEX index_container_repository_states_on_project_id ON container_repository_states USING btree (project_id);
CREATE INDEX index_container_repository_states_on_verification_state ON container_repository_states USING btree (verification_state);
 
CREATE INDEX index_container_repository_states_pending_verification ON container_repository_states USING btree (verified_at NULLS FIRST) WHERE (verification_state = 0);
Loading
Loading
@@ -33992,6 +34011,8 @@ CREATE TRIGGER trigger_fbd42ed69453 BEFORE INSERT OR UPDATE ON external_status_c
 
CREATE TRIGGER trigger_fbd8825b3057 BEFORE INSERT OR UPDATE ON boards_epic_board_labels FOR EACH ROW EXECUTE FUNCTION trigger_fbd8825b3057();
 
CREATE TRIGGER trigger_fd4a1be98713 BEFORE INSERT OR UPDATE ON container_repository_states FOR EACH ROW EXECUTE FUNCTION trigger_fd4a1be98713();
CREATE TRIGGER trigger_ff16c1fd43ea BEFORE INSERT OR UPDATE ON geo_event_log FOR EACH ROW EXECUTE FUNCTION trigger_ff16c1fd43ea();
 
CREATE TRIGGER trigger_fff8735b6b9a BEFORE INSERT OR UPDATE ON vulnerability_finding_signatures FOR EACH ROW EXECUTE FUNCTION trigger_fff8735b6b9a();
Loading
Loading
@@ -34652,6 +34673,9 @@ ALTER TABLE ONLY ci_pipeline_chat_data
ALTER TABLE ONLY cluster_agent_tokens
ADD CONSTRAINT fk_64f741f626 FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE;
 
ALTER TABLE ONLY container_repository_states
ADD CONSTRAINT fk_6591698505 FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE;
ALTER TABLE ONLY import_placeholder_memberships
ADD CONSTRAINT fk_66286fb5e6 FOREIGN KEY (group_id) REFERENCES namespaces(id) ON DELETE CASCADE;
 
# frozen_string_literal: true
module Gitlab
module BackgroundMigration
class BackfillContainerRepositoryStatesProjectId < BackfillDesiredShardingKeyJob
operation_name :backfill_container_repository_states_project_id
feature_category :geo_replication
end
end
end
# frozen_string_literal: true
require 'spec_helper'
RSpec.describe Gitlab::BackgroundMigration::BackfillContainerRepositoryStatesProjectId,
feature_category: :geo_replication,
schema: 20241015082357 do
include_examples 'desired sharding key backfill job' do
let(:batch_table) { :container_repository_states }
let(:batch_column) { :container_repository_id }
let(:backfill_column) { :project_id }
let(:backfill_via_table) { :container_repositories }
let(:backfill_via_column) { :project_id }
let(:backfill_via_foreign_key) { :container_repository_id }
end
end
# frozen_string_literal: true
require 'spec_helper'
require_migration!
RSpec.describe QueueBackfillContainerRepositoryStatesProjectId, feature_category: :geo_replication do
let!(:batched_migration) { described_class::MIGRATION }
it 'schedules a new batched migration' do
reversible_migration do |migration|
migration.before -> {
expect(batched_migration).not_to have_scheduled_batched_migration
}
migration.after -> {
expect(batched_migration).to have_scheduled_batched_migration(
table_name: :container_repository_states,
column_name: :container_repository_id,
interval: described_class::DELAY_INTERVAL,
batch_size: described_class::BATCH_SIZE,
sub_batch_size: described_class::SUB_BATCH_SIZE,
gitlab_schema: :gitlab_main_cell,
job_arguments: [
:project_id,
:container_repositories,
:project_id,
:container_repository_id
]
)
}
end
end
end
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment