Add mirror related indexes
These indexes will be needed because of the two SLAs that were set in https://gitlab.com/gitlab-org/gitlab-ee/issues/3641#note_42534211 which can be seen in more detail here: https://gitlab.com/gitlab-cookbooks/gitlab-prometheus/merge_requests/319 and here: https://gitlab.com/gitlab-cookbooks/gitlab-prometheus/merge_requests/320
Before on staging:
QUERY PLAN
---------------------------------------------------------------------------------------------------
Aggregate (cost=87173.87..87173.88 rows=1 width=8)
-> Nested Loop (cost=0.43..87173.86 rows=1 width=0)
-> Seq Scan on project_mirror_data (cost=0.00..1594.16 rows=23895 width=4)
Filter: (next_execution_timestamp <= (now() - '00:10:00'::interval))
-> Index Scan using projects_pkey on projects (cost=0.43..3.57 rows=1 width=4)
Index Cond: (id = project_mirror_data.project_id)
Filter: (mirror AND ((import_status)::text <> ALL ('{scheduled,started}'::text[])))
After on staging:
QUERY PLAN
---------------------------------------------------------------------------------------------------
Aggregate (cost=7.34..7.35 rows=1 width=8)
-> Nested Loop (cost=0.41..7.34 rows=1 width=0)
-> Index Scan using stark_tmp4 on projects (cost=0.12..3.02 rows=1 width=4)
Filter: ((import_status)::text <> ALL ('{scheduled,started}'::text[]))
-> Index Scan using index_project_mirror_data_on_project_id on project_mirror_data (cost=0.29..4.31 rows=1 width=4)
Index Cond: (project_id = projects.id)
Filter: (next_execution_timestamp <= (now() - '00:10:00'::interval))
One of these may also address one of the heavy hitter queries from production:
SELECT COUNT(*) FROM "projects" INNER JOIN "project_mirror_data" ON "project_mirror_data"."project_id" = "projects"."id" WHERE "projects"."mirror" = ? AND ("projects"."import_status" NOT IN (?,?)) AND (next_execution_timestamp <= ?);
However testing this on staging has proven fruitless as all mirroring is turned off on staging so all the next_execution_timestamp
fields are set to old values.
== 20171006114203 AddIndexToProjectMirrorDataNextExecutionTimestamp: migrating
-- transaction_open?()
-> 0.0000s
-- execute("SET statement_timeout TO 0")
-> 0.0004s
-- add_index(:project_mirror_data, :next_execution_timestamp, {:algorithm=>:concurrently})
-> 0.0438s
== 20171006114203 AddIndexToProjectMirrorDataNextExecutionTimestamp: migrated (0.0445s)
== 20171006121407 AddIndexToProjectMirrorLastUpdateAt: migrating ==============
-- transaction_open?()
-> 0.0000s
-- execute("SET statement_timeout TO 0")
-> 0.0003s
-- add_index(:projects, [:mirror, :mirror_last_update_at], {:where=>"mirror", :algorithm=>:concurrently})
-> 0.0127s
== 20171006121407 AddIndexToProjectMirrorLastUpdateAt: migrated (0.0132s) =====
Remove this section and replace it with a description of your MR. Also follow the checklist below and check off any tasks that are done. If a certain task can not be done you should explain so in the MR body. You are free to remove any sections that do not apply to your MR.
When gathering statistics (e.g. the output of EXPLAIN ANALYZE
) you should make
sure your database has enough data. Having around 10 000 rows in the tables
being queries should provide a reasonable estimate of how a query will behave.
Also make sure that PostgreSQL uses the following settings:
-
random_page_cost
:1
-
work_mem
:16MB
-
maintenance_work_mem
: at least64MB
-
shared_buffers
: at least256MB
If you have access to GitLab.com's staging environment you should also run your measurements there, and include the results in this MR.
Database Checklist
When adding migrations:
-
Updated db/schema.rb
-
Added a down
method so the migration can be reverted -
Added the output of the migration(s) to the MR body -
Added the execution time of the migration(s) to the MR body -
Added tests for the migration in spec/migrations
if necessary (e.g. when migrating data) -
Made sure the migration won't interfere with a running GitLab cluster, for example by disabling transactions for long running migrations
When adding or modifying queries to improve performance:
-
Included the raw SQL queries of the relevant queries -
Included the output of EXPLAIN ANALYZE
and execution timings of the relevant queries -
Added tests for the relevant changes
When adding indexes:
-
Described the need for these indexes in the MR body -
Made sure existing indexes can not be reused instead
When adding foreign keys to existing tables:
-
Included a migration to remove orphaned rows in the source table -
Removed any instances of dependent: ...
that may no longer be necessary
When adding tables:
-
Ordered columns based on their type sizes in descending order -
Added foreign keys if necessary -
Added indexes if necessary
When removing columns, tables, indexes or other structures:
-
Removed these in a post-deployment migration -
Made sure the application no longer uses (or ignores) these structures
General Checklist
-
Changelog entry added, if necessary -
Documentation created/updated -
API support added -
Tests added for this feature/bug - Review
-
Has been reviewed by UX -
Has been reviewed by Frontend -
Has been reviewed by Backend -
Has been reviewed by Database
-
-
Conform by the merge request performance guides -
Conform by the style guides -
Squashed related commits together