Skip to content

Add (partial) index on Labels.template

Gregory Stark requested to merge add-labels-template-index into master

The top query of sequential i/o is:

SELECT "labels".* FROM "labels" WHERE "labels"."template" = 't'  ORDER BY "labels"."title" ASC

We have zero rows in gitlab.com where template but even on a site which uses templates heavily this will be a small portion of the labels table so this query is doing a lot of i/o for nothing. This MR creates a partial index WHERE template which will be empty or tiny and make this query go from being a top i/o consumer to being nearly free.

== 20170927122209 AddPartialIndexForLabelsTemplate: migrating =================
-- transaction_open?()
   -> 0.0000s
-- execute("SET statement_timeout TO 0")
   -> 0.0004s
-- add_index("labels", ["template"], {:where=>"template", :algorithm=>:concurrently})
   -> 0.0241s
== 20170927122209 AddPartialIndexForLabelsTemplate: migrated (0.0249s) ========

In staging:

gitlabhq_production=# explain analyze SELECT "labels".* FROM "labels" WHERE "labels"."template" = 't'  ORDER BY "labels"."title" ASC;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Sort  (cost=76263.82..76263.82 rows=1 width=127) (actual time=271.830..271.830 rows=0 loops=1)
   Sort Key: title
   Sort Method: quicksort  Memory: 25kB
   ->  Seq Scan on labels  (cost=0.00..76263.81 rows=1 width=127) (actual time=271.799..271.799 rows=0 loops=1)
         Filter: template
         Rows Removed by Filter: 1772576
 Planning time: 0.094 ms
 Execution time: 271.879 ms
(8 rows)

gitlabhq_production=# begin;
BEGIN

gitlabhq_production=# create index tmp on labels (template) where (template);
CREATE INDEX

gitlabhq_production=# explain analyze SELECT "labels".* FROM "labels" WHERE "labels"."template" = 't'  ORDER BY "labels"."title" ASC;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Sort  (cost=2.15..2.15 rows=1 width=127) (actual time=0.011..0.011 rows=0 loops=1)
   Sort Key: title
   Sort Method: quicksort  Memory: 25kB
   ->  Index Scan using tmp on labels  (cost=0.12..2.14 rows=1 width=127) (actual time=0.002..0.002 rows=0 loops=1)
 Planning time: 0.361 ms
 Execution time: 0.041 ms
(6 rows)

gitlabhq_production=# abort;
ROLLBACK

Database Checklist

When adding migrations:

  • Updated db/schema.rb
  • Added a down method so the migration can be reverted
  • Added the output of the migration(s) to the MR body
  • Added the execution time of the migration(s) to the MR body
  • Added tests for the migration in spec/migrations if necessary (e.g. when migrating data)
  • Made sure the migration won't interfere with a running GitLab cluster, for example by disabling transactions for long running migrations

When adding or modifying queries to improve performance:

  • Included the raw SQL queries of the relevant queries
  • Included the output of EXPLAIN ANALYZE and execution timings of the relevant queries
  • Added tests for the relevant changes

When adding indexes:

  • Described the need for these indexes in the MR body
  • Made sure existing indexes can not be reused instead

When adding foreign keys to existing tables:

  • Included a migration to remove orphaned rows in the source table
  • Removed any instances of dependent: ... that may no longer be necessary

When adding tables:

  • Ordered columns based on their type sizes in descending order
  • Added foreign keys if necessary
  • Added indexes if necessary

When removing columns, tables, indexes or other structures:

  • Removed these in a post-deployment migration
  • Made sure the application no longer uses (or ignores) these structures

General Checklist

Edited by yorickpeterse-staging

Merge request reports