Skip to content
Snippets Groups Projects
Select Git revision
  • master default protected
  • 12-9-stable
  • 12-7-stable
  • 12-6-stable
  • 12-8-stable
  • github/fork/Kloppi313/patch-1
  • 12-5-stable
  • 12-4-stable
  • github/fork/ramalokesh8477/master
  • 12-1-stable
  • 12-2-stable
  • 12-0-stable
  • 12-3-stable
  • 42-42-stable
  • github/fork/hussamgit398/patch-2
  • 12-3-auto-deploy-20190911
  • 12-3-auto-deploy-20190916
  • 12-3-auto-deploy-20190908
  • 12-3-auto-deploy-20190901
  • 12-3-auto-deploy-20190901-32664
  • v12.10.0.pre
  • v12.9.0
  • v12.9.0-rc42
  • v12.8.7
  • v12.8.6
  • v12.8.5
  • v12.8.4
  • v12.8.3
  • v12.6.8
  • v12.7.7
  • v12.8.2
  • v12.8.1
  • v12.9.0.pre
  • v12.8.0
  • v12.8.0-rc42
  • v12.5.10
  • v12.7.6
  • v12.6.7
  • v12.7.5
  • v12.5.9
40 results

projects_finder.rb

  • Toon Claes's avatar
    01c6323d
    UNION of SELECT/WHERE is faster than WHERE on UNION · 01c6323d
    Toon Claes authored
    Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
    SELECT statements, and do UNION on that.
    
    Local tests with about 2_000_000 projects:
     - 1_500_000 private projects
     -    40_000 internal projects
     -   400_000 public projects
    
    For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE ...
    ```
    
    The original query took 1073.8ms.
    The query refactored to UNION of SELECT/WHERE took 2.3ms.
    
    The original query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)))
      AND "projects"."visibility_level" = $2
      AND "projects"."archived" = $3
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"],
           ["visibility_level", 0],
           ["archived", "f"]]
    ```
    
    The refactored query:
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'))
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"]]
    ```
    01c6323d
    History
    UNION of SELECT/WHERE is faster than WHERE on UNION
    Toon Claes authored
    Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
    SELECT statements, and do UNION on that.
    
    Local tests with about 2_000_000 projects:
     - 1_500_000 private projects
     -    40_000 internal projects
     -   400_000 public projects
    
    For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE ...
    ```
    
    The original query took 1073.8ms.
    The query refactored to UNION of SELECT/WHERE took 2.3ms.
    
    The original query was:
    
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)))
      AND "projects"."visibility_level" = $2
      AND "projects"."archived" = $3
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"],
           ["visibility_level", 0],
           ["archived", "f"]]
    ```
    
    The refactored query:
    ```sql
    SELECT "projects".*
    FROM "projects"
    WHERE "projects"."pending_delete" = $1
      AND (projects.id IN
             (SELECT "projects"."id"
              FROM "projects"
              INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
              WHERE "projects"."pending_delete" = 'f'
                AND "project_authorizations"."user_id" = 23
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'
              UNION SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."visibility_level" IN (20,
                                                      10)
                AND "projects"."visibility_level" = 0
                AND "projects"."archived" = 'f'))
    ORDER BY "projects"."created_at" DESC
    LIMIT 20
    OFFSET 0 [["pending_delete", "f"]]
    ```