Trending projects query performs sequence scans

The controller method Explore::ProjectsController#trending uses the following query for counting statistics (as per New Relic):

SELECT COUNT(*) AS count_all, projects.id AS projects_id 
FROM ? 
INNER JOIN ? ON ?.? = ?.? 
WHERE ?.? IN (?, ?) 
AND (notes.created_at > ?) 
AND ?.? = ? 
GROUP BY projects.id  
ORDER BY count(notes.id) DESC 
LIMIT ? 
OFFSET ?

Replacing the placeholders with actual values can yield a query such as the following:

SELECT COUNT(*) AS count_all, projects.id AS projects_id
FROM "projects"
INNER JOIN "notes" ON "notes"."project_id" = "projects"."id"
WHERE "projects"."visibility_level" IN (0, 20)
AND (notes.created_at > '2014-01-01')
AND "projects"."archived" is false
GROUP BY projects.id
ORDER BY count(notes.id) DESC
LIMIT 100
OFFSET 0;

If we run this through EXPLAIN ANALYZE we get the following query plan:

                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=203.89..203.90 rows=6 width=8) (actual time=3.821..3.826 rows=6 loops=1)
   ->  Sort  (cost=203.89..203.90 rows=6 width=8) (actual time=3.818..3.821 rows=6 loops=1)
         Sort Key: (count(notes.id))
         Sort Method: quicksort  Memory: 25kB
         ->  GroupAggregate  (cost=33.47..203.81 rows=6 width=8) (actual time=0.097..3.796 rows=6 loops=1)
               Group Key: projects.id
               ->  Merge Join  (cost=33.47..184.20 rows=2607 width=8) (actual time=0.037..3.332 rows=3506 loops=1)
                     Merge Cond: (notes.project_id = projects.id)
                     ->  Index Scan using index_notes_on_project_id on notes  (cost=0.28..396.88 rows=3910 width=8) (actual time=0.007..2.283 rows=3910 loops=1)
                           Filter: (created_at > '2014-01-01 00:00:00'::timestamp without time zone)
                     ->  Sort  (cost=33.19..33.21 rows=6 width=4) (actual time=0.028..0.200 rows=671 loops=1)
                           Sort Key: projects.id
                           Sort Method: quicksort  Memory: 25kB
                           ->  Seq Scan on projects  (cost=0.00..33.11 rows=6 width=4) (actual time=0.004..0.020 rows=6 loops=1)
                                 Filter: ((archived IS FALSE) AND (visibility_level = ANY ('{0,20}'::integer[])))
                                 Rows Removed by Filter: 3
 Planning time: 0.271 ms
 Execution time: 3.886 ms
(18 rows)

This query performs a regular sequence scan on the "projects" table due to a lack of indexes on the following columns:

projects.visibility_level
projects.archived

New Relic in turn shows numerous transactions where running this query took 3-5 seconds, for example:

Adding indexes on these columns will remove the need for full sequence scans, in turn allowing PostgreSQL to perform bitmap heap scans. These aren't as fast as full index scans, but they can be way faster than regular sequence scans.

Admin message

Admin message

Trending projects query performs sequence scans