Start requiring zero downtime deploys
In almost every 8.7 RC we needed downtime due to migrations. Except for maybe 2 or 3 migrations all of these could be done online with some minor changes. For example, when adding a column to a large table we should not add a default value right away as doing so locks the entire table. Another example is creating indexes without creating them concurrently, again needing a table lock.
Since this requires around 15-20 minutes of downtime and is profoundly annoying for both those deploying and our users I propose that starting with 8.8 we start requiring zero downtime migrations unless there's no way around it. This requires us to implement a few things:
- An easy way of adding indexes concurrently on PostgreSQL
- An easy way of creating a column, setting the default value for every row in batches, then setting it on table level
- A way to handle deploys that essentially require 3 steps: a migration, a deploy to load new code, another migration to update any data modified/created in the mean time (e.g. when dealing with bugs that create bogus data)
The first two items are fairly easy to solve and I'll take care of those starting next week. Option 3 is harder and I'm not sure yet what the solution to this problem is.
We may not be able to solve all downtime requirements, but we certainly can solve 9 out of 10 cases.