Soft removals mess up query plans
For issues, merge requests, and various other tables we use soft removals. Instead of moving soft removed data to a separate table we use deleted_at
timestamp columns that we filter by along the lines of WHERE deleted_at IS NULL
.
In almost all cases this will lead to a less than ideal query plan where PostgreSQL will use a Filter for this column instead of being able to perform a regular index scan. The most recent example of this can be found in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/13995. The exact performance overhead is hard to estimate because this depends on the amount of soft removed data.
This particular problem is similar to that of storing both active and pending members in the same table, then filtering them out using a column: https://gitlab.com/gitlab-org/gitlab-ce/issues/30704. In both cases the solution is very simple: store the data in a separate table and you never have to filter by it. Or perhaps even better: just hard remove it.
In the past we had rogue bots trying to remove users. While soft removals would have prevented physical data loss the current setup (and in my opinion soft removals in general) are the wrong approach. Something as dangerous as physically removing a user is something that should never happen unattended in the first place, soft removals or not. This is like removing a production database and then blaming the file system for not having a transaction log that you can replay/revert.
This might be a bit of an extreme approach (and I don't know if we can revert soft removals), so an alternative would be to store data in a separate table. This table could either be a 1:1 copy structure wise, or use JSON. I'm genuinely not a fan of storing structured data but this might be one of few cases where it could be useful for storing the row's details without having to replicate the entire schema.
A downside is that when migrating the source table we might also have to migrate removed rows as we may otherwise be unable to restore them. In general however I would advise just getting rid of soft removals all together as it removes a lot of complexity.
I'm labelling this AP2 not because it impacts timings so much, but because it impacts every single query that goes to issues
or merge_requests
. This means that getting rid of this would potentially improve a lot of queries (or at the very least make their plans simpler).