ClearDatabaseCacheWorker keeps retrying and causes high amount of table bloat

We ran ClearDatabaseCacheWorker manually via a Rake task, and it caused a high amount of table bloat and replication lag because it failed and retried. For more details, see: https://gitlab.com/gitlab-com/infrastructure/issues/1576#note_27127622.

It appears that the Sidekiq job continued to retry multiple times over the course of 24 hours:

Several problems here:

It should not retry so much
We should limit the amount of table bloat this causes in production
It should resume from where it left off rather than reclear the same database rows

Any other ideas, @pcarranza?

Admin message