Skip to content

Retry known idempotent SELECT queries on connection-related exceptions

Created by: adrianna-chang-shopify

Motivation / Background

Take 2 of https://github.com/rails/rails/pull/51166, but rather than assuming that any SQL coming from the #select methods is safe to retry, we retry only queries we have constructed and thus know to be idempotent.

Detail

This PR makes two types of queries retry-able by opting into our allow_retry flag:

  1. SELECT queries we construct by walking the Arel tree via #to_sql_and_binds. We use a new retryable attribute on collector classes, which defaults to true for most node types, but will be set to false for non-idempotent node types (functions, SQL literals, update / delete / insert statements, etc). The retryable value is returned from #to_sql_and_binds and used by #select_all and passed down the call stack, eventually reaching the adapter's #internal_exec_query method.

  2. #find and #find_by queries with known attributes. We set allow_retry: true in #cached_find_by, and pass this down to #find_by_sql and #_query_by_sql.

These changes ensure that queries we know are safe to retry can be retried automatically.

Additional information

To verify:

  • Is test coverage sufficient? It's tricky to test every possible Active Record API that could lead to a given SQL query. I've tried to cover the most commonly used ones. I was hoping to test that inserts / updates / deletes are non-retryable, but since these are wrapped in transactions, we end up reconnecting on the BEGIN.
  • Are there Arel node types I've missed that might be non-idempotent? Note that we're considering all NamedFunction and SqlLiteral nodes to be non-idempotent, which likely excludes a lot of potential queries that would be retry-able, but this is the safest option for now.
  • Is cached_find_by always guaranteed to produce an idempotent query? This is the only way to ensure Post.find / Post.find_by(title: "foo") etc. are retryable, and I definitely think we should make these types of queries retryable. find_by with a SQL fragment supplied as args will be flagged as non-retryable.

cc @matthewd

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

Merge request reports