WIP: Bundle pgpool-II 3.6.1
General
-
Get the absolute most basic pgpool setup up and running: 1 pgpool instance, one host to forward queries to (in staging) -
Figure out how to add hosts to pgpool, and if this can be done dynamically - This seems to require updating the pgpool configuration file (http://www.pgpool.net/docs/latest/en/html/runtime-config-backend-settings.html#GUC-BACKEND-HOSTNAME for example), though some options can be set using SQL commands as well (but they're not persisted/flushed to disk) as per http://www.pgpool.net/docs/latest/en/html/sql-pgpool-set.html
- PCP commands can also be used for this, but as far as I can tell they are not persisted: http://www.pgpool.net/docs/latest/en/html/pcp-attach-node.html
-
Log to STDERR so runit can take care of it - Logging to STDERR is the default
-
Figure out if relation caching is useful to us (this might speed up application bootups) - Used for caching system catalog information, doesn't seem to be flushed automatically upon an
ALTER table
; requires a timeout instead. This could speed up queries used by Rails to get the columns for every model, though it's not clear how great that overhead is to begin with
- Used for caching system catalog information, doesn't seem to be flushed automatically upon an
-
Figure out what pgpool uses in-memory caching for, and if it's useful - This appears to be used for caching the output of SELECT queries. This could be quite useful for caching frequently executed queries. The cache appears to be flushed automatically.
-
Figure out if we use temporary tables, disable pgpool checking this if we don't use them (reduces master load) - As far as I can tell we only use this in
lib/gitlab/database/median.rb
and only for MySQL
- As far as I can tell we only use this in
PostgreSQL
-
Move PostgreSQL to a different port (not accessible from the outside), and have pgpool listen on 5432 -
Automatically enable the pgpool_recovery
extension as mentioned on http://www.pgpool.net/docs/latest/en/html/install-pgpool-recovery.html -
adjusting PostgreSQL settings as mentioned on http://www.pgpool.net/docs/latest/en/html/install-pgpool-recovery.html -
Set up PostgreSQL user for pgpool, used for health checks, etc. This user will either need a password, or we somehow allow this user to connect without one (including on remote hosts) -
Ensure a replication user is in place, with a password (or another way to connect to other DB hosts). Currently we use gitlab_replicator
but this is managed by https://gitlab.com/gitlab-cookbooks/gitlab-pgsql-ha/ if I'm not mistaken
pgpool
-
pgpool configuration generation/management ( pgpool.conf
andpgpool_hba.conf
) -
Set up pgpool watchdog (required for managing multiple nodes, reaching consensus, failovers, etc: http://www.pgpool.net/docs/latest/en/html/tutorial-watchdog-intro.html) -
Create/configure the PID directory (if necessary when using runit) -
Support & enable pgpool health checks -
Support & enable pgpool replication checks -
Create/configure the socket directory
Replication
The following is required to set up replication, which is necessary to ensure all load balanced nodes have the same data:
-
Synchronous replication so changes don't vanish between requests: https://www.postgresql.org/docs/9.6/static/warm-standby.html#SYNCHRONOUS-REPLICATION -
Generate a recovery.conf
file in the PostgreSQL data directory. This file should contain the right recovery settings for a standby node. -
Being able to persist node addresses, primaries, etc in Chef