Create troubleshooting pgbouncer runbook
Today, after all the NFS disaster, the pg db run out of connections, this impacted the whole site with errors 500.
In this situation, I went to the runbooks to find out how to deal with it and found nothing besides a section that for troubleshooting said to bounce the process.
I know that we have better tooling than this to deal with this kind of outages, so I think we are missing at least 2 things:
Alerting for when the database runs out of active connections.
A proper troubleshooting runbook for pgbouncer explaining how to deal with this kind of situations.