[WIP] Centralized Alerting Framework
Description
As GitLab continues to add additional monitoring features and capabilities, a key foundation will be the ability to notify users and administrators of events that need attention. These could come from a variety of different sources:
APM
- Defined alerting thresholds for metrics
- Automated anomaly detection (https://gitlab.com/gitlab-org/gitlab-ee/issues/3610)
- Behavior change after release (https://gitlab.com/gitlab-org/gitlab-ee/issues/3555)
Logging
- Auto Log Alerts (https://gitlab.com/gitlab-org/gitlab-ee/issues/3626)
GitLab
- GitLab Service Alerts
Rather than building all the necessary alerting and other functionality into each of these areas, we can instead build a centralized alerting functionality. This would reduce the amount of work, and offer a single UI to manage notifications across these types of events.
Proposal
-
Alerts should support going out over configured Chat services, like Slack or Mattermost -
Add support for notifications via SMS -
Add support for responding and acknowledging alerts via notification methods -
Alerts should also feed into the Service Status Dashboard and Internal Ops Dashboard (https://gitlab.com/gitlab-org/gitlab-ee/issues/3541) , if acknowledged as a problem
Edited by Joshua Lambert