sentry.gitlap.com down
At 11:22:52 UTC time, Sentry processes (managed by supervisord
) started to restart on a loop. It took a while to notice sentry.gitlap.com was down because, as the processes were up/down constantly, the alert didn't trigger because it wasn't always down.
We followed https://gitlab.com/gitlab-com/runbooks/blob/master/troubleshooting/sentry-is-down.md, both redis/postgres were up and running.
Nothing was on the supervisor logs (apart from the up/down messages so it was difficult to wonder what could be happening.
Also, upon login, there were some ongoing issues with /dev/null
on the box:
*** System restart required ***
Last login: Mon Jul 3 11:53:54 2017 from 92.191.19.251
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
-bash: /dev/null: Permission denied
Issuing sudo service supervisor restart
and/or sudo supervisorctl restart all
didn't help.
@ilyaf fixed /dev/null
issues:
root@sentry:/home/ilya# ls -la /dev/null
crw------- 1 man root 1, 3 May 17 14:32 /dev/null
root@sentry:/home/ilya# chmod 0666 /dev/null
root@sentry:/home/ilya# chown root:root /dev/null
root@sentry:/home/ilya# ls -la /dev/null
crw-rw-rw- 1 root root 1, 3 May 17 14:32 /dev/null
We restarted the sentry processes again but the fix didn't help.
Given that the processes were flapping, we ran some strace
for the sentry-web
process:
$ sudo strace -f -v -s10240 -p$(ps -ef | grep -i sentry | grep -v grep | awk '{print $2}')
[...]
read(10, "from __future__ import absolute_import\n\nfrom sentry_plugins.base import assert_package_not_installed\n\nassert_package_not_installed('sentry-gitlab')\n", 4096) = 148
write(2, " ", 4) = 4
write(2, "assert_package_not_installed('sentry-gitlab')\n", 46) = 46
close(10) = 0
munmap(0x7fe8172c4000, 4096) = 0
write(2, " File \"/usr/share/nginx/sentry/local/lib/python2.7/site-packages/sentry_plugins/base.py\", line 26, in assert_package_not_installed\n", 132) = 132
open("/usr/share/nginx/sentry/local/lib/python2.7/site-packages/sentry_plugins/base.py", O_RDONLY) = 10
fstat(10, {st_dev=makedev(8, 1), st_ino=535388, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=784, st_atime=2017/07/03-11:22:58, st_mtime=2017/06/26-04:35:58, st_ctime=2017/06/26-04:35:58}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8172c4000
read(10, "from __future__ import absolute_import\n\nimport pkg_resources\nimport sentry_plugins\n\n\nclass CorePluginMixin(object):\n author = 'Sentry Team'\n author_url = 'https://github.com/getsentry/sentry-plugins'\n version = sentry_plugins.VERSION\n resource_links = [\n ('Bug Tracker', 'https://github.com/getsentry/sentry-plugins/issues'),\n ('Source', 'https://github.com/getsentry/sentry-plugins'),\n ]\n\n # HACK(dcramer): work around MRO issue with plugin metaclass\n logger = None\n\n\ndef assert_package_not_installed(name):\n try:\n pkg_resources.get_distribution(name)\n except pkg_resources.DistributionNotFound:\n return\n else:\n raise RuntimeError(\"Found %r. This has been superseded by 'sentry-plugins', so please uninstall.\" % name)\n", 4096) = 784
write(2, " ", 4) = 4
write(2, "raise RuntimeError(\"Found %r. This has been superseded by 'sentry-plugins', so please uninstall.\" % name)\n", 106) = 106
close(10) = 0
munmap(0x7fe8172c4000, 4096) = 0
write(2, "RuntimeError", 12) = 12
write(2, ": ", 2) = 2
write(2, "Found 'sentry-gitlab'. This has been superseded by 'sentry-plugins', so please uninstall.", 89) = 89
write(2, "\n", 1) = 1
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fe816e99330}, {0x45a0f5, [], SA_RESTORER, 0x7fe816e99330}, 8) = 0
munmap(0x7fe80b0d2000, 4230504) = 0
munmap(0x7fe80be67000, 6279096) = 0
munmap(0x7fe80bc44000, 2237640) = 0
munmap(0x7fe80ba1b000, 2264288) = 0
munmap(0x7fe80cbe0000, 2481976) = 0
munmap(0x7fe81506c000, 262144) = 0
close(5) = 0
exit_group(1) = ?
+++ exited with 1 +++
Sentry was trying to load sentry-gitlab
pip module but this one is superseded by sentry-plugins
. Similar bug (which we also hit can be found here). According to the documentation, we would just need to get rid of the pip
module and let sentry-plugins
do its job.
We made a backup of the virtualenv
just in case on my home folder, then we removed the offending module:
# load python venv
source /usr/share/nginx/sentry/bin/activate
# remove
pip uninstall sentry-gitlab
After that we ran sudo supervisorctl start sentry-web
and it worked.
However sentry-cron
and sentry-worker
were still flapping. We ran another strace
on sentry-worker
just to found out that there was a similar problem with the sentry-slack
module. Following the same steps from above made it work again.
We are still not sure about which cause(s) originated the problem.
@stanhu you logged into the box some moments before sentry started to throw 502
errors upon saving of events, did you issue an upgrade or something? Otherwise, we're still unsure about what caused the issue.