GitLab VM Unresponsive
Summary
I run GitLab on an Ubuntu Azure VM. It has been working great! But, every once in a while, GitLab's web interface gets unresponsive. This happens somewhat frequently so normally after a refresh or two it is fine. But occasionally, it lasts longer than usual. So I SSH to the VM to restart GitLab. The problem is that the entire system becomes unresponsive. I'm unable to SSH to it. After that, I try to go to the Azure Portal and reboot the VM. That normally works but it gives an error message saying that it isn't sure if the VM actually rebooted. The entire system becomes so unresponsive that nothing works at all! I have found that, given enough time, the problem will also go away and the system will return back to a normal state. Enough time is about 30 minutes. You can look at this status page to see how often it happens and for how long. Almost all GitLab outages are caused by this issue and it is becoming more frequent.
Steps to reproduce
Run Omnibus GitLab on Azure with a Standard D1 v2 (1 core, 3.5 GB memory) Ubuntu 14.04 VM and leave it running for a long time. Other than that, I'm not entirely sure how to reproduce this.
What is the current bug behavior?
The entire GitLab VM overloads and becomes unresponsive due to GitLab.
What is the expected correct behavior?
GitLab should NOT overload and kill the VM.
Relevant logs and/or screenshots
I use Sentry for monitoring errors in GitLab. Each time it crashes Sentry sends me these errors:
Redis::CommandErrorlib/gitlab/workhorse.rb in block in set_key_and_notify at line 178
MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
And
PG::ConnectionBadlib/gitlab/request_context.rb in call at line 18
FATAL: the database system is in recovery mode
Other logs seem strangely normal.
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
System information System: Ubuntu 14.04 Current User: git Using RVM: no Ruby Version: 2.3.3p222 Gem Version: 2.6.6 Bundler Version:1.13.7 Rake Version: 10.5.0 Redis Version: 3.2.5 Git Version: 2.11.1 Sidekiq Version:5.0.0GitLab information Version: 9.2.6 Revision: 332a71d Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: postgresql URL: https://gitlab.filiosoft.com HTTP Clone URL: https://gitlab.filiosoft.com/some-group/some-project.git SSH Clone URL: git@gitlab.filiosoft.com:some-group/some-project.git Using LDAP: no Using Omniauth: yes Omniauth Providers: azure_oauth2
GitLab Shell Version: 5.0.4 Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks Git: /opt/gitlab/embedded/bin/git
Results of GitLab application Check
Expand for output related to the GitLab application check
Checking GitLab Shell ...
GitLab Shell version >= 5.0.4 ? ... OK (5.0.4) Repo base directory exists? default... yes Repo storage directories are symlinks? default... no Repo paths owned by git:root, or git:git? default... yes Repo paths access is drwxrws---? default... yes hooks directories in repos are links: ... 6/4 ... ok 6/5 ... ok 6/8 ... ok 6/9 ... ok 6/10 ... ok 6/11 ... ok 6/26 ... ok 6/27 ... ok 13/29 ... ok 6/32 ... ok 6/34 ... ok 2/36 ... ok 6/37 ... ok 6/38 ... ok 6/39 ... ok 6/41 ... ok 13/49 ... ok 6/51 ... ok 14/52 ... ok 14/53 ... ok 6/54 ... ok 14/62 ... ok 6/63 ... ok 6/64 ... ok 2/65 ... ok 6/69 ... ok 17/71 ... ok 17/72 ... ok 6/75 ... ok 18/76 ... ok 17/77 ... ok 18/79 ... ok 2/80 ... ok 6/81 ... ok 20/82 ... ok 6/83 ... ok Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Access to /var/opt/gitlab/.ssh/authorized_keys: OK Send ping to redis server: OK gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Sidekiq ...
Running? ... yes Number of Sidekiq processes ... 1
Checking Sidekiq ... Finished
Checking Reply by email ...
Reply by email is disabled in config/gitlab.yml
Checking Reply by email ... Finished
Checking LDAP ...
LDAP is disabled in config/gitlab.yml
Checking LDAP ... Finished
Checking GitLab ...
Git configured with autocrlf=input? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config outdated? ... no Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory setup correctly? ... yes Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) projects have namespace: ... 6/4 ... yes 6/5 ... yes 6/8 ... yes 6/9 ... yes 6/10 ... yes 6/11 ... yes 6/26 ... yes 6/27 ... yes 13/29 ... yes 6/32 ... yes 6/34 ... yes 2/36 ... yes 6/37 ... yes 6/38 ... yes 6/39 ... yes 6/41 ... yes 13/49 ... yes 6/51 ... yes 14/52 ... yes 14/53 ... yes 6/54 ... yes 14/62 ... yes 6/63 ... yes 6/64 ... yes 2/65 ... yes 6/69 ... yes 17/71 ... yes 17/72 ... yes 6/75 ... yes 18/76 ... yes 17/77 ... yes 18/79 ... yes 2/80 ... yes 6/81 ... yes 20/82 ... yes 6/83 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.1.0 ? ... yes (2.3.3) Your git bin path is "/opt/gitlab/embedded/bin/git" Git version >= 2.7.3 ? ... yes (2.11.1) Active users: 4
Checking GitLab ... Finished
Possible fixes
A possible fix could be upgrading the size of the VM since it overloads memory and CPU. I'm not sure that this would actually help, though.