Intermittent Errors in gitaly-cluster Jobs: Error running gitlab-ctl reconfigure - gitlab-kas failing with exit code 1
The gitaly-cluster
job in our E2E Omnibus GitLab EE master
pipelines has been failing intermittently this week with the following errors:
Recipe: gitlab-kas::enable
* runit_service[gitlab-kas] action restart
================================================================================
Error executing action `restart` on resource 'runit_service[gitlab-kas]'
================================================================================
Mixlib::ShellOut::ShellCommandFailed
------------------------------------
Expected process to exit with [0], but received '1'
---- Begin output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
STDOUT: timeout: down: /opt/gitlab/service/gitlab-kas: 0s, normally up, want up
STDERR:
---- End output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
Ran /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas returned 1
Cookbook Trace: (most recent call first)
----------------------------------------
/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/helpers.rb:136:in `safe_sv_shellout!'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/helpers.rb:164:in `restart_service'
/opt/gitlab/embedded/cookbooks/cache/cookbooks/runit/libraries/provider_runit_service.rb:368:in `block in <class:RunitService>'
Resource Declaration:
---------------------
suppressed sensitive resource output
Compiled Resource:
------------------
suppressed sensitive resource output
System Info:
------------
chef_version=17.10.0
platform=ubuntu
platform_version=22.04
ruby=ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux]
program_name=/opt/gitlab/embedded/bin/cinc-client
executable=/opt/gitlab/embedded/bin/cinc-client
There was an error running gitlab-ctl reconfigure:
runit_service[gitlab-kas] (gitlab-kas::enable line 148) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
STDOUT: timeout: down: /opt/gitlab/service/gitlab-kas: 0s, normally up, want up
STDERR:
---- End output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
Ran /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas returned 1
Running handlers complete
[2023-07-14T16:16:48+00:00] ERROR: Exception handlers complete
Infra Phase failed. 249 resources updated in 01 minutes 09 seconds
[2023-07-14T16:16:48+00:00] FATAL: Stacktrace dumped to /opt/gitlab/embedded/cookbooks/cache/cinc-stacktrace.out
[2023-07-14T16:16:48+00:00] FATAL: ---------------------------------------------------------------------------------------
[2023-07-14T16:16:48+00:00] FATAL: PLEASE PROVIDE THE CONTENTS OF THE stacktrace.out FILE (above) IF YOU FILE A BUG REPORT
[2023-07-14T16:16:48+00:00] FATAL: ---------------------------------------------------------------------------------------
[2023-07-14T16:16:48+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: runit_service[gitlab-kas] (gitlab-kas::enable line 148) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
STDOUT: timeout: down: /opt/gitlab/service/gitlab-kas: 0s, normally up, want up
STDERR:
---- End output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas ----
Ran /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitlab-kas returned 1
Example of failed job: https://gitlab.com/gitlab-org/gitlab/-/jobs/4656289568