gitlab-unicorn metrics endpoint down
Summary
the gitlab-unicorn metrics endpoint is very fragile and is most of the time down.
Steps to reproduce
- spin up a new gitlab 10.0.3-ce.0 instance (e.g. using rgl/gitlab-vagrant)
- enable metrics
- make prometheus listen at 0.0.0.0 (e.g.
prometheus['listen_address'] = '0.0.0.0:9090'
) - see prometheus targets status at http://gitlab.example.com:9090/targets to see its down.
Relevant logs
I've got two different errors as seen bellow.
==> /var/log/gitlab/gitlab-rails/production_json.log <==
{"method":"GET","path":"/-/metrics","format":"html","controller":"MetricsController","action":"index","status":500,"error":"ArgumentError: @ outside of string","duration":87.88,"view":0.0,"db":1.5,"time":"2017-10-07T12:55:18.030Z","params":{},"remote_ip":null,"user_id":null,"username":null}
==> /var/log/gitlab/gitlab-rails/production.log <==
Started GET "/-/metrics" for 127.0.0.1 at 2017-10-07 13:55:18 +0100
Processing by MetricsController#index as HTML
Completed 500 Internal Server Error in 85ms (ActiveRecord: 1.5ms)
ArgumentError (@ outside of string):
app/services/metrics_service.rb:14:in `prometheus_metrics_text'
app/services/metrics_service.rb:24:in `metrics_text'
app/controllers/metrics_controller.rb:9:in `index'
lib/gitlab/middleware/multipart.rb:93:in `call'
lib/gitlab/request_profiler/middleware.rb:14:in `call'
lib/gitlab/middleware/go.rb:17:in `call'
lib/gitlab/etag_caching/middleware.rb:11:in `call'
lib/gitlab/middleware/rails_queue_duration.rb:20:in `call'
lib/gitlab/metrics/rack_middleware.rb:29:in `block in call'
lib/gitlab/metrics/transaction.rb:49:in `run'
lib/gitlab/metrics/rack_middleware.rb:29:in `call'
lib/gitlab/request_context.rb:18:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:27:in `call'
==> /var/log/gitlab/gitlab-rails/production_json.log <==
{"method":"GET","path":"/-/metrics","format":"html","controller":"MetricsController","action":"index","status":500,"error":"Oj::ParseError: NULL byte in string at line 1, column 117 [parse.c:340]","duration":39.72,"view":0.0,"db":0.72,"time":"2017-10-07T16:57:36.857Z","params":{},"remote_ip":null,"user_id":null,"username":null}
==> /var/log/gitlab/gitlab-rails/production.log <==
Started GET "/-/metrics" for 127.0.0.1 at 2017-10-07 17:57:36 +0100
Processing by MetricsController#index as HTML
Completed 500 Internal Server Error in 39ms (ActiveRecord: 0.7ms)
Oj::ParseError (NULL byte in string at line 1, column 117 [parse.c:340]):
app/services/metrics_service.rb:14:in `prometheus_metrics_text'
app/services/metrics_service.rb:24:in `metrics_text'
app/controllers/metrics_controller.rb:9:in `index'
lib/gitlab/middleware/multipart.rb:93:in `call'
lib/gitlab/request_profiler/middleware.rb:14:in `call'
lib/gitlab/middleware/go.rb:17:in `call'
lib/gitlab/etag_caching/middleware.rb:11:in `call'
lib/gitlab/middleware/rails_queue_duration.rb:20:in `call'
lib/gitlab/metrics/rack_middleware.rb:29:in `block in call'
lib/gitlab/metrics/transaction.rb:49:in `run'
lib/gitlab/metrics/rack_middleware.rb:29:in `call'
lib/gitlab/request_context.rb:18:in `call'
lib/gitlab/metrics/requests_rack_middleware.rb:27:in `call'
Results of GitLab environment info
System information
System: Ubuntu 16.04
Current User: git
Using RVM: no
Ruby Version: 2.3.5p376
Gem Version: 2.6.13
Bundler Version:1.13.7
Rake Version: 12.0.0
Redis Version: 3.2.5
Git Version: 2.13.5
Sidekiq Version:5.0.4
Go Version: unknown
GitLab information
Version: 10.0.3
Revision: 8895150
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: postgresql
URL: https://gitlab.example.com
HTTP Clone URL: https://gitlab.example.com/some-group/some-project.git
SSH Clone URL: git@gitlab.example.com:some-group/some-project.git
Using LDAP: yes
Using Omniauth: no
GitLab Shell
Version: 5.9.0
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks
Git: /opt/gitlab/embedded/bin/git