I noticed that since I upgraded to 8.x two weeks ago (from 7.14) merging may take also ages in my case (a few hours), and that the web app is not properly updated.
The reason for that is that some Sidekiq jobs take hours to complete:
As suggested by @stanhu in the ML, it might be related to counting the number of commits in the repository.
I maintain a gitlab instance for our university. We have 685 projects and 537 users currently. Most projects are students projects, so they do not contain that many commits.
Designs
An error occurred while loading designs. Please try again.
Child items
0
Show closed items
GraphQL error: The resource that you are attempting to access does not exist or you don't have permission to perform this action
No child items are currently open.
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
What's odd to me is that all of those services are taking a long time. PostReceive should just take moments to process each new push. ProjectCacheWorker the repository size, the commit count, and the cache (branch names etc.). This shouldn't take very long.
Is it possible there is something going on with this machine? Do you see unusual CPU or disk activity while these workers are in progress?
I updated our redis server to 2.8 and still have the same issue, so it is unrelated to #3649 (closed).
We do not have any proxy in front of our database. I updated the pool size to 20 instead of 10. It does not solve the problem.
Gitlab is currently not usable for project management in my case: I need to wait hours to get notifications, see activity update on the dashboard, etc.
When I check Sidekiq logs on the server, I can see a lot of ActiveRecord issues, such as:
2015-11-30T10:24:49.425Z 9118 TID-ow4ogslmo WARN: Couldn't find Note with 'id'=16832015-11-30T10:24:49.425Z 9118 TID-ow4ogslmo WARN: /home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activerecord-4.1.12/lib/active_record/relation/finder_methods.rb:320:in `raise_record_not_found_exception!'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activerecord-4.1.12/lib/active_record/relation/finder_methods.rb:429:in `find_one'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activerecord-4.1.12/lib/active_record/relation/finder_methods.rb:413:in `find_with_ids'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activerecord-4.1.12/lib/active_record/relation/finder_methods.rb:68:in `find'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activerecord-4.1.12/lib/active_record/querying.rb:3:in `find'/home/git/gitlab/app/mailers/emails/notes.rb:19:in `note_issue_email'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionpack-4.1.12/lib/abstract_controller/base.rb:189:in `process_action'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionpack-4.1.12/lib/abstract_controller/callbacks.rb:20:in `block in process_action'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/callbacks.rb:82:in `run_callbacks'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionpack-4.1.12/lib/abstract_controller/callbacks.rb:19:in `process_action'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionpack-4.1.12/lib/abstract_controller/base.rb:136:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionview-4.1.12/lib/action_view/rendering.rb:30:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionmailer-4.1.12/lib/action_mailer/base.rb:580:in `block in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/notifications.rb:159:in `block in instrument'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/notifications/instrumenter.rb:20:in `instrument'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/notifications.rb:159:in `instrument'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionmailer-4.1.12/lib/action_mailer/base.rb:577:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionmailer-4.1.12/lib/action_mailer/base.rb:568:in `initialize'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionmailer-4.1.12/lib/action_mailer/base.rb:551:in `new'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/actionmailer-4.1.12/lib/action_mailer/base.rb:551:in `method_missing'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/extensions/action_mailer.rb:17:in `public_send'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/extensions/action_mailer.rb:17:in `perform'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:75:in `execute_job'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:52:in `block (2 levels) in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidetiq-0.6.3/lib/sidetiq/middleware/history.rb:8:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/active_record.rb:6:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/retry_jobs.rb:74:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:11:in `block in call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/logging.rb:22:in `with_context'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:7:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132:in `invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:51:in `block in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:98:in `stats'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:50:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'2015-11-30T10:26:16.480Z 9118 TID-ow4ogslmo Sidekiq::Extensions::DelayedMailer JID-bb38fb8e928fe6d3815219ff INFO: start2015-11-30T10:26:16.486Z 9118 TID-ow4ogslmo Sidekiq::Extensions::DelayedMailer JID-bb38fb8e928fe6d3815219ff INFO: fail: 0.006 sec
and
2015-11-30T10:23:24.228Z 9118 TID-ow4o10hd0 WARN: /home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/dependencies.rb:478:in `load_missing_constant'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/dependencies.rb:180:in `const_missing'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/dependencies.rb:512:in `load_missing_constant'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.1.12/lib/active_support/dependencies.rb:180:in `const_missing'/home/git/gitlab/app/models/repository.rb:606:in `cache'/home/git/gitlab/app/models/repository.rb:184:in `block in expire_cache'/home/git/gitlab/app/models/repository.rb:183:in `each'/home/git/gitlab/app/models/repository.rb:183:in `expire_cache'/home/git/gitlab/app/services/git_push_service.rb:21:in `execute'/home/git/gitlab/app/workers/post_receive.rb:41:in `block in perform'/home/git/gitlab/app/workers/post_receive.rb:28:in `each'/home/git/gitlab/app/workers/post_receive.rb:28:in `perform'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:75:in `execute_job'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:52:in `block (2 levels) in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:127:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidetiq-0.6.3/lib/sidetiq/middleware/history.rb:8:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/active_record.rb:6:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/retry_jobs.rb:74:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:11:in `block in call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/logging.rb:22:in `with_context'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/server/logging.rb:7:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/middleware/chain.rb:132:in `invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:51:in `block in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:98:in `stats'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-3.3.0/lib/sidekiq/processor.rb:50:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
The logs are typically not updated correctly currently on the webapp so I cannot really on the admin panel to check what's going on.
Could it be possible that the exceptions launched by ActiveRecord could simple exhaust the connections in the database, thus blocking the other process requiring a database connection?
After upgrading to 8.3.2, I still have the same problem: the jobs look like blocked at some point, and the only way to retrieve a normal behavior is to perform a service gitlab restart.
It never happened before 8.0 (I have been running gitlab for almost 2 years before).
@daniel.leberre how many sidekiq workers are busy when gitlab stuck?
im my case im using v8.2.1 and when gitlab block all sidekiq workers are busy 25/25.
I even thought that the problem could be the sidekiq . But the latest versions of gitlab use the most recent version of sidekiq.
@fernandoneto Only 5, on two different queues (post_receive and default). This is why I do not understand that the jobs are stalled. They will eventually complete, after thousands of seconds, but this is unacceptable for daily operations.
Yes, I am running sidekiq 4.0.1, I hoped it could solve the problem. It does not.
@daniel.leberre i have big hope in sidekiq 4.0.1. One thing we can all do is work together and share informations to understand the origin of problem, because only few users are getting this problem so it can be a configuration problem.
I wonder if the issue I met is not due to several different issues which end up launching an exception when the job is executed. Here is the kind of error I get in sidekiq.log
2016-01-05T15:32:19.688Z 8874 TID-oryzc9og4 WARN: RuntimeError: Circular dependency detected while autoloading constant Banzai::LazyReference2016-01-05T15:32:19.688Z 8874 TID-oryzc9og4 WARN: /home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.2.4/lib/active_support/dependencies.rb:492:in `load_missing_constant'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/activesupport-4.2.4/lib/active_support/dependencies.rb:184:in `const_missing'/home/git/gitlab/lib/banzai/reference_extractor.rb:21:in `lazily'/home/git/gitlab/lib/banzai/reference_extractor.rb:46:in `references'/home/git/gitlab/lib/gitlab/reference_extractor.rb:33:in `issues'/home/git/gitlab/lib/gitlab/closing_issue_extractor.rb:25:in `closed_by_message'/home/git/gitlab/app/models/commit.rb:174:in `closes_issues'/home/git/gitlab/app/services/git_push_service.rb:92:in `block in process_commit_messages'/home/git/gitlab/app/services/git_push_service.rb:82:in `each'/home/git/gitlab/app/services/git_push_service.rb:82:in `process_commit_messages'/home/git/gitlab/app/services/git_push_service.rb:52:in `execute'/home/git/gitlab/app/workers/post_receive.rb:41:in `block in perform'/home/git/gitlab/app/workers/post_receive.rb:28:in `each'/home/git/gitlab/app/workers/post_receive.rb:28:in `perform'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:150:in `execute_job'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:132:in `block (2 levels) in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:127:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/server/active_record.rb:6:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/server/retry_jobs.rb:74:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/server/logging.rb:11:in `block in call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/logging.rb:30:in `with_context'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/server/logging.rb:7:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:129:in `block in invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:132:in `call'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/middleware/chain.rb:132:in `invoke'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:127:in `block in process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:166:in `stats'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:126:in `process'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:79:in `process_one'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/processor.rb:67:in `run'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/util.rb:16:in `watchdog'/home/git/gitlab/vendor/bundle/ruby/2.1.0/gems/sidekiq-4.0.1/lib/sidekiq/util.rb:24:in `block in safe_thread'
@fernandoneto I agree that it is probably a configuration problem. I have been updating my config files each month since sept 2013: I might have overlooked one change at some point which is only important right now.
@daniel.leberre One suggestion I have for you use something like ruby-prof to profile the different Sidekiq tasks (e.g. ProjectCacheWorker in app/workers/project_cache_worker.rb), save the outputs to different files, and see if there's an obvious place where all the time is being spent. I used it in this example.
@stanhu Thanks for the suggestion. I will try that. Note however that the issue is not AFAIK the task itself, since restarting gitlab will just allow the tasks to complete normally, i.e. in reasonable time (a few seconds).
The issue is that sidekiq stops executing all the jobs, which are then just being enqueued, and will eventually fill in the queue. After thousands of seconds, the jobs are executed again.
Thanks for the pointers. I noted in the comments that several people met issues while trying to connect to redis or other third party tools. I also believe that the issue is due to the jobs trying to access either our db (MySQL) or redis or gitlab-workhorse but that it fails to do so.
We found from time to time some issues related to one of the framework in the logs, but could not isolate one as the source of our problem.
Do you know if one exception raised during a sidekiq job could really pause sidekiq?
I finally found the root of all my problems: a firewall was blocking database connections after 3600s of inactivity. It looks like release 8.X changed the way the database connections were used, thus increased the number of time it happened. The system was stalled when gitlab was trying to use a connection blocked by the firewall.
Process TID JID Очередь Задача Аргументы Запущеноdomain:10072:280c4e3b546b osqql6ntw 9ea7d02778732df9ed35363d default ProjectCacheWorker 2 23 минуты назадdomain:10072:280c4e3b546b osqpyim44 510bf490ae19b377e83b43b7 default ProjectCacheWorker 2 11 минут назад
I think I found the problem.
The task "ProjectCacheWorker" with argument "2" takes a long time.
And more of them accumulated in the queue handled the longer the rest.
I understand how the cache is built.
And judging by the code argument of "2" is project_id
I know that it is for the project and that it is big. But before such was not.
How to find out why this task is carried out so long?
In turn cacher allocated separate sidekiq with one thread.
Now, the basic tasks are performed quickly.
But there are two questions:
Why is queued second cache of the same project if you do not complete the first (you must either kill the old job replacing it with a new, or not to put in a new place)
Why to cache all commit to the project? Now only one of our projects 146000 commit!