just updated from latest 8.x to gitlab 9.0 and runner 9.0
Actual behavior
see sumary
Expected behavior
build should fetch content and run build
Relevant logs and/or screenshots
Running with gitlab-ci-multi-runner 9.0.0 (08a9e6f) on docker (c06ca919)Using Docker executor with image docker.kay-strobach.de/docker/php:7.0 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go value of type boolWill be retried in 3s ...Using Docker executor with image docker.kay-strobach.de/docker/php:7.0 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go value of type boolWill be retried in 3s ...Using Docker executor with image docker.kay-strobach.de/docker/php:7.0 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go value of type boolWill be retried in 3s ...ERROR: Job failed (system failure): Error reading remote info: json: cannot unmarshal number into Go value of type bool
Environment description
gitlab omnibus
debian Linux 4.4.27-x86_64-jb1 #1 (closed) SMP Thu Oct 27 13:51:17 CEST 2016 x86_64 GNU/Linux
Debian GNU/Linux 8
no apache ... nginx with some other vhosts -> disabled for testing old behaviour
@kaystrobach If you can then sure, please try to reproduce this on GitLab.com. At the moment I was unable to reproduce your problem on my local development version of GitLab on v9.0.0 tag. I'm currently going to install GitLab on a fresh host from the Omnibus package. Maybe there I will reproduce the error.
I'm currently unable to reproduce the error and I think I will prepare an extended version on top of 9.0.0, which would print the json payload if the error occurs. This would help us much. But it will also print all secrets that are sent to Runner, so token, repository URL, secure variables so the output will need to be cleaned before it'll be published. I'll post an URL to the patched version in a few moments.
For us a single job out of many cannot be run, it is the only job with an after_script:
lint:javascript:stage:testimage:node:7tags:-dockerscript:-npm --silent run eslintafter_script:-npm --silent run eslint-reportartifacts:name:eslint-reportexpire_in:1dpaths:-eslint-report.html
gitlab-runner[5257]: time="2017-03-23T13:54:33+01:00" level=warning msg="Checking for jobs... failed" runner=f1b9ec48 status="Error decoding json payload json: cannot unmarshal array into Go value of type string" #012<nil>gitlab-ci-multi-runner[5257]: time="2017-03-23T13:54:33+01:00" level=warning msg="Checking for jobs... failed" runner=f1b9ec48 status="Error decoding json payload json: cannot unmarshal array into Go value of type string"
Probably related:
The job will stay in status "running" but should be set to failed whenever something like this happened.
We realized something is wrong when to notification mails arrived about the pipeline status (as there was still this job running).
@kaystrobach Looking again on your initial report I now see that your issue is different than others. after_script generates invalid job payload and it fails the job even before anything is printed in trace (that's why everyone else is posting logs from Runner, not from job's trace).
So we have two issues here. First will need a fix in GitLab CE/EE. Second will need a little more investigation and then fix in Runner or in your configuration/environment.
Can you post output of docker info and curl -s --unix-socket /var/run/docker.sock http://localhost/info? In both cases please hide the value of ID:.
@TheCapsLock@kaystrobach Can you also post an output of docker version? In 9.0 we've changed docker library to official one (github.com/docker/docker/client/). Output of info is quite different that at mine personal computer and I'm starting to suspect that docker-engine version that you are using has older API than version supported by the library.
OK, I think I know what is the problem. Your docker-engines are using API v1.18. Runner defines used API as v1.18 so requests are targeting a proper endpoint. However... https://docs.docker.com/engine/api/version-history/#v119-api-changes - here we can see, that in v1.19 version of API GET /info was changed:
GET /info The fields Debug, IPv4Forwarding, MemoryLimit, and SwapLimit are now returned as boolean instead of as an int.
In v1.18 these fields are returned as integers and that is confirmed in data posted by you above. Runner, through docker/docker/client library, is requesting /v1.18/info for the data and receive a JSON object with these fields as integers. And then it tries to decode this JSON data to a github.com/docker/docker/api/types.Info struct, where Debug, SwapLimit, MemoryLimit and IPv4Forwarding are defined as booleans!
If all changes after v1.18 would only add new fields, then JSON decoder should handle this without problems and just skip fields that are not send by docker engine. But in this case data is send, but in different format and that's why we see cannot unmarshal number into Go value of type bool.
The sad information is that I don't see an easy way to fix the issue other than upgrading docker-engine to at least 1.7 or switching back to Runner 1.11.x.
Running with gitlab-ci-multi-runner 9.0.0 (08a9e6f) on docker (c06ca919)Using Docker executor with image docker.kay-strobach.de/docker/php:7.0 ...Using docker image sha256:344bf826be5884d21a6fe8ebc1a0804910357a6c4825729835cd266ed87403ff ID=sha256:344bf826be5884d21a6fe8ebc1a0804910357a6c4825729835cd266ed87403ff for predefined container...Pulling docker image docker.kay-strobach.de/docker/php:7.0 ...Using docker image docker.kay-strobach.de/docker/php:7.0 ID=sha256:d778d4d40c05c502337a453102ca4c49a0d342482fb387209478498c3b5c73db for build container...Running on runner-c06ca919-project-39-concurrent-0 via git.4viewture.eu...Cloning repository...Cloning into '/builds/flooridoo/Shop'...warning: You appear to have cloned an empty repository.Checking out 07ec8780 as master...fatal: reference is not a tree: 07ec8780fba77ac299fe181d073b20dc54c46a73ERROR: Job failed: exit code 1
so maybe you can add a check to omnibus / runner package and "magically fix" docker for people having these problems ... it's just a couple of shell commands
It would be nice if we can keep using versions provided by debian jessie at least
But this will posibly hit all debian users ...
I understand that. But we have no power to force Debian to update used provided Docker version, and for us using an old and non-official library to access Docker is not possible any more. This just doesn't scale :(
You can also stick to Runner 1.11.X. This one will be able to work with GitLab >= 9.0 up to August 2017. Until then we will still support old API used by Runner to talk with GitLab. After August API will be removed and since next GitLab release the only way to work will be to update Runner.
The root cause of this problem is in docker library and the breaking change in Docker API v1.19. The cleanest way would be to add fix in the library: if version v1.18 is used then JSON decoding should be done with translation from boolean to integer. I can open issue in docker issue tracker with such proposal, but I don't know if anyone would like to add such hack for support an API which is almost two years old
@kaystrobach What do you mean by "magically fix docker"?
As for the problem of "You appear to have cloned an empty repository" there is already discussion at omnibus-gitlab#2119 (closed) how to fix this issue.
As for the problem of Docker API v1.18 incompatibility - we can either inform that Runner needs to work with Docker at least 1.7 (and leave the code as it is with hope, that library will be updated) or switch change Docker API version used by Runner to 1.19 which should end with a clear error, that available docker-engine API is incompatible (this needs to be checked). An "automagic update" of docker-engine installed at users machine is out of scope of our package.
I mean if the wrong docker version is installed, fix it with the dependencies defined in the gitlab runner package ... that's it and nobody will recognize that problem, which is mainly cause due to often renaming the docker package in debian
@tmaczukin I agree with @kaystrobach ; I see only two correct solutions (letting it as-is will brake all gitlabs installed on those distros after a standard upgrade being run) :
prevent upgrade to gitlab v9 (by requiring specific docker engine version)
upgrade docker engine version
doing the last assumes docker engine is required only by gitlab-runner which is wrong for some of us : do we want to force adminsys guys ? I don't think so
@dedsm Which error do you see? Because we were discussing two problems here:
There is no trace and in Runner's log after requesting for a job you can see Error decoding json payload json: cannot unmarshal array into Go value of type string. This is a bug in GitLab and it will be fixed with GitLab CE/EE 9.0.1 (gitlab-ce!10185).
Job starts but it fails with Error reading remote info: json: cannot unmarshal number into Go value of type bool. This is the problem of too old Docker API.
Since you have a quite new Docker version I suppose that you're hit by the first problem. In that case you need to wait for GitLab CE/EE 9.0.1 which should be released today or tomorrow.
@codenamemahi 9.0.1 was tagged so it should be released today. Release will be notified with blog post on https://about.GitLab.com/blog/. For patches for current version (like this one) we're also updating GitLab.com each time a release is ready so you can follow https://twitter.com/gitlabstatus. We will mention there the start and finish of deployment.
I've even tried to set the gitlab-ci.yml file with only these lines:
jessietest:script:-"echoHelloworld"
As you can see I've launched multiple tries and it only worked when I installed back the 1.1.11 gitlab-ci-multi-runner.
Maybe the error comes from another spot ? I've thought about the config.toml file, but I've still same error even removing my specific lines. I'm using these configurations:
Not sure if this is exactly the same issue I'm experiencing because my logs are slightly different:
Running with gitlab-ci-multi-runner 9.5.0 (413da38) on gitlabcinode (5a14a308)Using Docker executor with image node:6 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go struct field Info.Debug of type boolWill be retried in 3s ...Using Docker executor with image node:6 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go struct field Info.Debug of type boolWill be retried in 3s ...Using Docker executor with image node:6 ...ERROR: Preparation failed: Error reading remote info: json: cannot unmarshal number into Go struct field Info.Debug of type boolWill be retried in 3s ...ERROR: Job failed (system failure): Error reading remote info: json: cannot unmarshal number into Go struct field Info.Debug of type bool
Docker version 1.6.2, build 7c8fca2
gitlab-ci-multi-runner 9.5.0
GitLab 9.5.2
But all my builds have been failing for months. No upgraded has fixed this yet.