WIP: Added functionality to create a network per build-ID and attach containers to it.
What does this MR do?
Add a docker network for each build.
Why was this MR needed?
Docker links are deprecated in place of using networks. This also gives bidirectional communication between containers.
Are there points in the code the reviewer needs to double check?
It isn't finished!
You may want to run docker network prune
after running tests.
TODO
-
Delete networks after they are finished with. -
Test that config.toml's network settings aren't overwritten -
Add integration tests. -
Having run go test gitlab.com/gitlab-org/gitlab-ci-multi-runner/executors/docker
there are many docker networks called build-0. I was expecting one, so it looks like docker.CreateNetwork can create many networks with the same name, but different IDs. Probably want to do a list networks before create to avoid this. Also change tests to use unique build IDs to keep them isolated if run in parallel.
Does this MR meet the acceptance criteria?
-
Documentation created/updated - Tests
-
Added for this feature/bug -
All builds are passing
-
-
Branch has no merge conflicts with master
(if you do - rebase it please)
What are the relevant issue numbers?
Merge request reports
Activity
#1592 is the same network issue as I have run into during testing.
Thanks @dooferlad. I'll look into the
docker:dind
issue with customnetworks
.Could you try to finish the base functionality?
- containers joining created network,
- network being removed during cleanup,
- tests for making sure that this flow is executed properly.
- Resolved by username-removed-753629
- Resolved by username-removed-753629
- Resolved by username-removed-753629
@dooferlad @ayufan I left some comments :)
Remove use of DIND from tests since
TestDockerPrivilegedServiceAccessingBuildsFolder
doesn't work with networks due to it using br_netfilter, which can't be loaded from within a container.This is an interesting problem, I'll also try to find time tomorrow to check what is happening there. Currently I can confirm that all tests from
./executors/docker/
excluding this one are working. There needs to be some way to fix this without removing DinD usage (which will be hard or even impossible for our test and build environment).@tmaczukin thanks for your comments! I completely agree about testing, hence the WIP tag.
Before I do more I just want some clarity on what you use DinD for since its use is very much discouraged. It may be better to remove DinD than work around its limitations and replace it with something supported, such as docker-compose to create a group of containers that you can remove in a single command.
Ehh... everyone is pointing to this article without noticing that it's quite old and it's discouraging DinD in some cases and with some goals in mind
Most of the problems described in this article are related to used infrastructure, OS and its configuration. Because we've decided to use DinD, we've prepared infrastructure in the way that it doesn't have this problems. We're not using BRTFS on AUFS (and other mentioned there combinations of filesystems). We're not using different machines with SELINUX randomly enabled or not - we have a one base system, that is configured for our needs.
The article is suggesting to mount
/var/run/docker.sock
to the container because itlooks like Docker-in-Docker, feels like Docker-in-Docker, but it’s not Docker-in-Docker
. And then the overall conclusion seems to be that the biggest and most expected profit is that your docker cache is not lost after the container is removed. Well, for some people this may be a profit, we decided that a bigger profit for us is that we can emulate a standard host with a docker engine (with a version of our choice!) inside a test docker container. Also - for shared runners we're using auto-scaling were the machine is remove after one build. Using host's docker cache will not be helpful in such case :)To not repeat what was already wrote - please read this issue: gitlab-ce#17769. You will find there arguments describing why we've decided to use DinD.
Now, going back to this MR. All tests are working properly with DinD, only this one is an exception. And it only happens when the custom network is used. We should look how we can fix this one failing test (it would be best to do it by changing the configuration of environment, not rewriting the test, but maybe the test is wrong!) instead of changing a whole test environment where everything is working properly :)
Let's focus on finishing the main part of this MR, because it's mostly not related with DinD at all, and it will work in every environment, where Docker is available. And we'll get back to the failing
TestDockerPrivilegedServiceAccessingBuildsFolder
later :)Sorry for the delay getting more done - life happened (too much of it - virus invaded mine)! Still feeling grotty, but I thought I should check in.
@tmaczukin thanks for the DinD issue ref. I seem to have some gaps in my understanding of the workflow still, but we don't need to get into those now.
My time is limited today, but if I can get anything done I will, else it will be tomorrow, assuming you want to continue.
added 1 commit
- 117117d5 - * Actually test that networks are created based on if the network API is supported
changed milestone to %v1.11
added executordocker improvement labels
@dooferlad Thanks for the update. I'll try to find some time in this week to review this. Definitely I want to continue and release this with v1.11 in February :)
mentioned in issue #2127
changed milestone to %v9.0
changed milestone to %Backlog
mentioned in issue #2511
mentioned in merge request !609