While he have CI runners that are able to operate in distributed environment and run test suites in parallel, an interesting idea may be introducing mechanism to split test suites even further and run test even faster.
It may be particularly useful for large organizations that have test suites running for hours. This would require introducing some kind of abstraction that may be implemented using various programming languages (like PHP, Java, C++). This implementation should be ten consumed somehow by GitLab CI/runners.
I think that this is pretty much language specific and not easily handlable by CI system that doesn't care what is run.
It would be nice to think of something that would make it possible: reconfigure builds to retry broken tests, but it will still be hard since we don't know about all project requirements :(
@grzesiek I don't know about something automatic. People use so many different languages, frameworks, dependencies. We should try to build something to works for many people. It would be hard to cover everyone.
Essentially, .gitlab-ci.yml already offers an abstracting interface for parallelisation. It's up to the user to leverage this.
However, parallization is interesting. Maybe you can think of ways that we can make it easier for people to run things in parallel?
document more languages? (This is something @ayufan and @axil have worked on)
It is surely something that is not easy, I agree with both of you that we need something that works for many people and is less dependent on some certain technologies, languages etc. But it is not impossible to come with something interesting and valuable as long as we will write down ideas and wait for good solution to hatch from them. This is what this issue is for, as I don't have good solution for this now.
Today we talked with @ayufan about this, and I seems to be quite reasonable approach to provide mechanism that would automatically provision and orchestrate CI runners depending on output of configuration build. Configuration build would provide an information how to orchestrate the rest of the builds.
With this approach user would be able to create a build, that would split user's test suite into smaller parts, and it would be up to the user how test suite will be partitioned and how it would be executed.
For example, if we have rspec suite, user would be required to create task like rake rspec:split that would return data like JSON string containing metadata and information how to execute each part of the build, something like:
This is rspec example above, but the idea is not tied to specific technology.
With this approach this is user responsibility to split their test suite, so that it could be executed in parallel, and Gitlab CI mechanisms merely give user an ability to provision builds/runners automatically and orchestrate entire environment to run test suites in parallel.
This is not meant to be done right now of course, this is only an another idea on how to speed up builds.
Yes, it is good idea to think how we could split that. I'm a little against at looking at list of files, because this will work only with some languages.
Automatic parallelization is a huge user benefit, but also not easy. And like most magic, it's awesome when it works, and horrible when it doesn't. Some approaches:
Let the language/framework figure it out and just support it at the highest level.
File-based splitting
Test-based splitting
(1) might be the easiest for us to start with. I imagine it involves people making complex .gitlab-ci.yml files that might be brittle, but at least it would be functional. @ayufan has a great prototype for configuring it for gitlab-ce and it involves caching test runner output so the tool can auto-balance itself after each run. We could take that approach and clean up the syntax in .gitlab-ci.yml to reduce repetition.
(2) would be somewhat easy for us to do, and would scale when people add new files, but of course isn't universally valuable for some frameworks. Also, splitting up 20 files into 4 buckets of 5 files doesn't automatically result in the most efficient split. Many test runners support JUnit-style XML output which includes time taken for each test, which once analyzed, can provide information for auto-balancing so each runner aims to take about the same time. This then becomes a documentation and onboarding challenge to teach people to use JUnit output (which is not the default for most frameworks, even if it is possible).
(3) would be much more complex, and likely require a different solution for each test framework. It would also depend on something like JUnit output for balancing. But of course, this results in the most optimized test runs.
Some other thoughts while I'm at it. One challenge is to avoid thinking of parallelizing at a single level. e.g. the naive approach would be to give each runner a number and provide the runner number and total number of runners as env vars. But practical configurations require something like 5 manually specified commands plus 1 or more commands that can be parallelized further. e.g. run lint on a single runner then split rspec into 5 runners. With another vendor, I've personally run into problems where the system only knows how to split up the tests amongst 5 parallel runners, and lint ends up running redundantly on each one. Again, our own gitlab-ce tests are a great example of what we need to support. There's a bunch of manually split specs, then spinach which should be auto-split, then a collection of one-off tests (rubocop, teaspoon, flog, etc.).
Other vendors take some of the parallelization configuration out of config files and let developers use a slider to configure how parallel you want things to run. This works for them because they charge based on the max number of containers used. Since we don't charge this way, this makes less sense. I still like the semantics of sliding a slider to increase your parallelization, but it would be more practical to simply include that in .gitlab-ci.yml itself.
FYI, JUnit is also known more generically as XUnit, but that term doesn't seem as popular. I hate calling it JUnit though since it implies Java, which really turns many people off (and misleads them into thinking it's not relevant for them).
So given the simplest possible option I think that we should stick with option 1. Trying to support 2. and 3. seems to be troublesome and also not needed at this point.
Yes, now imagine we have plugins and example .gitlab-ci.yml files for popular languages, and someone creates a new ruby project and clicks on "Set up CI" and we detect they're running rspec and spinach and we auto-generate:
What about glob based job definitions such as #23455 (moved), a bit similar to Make. For example, the following would be interpreted as many jobs, one for each match based on the glob. Script would execute from the context.