GitLab QA Expansion Plan

TL;DR At this moment, GitLab QA project development and usability are limited. Most of the problems, we have, stem from the fact that GitLab QA test scenarios are decoupled from the products these test scenarios are meant to run against, and without solutions like multi-project pipelines, we are vulnerable to fragile tests and synchronization problems. The solution may be using a separate Docker image containing integration test scenarios.

1. Problems we have now

The main problems we have now are fragile tests problem (see #6 (closed)) and CE/EE synchronization problem (see #21 (closed)). These problems stem from the fact that GitLab QA is a separate project, with test scenarios implemented for both, CE and EE. Example scenario:

feature 'create a new project', :ce, :ee, :staging do
    scenario 'user creates a new project' do
      Page::Main::Entry.act { sign_in_using_credentials }

      Scenario::Gitlab::Project::Create.perform do
        with_project_name('some-project')
        with_project_description 'Some test project created by GitLab QA'
      end

      expect(page).to have_content /Project \S?some-project\S+ was successfully created/
  end
end

This is simple test scenario that checks if we can successfully create a new project in GitLab CE and GitLab EE instance. The underlying implementation of Page::Main::Entry and Scenario::Gitlab::Project::Create uses Capybara to click its way through GitLab to create project using clicks only. It obviously needs to use HTML selectors and has a knowledge about structure of pages, menu, etc.

Imagine a situation when someone changes HTML selectors on project page in GitLab CE. The test will fail, and because GitLab QA pipeline is not integrated into GitLab CE/EE pipeline (we neither have multi-project pipeline, nor ability to use Omnibus to build Docker image with GitLab from the particular SHA) developer who introduced the change does not know that this is a braking change for GitLab QA. It means that after each change like that we have red pipelines in GitLab QA, and we need to manually fix the selectors in test scenarios.

This leads to another problem. When we fix selectors in GitLab QA, and these now match selectors in GitLab CE, tests for GitLab EE are still broken because of synchronization delay. Selectors in GitLab EE can be different even for a week, before someone merges CE into EE.

The most boring solution here may seem to be merging entire GitLab QA into CE and EE, but this is not a good solution. This won't work because GitLab QA is designed to execute tests against Docker image created by Omnibus. It needs to be executed against a product as a whole. It needs to bootstrap the entire environment, and then execute tests. As a matter of fact, GitLab QA consists of two scenario layers. First layer is scenarios for bootstrapping the environment, second layer is instance scenarios. Good example that may help to understand that is scenario for instance upgrade. This is how it works:

spin up GitLab CE/EE docker image with latest tag with data volumes mounted in some directory
run instance tests against GitLab container to populate database with data
teardown the GitLab container
spin up GitLab CE/EE docker image with nightly tag and mount the same data volumes
this will trigger Omnibus upgrade scripts, migrations and all the stuff we want to test
because tests are idempotent, execute them again against the upgraded instance
if everything is fine we report the success

You can notice one additional problem with the workflow presented here. What if HTML selectors in the latest image are different than HTML selectors in the nightly image? This is an another problem we need to solve,

Take a look at another test scenario:

spin up GitLab CE container
run instance test scenario that checks integration with GitLab Runner
test scenario logins to GitLab, creates the project
test scenario goes to the project settings -> runners and gets the runner token
test scenario spins up the Docker container with gitlab-runner:latest
runner registration token is used to register runner
all remaining runner-related test scenarios are executed

This means that instance test scenarios are also responsible for creating the environment around GitLab:

we use environment test scenarios to bootstrap GitLab instance, configure Omnibus and external services
we use instance test scenarios to configure a project within existing instance with external services

This should be now clearly visible that we have layered architecture in GitLab QA.

2. What are our objectives?

At this point we should aim to find a most viable solution that solves the problems we have now.

Make it possible to extend test suite.
Prevent fragile test and synchronization problems.
Make GitLab QA useful

3. How to solve that?

1. Multi-project pipelines

One solution may be shipping multi-project pipelines that are on our roadmap. Unfortunately at this point we do not yet know what would be shape and specification of multi-project pipelines. We do not have a timeframe in which we can expect to have multi-project pipelines. And we still are not sure if this would solve all the problems GitLab QA has now. Multi-project pipelines were our primary objective when we started work on GitLab QA, but soon after Idea to Production vision took over and, at this moment, we do not know when we are going to work on mutli-project pipelines. In my opinion we should not wait for the feature that we are not even sure we will ever have. We need to find most viable solution for problems we have now, and we need to do it now.

2. Making integration tests a product feature

An another solution, I was recently thinking of, is making QA test suite a GitLab feature. Imagine having the Settings -> Integration tests page, with ANSI terminal, where you can click "Run" and see the output of internal RSpec test suite we already have. This would solve all three problems:

we won't have fragile tests, because code will live in GitLab CE/EE repository
as soon as we will make it possible to build Docker image using Omnibus and SHA from repositoru on gitlab.com we can add this to the CI (probably using manual CI action)
this will also solve synchronization problems, because internal integration tests would live in gitlab/qa directory
this will make it possible to support integration tests for ALL version, so we won't have problems with running instance upgrade scenarios

The possible implementation needs some explanation. This is only a seed of the concept. It probably needs refinement, but I think it should be doable.

In order to make it work, we would need to make some assumptions:

instance test scenario layer should be moved to instance itself, so this is the code that should be moved to the GitLab CE/EE repository
internal instance test scenarios still would need to manage environment, we should harness Kubernetes with which we integrate closely anyway, and make it possible to instantiate GitLab Runner from within the GitLab CE/EE instance (using test scenario only, no need to implement or devise something new)
environment test scenarios need to remain a separate layer; this still would need to be an external project / gem, capable of configuring Omnibus in Docker image with GitLab, and triggering internal instance test scenarios.

This approach has numerous benefits:

The most important - we would solve all problems GitLab QA has now.
We provide additional feature for customer - internal integration tests.
We already have internal instance integration tests implemented in GitLab QA.
We already have environment test scenarios implemented in GitLab QA.
With this approach we would be able to test upgrade from 8.16 to 10.X.
We preserve the flexibility of external QA tool while increasing robustness and usability.
We will be able to extend test suite and involve developers in doing that as well.
We would be able to run QA on staging, before the release, without the need to click through it.

3. Use a docker image with instance test scenarios

This is a modification of the previous approach. We can build a docker image with tests, for each version of GitLab (build a Dockerfile when a new tag is pushed). This means we will not need to bing new dependencies to the Omnibus, all dependencies will live in a separate Docker image. External GitLab QA tool will be still used to run tests against any instance or instance configured locally.

At this point this is a seed of a concept. I think it is doable and it is not enormous effort since we already have most of the implementation done.

Please take a look at this @stanhu @rymai @ayufan @pcarranza @marin. Thanks in advance!

Admin message