Skip to content

Add poll interval and timeout parameters for Kubernetes runner

What does this MR do?

This MR adds two new parameters that can be specified in the [runners.kubernetes] configuration:

  • poll_interval: How frequently, in seconds, the runner will poll the Kubernetes cluster to check on the status of the container it has attempted to create. [Default: 3]
  • poll_timeout: The amount of time, in seconds, that needs to pass before the runner will timeout attempting to connect to the container it has just tried to create. [Default: 180]

Why was this MR needed?

These parameters help with the use case whereby you have a Kubernetes cluster that has potential for high workload, and there isn't enough resources to run every task as soon as they come in (i.e. the Kubernetes scheduler could be reporting that that pod hasn't started because there isn't enough resources to run the container). Currently, there are some magic constants that define that the runner will poll Kubernetes every 3 seconds, at most 60 times, before timing out, but it allows for more functionality if these values can be configured.

This is not the perfect solution for this situation; ideally GitLab CI would be able to talk to the Kubernetes scheduler when it was about to run a Kubernetes task, and hold off on actually scheduling the task if it knew there wasn't resources currently available. However, this seems like a much more complex undertaking, and may go against the GitLab CI flow quite a bit (but if people think this is possible in future, I can open an issue for it).

In the meantime, the addition of these parameters should help those in use cases like myself where there's a potential for backlog due to full resources, and don't want the tasks timing out too early.

Does this MR meet the acceptance criteria?

  • Documentation created/updated
  • Tests
  • Added for this feature/bug
  • All builds are passing
  • Branch has no merge conflicts with master (if you do - rebase it please)

Merge request reports