I don't like current approach of manually specifying volumes. How about using template files that would be pre-filled with GitLab specific data. This allows us full flexibility of configuration of volumes (any type), labels (of any value) and anything else that is there:
We would load pod template, fill it with our own data. We will load container template, for each type and fill it with our own data. We will add container template to list of pod containers. Pod template could have additional containers that would not be created dynamically, but manually.
@ayufan sorry for the delay, it's been very busy in the run up to KubeCon here.
I'm not a big fan of this design for a few reasons:
with this, there's no way for a user to mount a volume into a service container. Currently, volumes are mounted into every container with the pod. So if a user wants to configure/tweak parameters or some config in a service, they can.
if the Kubernetes API changes, we now have to version gitlab-runner against not only GitLab, the helper docker image, and then also additional volume types.
where does this kind of templating end? There's a complexity vs. flexibility tradeoff here, and I feel like the proposal favours flexibility over complexity, meaning it's more powerful but also a steeper learner curve for users (having to understand k8s manifests format for one)
Kubernetes itself already has abstractions for what we're trying to do here in the form of PersistentVolumeClaims (https://kubernetes.io/docs/user-guide/persistent-volumes/#persistentvolumeclaims). If the number of different volume types being added here is a concern, we could support just PVC's as volumes. If users want to mount a volume into their build, they can create a PVC in the destination Kubernetes cluster (and namespace) and reference that from within their config. PVCs can be created from within the Kubernetes dashboard UI too, making it nice and simple for users that have 0 experience with writing Kubernetes manifests.
with this, there's no way for a user to mount a volume into a service container. Currently, volumes are mounted into every container with the pod. So if a user wants to configure/tweak parameters or some config in a service, they can.
You have the separate definition for service container that can have different volume mounts. So this is not fully correct.
if the Kubernetes API changes, we now have to version gitlab-runner against not only GitLab, the helper docker image, and then also additional volume types.
It will be just recompiled of binary with the bump of k8s integration. As we do not implement new functionality, we only parse existing specifications. So this will not be the biggest problem. Since the API there doesn't change you may be quite sure that it will work as previously.
Kubernetes itself already has abstractions for what we're trying to do here in the form of PersistentVolumeClaims (https://kubernetes.io/docs/user-guide/persistent-volumes/#persistentvolumeclaims). If the number of different volume types being added here is a concern, we could support just PVC's as volumes. If users want to mount a volume into their build, they can create a PVC in the destination Kubernetes cluster (and namespace) and reference that from within their config. PVCs can be created from within the Kubernetes dashboard UI too, making it nice and simple for users that have 0 experience with writing Kubernetes manifests.
I want to avoid adding specific volumes support, and either rely on parsing a template to structures that are provided by someone else or simply. So either we will stick with templating which is very generic way of supporting all, or allow only to have: pvc, hostPath, config and secretMap. I don't expect to support more, but still I don't like that we have to replicate and rewrite to an API.
I have a dilemma. I understand @munnerz's concerns, but I also would prefer to include some YAML file in k8s format instead of re-implementing such thing in Runner. The main problem that I see in this way is what @munnerz already wrote
it's (...) a steeper learner curve for users (having to understand k8s manifests format for one)
Heaving a configuration struct with 4-5 fields instead of full k8s manifest which can be not known for a user would be much easier to learn for someone who don't have experience with both k8s and Runner.
I'm very interested in this
Kubernetes itself already has abstractions for what we're trying to do here in the form of PersistentVolumeClaims.
From the linked documentation I don't fully see how this would be an abstraction layer for accessing different volumes. If I understand documentation correctly with PVC I can define a type of the volume and k8s cluster will prepare the volume itself for me. That means that I have no power to decide - for example - which directory from NFS/S3/any other source I can mount. For environments and tasks where I'm interested only in providing a persistent storage this will be OK. But if I need to mount a specific target (e.g. a persistent storage of some WWW pod to "deploy" new version of website with my job). Will this work? If yes, then I agree with the last proposition of @ayufan. Let's support only pvc, hostPath, config and secretMap with some clean configuration structures in config.toml file, and allow users to access other types of volumes through pvc. If PVC can be configured with k8s dashboard, then it will also be easier to understand by a new user than a YAML manifest
Is there currently a workaround for this? I added some source code myself to implement this feature (copied from elsewhere) but the Dockerfile in this repo is broken, so I couldn't build an image to use.
@tmaczukin sorry for not responding to your message here sooner.
From the linked documentation I don't fully see how this would be an abstraction layer for accessing different volumes. If I understand documentation correctly with PVC I can define a type of the volume and k8s cluster will prepare the volume itself for me. That means that I have no power to decide - for example - which directory from NFS/S3/any other source I can mount.
It's worth noting that PVCs do container 'selectors', that can be used to match specific PVs configured in the target Kubernetes cluster (more info: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#selector). It would also be possible in the pvc config structure in the config.toml to specifically allow/disallow dynamic volume provisioning.
Once we've got a consensus on how we want to implement this I'm happy to jump on this work. I know the big outcry at the minute is for the ability to pass through /var/run/docker.sock, and I too am keen to get that out!
@tmaczukin@ayufan it's also worth noting the work around PodPreset in Kubernetes, which provides a k8s native way to achieve what we want here (kind of). It's currently in alpha however, so there are no stability guarantees and the feature won't be available in GKE.
(sorry for the 4th comment here!) to follow up @ashernevins, you actually could use the PodPreset feature of Kubernetes to do this today if you are running in a cluster with the settings.k8s.io/v1alpha1/podpreset api type enabled. Have a read over the above link to get an idea of how.
@munnerz I actually tried that, but building from the Dockerfile in this project is currently broken. I opened an issue for it already.
I run on GKE, so PodPreset is unfortunately not available to me.
Is there any kind of timeline for implementing the volumes feature? I'd be happy to offer a bounty for quick delivery. This is a pretty core feature that's blocking us from using GitLab CI at the moment.
@munnerz I decided to try to use your PodPreset suggestion. I created an alpha cluster and a PodPreset, but I can't find any documentation on how to have the runner create nodes with specific labels so the PodPreset applies. Can you point me in the right direction?
If there are no strong arguments for splitting configuration of mount into two entries (mounts and volumes.{type}) instead of one (volumes.{type}), then lest go with the second one.
It would be interesting to have the support for EmptyDir, as i don't see a way of providing a tmpfs to a service container as of now, but i might be wrong.
Yeah adding EmptyDir support is probably also wise. It has a single parameter, Medium (a string), that when set to "Memory" will be an in-memory disk. Otherwise if the field is not set, it will be an empty temporary directory.