Create a Cluster page to add a new cluster on GKE

changed the description

mentioned in issue #35956

marked this issue as related to #35956

@markpundsack looking at the design of https://gitlab.com/gitlab-org/gitlab-ce/issues/27888 I can see this working here in a similar fashion. However I wonder about a few things:

The person that creates the cluster may not have connected their account with their google profile, so I think we'd need a way to let them authenticate this... before showing options of creating a cluster.
https://gitlab.com/gitlab-org/gitlab-ce/issues/27888 starts off by "enabling" a cluster.. like it will just be created... this almost implies to a group level enabled google account. Do you envision a similar thing for project based gke cluster creation?
What about selecting an already existent gke cluster?
You want us to prefill the info, after using the google way of connecting? or rather show it in a different way?
"Creation should let the user pick the number of nodes and the region and let them know what size machines will be used" indicates that if we selected it with the google way.. additional info is present.. Can we detect this info on any cluster, not just gke?

mentioned in issue #35616 (moved)

The person that creates the cluster may not have connected their account with their google profile, so I think we'd need a way to let them authenticate this... before showing options of creating a cluster.

@dimitrieh Turning that around, we should show the option to create a cluster, but then drive them through an OAuth and permissions flow if necessary. i.e. show the goal first, and the technical hurdle second. Hiding the option to create the cluster until they've authenticated would bury the capability and hinder adoption.

https://gitlab.com/gitlab-org/gitlab-ce/issues/27888 starts off by "enabling" a cluster.. like it will just be created... this almost implies to a group level enabled google account. Do you envision a similar thing for project based gke cluster creation?

"Enabling" there is an indication of how frictionless we want the experience to be. In reality, cluster creation can take several minutes so maybe we should make that clearer. It does not imply a group-level Google account. We assumed the same kind of OAuth-when-necessary dance.

What about selecting an already existent gke cluster?

Yeah, that's an interesting question; that we haven't explicitly tackled. It should be possible to connect to an existing GKE cluster, and thus get certain admin/monitoring capabilities for it. But the k8s integration already lets you add creds to any existing cluster (GKE or not), so it's not a practical concern yet. Well, except that people might be more comfortable/happier doing an OAuth dance and selecting from a list instead of copying/pasting k8s creds. Perhaps that's a further iteration enhancement, or something after https://gitlab.com/gitlab-org/gitlab-ce/issues/35616.

You want us to prefill the info, after using the google way of connecting? or rather show it in a different way?

I think what you're asking is whether, after the cluster is created, we just fill in the regular k8s integration values, or do something special. I hadn't thought much about it. I guess I assuming more the latter, like we do for the Mattermost integration. There, after you click the button, we just acknowledge that it's configured, but hide the details. If it's easier for a first iteration to use the existing k8s variables, then I'd be open to it, but I think a good implementation would treat the GKE cluster as something more first-class.

"Creation should let the user pick the number of nodes and the region and let them know what size machines will be used" indicates that if we selected it with the google way.. additional info is present

Yeah, it does imply that. :)

Can we detect this info on any cluster, not just gke?

We could detect the number of nodes, but I'm not sure about the machine type. I mean, it's buried in node labels like beta.kubernetes.io/instance-type: n1-standard-2, but I have no idea if this is a convention we can count on for all providers. We certainly wouldn't let them edit the number of nodes unless it's GKE.

added Deliverable label

@tauriedavis would you be able to chime in here? (as I prob don't have enough time for this issue, and you did previous work with me on the google stuff) cc: @sarrahvesselov

Im not sure I will get to this this week and I will be on vacation next week. I will keep this in my todos incase I have time this week but will leave unassigned incase someone else is able to pick it up.

we should show the option to create a cluster, but then drive them through an OAuth and permissions flow if necessary. i.e. show the goal first, and the technical hurdle second. Hiding the option to create the cluster until they've authenticated would bury the capability and hinder adoption.

I agree with this @markpundsack.

What about selecting an already existent gke cluster?

I am in agreement with Marke here, this should be tackled in a future iteration.

assigned to @tauriedavis

@markpundsack Can you clarify this portion:

Creation should let the user pick the number of nodes and the region and let them know what size machines will be used (e.g. n1-standard-2).

What do you mean by region? How is the machine size specified? Those dont seem to be included in https://gitlab.com/gitlab-org/gitlab-ce/issues/27888.

@tauriedavis Yeah, that was feedback that our mockups didn't include enough configurability and that selecting region was actually really important. (And I agree, but just didn't think about it when we did the first mockups.)

You can see regions here: https://cloud.google.com/about/locations/

They're things like us-central1. The dropdown for Zones looks like this.

Actually, it looks like you need to specify a "Zone", which is a sub-thing under regions. e.g. us-centra1-a.

We could let them specify the machine size, but I think we might be better off picking a recommended size. Not sure about it, but it's simpler to pick a reasonable default and give them fewer choices. But if we pick a reasonable size, like n2-standard-2, we should still tell them what size we're using as it affects cost and might impact their choice of number of nodes.

Standard machine types look like this.

added frontend label

added backend label

@cperessini Since you have some more availability, do you want to pick up this issue?

assigned to @cperessini and unassigned @tauriedavis

@tauriedavis I'll pick it up, thanks!

assigned to @filipa

assigned to @dosuken123

@cperessini I learned today there are some old mockups for a similar thing: https://gitlab.com/gitlab-org/gitlab-ce/issues/27888 Not sure if this helps!

Thanks @filipa! I'm gonna look into this today

@ayufan Here is the summary of technical challenges on this issue.

Creates a cluster

We can let users to create a cluster with projects.zones.clusters.create method in GKE API. It can leverage almost everything as well as we can do in the GCP Web Console.

Example

We create a cluster in a GCP project "test-autodevops"
The zone of the new cluster is "us-central1-a"
The name of the new cluster is "test-api-creation"
The node size of the new cluster is 1

curl -H "Content-Type: application/json" \
-X POST -d '{"cluster":{"name":"test-api-creation","initial_node_count":"1"}}' \
https://container.googleapis.com/v1/projects/test-autodevops/zones/us-central1-a/clusters

=>

{
  "name":"operation-xxxxxxxx",
  "operationType":"CREATE_CLUSTER",
  "selfLink":"https://container.googleapis.com/v1/projects/xxxxxxxx/zones/us-central1-a/operations/operation-xxxxxxxx",
  "startTime":"2017-09-13T16:49:13.055601589Z",
  "status":"RUNNING",
  "targetLink":"https://container.googleapis.com/v1/projects/xxxxxxxx/zones/us-central1-a/clusters/test-api-creation",
  "zone":"us-central1-a"
}

Note

GCP project and other necessary components to create a cluster (e.g. Billing info) should have been prepared before creating a cluster

Gets the details of a specific cluster

We can get the cluster details with projects.zones.clusters.get method in GKE API. It contains necessary data to integrate with k8s, such as Endpoint, ca-certificate, username/password and status.

Example

GCP project is "test-autodevops"
Cluster zone is "us-central1-a"
Cluster name is "cluster-1"

curl https://container.googleapis.com/v1/projects/test-autodevops/zones/us-central1-a/clusters/cluster-1

=>

{
  ...
  "endpoint":"104.xxx.xxx.xxx",
  ...
  "masterAuth":{
    "clientCertificate":"xxxxxxxxxxxxxxx",
    "clientKey":"xxxxxxxxxxxxxx",
    "clusterCaCertificate":"xxxxxxxxxxxxxx",
    "password":"xxxxxxxxxxxxxx",
    "username":"xxxxxxxxxxxxxx"
  },
  ...
  "status":"RUNNING",
  ...
}

Note

We can get a cluster username/password, but we can not get a cluster token. Currently, GitLab k8s integration has been using token for authentication, so we need to extend it to use username/password instead of token.
CaCertificate is encoded in Base64

Authentication/Authorization

We need authenticating as an end user for the above methods. This works as well as we have integrated with other services. (i.e. We present a consent screen to the user)

Note

We use https://www.googleapis.com/auth/cloud-platform OAuth scope for both clusters.get and clusters.create.
We also have another option Service accounts. This is tied to the GCP project opposed to a single user account.

Others

Users have to enable Google Container Engine API before GitLab access the resource via API.
The API rate limit is 10 requests per second
Resource Quotas

Libraries

We use google/google-api-ruby-client. It's already been included in Gemfile in master branch in CE repo. I used this library to test the above methods. Probably we need to update the gem.

References

zone

Sample code for geting the details of a specific cluster

``` # BEFORE RUNNING: # --------------- # 1. If not already done, enable the Google Container Engine API # and check the quota for your project at # https://console.developers.google.com/apis/api/container # 2. This sample uses Application Default Credentials for authentication. # If not already done, install the gcloud CLI from # https://cloud.google.com/sdk and run # `gcloud beta auth application-default login`. # For more information, see # https://developers.google.com/identity/protocols/application-default-credentials # 3. Install the Ruby client library and Application Default Credentials # library by running `gem install google-api-client` and # `gem install googleauth`

require 'googleauth' require 'google/apis/container_v1'

service = Google::Apis::ContainerV1::ContainerService.new

service.authorization =
Google::Auth.get_application_default(['https://www.googleapis.com/auth/cloud-platform'])

The Google Developers Console [project ID or project

number](https://support.google.com/cloud/answer/6158840).

project_id = 'test-autodevops' # TODO: Update placeholder value.

The name of the Google Compute Engine

zone in which the cluster

resides.

zone = 'us-central1-a' # TODO: Update placeholder value.

The name of the cluster to retrieve.

cluster_id = 'cluster-1' # TODO: Update placeholder value.

response = service.get_zone_cluster(project_id, zone, cluster_id)

TODO: Change code below to process the `response` object:

puts response.to_json


*Note*
- *We don't use the above authentication code. We use this OAuth authentication, instead.*
- *This was picked from GKC API doc*
</details>

---

If you have any question, please let me know.

I'll work on PoC for auth flow in GKE API.

@markpundsack @filipa let me know if this proposal does what we need:

We add radio buttons to the Kubernetes integration page so you can choose how you want to set up your cluster: Use Google Container Engine or Set up manually

The Details panel changes its fields according to the selected option above.

The Manual panel has all the same fields that you can find in the Kubernetes integration page today.

The GKE panel has two fields: Number of nodes and Region. Both are pre-filled with a value chosen by us. The panel also informs the user of the machine size that will be used.

A Sign in with Google button is shown at the bottom. This button doesn't care if the user has already signed in with Google. Once they click it, the OAuth mechanism will take care of that difference.

GKE	Manual

Once the OAuth flow is finished, GitLab will show a banner message saying the cluster is being created:

Once the cluster is created, the panel will display the access details to the cluster.

The user must still check the Active checkbox and click Save changes to use the newly created cluster. This is reflected in the new banner message:

When the user visits this page after the cluster has been set up, the Details panel shows the current configuration.

I'm not sure if we'll be able to modify the cluster's setup on GKE from GitLab. If we can't, we should disable the input fields:

We need a field "GCP project name" to choose which GCP project owns the new cluster. We also need to specify "cluster name", however, it's not so important for the integration, so we can autofill it, I guess.

Couple of questions

Do we need to fetch an array of available GCP project names for dropdown/combobox instead of text filed?
Do we need to fetch an array of available zone names for dropdown/combobox instead of text filed?
Do we need a button to delete a cluster?
Do we need a "Create a new cluster" button instead of "Sing in with Google"? If user can't use OAuth, then do we show a login page?

I think "Zone" is better than "Region" because we specify zone as parameter. Region can't be used as parameter. (Reference: https://cloud.google.com/compute/docs/regions-zones/regions-zones)

Instead of Use Google Container Engine, how about Create on Google Container Engine. You can "use" GKE with manual configuration. The difference is that we're creating it for them.

It seems wrong to not active the cluster automatically.

We need a field "GCP project name" to choose which GCP project owns the new cluster.

Good point. I wonder if we can create a project for them too though, since people may not have a project already. But perhaps allow them to specify a project if they do have one, since the project has the billing relationship.

@cperessini What happens if they change the radio button to manual, after creating a cluster? And then goes back?

I wonder if we should use a create button pattern instead of radio buttons. Then after the cluster is created, it just fills in the fields for you. Or maybe hides them, I'm not sure. Thinking of how we did the Mattermost slash commands integration, we had a button that did the magic for you, then changed the display of page to hide the configuration.

Maybe we should provide links to manage the cluster directly if needed, especially if we don't let them delete the cluster.

Eventually we want to let them change the number of nodes. Not sure how that would fit, but it's beyond scope currently.

I wonder if we can create a project for them too though, since people may not have a project already. But perhaps allow them to specify a project if they do have one, since the project has the billing relationship.

The technical difficulty of creating GCP project would be high since it has to be connected to a credit card for enabling GKE. So I propose to tackle it in another iteration. Although, I think we should read all available GCP projects the user owns, and show it as a list, click and select in the form.

@ayufan I've not checked the readability of GCP project via API, and I feel we should include it in the first iteration. Should I spend time for the reaserch?

@cperessini it looks good to me but @markpundsack has a better view of the product goal :)

@cperessini @markpundsack Can we maybe decouple Kubernetes cluster from integration view? Maybe we could start with proposing the separate view within the future Cluster view, as part of a separate page.

I'm asking, as adding that today to Kubernetes/integration is troublesome, and we now that we will want to make Auto DevOps/Cluster a first-class thing, later also including installation of Tiller, Deploy Apps, Prometheus and Runner. Doing that as part of CI / CD > Cluster | Auto DevOps would make a lot of sense to me, as we then easily extend that view with additional data.

Internally we would, once the cluster is created, configure Kubernetes/integration, but this would be basically a separate entity and flow stored in the database.

Can we maybe decouple Kubernetes cluster from integration view?

@ayufan I'm sorry, I'm having trouble understanding your comment. Could you rephrase?

@markpundsack @dosuken123 thanks for your reviews, they were really helpful! Holding off on creating new designs until we know if we'll go in the direction @ayufan brought up.

@markpundsack

The current design changes Integrations > Kubernetes. My proposal is to create a new top-level page and leave this setting as-is.

The new page would for now be only used to Create a new cluster, but then we can extend that for cluster management.

@ayufan It sounds like what you're saying is to do https://gitlab.com/gitlab-org/gitlab-ce/issues/35616, but start with only project-level Clusters, and cut other scope (like only have cluster creation, and not management). That's fine, as long as we get it done in 10.1, and it doesn't slow down the rest of the iterations. Cutting https://gitlab.com/gitlab-org/gitlab-ce/issues/35616 was an attempt at focusing on a smaller MVP, so if you're saying it's actually faster to do that, great!

One question: will this support multiple clusters per project? If not, having a page to manage clusters seems kinda silly.

As this isn't designed yet, there's risk in changing direction this late in the process. You'll need to manage that risk carefully. Let's get a design asap so we can see if it's viable (technically, and from a product perspective). /cc @cperessini

@markpundsack @dosuken123 @ayufan @filipa here are designs for decoupling the Kubernetes integration page from clusters.

We'd need to add a few pages and a new Cluster entity. They're not huge changes, but probably a lot bigger than initially scoped for.

Integration page

Since clusters will be their own entity, there's no need to fill those details in the integration page.

We replace the input fields with just one dropdown where you can select any existing clusters or add a new one.

Integration page	Dropdown	Empty dropdown

Add new cluster

We add a page similar to the New issue or New merge request pages.

The first option on this page is to create a cluster on GCP. Clicking the Google Sign In button will take you to a different page if authentication succeeds. If authentication fails we show an error on this page.

The following section allows you to add an existing cluster by filling in the same information you can enter in the Integration page today.

I added a new Cluster name field so users can later identify the clusters they add to GitLab.

Add cluster	Authentication failed

Create cluster on GCP

If authentication succeeds, you are taken to this page, where you can specify the parameters for your new cluster on GCP.

We really could simplify all of this and just ask for a Cluster name, but I included the fields in case that's the direction we go in.

Some notes about the new fields:

Cluster name: This is the name that will be used by GitLab. I'm not sure if we can also communicate this name to GCP.
GCP Project name: This field is optional. If the user enters one, we create the cluster under that project on GCP. Otherwise, we create a new project with a generic name.

When the user clicks Create, the new GCP cluster gets added to the list of GitLab clusters too.

Clusters list

We add a new Clusters section under CI / CD in the navigation. For now, this will just be a list of clusters, where each row has the following informatino:

Cluster ID
Cluster name
Kubernetes integration status. If this is the selected cluster on the integration page, it will say Yes here. Otherwise it will say No.
Edit button
Delete button

There is no detail page for each individual cluster.

Edit cluster

The edit cluster page has the same fields as the current Integration page, plus the cluster name field.

This page shows the same information whether you created the cluster on GCP or added an existing one.

In a future iteration we can make this page smarter and add GCP-specific fields.

Some notes

The migration to 10.1 would need to create a cluster that uses the info specified in the current K8s integration page.

Do we need to fetch an array of available GCP project names for dropdown/combobox instead of text filed?

Addressed in the design

Do we need to fetch an array of available zone names for dropdown/combobox instead of text filed?

If fetching the list is complicated, we can choose 5-10 regions manually and offer those without having to go to Google.

We can also use a text field, but that will make things more complicated for users

Do we need a button to delete a cluster?

Not on GKE, but we'll need to be able to delete the clusters we add.

Do we need a "Create a new cluster" button instead of "Sing in with Google"? If user can't use OAuth, then do we show a login page?

Solved in the new designs

Thanks for the detailed mockups with all the status @cperessini!

@ayufan I created a minimal ruby app to test whole processes from authorizing user to getting cluster information. https://gitlab.com/dosuken123/gke_integration_poc. I’ve already confirmed authorized users can access their cluster info. It uses OAuth token when requesting to the API. OAuth token is issued when user login with Gmail account.

This app is using devise, omniauth and omniauth-google-oauth2 which are used in GitLab codebase. To be easily adopted into CE/EE.

Although, one hurdle remains. The sample app is currently authorizing users when user login. But we want to authorize them when they access clusters. In other words, we authorize users with GCP-API-OAuth-scope(https://www.googleapis.com/auth/cloud-platform) when they create a cluster. We don’t include the scope when they login with google account(i.e. we don't touch config.omniauth). I’m currently investigating this. Maybe we need to use Signet.

FYI, we use google-api-client gem for GCP/GKE API library. This is developed by google. We don’t need to develop API library from a scratch.

@cperessini

Can we not touch integration page at all and leave it as-is?

Can we focus on showing single cluster today? We don't support multiple anyway, today.

@dosuken123 Awesome work! I will take a look at it now!

@cperessini Thank you for creating awesomw mockup!

Here is a couple of feedbacks/questions.

Add new cluster page

I think showing "Sing in with Google" at the first place would be not ideal. How about putting "Create cluster on GCP" form in this page?
About "Sing in with Google" button. Authentication should be done only once. We don't show authentication button everytime. When user accepted a concent screen, we save the auth_token in GitLab. And We manipulate GCP API as a behalf of the user with the auth_token.
Do we need "Project namespace(optional/unique)"? This seems useless if it's manual configuration.
Please extend the "Token" filed as "Token" or "Username" and "Password" fields. This is because we can't get k8s token, but k8s Username/Password when we crate a new cluster. If user clicked "Create a new cluster" button, we save the "Username" and "Password" instead of "Token". And we authorize the new cluster by the Username/Password. But still "Token" is being used for the manual configuration as described in https://docs.gitlab.com/ce/topics/autodevops/quick_start_guide.html flow.

Create cluster on GCP page

GCP Project name should be required and user should have valid GCP Project before they continue it. We don't suport creation of GCP project in this iteration as this technical difficulty would be high https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_40587560. At least, user needs to have a GCP Project which enables billing and API access. We should annotate that beside the GCP Project name filed.
Please change "Region" to "Zone". We can't use "Region" as a parameter.
About Cluster name field. Just to be claer, each cluster already has a cluster name, and that should be unquie under each GCP project. Ideally, this GitLab customizable cluster name and the real cluster name should be identical. In this automation flow, we can synchronize it, but if user setup k8s integration manually, that can be different. This would cause unexpected behavior.
About Cluster name field. I think we should make this filed as reuiqred. We need the cluster name when we check the cluster details via API. FYI, this validation would be failed when another cluster already has taken the name.

Clusters list page

Do we need Cluster ID? This won't be used/refered from anywhere.
"Integrated with Kubernetes" would be confusing because cluster itself is k8s. How about "Used in this project" or something like that.
About Delete button. We should take care of that cluster itself will NOT be deleted (It just deletes associated data in GitLab, because we don't implement cluster deletion in this iteration). This should be shown as an alert. Also, it should be documented.

Edit cluster page

We should point to a link to the cluster in case of the user wants to edit cluster parametes. (e.g. node numbers)

@ayufan @markpundsack Thoughts?

Summary for this iteration

Do we create a new GCP Project if user doesn't have any? -> No. But we should document how to create a valid GCP project (e.g. billing is neccesary, etc).
Do we create a new cluster? -> Yes
Do we update the cluster? (e.g. node numbers) -> No. But we point a link to the cluster. User udpate in GCP Console.
Do we delete the cluster? -> No. But we point a link to the cluster. User delete in GCP Console.
Do we need to fetch an array of available GCP project names for dropdown/combobox instead of text filed? -> No? But show the link to the GCP project lists?
Do we need to fetch an array of available zone names for dropdown/combobox instead of text filed? -> No? Do we put a link to the zone lists?
Do we enable Kubernetes integrations automatically, when user adds a new cluster regardless of manual or auto creation/configuration. -> Yes? (Important)
etc etc etc

@cperessini Please prioritize @ayufan and @markpundsack comments, not mine.

First iteration

We add CI / CD > Clusters,
We use single page for the cluster integration,
We show a form with ability to disable cluster integration, allow to specify manual cluster parameters (replicate of Kubernetes integration), allow to create Kubernetes cluster,
In the beginning, we ask person to login to Google Cloud,
When creating a cluster we ask for: GCP project name (required), Cluster name (required), Cluster size, Machine type, Number of nodes, Zone, Project namespace (similar to Kubernetes integration),
When user clicks Create cluster we redirect him to the same page,
We show cluster details with the cluster creation status or cluster creation failure,
In details, we allow seeing API URL, CA Certificate, project namespace (copy paste of Kubernetes integration), login and password,
Password should probably be masked,
On details page, we provide a link to Google Cloud which allows us to access cluster details,
When we use Google Cloud cluster, on Kubernetes integration we show that this integration is managed by Cluster, we link to cluster,

Frontend

Frontend shows cluster creation form / or details form depending whether this is created or not,
I expect to use HAML with Javascript,
Frontend when on details page polls the status of cluster periodically with API provided by Backend,

Backend

We create a new database model which holds google_cloud_clusters, or just clusters,
We store all data about cluster, including a unique identifier that allows us to interact with cluster,
We all operations for cluster creation, or status pulling of cluster do with sidekiq job (for creation), status pulling (reactive cache),
For frontend, we provide an API for query cluster status, an API to login to Google Cloud,

Next iterations

We allow to soft-delete cluster (unlink from database), hard-delete cluster (delete locally, and remotely from Google Cloud),
We make a form for creating cluster to fetch: machine types, zone, GCP projects from GCE,
We allow to edit/upgrade cluster to newer version,

Next, next iteration

We extend cluster page with the ability to install and configure Prometheus, Runners, etc.

Next, next, next iteration

We allow to create multiple clusters and define them per-group (not per-project).

@bikebilly

I propose CI/CD > Clusters, to be consistent with other menu entries and looking at the future where we will use that page for multi-cluster management.

Works for me.

I'm not sure it is security-wise. It would be enough to give the link, people will log in if needed. It is out of our "auto cluster" thing, and then they have to perform a bunch of manual operations that are not related to us, so it is also fine that they have to enter credentials.

Works for me.

Since we are create a "clone" of that page in the CI/CD > Clusters page, do we really need to keep the old one? Maybe it is enough to put there a static statement that the K8s integration now is managed in the other page, with link, even for simple custom configuration.

If we remove current Kubernetes/integration we will also have to change all sales demo scripts which rely on that page. I consider that it will be better if we do it gracefully.

@ayufan Regards PoC auth flow

Although, one hurdle remains. The sample app is currently authorizing users when user login. But we want to authorize them when they access clusters. In other words, we authorize users with GCP-API-OAuth-scope(https://www.googleapis.com/auth/cloud-platform) when they create a cluster. We don’t include the scope when they login with google account(i.e. we don't touch config.omniauth). I’m currently investigating this. Maybe we need to use Signet.

I was able to implement this with Signet gem. I pushed to https://gitlab.com/dosuken123/gke_integration_poc. We need to store an access_token in db. access_token will be issued after user agreed in concent screen.

AFAIK, every potential difficulty has been proven as possible. I'll start production coding.

Oh god, please don't show cluster IDs.

If it helps, we can cut scope and not support manual form to add k8s creds in the cluster page. Just use the cluster page for GKE-managed clusters, and people doing other things can use the existing k8s service integration.

To be clear, the goal is to kill the k8s service integration and have all clusters in the Clusters page, but let's stick to the smallest, easiest thing first and iterate.

Going further, let's only support creating a cluster, not connecting to an existing cluster. Obviously connecting to a cluster is useful, especially since these are created at the project level and connecting to a cluster could be a good way to share clusters between projects. But again, keep it minimal.

I don't mind UX completing a minimal design and a future design, in case we can do another iteration before 10.1 ships. But it's critical that at least one iteration ships in 10.1. And it's OK if the second iteration throws out work from the first iteration. So don't overdesign or overengineer the first iteration.

Likewise, start with a field to enter in the zone. Add a link later, if you have time. Fetch the list if you have time. But merge a first version into master that just asks for a text field.

I'm really confused by the interplay between the Clusters page and the Kubernetes service integration. As mentioned above, there's no reason for both to live eventually. If as a technical shortcut, we fetch the default secrets from the cluster and populate the existing k8s service integration, then sure, go for it. But don't go out of your way to keep that integration in sync if that's not just the easiest (boring) solution. If the new Clusters page stores that information in some other way though, then don't worry about it.

If we remove current Kubernetes/integration we will also have to change all sales demo scripts which rely on that page. I consider that it will be better if we do it gracefully.

We can leave the integration as is. If Clusters is in beta, then leaving two living for a time period is fine. even if Clusters is GA immediately, we can deprecate the old integration and move on. Some time later, we can make a migration that moves existing (active) integrations into the Clusters page.

@pedroms What's the absolute minimal design for a page that:

Shows when you don't have a cluster
Let's you create a cluster (with various parameters)
Shows when you do have a cluster, with small summary of the cluster (possibly in separate detail page if warranted), and a link to GCP to admin it

Actually, now that I think of it, even information like size can change because they admin it in GCP, so we can cut scope further by not showing that information in the summary. Just the cluster name with a link to admin it.

@bikebilly I know the reasoning behind saying Clusters, but maybe we should more accurately reflect that there's only one. When we shipped issue boards, it was first limited to only one, but we didn't call it "boards" because we'd later expand it. Remember, we may never expand it to support multiple clusters, so let's not design for that case.

@bikebilly @markpundsack @filipa @dosuken123 @ayufan this is a version that combines everything under the new Cluster page.

No cluster is set up

When no cluster is set up, you are shown a panel where you can add a cluster to your project. You can choose to create a new cluster on GKE or add an existing Kubernetes cluster.

The help text has links to GKE documentation about the requisites to create a cluster.

Create on GKE

After Google sign in succeeds, the form for cluster creation appears inline (if possible).

I used dropdowns for project name, zone and machine type for the mockups. This is just for illustration purposes and can be easily changed to text fields or a fixed value.

Use existing Kubernetes cluster

This is the same form that you get in the Kubernetes integration page.

For the new UI elements (tab title and confirmation button), I tried to use language that reflects that you're not creating anything new and that this is just about Kubernetes.

Viewing the project's cluster

Once the integration is set up, visiting this page shows you the information that's already been entered.

There's a enable/disable switch and a Save button to make it take effect.

If the cluster was created on GKE, there's an extra section with a link to the cluster on GCP. The fields with the cluster information will be disabled, as they cannot be modified from GitLab. I added 'Copy to clipboard' buttons like we do on other read-only fields, but this is just a nice-to-have.

If the k8s integration was entered manually, the details can be edited. The Save button is used to save any changes made to these fields too.

GKE	Manual integration

No permissions

If a user doesn't have the permissions to edit the project's cluster integration, they don't really have access to anything that's being displayed on this page.

For that reason, I think it makes sense to make this a first-level settings page instead of a page under CI / CD. Putting it under settings would take care of the permissions issue automatically.

Once we expand the cluster integration functionality, we'll have read-only information that can be seen by users without permissions. I think then will be an appropriate time to move this to a section under CI / CD in the navigation.

Kubernetes integration

Finally, when cluster integration is set up, the Kubernetes integration page is disabled. It shows all the up-to-date information like it does today, but it cannot be modified.

A message in the top section lets the user know what's going on

@cperessini Thank you for hard working! This looks awesome. @ayufan What do you think? I think this mock is 90% accurate with your latest proposal, and we can start production coding throughout this design.

I just left a couple of nitpicks.

Do we need "Save Changes" button and "Cancel" button in "Viewing the project's cluster" > "GKE"?
Do we need "Cancel" button in "Viewing the project's cluster" > "Manual integration"?

Important. Since we decided to support soft delete in the next iteration, user can click "Create cluster" button only once. If they made a mistake, user doesn't have a recovery path.

mentioned in issue #38286 (closed)

marked this issue as related to #38286 (closed)

mentioned in merge request !14470 (merged)

added In dev label

Do we need "Save Changes" button and "Cancel" button in "Viewing the project's cluster" > "GKE"?

These buttons affect the Enable/Disable switch, so I think they are necessary.

Do we need "Cancel" button in "Viewing the project's cluster" > "Manual integration"?

I included the Cancel button so users could exit the dialog knowing the changes they made won't be applied. But thanks to your comment I realized that we don't use a Cancel button in Settings pages.

It's better to remove the Cancel button in both situations, thanks! I'll update the mockups.

Important. Since we decided to support soft delete in the next iteration, user can click "Create cluster" button only once. If they made a mistake, user doesn't have a recovery path.

This is a good point. We could include a Remove cluster integration panel below the Save button. If it is possible from the backend side, I would recommend doing this.

@cperessini Thanks!

Important. Since we decided to support soft delete in the next iteration, user can click "Create cluster" button only once. If they made a mistake, user doesn't have a recovery path.

@ayufan @bikebilly What do you think?

@bikebilly I don't think we should have an 'overwrite' behavior. I think users should go through a Remove flow and then they can create a new cluster. This way they have a consistent experience and they know what outcome they can expect from each step.

This can be achieved with a very simple UX:

@cperessini

In general, I think tabbed interfaces during creation is bad. It's not obvious that the other tab exists, and thus hides those options.

I've already suggested we can just cut handling existing k8s clusters, so you can just drop that tab completely.

Eventually, I imagine a list of clusters, with two buttons to either create a new cluster or add an existing one. Or a single button with a drop-down alternate button. Perhaps you could use buttons now. But either way, I'd rather not see a tabbed creation show up now or later.

The mention of Kubernetes auto deployment doesn't make sense to me. This isn't just about Auto DevOps; it's about any deployment to k8s, auto or not.

There should be some instruction about needing an existing project, and in the likely case that people don't have a project, how to create one. I expect that will be the biggest stumbling block to adoption. People will have an existing gmail account, but never used GCP and thus not have a project. And it's not trivial to create one.

Project namespace should not be filled in by default and should be user-editable even if it's a GKE cluster.

For that reason, I think it makes sense to make this a first-level settings page instead of a page under CI / CD. Putting it under settings would take care of the permissions issue automatically.

Once we expand the cluster integration functionality, we'll have read-only information that can be seen by users without permissions. I think then will be an appropriate time to move this to a section under CI / CD in the navigation.

Good point, but it's useful for non-masters to know whether k8s is configured or not, it's valuable to keep it out of settings. Also, we want the prominence of having it be a main menu item. For now, the one piece of (real-only) information we could/should show for non-masters is the project namespace. Actually, also the cluster name. For that matter, why not the other information? It seems that focusing on both GKE-created clusters and existing clusters is clouding our design. So again, I'll suggest dropping handling the existing cluster completely so you can design a good experience around creating (and viewing) clusters.

Finally, when cluster integration is set up, the Kubernetes integration page is disabled. It shows all the up-to-date information like it does today, but it cannot be modified.

From a scope perspective, we don't need to do this. We can just deprecate that page eventually. If in the meantime it shows inconsistent information, that's fine. The plan is to eventually kill it (after migrating content, of course).

I've already suggested we can just cut handling existing k8s clusters, so you can just drop that tab completely.

That makes the design much simpler.

There should be some instruction about needing an existing project, and in the likely case that people don't have a project, how to create one

I think we can include this in the bullet points. We can point at GKE's documentation on how to set up a project.

Before sign in	Form

The mention of Kubernetes auto deployment doesn't make sense to me. This isn't just about Auto DevOps; it's about any deployment to k8s, auto or not.

This was just some sample text. I'm not the most familiar with this side of GitLab, so it'd be great if product could supply the copy for this feature.

Project namespace should not be filled in by default and should be user-editable even if it's a GKE cluster.

I may have included a value in a mockup, but I meant to all fields empty with a placeholder

I made the field editable in the view page and moved it to the top to make it more obvious.

Good point, but it's useful for non-masters to know whether k8s is configured or not, it's valuable to keep it out of settings. Also, we want the prominence of having it be a main menu item. For now, the one piece of (real-only) information we could/should show for non-masters is the project namespace. Actually, also the cluster name. For that matter, why not the other information?

What about public projects like GitLab CE? It's probably okay to show all the info to project members, but not to the wide community.

Which fields do we want to show once the project has been created? Which of those should be available to each permissions level?

From a scope perspective, we don't need to do this. We can just deprecate that page eventually. If in the meantime it shows inconsistent information, that's fine. The plan is to eventually kill it (after migrating content, of course).

This was part of the plan that we came up in the sync-up call. I agree that it's not a crucial point, but having this will avoid some confusion in case users are not aware of the new functionality.

@ayufan @fabio @markpundsack It seems each proposal is conflicted. Could you conclude the following items for 10.1?

Do we need a form for handling existing k8s clusters? (Note: this is just a replica of Setting > Integration > k8s)
Do we support soft delete? (https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_41328740)
Do we disable Kubernetes integration page, when cluster integration is set up?

/cc @cperessini

mentioned in merge request !14496

@bikebilly sounds good to me. That's very similar to https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_41335028, except you want to remove the cluster details once the cluster has been created, correct?

Following @markpundsack's comment, we should show the project namespace field, though.

What information do we want to show to those that have less permissions than Master?

@bikebilly I like your proposal of cutting the scope. Internally once the cluster is created we will fill the KubernetesService with cluster details as this is MVP of that change.

Following @markpundsack's comment, we should show the project namespace field, though.

@cperessini @bikebilly You can cut that scope too. That was more an exploration of what information we should show for non-masters, not what we need to show for first iteration.

@ayufan If the easiest technical thing is to fill in the KubernetesService after creating the cluster, that sounds find. But @dosuken123 is only getting back username/password from GCP currently and is looking to augment the service with those fields. I think that is complicating things and feels like the wrong direction. I don't know the technical details, so feel free to ignore me, but I imagine that you're supposed to use the username/password to connect to the k8s cluster and then fetch the default credentials and fill those values into the KubernetesService. Thus we have parity with the existing flow.

We've talked about eventually creating our own creds with appropriate scope just for the project and passing those limits creds to the runner. We can keep the username/password in GitLab core, but not pass it to the runner. With that eventual goal, it seems the wrong direction to start passing username/password today.

We've talked about eventually creating our own creds with appropriate scope just for the project and passing those limits creds to the runner. We can keep the username/password in GitLab core, but not pass it to the runner. With that eventual goal, it seems the wrong direction to start passing username/password today.

I find it complicated to do it today: implement fetching default secret from Kubernetes cluster and use these as one-time-off credentials. The current proposal is to allow using username/password. This opens us to improve the security to generate one-time-off credentials that would be used by runner and attached to the cluster.

@bikebilly

I don't see any security problem showing cluster name and project namespace, so let's keep the same view for non-masters. Only limitation is on actions (backend also will check).

Today "Project namespace" is readable for only masters. Do we loose the security?
GCP Project/zone/cluster-name are sensitive information, which can be used for deployment. Do we want to expose it to developers?
This authentication flow should be two steps. At first, authenticating user role in GitLab. And second, checking readability for the target cluster in GKE. Masters who don't have an accessibility to the cluster should not see anything, IMO.

I'm checking if we can get token from k8s API.

changed the description

I just updated the description with final designs. Assigning UX ready

@dosuken123 for the GCP project field I made the placeholder Project ID instead of Project name. We requiere the ID, correct?

I think this is out of scope now, but I made these two designs because they were part of the original conversation:

While the cluster is being created on GKE, a banner is shown in the View page.
If the user does not close the page, that message changes once the creation succeeds. No need to show this message if the user visits the page after the creation has succeeded.

Creating	Created

added UX ready and removed UX labels

We should avoid taking a username and password for a google cloud account. That comes with avoidable risks - we need to keep the credentials secure, and the costs are high if they are lost.

If I'm reading Google's documentation correctly: https://cloud.google.com/compute/docs/api/how-tos/authorization

They recommend we use "Application default credentials" for creating the GKE cluster and interacting with it. This requires the user to generate a service account and give us the token directly.

An alternative would be to use OAuth instead of taking a username+password. With the permissions that grants us, we can generate the service account ourselves and store the token for that.

Hmm. We could fetch application default credentials from Kubernetes cluster after it is created.

one last thing I'm not sure we already discussed... do you think we need to hide cluster name when the integration is disabled? It could be useful, so you can see which is the cluster assigned to the project without enabling it

I think either option makes sense. If you think there are situations where it may be useful, let's include it. I'll add it to the mockup

changed the description

Edited the description with the changes from https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_41801654

When entering the cluster information, each item should have a link (to list, or to documentation about what it means).

Adding a link next to each field name would overload the page too much. How about including a single link to a help page at the top? That way we can also include help information for the fields that need to link to a list.

mentioned in issue #34834 (moved)

mentioned in issue #3691 (moved)

Ideally it should be a form to select existing or create project, but for now we this is the only way to do it.

changed the description

Edited the description so all mockups have the latest copy from https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_41685798

Added ID to the GCP Project field name per https://gitlab.com/gitlab-org/gitlab-ce/issues/35954#note_42134744

Added a mockup for the situation where OAuth is not configured in the instance. It represents the solution that was reached with this discussion: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14470#note_42104092

marked this issue as related to #38668

When the user clicks the Remove integration button, we should show a browser alert to give the user a chance to confirm the destructive action. The text for the alert will be:

Are you sure you want to remove cluster integration from this project? <br>
This will not delete your cluster on Google Container Engine

I spoke with @filipa and she said this is easy to do.

@dosuken123 Once the user clicks OK and the action succeeds, the page should refresh and become a creation form again. According to @filipa, we'd need a backend redirect for that, would that be possible? It seems we are currently redirecting the user to the login page, which is not ideal, but it's okay if there's no time to change it.

If possible, it would be good to include banners informing the user of success or failure:

Success: A blue banner across the top with the message Cluster integration was removed
Failure: A red banner across the top with the message Something went wrong while removing cluster integration

@dosuken123 Once the user clicks OK and the action succeeds, the page should refresh and become a creation form again. According to @filipa, we'd need a backend redirect for that, would that be possible? It seems we are currently redirecting the user to the login page, which is not ideal, but it's okay if there's no time to change it.

@cperessini we already redirect to creation form.

marked this issue as related to #38728 (closed)

This is missing and I am not sure if we'll have time:

If possible, it would be good to include banners informing the user of success or failure:

Success: A blue banner across the top with the message Cluster integration was removed

Failure: A red banner across the top with the message Something went wrong while removing cluster integration

This is missing and I am not sure if we'll have time

@filipa I don't think it's a dealbreaker if we can't include the banners. Users will be able to infer what happened, but it'd be nice to be explicit.

If you had time to do one of the two, the failure banner is more important.

@cperessini both done

You are awesome @filipa!

Should I start UX review?

@cperessini yes please! I will add screenshots asap! Please check slack channel to know how to run this locally ;) you need to run some migrations :)

mentioned in merge request !14712

closed via commit fb70fada

closed via merge request !14470 (merged)

mentioned in commit fb70fada

Admin message

Admin message

Create a Cluster page to add a new cluster on GKE

Description

Proposal

Design

If the user has not signed in with Google

After authentication succeeds

Viewing the cluster

Users without permissions

Links / references

Documentation blurb

Overview

Use cases

Feature checklist

Relates to

Activity

Creates a cluster

Gets the details of a specific cluster

Authentication/Authorization

Others

Libraries

References

Sample code for geting the details of a specific cluster

The Google Developers Console [project ID or project

number](https://support.google.com/cloud/answer/6158840).

The name of the Google Compute Engine

zone in which the cluster

resides.

The name of the cluster to retrieve.

TODO: Change code below to process the response object:

Integration page

Add new cluster

Create cluster on GCP

Clusters list

Edit cluster

Some notes

Add new cluster page

Create cluster on GCP page

Clusters list page

Edit cluster page

First iteration

Frontend

Backend

Next iterations

Next, next iteration

Next, next, next iteration

No cluster is set up

Create on GKE

Use existing Kubernetes cluster

Viewing the project's cluster

No permissions

Kubernetes integration

TODO: Change code below to process the `response` object: