Improve API by using it ourselves

One of the things we like least about GitLab is their REST API. It’s awful to use.

I agree—for example compared with GitHub's API we lack behind a lot.

Start using the API for all XHR requests

I had proposed this in #21049 (closed) already. Currently, we have an "internal" API for the frontend which is more focused on the data that the frontend views need.

I think !2397 would improve our API a lot already because we would then have a standardized way of documenting it, the option to offer client libraries for different languages and also the ability to try out API calls from the Swagger UI.

@sytses maybe we can continue discussion in #21049 (closed)?

I always thought it strange that we rely on custom endpoints for XHR requests on the site rather than consuming our own API.

The objections seemed to be pertaining to security concerns and not wanting to expose API tokens to the frontend code but this is a solvable problem. We could create transient, session-based tokens to use from the frontend that can be made valid for API requests. Surely this is better than duplicating work by effectively creating multiple endpoints that do the same thing.

A few more arguments to use our own API:

Performance: it would force us to make the API fast
Security
Bugs: we don't need to maintain 2 parts of the codebase

Jobert of HackerOne also posted a informative tweet about APIs: https://twitter.com/jobertabma/status/854239121597976577

This makes a lot of sense. We discussed this back in January (https://gitlab.slack.com/archives/C3T523PS5/p1485335688000239):

stanhu:

I wonder if we should consider using all the exposed APIs instead of having duplicates in the Rails controllers

Less code to maintain 2. More useful API for users

douwe:

@stanhu We did that originally, but they weren’t optimized to give us just the attributes we needed

And that too had the versioning problem

remy:

My 2 cents: I like the non-versioned API, and it’s convenient from a developer perspective to be able to iterate on a rolling basis (i.e. not changing everything only on major release).

@stanhu That’s something that Dailymotion is/was doing but as a consumer you could choose what fields to query (Graph API), and caching was done per field on the backend. That’s actually something that I really like but as @douwe said, we’re far from being able to doing that, but IMO we shouldn’t forget the idea.

Some additional context about the decision against using the API is in https://gitlab.com/gitlab-org/gitlab-ce/issues/23886

I agree. We need to dog food our API to make it better.

We are using it to handle some admin operations already with our ChatOps project, and it can indeed use some improvement.

mentioned in issue #31953 (moved)

mentioned in issue #32151 (moved)

cc @mydigitalself @victorwu @jschatz1 @timzallmann

I agree 100%. I don't know on the previous decision against it or why we are not doing it, but then we should get to the point to use our own API everywhere. Also helps decoupling BE + FE work , alot for future issues. Also the possibility of having a GraphQL format should be considered as this would give us soo much freedom on building stuff.

As @stanhu points out, we've talked about this before, and it's not so easy.

Using specialized endpoints instead of the generic API means that we get exactly the data we need and nothing more, which reduces data size and performance, and reduces the need to using multiple requests to get an object and its related data, that we would otherwise get in 1 request. It also means that we can tweak and optimize certain endpoints at will for our specific usage without needing to keep incrementing the global API version and maintain backward compatibility.

GraphQL is really cool, but also non-trivial to implement if you want to cache certain things and prevent the N+1 query problem.

It also means that we can tweak and optimize certain endpoints at will for our specific usage without needing to keep incrementing the global API version and maintain backward compatibility.

We can maintain a non-stable, next-API-version-preview that we use in the frontend if breaking changes are necessary. Once we have tweaked it, it can become the stable API.

That wouldn't be too different from the current situation only that everything happens at /api/....

Using specialized endpoints instead of the generic API means that we get exactly the data we need and nothing more

We can achieve this by using subresources or parameters. For example /users/:id vs. /users/:id/short or /users/:id?format=short.

The GitHub API uses custom content types to define the amount of data to receive and for versioning only single endpoints rather than the whole API.

One thing that would improve developing for the API, but also using the api would be scheduling !2397 to be finished. Some time ago I asked a former product manager, but no luck.

I know auto generated API docs are not sexy for the release post, but it will help all parties to have consistent docs, without missing keys, or too many keys.

On GraphQL, I remember @razer6 was working on a WIP MR, but that has not surfaced yet. Its not trivial, but if I look how many of our endpoints have N + 1 query problems now, it might be no worse than we're doing now.

mentioned in merge request !11307 (merged)

The custom media type header seems like a good intermediate step before going to a full-blown GraphQL implementation.

Correct me if I'm wrong, it seems to me one issue is that the API endpoints don't use the same session tokens that we currently use. Is the first step to do that?

Then perhaps we should start using the most used API endpoints, which appears to be the merge_check endpoint for the MR widget.

@nick.thomas Sorry that I don't remember the detail, but could we use the cookie to access the API now? Or was that removed or so because some security concern?

Correct me if I'm wrong, it seems to me one issue is that the API endpoints don't use the same session tokens that we currently use.

@stanhu Sorry, if I misunderstand something here (I'm really bad at tokenizing)—is the CSRF token the one you are talking about? Then it gets automatically passed around by the jquery-rails Gem we are using.

Also if it answers your question better: We are already using API endpoints in the frontend. Please see api.js.

@winh I noticed we are still using the private_token field in EE here: https://gitlab.com/gitlab-org/gitlab-ee/blob/v9.2.0-rc4-ee/app/assets/javascripts/api.js#L165 Does the frontend now pass along the session cookie with every Ajax request?

The session cookie is automatically passed by the browser.

We can use it for GET and HEAD requests, but not currently for PUT, POST, DELETE, PATCH, etc, due to CSRF concerns.

We can pass the rails CSRF token along with the API request and validate it for requests made using session cookie authentication if we want to add support for those methods.

Consuming the public API can be awesome and leads to unexpected benefits such as making it easy to have review apps that change the UI, but consume production data. But to do it right, you basically still need two focuses, large set of unknown users (the public API), and a small set of known users (our frontend devs). Sometimes that means different APIs, or an overlapping set of APIs, but with some distinct APIs only used by one side. GraphQL certainly tries to solve some of that.

Github does the same https://news.ycombinator.com/item?id=14392391

I really like the idea of using API for our UI. The problem is development speed and API compatibility. Back in time, it was simply not an option to move fast and have stable API. Now when we keep separate v3 API for compatibility we can be more aggressive with edge API.

Solution:

Start using the API for all XHR requests

Good one. Not sure about all but we can start with using API for XHR requests that require minimal API adaptation. Because don't want GitLab UI specific code in the API:

def comments
  if params[:some_gitlab_ui_argument]
    # ... 
  else
    # ...
  end
end

Have Sidekiq use the API too, instead of directly using the DB

I don't think so. Sidekiq workers depend heavily on GitLab classes and ActiveRecord. Basically, many of our workers look like this:

class Worker
  def perform(data)
    Service.new(data).execute 
  end
end

Rewriting Worker or Service to use API and parse JSON instead of direct use of application code seems ambiguous to me. The main benefit of having Sidekiq using Rails environment is the ability to write a minimal amount of code in Worker and pass everything to application code. If we write workers that use API and are self-sufficient it makes little sense to use rails environment inside then.

@dzaporozhets This is what GraphQL would solve: endpoints could request only the things they need. I think that's the right solution. What are the complications with starting to move more of our endpoints to GraphQL?

@stanhu lets try GraphQL then. If won't work we can still adopt frontend to use more of our API

The GraphQL issue is here: gitlab-org/gitlab-ce#22753

mentioned in merge request !11998 (merged)

@nick.thomas @stanhu @DouweM

We can use it for GET and HEAD requests, but not currently for PUT, POST, DELETE, PATCH, etc, due to CSRF concerns.

We can pass the rails CSRF token along with the API request and validate it for requests made using session cookie authentication if we want to add support for those methods.

I'm working on https://gitlab.com/gitlab-org/gitlab-ce/issues/13284 which requires an ability to create a commit with multiple files in it.
We already have an APIv4 endpoint for that (https://docs.gitlab.com/ce/api/commits.html#create-a-commit-with-multiple-files-and-actions), but it requires private-token header to be passed.

I want to keep the backend code DRYed, so adding another one action to a controller with quite the same functionality smells bad to me.

Are we fine to improve API to validate requests by CSRF tokens?
Can I work on it?

Adding CSRF would really help the already established utils we have on the frontend as well. 🎉

Are we fine to improve API to validate requests by CSRF tokens?

Let's make sure we don't confuse terms here: I think you mean CSRF verification, not CSRF authentication. Authentication should still be handled by the existing mechanisms (see https://gitlab.com/gitlab-org/gitlab-ce/blob/v9.2.2/lib/api/helpers.rb#L334-340 for more details). I think a separate issue needs to be created for this.

@stanhu yes, CSRF verification. I said above verify, not authenticate. 🙂

@blackst0ne it's perfectly safe to add, the only thing to be aware of is that the other forms of authentication should continue to work without the CSRF token.

mentioned in issue #22753 (closed)

mentioned in issue #33601 (closed)

I think a separate issue needs to be created for this.

https://gitlab.com/gitlab-org/gitlab-ce/issues/33601

mentioned in issue #33581 (moved)

mentioned in issue #34100 (closed)

mentioned in issue #34311 (moved)

Just thoughts from the performance perspective.

I'm wonder if we should implicitly add processing of the HEAD requests to our API endpoints in case when a user just wants to check if a resource is available (branch exists, issue exists, etc) without loading any information. 🤔

It would be great to check in logs how many HEAD requests are processed by the API endpoints on gitlab.com.

Related issue: https://gitlab.com/gitlab-org/gitlab-ce/issues/37159

mentioned in merge request !13884

I feel we can close this issue, since the current way forward is to use GraphQL, see gitlab-org/gitlab-ce#34754.

I don't know a lot about GraphQL, but I don't think it will directly address the point of this ticket which is that whatever API we have, the Gitlab interface should be using it, to ensure it is performant (which the current API is not in many cases - loading a project list can take half a minute with only 100 projects). I suppose we could open a new ticket that the interface should use the new GraphQL API when it exists... but for now I think this is still relevant. Or maybe, as a simpler solution, it would make more sense to just tickets to make each API endpoint performant! In face I will go do that for the project list interface right now if I can't find one ;)

Due to legal issues about GraphQL licensing, we halted our plans to use GrahQL. Making this issue more relevant again.

mentioned in issue gitlab-cog/gitlab-admin#21

moved to gitlab#9740

Hey! 👋

GitLab is moving all development for both GitLab Community Edition and Enterprise Edition into a single codebase. The current gitlab-ce repository will become a read-only mirror, without any proprietary code. All development is moved to the current gitlab-ee repository, which we will rename to just gitlab in the coming weeks. As part of this migration, issues will be moved to the current gitlab-ee project.

If you have any questions about all of this, please ask them in our dedicated FAQ issue.

https://gitlab.com/gitlab-org/gitlab-ee/issues/13855

You can also find more information here:

https://gitlab.com/gitlab-org/gitlab-ee/issues/13304
https://about.gitlab.com/2019/08/23/a-single-codebase-for-gitlab-community-and-enterprise-edition/

Frequently Asked Questions

For an up to date list of questions and answers, please take a look at https://gitlab.com/gitlab-org/gitlab-ee/issues/13855

What will we do with the repository names?

https://gitlab.com/gitlab-org/gitlab-ee/ will become https://gitlab.com/gitlab-org/gitlab, and https://gitlab.com/gitlab-org/gitlab-ce will become https://gitlab.com/gitlab-org/gitlab-foss.

Why rename gitlab-ce to gitlab-foss?

Using "gitlab" and "gitlab-ce" would be confusing, so we decided to rename gitlab-ce to gitlab-foss to make the purpose of this FOSS repository more clear

I created a merge requests for CE, and this got closed. What do I need to do?

You will need to create a merge request at https://gitlab.com/gitlab-org/gitlab-ee/. This link will eventually redirect to https://gitlab.com/gitlab-org/gitlab.

How does the licensing work in this new setup?

Everything in the ee/ directory is proprietary. Everything else is free and open source software. If your merge request does not change anything in the ee/ directory, the process of contributing changes is the same as when using the gitlab-ce repository.

Will you accept merge requests on the gitlab-ce/gitlab-foss project after it has been renamed?

No. Merge requests submitted to this project will be closed automatically.

Will I still be able to view old issues and merge requests in gitlab-ce/gitlab-foss?

Yes.

How will this affect users of GitLab CE using Omnibus?

No changes will be necessary, as the packages built remain the same.

How will this affect users of GitLab CE that build from source?

Once the project has been renamed, you will need to change your Git remotes to use this new URL. GitLab will take care of redirecting Git operations so there is no hard deadline, but we recommend doing this as soon as the projects have been renamed.

Where can I see a timeline of the remaining steps?

https://gitlab.com/gitlab-org/gitlab-ee/issues/13304

Will community contributions submitted to the new "gitlab" repository be available in the new gitlab-foss repository?

Yes, these changes will be synced to gitlab-foss automatically, several times per day.

locked this issue

Improve API by using it ourselves

Designs

Child items 0

Activity

Frequently Asked Questions

Admin message

Admin message

Improve API by using it ourselves

Activity

Frequently Asked Questions