"One of the things we like least about GitLab is their REST API. It’s awful to use. We’ve originally had a bot in our slack channel that was capable of triggering builds remotely. After our migration to use Environments to orchestrate deployments within our pipelines, we realised that the only way to manage the builds would be via the UI.
I believe the GitLab team are currently working on an improved version of the pipeline API but don’t quote me on that. Improved and easier to use APIs is certainly somewhere we feel GitLab could improve."
Solution:
Start using the API for all XHR requests
Have Sidekiq use the API too, instead of directly using the DB
Designs
An error occurred while loading designs. Please try again.
Child items
0
Show closed items
GraphQL error: The resource that you are attempting to access does not exist or you don't have permission to perform this action
No child items are currently open.
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
One of the things we like least about GitLab is their REST API. It’s
awful to use.
I agree—for example compared with GitHub's API we lack behind a lot.
Start using the API for all XHR requests
I had proposed this in #21049 (closed) already. Currently, we have an "internal"
API for the frontend which is more focused on the data that the frontend
views need.
I think !2397 would improve our API a lot already because we would then
have a standardized way of documenting it, the option to offer client
libraries for different languages and also the ability to try out API
calls from the Swagger UI.
I always thought it strange that we rely on custom endpoints for XHR requests on the site rather than consuming our own API.
The objections seemed to be pertaining to security concerns and not wanting to expose API tokens to the frontend code but this is a solvable problem. We could create transient, session-based tokens to use from the frontend that can be made valid for API requests. Surely this is better than duplicating work by effectively creating multiple endpoints that do the same thing.
I wonder if we should consider using all the exposed APIs instead of having duplicates in the Rails controllers
Less code to maintain 2. More useful API for users
douwe:
@stanhu We did that originally, but they weren’t optimized to give us just the attributes we needed
And that too had the versioning problem
remy:
My 2 cents: I like the non-versioned API, and it’s convenient from a developer perspective to be able to iterate on a rolling basis (i.e. not changing everything only on major release).
@stanhu That’s something that Dailymotion is/was doing but as a consumer you could choose what fields to query (Graph API), and caching was done per field on the backend. That’s actually something that I really like but as @douwe said, we’re far from being able to doing that, but IMO we shouldn’t forget the idea.
I agree 100%. I don't know on the previous decision against it or why we are not doing it, but then we should get to the point to use our own API everywhere. Also helps decoupling BE + FE work , alot for future issues. Also the possibility of having a GraphQL format should be considered as this would give us soo much freedom on building stuff.
As @stanhu points out, we've talked about this before, and it's not so easy.
Using specialized endpoints instead of the generic API means that we get exactly the data we need and nothing more, which reduces data size and performance, and reduces the need to using multiple requests to get an object and its related data, that we would otherwise get in 1 request. It also means that we can tweak and optimize certain endpoints at will for our specific usage without needing to keep incrementing the global API version and maintain backward compatibility.
GraphQL is really cool, but also non-trivial to implement if you want to cache certain things and prevent the N+1 query problem.
It also means that we can tweak and optimize certain endpoints at will for our specific usage without needing to keep incrementing the global API version and maintain backward compatibility.
We can maintain a non-stable, next-API-version-preview that we use in the frontend if breaking changes are necessary. Once we have tweaked it, it can become the stable API.
That wouldn't be too different from the current situation only that everything happens at /api/....
Using specialized endpoints instead of the generic API means that we get exactly the data we need and nothing more
We can achieve this by using subresources or parameters. For example /users/:id vs. /users/:id/short or /users/:id?format=short.
The GitHub API uses custom content types to define the amount of data to receive and for versioning only single endpoints rather than the whole API.
One thing that would improve developing for the API, but also using the api would be scheduling !2397 to be finished. Some time ago I asked a former product manager, but no luck.
I know auto generated API docs are not sexy for the release post, but it will help all parties to have consistent docs, without missing keys, or too many keys.
On GraphQL, I remember @razer6 was working on a WIP MR, but that has not surfaced yet. Its not trivial, but if I look how many of our endpoints have N + 1 query problems now, it might be no worse than we're doing now.
The custom media type header seems like a good intermediate step before going to a full-blown GraphQL implementation.
Correct me if I'm wrong, it seems to me one issue is that the API endpoints don't use the same session tokens that we currently use. Is the first step to do that?
Then perhaps we should start using the most used API endpoints, which appears to be the merge_check endpoint for the MR widget.
@nick.thomas Sorry that I don't remember the detail, but could we use the cookie to access the API now? Or was that removed or so because some security concern?
Correct me if I'm wrong, it seems to me one issue is that the API endpoints don't use the same session tokens that we currently use.
@stanhu Sorry, if I misunderstand something here (I'm really bad at tokenizing)—is the CSRF token the one you are talking about? Then it gets automatically passed around by the jquery-rails Gem we are using.
Also if it answers your question better: We are already using API endpoints in the frontend. Please see api.js.
The session cookie is automatically passed by the browser.
We can use it for GET and HEAD requests, but not currently for PUT, POST, DELETE, PATCH, etc, due to CSRF concerns.
We can pass the rails CSRF token along with the API request and validate it for requests made using session cookie authentication if we want to add support for those methods.
Consuming the public API can be awesome and leads to unexpected benefits such as making it easy to have review apps that change the UI, but consume production data. But to do it right, you basically still need two focuses, large set of unknown users (the public API), and a small set of known users (our frontend devs). Sometimes that means different APIs, or an overlapping set of APIs, but with some distinct APIs only used by one side. GraphQL certainly tries to solve some of that.
I really like the idea of using API for our UI. The problem is development speed and API compatibility. Back in time, it was simply not an option to move fast and have stable API. Now when we keep separate v3 API for compatibility we can be more aggressive with edge API.
Solution:
Start using the API for all XHR requests
Good one. Not sure about all but we can start with using API for XHR requests that require minimal API adaptation. Because don't want GitLab UI specific code in the API:
def comments if params[:some_gitlab_ui_argument] # ... else # ... endend
Have Sidekiq use the API too, instead of directly using the DB
I don't think so. Sidekiq workers depend heavily on GitLab classes and ActiveRecord. Basically, many of our workers look like this:
class Worker def perform(data) Service.new(data).execute endend
Rewriting Worker or Service to use API and parse JSON instead of direct use of application code seems ambiguous to me. The main benefit of having Sidekiq using Rails environment is the ability to write a minimal amount of code in Worker and pass everything to application code. If we write workers that use API and are self-sufficient it makes little sense to use rails environment inside then.
@dzaporozhets This is what GraphQL would solve: endpoints could request only the things they need. I think that's the right solution. What are the complications with starting to move more of our endpoints to GraphQL?
We can use it for GET and HEAD requests, but not currently for PUT, POST, DELETE, PATCH, etc, due to CSRF concerns.
We can pass the rails CSRF token along with the API request and validate it for requests made using session cookie authentication if we want to add support for those methods.
Are we fine to improve API to validate requests by CSRF tokens?
Let's make sure we don't confuse terms here: I think you mean CSRF verification, not CSRF authentication. Authentication should still be handled by the existing mechanisms (see https://gitlab.com/gitlab-org/gitlab-ce/blob/v9.2.2/lib/api/helpers.rb#L334-340 for more details). I think a separate issue needs to be created for this.
@blackst0ne it's perfectly safe to add, the only thing to be aware of is that the other forms of authentication should continue to work without the CSRF token.
I'm wonder if we should implicitly add processing of the HEAD requests to our API endpoints in case when a user just wants to check if a resource is available (branch exists, issue exists, etc) without loading any information. 🤔
It would be great to check in logs how many HEAD requests are processed by the API endpoints on gitlab.com.
I don't know a lot about GraphQL, but I don't think it will directly address the point of this ticket which is that whatever API we have, the Gitlab interface should be using it, to ensure it is performant (which the current API is not in many cases - loading a project list can take half a minute with only 100 projects). I suppose we could open a new ticket that the interface should use the new GraphQL API when it exists... but for now I think this is still relevant. Or maybe, as a simpler solution, it would make more sense to just tickets to make each API endpoint performant! In face I will go do that for the project list interface right now if I can't find one ;)
GitLab is moving all development for both GitLab Community Edition
and Enterprise Edition into a single codebase. The current
gitlab-ce repository will become a read-only mirror, without any
proprietary code. All development is moved to the current
gitlab-ee repository, which we will rename to just gitlab in the
coming weeks. As part of this migration, issues will be moved to the
current gitlab-ee project.
If you have any questions about all of this, please ask them in our
dedicated FAQ issue.
Using "gitlab" and "gitlab-ce" would be confusing, so we decided to
rename gitlab-ce to gitlab-foss to make the purpose of this FOSS
repository more clear
I created a merge requests for CE, and this got closed. What do I
need to do?
Everything in the ee/ directory is proprietary. Everything else is
free and open source software. If your merge request does not change
anything in the ee/ directory, the process of contributing changes
is the same as when using the gitlab-ce repository.
Will you accept merge requests on the gitlab-ce/gitlab-foss project
after it has been renamed?
No. Merge requests submitted to this project will be closed automatically.
Will I still be able to view old issues and merge requests in
gitlab-ce/gitlab-foss?
Yes.
How will this affect users of GitLab CE using Omnibus?
No changes will be necessary, as the packages built remain the same.
How will this affect users of GitLab CE that build from source?
Once the project has been renamed, you will need to change your Git
remotes to use this new URL. GitLab will take care of redirecting Git
operations so there is no hard deadline, but we recommend doing this
as soon as the projects have been renamed.
Where can I see a timeline of the remaining steps?