We also send 'Authorization' headers but I think Sentry filters those on the server side. That is still not good because it means we hand over unencrypted user passwords to Sentry. Edit: it looks like the Ruby Sentry client scrubs 'Authorization' client-side, before sending it to Sentry.
In the case of gitlab.com this is bad but at least we 'own' the Sentry server (we run an 'on premises' version). For other GitLab installations that integrate with Sentry.io the problem is worse because they have been sending passwords and API tokens equivalent to passwords to a third-party SaaS.
@stanhu I see some code in ApplicationController that accepts the Private-Token header, yet I see nothing that scrubs it. So I am not sure if this is an API-only problem.
@stanhu from working with the raven-go client I know that Sentry likes to add an 'http context' to each message it sends out. This is a separate thing from the embedded exception. I think the exception rarely contains HTTP headers.
This cat is already sort of out of the bag because I have some of the required scrubbing code sitting in a public gitlab-workhorse MR.
What we need is a hook that lets us delete headers from the HTTP context that Sentry builds. Then do a security release, and nuke all Sentry 'projects' that contain this data. Perhaps even nuke and secure-erase the entire sentry.gitlap.com server.
The client scrubs the HTTP “Authorization” header of requests before sending them to Sentry, to prevent sensitive credentials from being sent. You can specify additional HTTP headers to ignore:
this should be an 8.12 security patch release (cc @rymai )
code fix is simple; blacklist headers in sentry initializer
needs a clear discussion of the impact in the blog post
we may choose not to delete any data at all because we control sentry.gitlap.com
options for others depend on whether they use sentry.io or self-hosted, and on whether they mind losing exception history: all we can do is inform and explain the options
I will prepare the code patch and a blurb for the blog post. Help with the full release process welcome.
This is what I have so far for the explanatory blurb.
This vulnerability only affects GitLab instances that use Sentry exception tracking. This feature is off by default in GitLab.
As a GitLab adminstrator you have the option to integrate your GitLab instance with Sentry, an external exception tracking system. When this feature is enabled, you can see details of each error ('500 page') that occurs on your GitLab server. These details include HTTP headers of the request that experienced the exception. Prior to GitLab 8.12.X, when an exception occured in the GitLab API (a URL starting with /api/v3/), then GitLab would inadvertently send the 'Private-Token' header used to authenticate with the GitLab API in the error report to Sentry. This means that when you view a Sentry error report for an exception that occured during a GitLab API request you can see the API token of the user that performed the request. This also means that if there is a data breach at your Sentry server, GitLab user API tokens may be exposed.
The holder of the API token for a GitLab user can impersonate that user in GitLab via the API. That includes writing comments, adding SSH keys, creating repositories. The holder of the API token of a GitLab administrator is able to do much more, for instance creating new user accounts. Individual GitLab users can reset their API token at gitlab.example.com/profile/account, but there is no way for administrators to do a full reset of all user API tokens on their GitLab server. GitLab API tokens also never expire.
Reading what I've read so far it feels irresponsible to suggest not nuking the exception history. The best response would be to invalidate all API tokens but this is invasive for users and I am not sure how to do it (SQL UPDATE to null the column?). Leaving an unknown number of never-expiring API tokens, including admin tokens, just lying around on the Sentry server because we own the Sentry server and we think it's safe... doesn't feel right to me.
Also, is it really that bad to loose exception history? We usually care most about what is going on right now or in the past few days. Do we really look back more than a few days to see what exceptions there are?
I will have a look tomorrow at nulling the private token column.
What I am hoping is that if we null the private token column then when a user visits their 'account settings' page a new token is generated automatically. If we are unlucky we get a 500 error instead.
Generating new tokens for each individual user is not going to fly.
My apologies for misreading your question. The writeup looks good. From what I've read I think it's safe to assume that anyone who had this feature enabled suffered leakage of their API tokens and should reset them. Between nuking the exception history and nulling the tokens I'd lean towards nulling the tokens. The exception history is out there and the tokens may have leaked already.
Is it possible to create a rake task or a simple SQL script that could be downloaded for those who wish to go with the null token option?
Between nuking the exception history and nulling the tokens I'd lean towards nulling the tokens. The exception history is out there and the tokens may have leaked already.
I agree.
Is it possible to create a rake task or a simple SQL script that could be downloaded for those who wish to go with the null token option?
Thanks @briann@rymai . Will investigate now if nulling tokens is a viable option.
Edit:
Looks like this is OK for the user.
Nice message and 'generate token' button instead of 500 error on /profile/account
When I update some other user attribute a token is generated automatically (so the NULL is not preventing SQL updates to the user because of some validation failing)
API requests fail with a 401 when the token is null, that is an appropriate message
HOWEVER. It seems there are some front-end features which rely on the User.private_token, which is an alias for User.authentication_token. When that value is null, those features might stop working (until the user gets a new token).
HOWEVER. It seems there are some front-end features which rely on the User.private_token, which is an alias for User.authentication_token. When that value is null, those features might stop working (until the user gets a new token).
@rymai the question is when does that token get generated. I think a lot of people don't even know this token exists. So for users who do use the atom feed, it would just mysteriously stop working.
If we null the authentication tokens on gitlab.com then for the majority of users atom is broken and will only fix itself if the user causes validations to be run.
@jacobvosmaer-gitlab Good points! :/ Do we have an acceptable solution other than regenerate a new token for every users (this could/should be done in background jobs)?
Yay! So the user impact of nulling users.the authentication_token column should be acceptable.
For small GitLab servers, nulling that column is trivial. For gitlab.com however it is a problem; running User.update_all(authentication_token: nil) is probably going to freeze the database in some bad way.
@jacobvosmaer-gitlab What would be the benefit of doing this in a worker instead of doing this in a Rake task directly? I think the latter would be simpler while achieving the same results.
To figure out if we can scrub the Sentry data, I looked at the Sentry DB schema and code base briefly (thankfully, my Django experience pays off here).
It looks like it won't be a simple task of running a SQL UPDATE because Sentry stores this data in the sentry_message table in a column that is gzipped (possibly as a JSON or Python dictionary):
@yorickpeterse I think the idea was to use .find_in_batches, in a Rake task, indeed.
@stanhu Thanks! Although, if we choose to advertise the "nuke all the tokens in the GitLab DB", we shouldn't need to scrub the data in the Sentry DB?
Btw I've merged security into master after yesterday's security release, my bad the fix for this issue was already merged to security so now it's in master... That said I believe this issue is not exploitable directly so now I think we should try to release it either today, tomorrow, or early next week. Note for the future: we should not merge security in master but only cherry-pick the security commits in master once they're released as part of a tagged release. I will update the security guide.