GitLab sends HTTP headers containing user API tokens to Sentry

Oh, this might be because we added API logging to Sentry recently, and it doesn't scrub sensitive information the way Rails does. We should add that.

We might have to nuke our Sentry projects to get the data out. https://docs.sentry.io/hosted/learn/sensitive-data/

@stanhu I see some code in ApplicationController that accepts the Private-Token header, yet I see nothing that scrubs it. So I am not sure if this is an API-only problem.

We do set up the Raven scrubber here: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/config/initializers/sentry.rb#L20

I'm pretty sure this works on the Rails exceptions, but for some reason it may not be working for the API exception logging.

@stanhu from working with the raven-go client I know that Sentry likes to add an 'http context' to each message it sends out. This is a separate thing from the embedded exception. I think the exception rarely contains HTTP headers.

Oh this exception is coming from Rails after all. I think it's because "Private-Token" is different from private_token?

This cat is already sort of out of the bag because I have some of the required scrubbing code sitting in a public gitlab-workhorse MR.

What we need is a hook that lets us delete headers from the HTTP context that Sentry builds. Then do a security release, and nuke all Sentry 'projects' that contain this data. Perhaps even nuke and secure-erase the entire sentry.gitlap.com server.

For people who use sentry.io the recommended solution (see link above) is to delete the 'project' https://docs.sentry.io/hosted/learn/sensitive-data/#removing-data

The client scrubs the HTTP “Authorization” header of requests before sending them to Sentry, to prevent sensitive credentials from being sent. You can specify additional HTTP headers to ignore:

config.sanitize_http_headers = ["Via", "Referer", "User-Agent", "Server", "From"]

From https://docs.sentry.io/hosted/clients/ruby/config/

@jacobvosmaer-gitlab Right, we'll need to add more.

FYI, Kibana leaks private tokens in the query string too: https://gitlab.com/gitlab-com/infrastructure/issues/273

Thanks for the reminder about Kibana. Our response in this issue should be proportionate.

I think https://gitlab.com/gitlab-org/gitlab-ce/issues/19203 is a better link for Kibana.

I remember I added that line in the sentry.rb a while ago after realizing it wasn't being filtered as recommended by the docs.

We should make sure to check any new monitoring/logging we add to ensure that information like this isn't being leaked.

So how are we going to deal with this?

fix in master, no further action
security release
security release and throw away all existing exception data
security release, throw away existing exception data, force all user API tokens to be rotated

@DouweM what do you think?

Reassigned to @jacobvosmaer-gitlab

After discussion with @stanhu :

this should be an 8.12 security patch release (cc @rymai )
code fix is simple; blacklist headers in sentry initializer
needs a clear discussion of the impact in the blog post
we may choose not to delete any data at all because we control sentry.gitlap.com
options for others depend on whether they use sentry.io or self-hosted, and on whether they mind losing exception history: all we can do is inform and explain the options

I will prepare the code patch and a blurb for the blog post. Help with the full release process welcome.

Mentioned in merge request !6668 (merged)

Hmm. Scrubbing HTTP headers was added in raven 2.0.0, we are on 1.1.0. Current is 2.0.2.

I think we should be OK with Raven 2.0.x (we use the gem practically without any custom configuration, the breaking changes in https://github.com/getsentry/raven-ruby/blob/master/changelog.md do not sound like they apply to us). But it is a bit of a risk.

MR that prevents sending Private-Token to Sentry: https://dev.gitlab.org/gitlab/gitlabhq/merge_requests/2004

This is what I have so far for the explanatory blurb.

This vulnerability only affects GitLab instances that use Sentry exception tracking. This feature is off by default in GitLab.

As a GitLab adminstrator you have the option to integrate your GitLab instance with Sentry, an external exception tracking system. When this feature is enabled, you can see details of each error ('500 page') that occurs on your GitLab server. These details include HTTP headers of the request that experienced the exception. Prior to GitLab 8.12.X, when an exception occured in the GitLab API (a URL starting with /api/v3/), then GitLab would inadvertently send the 'Private-Token' header used to authenticate with the GitLab API in the error report to Sentry. This means that when you view a Sentry error report for an exception that occured during a GitLab API request you can see the API token of the user that performed the request. This also means that if there is a data breach at your Sentry server, GitLab user API tokens may be exposed.

The holder of the API token for a GitLab user can impersonate that user in GitLab via the API. That includes writing comments, adding SSH keys, creating repositories. The holder of the API token of a GitLab administrator is able to do much more, for instance creating new user accounts. Individual GitLab users can reset their API token at gitlab.example.com/profile/account, but there is no way for administrators to do a full reset of all user API tokens on their GitLab server. GitLab API tokens also never expire.

The only (documented) way to clear this data from Sentry is to clear the exception history for your GitLab instance in Sentry.

Reading what I've read so far it feels irresponsible to suggest not nuking the exception history. The best response would be to invalidate all API tokens but this is invasive for users and I am not sure how to do it (SQL UPDATE to null the column?). Leaving an unknown number of never-expiring API tokens, including admin tokens, just lying around on the Sentry server because we own the Sentry server and we think it's safe... doesn't feel right to me.

Also, is it really that bad to loose exception history? We usually care most about what is going on right now or in the past few days. Do we really look back more than a few days to see what exceptions there are?

I will have a look tomorrow at nulling the private token column.

cc @stanhu

@briann what do you think of this. What is appropriate, responsible to users / GitLab administrators. What do you think of the draft blurb above.

What I am hoping is that if we null the private token column then when a user visits their 'account settings' page a new token is generated automatically. If we are unlucky we get a 500 error instead.

Generating new tokens for each individual user is not going to fly.

My apologies for misreading your question. The writeup looks good. From what I've read I think it's safe to assume that anyone who had this feature enabled suffered leakage of their API tokens and should reset them. Between nuking the exception history and nulling the tokens I'd lean towards nulling the tokens. The exception history is out there and the tokens may have leaked already.

Is it possible to create a rake task or a simple SQL script that could be downloaded for those who wish to go with the null token option?

Between nuking the exception history and nulling the tokens I'd lean towards nulling the tokens. The exception history is out there and the tokens may have leaked already.

I agree.

Is it possible to create a rake task or a simple SQL script that could be downloaded for those who wish to go with the null token option?

Yes to a boring solution!

@briann @jacobvosmaer-gitlab This probably should be backported to 8.11 and 8.10?

Thanks @briann @rymai . Will investigate now if nulling tokens is a viable option.

Edit:

Looks like this is OK for the user.

Nice message and 'generate token' button instead of 500 error on /profile/account

When I update some other user attribute a token is generated automatically (so the NULL is not preventing SQL updates to the user because of some validation failing)
API requests fail with a 401 when the token is null, that is an appropriate message

HOWEVER. It seems there are some front-end features which rely on the User.private_token, which is an alias for User.authentication_token. When that value is null, those features might stop working (until the user gets a new token).

@jacobvosmaer-gitlab Thanks for investigating on this!

HOWEVER. It seems there are some front-end features which rely on the User.private_token, which is an alias for User.authentication_token. When that value is null, those features might stop working (until the user gets a new token).

Indeed, we still use the private token clients-side in the RSS URLs (see https://gitlab.com/gitlab-org/gitlab-ce/issues/18302#note_12313796) but I believe this is not a big deal if RSS feeds stop working until a new token is generated.

@rymai the question is when does that token get generated. I think a lot of people don't even know this token exists. So for users who do use the atom feed, it would just mysteriously stop working.

If we null the authentication tokens on gitlab.com then for the majority of users atom is broken and will only fix itself if the user causes validations to be run.

@jacobvosmaer-gitlab Good points! :/ Do we have an acceptable solution other than regenerate a new token for every users (this could/should be done in background jobs)?

I was thinking maybe we could ensure the token exists when the user signs in. Then the problem solves itself over time.

Edit: this may already be the case, I have not had the chance to test it.

@jacobvosmaer-gitlab Good idea! This works (the authentication token got regenerated upon login).

@jacobvosmaer-gitlab I think the only thing missing now is the command to nuke all the authentication_token, this could a simple Rake task?

It would be great to have it today, so that we can include this fix in 8.12.4 which we plan for today (hopefully).

Yay! So the user impact of nulling users.the authentication_token column should be acceptable.

For small GitLab servers, nulling that column is trivial. For gitlab.com however it is a problem; running User.update_all(authentication_token: nil) is probably going to freeze the database in some bad way.

We could make a rake task with code similar to https://gitlab.com/gitlab-org/gitlab-ce/blob/master/app/workers/repository_check/clear_worker.rb to avoid one massive UPDATE query freezing the users table.

@jacobvosmaer-gitlab Thanks!

@yorickpeterse What do you think of this approach? The goal is to nullify the authentication_token column for all the users. :)

@jacobvosmaer-gitlab What would be the benefit of doing this in a worker instead of doing this in a Rake task directly? I think the latter would be simpler while achieving the same results.

To figure out if we can scrub the Sentry data, I looked at the Sentry DB schema and code base briefly (thankfully, my Django experience pays off here).

It looks like it won't be a simple task of running a SQL UPDATE because Sentry stores this data in the sentry_message table in a column that is gzipped (possibly as a JSON or Python dictionary):

For example:

sentry=# select * from sentry_message limit 1;
 id |                         message                         |        datetime        |                                     data                                     | group_id |            message_id            | project_id | time_spent | platform 
----+---------------------------------------------------------+------------------------+------------------------------------------------------------------------------+----------+----------------------------------+------------+------------+----------
  2 | JSON::ParserError: 795: unexpected token at 'Thank you' | 2016-01-22 12:22:12+00 | eJzTSCkw5ApWz8tPSY3PTFHnKjAC8sLCXL2MK8urgirNTI31yyr8PUxSytJtbYHSxlzFegCZdA8+ |        2 | c13b01e1b7c84be5a3f0bc6bd8211c43 |          2 |            | ruby
(1 row)

It's probably not too hard to write a migration to decompress each data column, scrub the contents, and update it.

Status changed to closed by commit 437bebb0

@yorickpeterse I think the idea was to use .find_in_batches, in a Rake task, indeed.

@stanhu Thanks! Although, if we choose to advertise the "nuke all the tokens in the GitLab DB", we shouldn't need to scrub the data in the Sentry DB?

Btw I've merged security into master after yesterday's security release, my bad the fix for this issue was already merged to security so now it's in master... That said I believe this issue is not exploitable directly so now I think we should try to release it either today, tomorrow, or early next week. Note for the future: we should not merge security in master but only cherry-pick the security commits in master once they're released as part of a tagged release. I will update the security guide.

FYI I've prepared a Rake task to nuke all the authentication tokens in the new-gitlab-users-clear_all_authentication_tokens-task branch, but I'm currently unable to create a MR for it due to infrastructure issues.

Mentioned in merge request !6745 (merged)

Mentioned in commit 2ef90053

Mentioned in issue #23067 (closed)

Made the issue visible

GitLab sends HTTP headers containing user API tokens to Sentry

Designs

Child items ...

Activity

Admin message

Admin message

GitLab sends HTTP headers containing user API tokens to Sentry

Activity