I agree with @ayufan on this: do not log/print passwords, ever (your log statement could output the first and the last character of the password if you are in doubt whether the right one is used).
And fiddling around with the console log did lead to problems in Jenkins in my experience.
An option to mask variables in build logs seems valuable. Perhaps that should even be the default behavior. It doesn't actually secure the variable, but it helps protect from accidental display. Perhaps we could let people reveal masked variables (if they have the right permissions).
I wouldn't be opposed to seeing that. Of course, it's always possible to put together a build script to reveal the secrets, but at least it's a layer of security.
This would accomplish another thing: it could provide project owners with the possibility of keeping the build logs public but redacting sensitive information from them. Which is currently impossible.
Example scenario: the user wants to set up CI so it push mirrors into a mirror repository.
The user creates a personal access token
He adds his token to the project variables
He sets up CI so it uses the project variable (his token) to auth and push from a CI run to the mirror repository
The authentication step fails (for whatever reason - maybe a typo in his username), and the token is publicly shown
Like this:
Now he could always make the pipeline private, but what if he wanted to have it public but prevent his sensitive information from leaking?
BTW, I think the best way would be creating an authentication token per project, besides making it secret and not showing in build logs. This way, even if revealed, it wouldn't affect other projects of that user. Having only authentication token per user, when leaked, the results could be disastrous.
I believe that we are working on deploy keys allowed to be used to push.
That's great!
Secondly we already do have repository mirroring as part of EE.
I know, but this is not the problem I'm seeing. The major issue is the security risk on having "secret keys" being revealed on build logs, as @cmattrex showed above. That's my concern. We think we're safe by storing the secret variables in the project's settings > variables, but they aren't hidden at all, when, for example, the build fails. From my pov, they should be displayed as "xxxxx" or something like that, but we should never expose a secret key.
just my 2 cents, this is non-trivial and affects all ci solutions, I'd prefer to see plug-able solutions/approaches/recommendations, rather than a specific attempt to fix it. Staring passwords in the build log is fine until someone decides to pipe everything into a file for later review - can't star those...
In CI land, preventing others from seeing the secrets is difficult and essentially becomes a matter of trust.
I'm trialing Hashicorp Vault, because it can be setup to provide temporary credentials based on different backends (Aws/etc), which means that the exposure of credentials is only an issue for the lease duration (e.g. 10min). Also it provides central auditing and the ability to revoke access in one place. This doesn't solve the problem of someone checking out the repo and branching it with a modified gitlab-ci.yml, possibly affecting live environments. If that level of control is required (or lack of trust exists), then our aim is to use a dual repo model - one repo that has the code and build logic, and one that deploys it - in this way we can limit who can deploy and affect environments by who has access to the deploy repo.
@ayufan I just wanted to make my point and raise awareness for something that I consider... hmmm... sensitive. I came across this issue a couple days ago, and realized that I could have leaked a personal access token. Can you imagine the disaster it could cause if someone with bad intentions accessed my account with my token? he/she could make an epic mess with all the projects I have access to. That could happen to anyone, using any GitLab instance. It's implicit in the name "secret variable" that it should be secret, we trust that.
I thought it wouldn't be such a big deal to improve this, once we have it working for the variable $CI_BUILD_TOKEN:
But this is as far as I can go with this discussion, I don't have the knowledge to go any further on a feature proposal, or anything like that :( Up to you guys to know what to do, and when to do! :D
As $CI_BUILD_TOKEN is short living and even if it leaks it allows very limited access, but putting credentials as Secret Variables when they can be multiline makes it pretty much impossible to properly mask.
I don't know, but my thinking was to not introduce the features that can give you a false sense of security and "masking" is this kind of thing. Preventing secrets from leaking is very hard thing to do, and the only reliable way to prevent that from happening is to disallow access for CI build logs, CI runner machines and CI artifacts. So, if you are concerned you should protect these parts of the system.
The idea of having separate repositories is another way to mitigate the problem, you basically limit access to mentioned parts of the system.
Maybe @briann can jump in and add a few words about other best practices, but generally I consider this problem as very complicated to do right.
@ayufan Would it be possible to check the result code from git and then if it's not zero run the error through a filter that parses any URL and strips the credentials?
I'm not concerned about my own projects or my account, bc I didn't include any secret variable that could cause a problem, just a runner token for a test project. But I'm concerned for other users that could fall into this issue.
I'm looking at something like this for access to third-party systems. For instance, we publish artifacts to an artifactory instance. To do that, we need to provide an API key to the system. That token, in the wrong hands, could do some damage to our libraries, so we want to keep it secure.
Having a system that can create ephemeral keys for access would be nice, but since this is outside of Gitlab, I think that becomes a harder problem. What might be nice is a way to get those credentials from a third party system in a scripted fashion. But I don't think that needs to be a gitlab feature, it should be easy enough to script something up that can be called from the build script to retrieve those.
guys/girls: there exists some kind of masking, i've seen it in my logs! i think $CI_BUILD_TOKEN gets masked in env output somehow. what's the current state of masking? perhaps some links to actual implementation to get better picture?
I use https://www.vaultproject.io/, I have a script that generates vault token and stores them as secrets in all ours repos this token is change each hour and is different in each repo.
then in my custom build image, I have vault client installed to access to those secrets, in this way I can access to secrets that are generated also dynamically and that vault automatically destroy after few minutes, like pki certificates for my kubernetes cluster
this probably will not work with static secrets. but I don't see why you want to have static secret in your ci environment anyway.
I hate beating a dead horse, but in this case I think it's somewhat necessary, since I don't think I saw any sort of conclusion here: I think this feature should be seriously considered and implemented. There are solutions for this in other CI platforms (namely Jenkins and Travis), so not having this in GitLab is a major detractor for the security-minded folks around IMHO.
We need to differentiate here - there are two things we want protection for.
Revealing secret variables against mistake - I alreday stepped into this, debugging mi ci scripts with bash -x and env to show env variables, and bang, my CI runner ssh private key leaked in the output.
It's very useful to at least protect users from this happening.
Active attack - I agree with @criloz that this might be more difficult and using better tools to handle authentication is probably the better way here. But I think for those users who doesn't use those tools, we should at least increase the security for them.
Solving 1. would be the first step in the right direction.
Totally agree with @varac in that solving for 1 is a step in the right direction.
But for 2, and if I understand @criloz correctly, still can be mitigated. This is all dependent on how the GitLab folks would decide to implement it though. And this isn't an easy problem to solve.
The best comparison I think I can draw is how Travis seemed to solve it with encrypted environment variables. The idea is you encrypt your sensitive data in the .travis.yml file, and then that those environment variables get decrypted in the runner (simple enough, right?). However, encrypted variables can't be used from forked repositories, which stops an attacker from forking your repo and making a PR that will dump your secrets; only branched pull requests work. It can be problematic, but I do think that approach mitigates the problem.
The best comparison I think I can draw is how Travis seemed to solve it with encrypted environment variables. The idea is you encrypt your sensitive data in the .travis.yml file, and then that those environment variables get decrypted in the runner (simple enough, right?). However, encrypted variables can't be used from forked repositories, which stops an attacker from forking your repo and making a PR that will dump your secrets; only branched pull requests work. It can be problematic, but I do think that approach mitigates the problem.
They are decrypted, but can still be printed with simple printenv. So you are back into step one :)
Encryption of variables secures them on transfer, but do not prevent from printing them.
Correct; however what I'm saying is that Travis CI won't run jobs with encrypted variables from PRs from forked repos, making it impossible for an attacker to fork your PR and dump your secrets through a pull request.
Instead Travis will only run jobs with encrypted variables if the PR was made from a branch on the original repository. This limits the attack surface by preventing random attackers from forking/PRing/dumping and trusting only those with access to create branches on the repository.
We have our secure vars as env vars on the runners, set in the config.toml. A developer was able to echo that creatively via echo -n “$PASSWORD” | base64, and then just copy that base64 encoded value to his laptop and decode it there. This was an ethical hacking scenario that we want to avoid via further security on Gitlab's part.
@bikebilly that is true. The bulk of our runner config is a shared environment. We've begun to segregate some of them now. We have a inconvenient issue that is outside of this scope - where we need to allow a lot of people to be master to run manual jobs but would prefer they don't have access to see the encrypted vars, which they can unmask in the UI.
@bikebilly yes, its taken our culture a bit to accept that masters should be trusted etc. Second point, we've begun to implement that sort of pattern too, but only limited so far. We're hoping to wait until premium has custom ci/project templates to roll a lot of improvements out across the board.
GitLab is moving all development for both GitLab Community Edition
and Enterprise Edition into a single codebase. The current
gitlab-ce repository will become a read-only mirror, without any
proprietary code. All development is moved to the current
gitlab-ee repository, which we will rename to just gitlab in the
coming weeks. As part of this migration, issues will be moved to the
current gitlab-ee project.
If you have any questions about all of this, please ask them in our
dedicated FAQ issue.
Using "gitlab" and "gitlab-ce" would be confusing, so we decided to
rename gitlab-ce to gitlab-foss to make the purpose of this FOSS
repository more clear
I created a merge requests for CE, and this got closed. What do I
need to do?
Everything in the ee/ directory is proprietary. Everything else is
free and open source software. If your merge request does not change
anything in the ee/ directory, the process of contributing changes
is the same as when using the gitlab-ce repository.
Will you accept merge requests on the gitlab-ce/gitlab-foss project
after it has been renamed?
No. Merge requests submitted to this project will be closed automatically.
Will I still be able to view old issues and merge requests in
gitlab-ce/gitlab-foss?
Yes.
How will this affect users of GitLab CE using Omnibus?
No changes will be necessary, as the packages built remain the same.
How will this affect users of GitLab CE that build from source?
Once the project has been renamed, you will need to change your Git
remotes to use this new URL. GitLab will take care of redirecting Git
operations so there is no hard deadline, but we recommend doing this
as soon as the projects have been renamed.
Where can I see a timeline of the remaining steps?