This is impacting me as well. I'm not sure why the "only_if" block is passing... I'm running the container in OpenShift Enterprise 3 but when I run id -Z it returns "id: --context (-Z) works only on an SELinux-enabled kernel" as expected. (I checked and echo $? returns 1, an error code. Unless Chef is interpreting that as 'true'.)
It still tries to run the chcon command which fails as should be expected as the filesystem is NFS and the mount option does not support SELinux labels.
I think there are a few options:
Try and run the chcon command but use it with true || so that it skips it either way
Use some sort of Chef exception handling and throw out a warning
Add an environment variable to control the execution of that part of the script. (Especially for container users to turn it off.)
My preference is for the error/warning with the possibility of an environment variable. I'd accept the true || just to get passed it.
ls -dZ #{ssh_home} | egrep '(fuse|nfs|cifs)' <-- returns something like drwx------. git git system_u:object_r:fusefs_t:s0 /var/opt/gitlab/.ssh if and only if the current SELinux 'type' context includes any of the strings "fuse", "nfs", or "cifs"; if the egrep fails then the command will have a bash return code of 1, thus not satisfying the "not_if" condition.
In my case, the .ssh directory for the git user is a fusefs_t context as it's a FUSE mountpoint. In chrisruffalo's case, the user home directory is NFS. The egrep would prevent both use-cases from having failure events. The context-declaration at mount-time would only remediate my own.
I have never seen a complete list of possible remote/SELinux-context-forcing filesystems, but a non-comprehensively list that should cover most cases would include:
nfs
fusefs
cifs
sshfs (usually covered by fusefs context)
ecryptfs
A more robust test would be, in order:
Check if SELinux is enabled. id -Z seems to be sufficient to this so long as we check for unconfined_u or system_u as output as a secondary validation.
Upon successfully passing condition 1, Check the current SELinux context of the #{ssh_home} directory for any of nfs_t, fusefs_t, cifs_t, or ecryptfs_t.
Upon passing any one of the above, using the setsebool -P use_#{selinux_fstype}_home_dirs on where "selinux_fstype" is the value before _t being checked for in condition 2.
Upon passing condition 1 but finding no hits in condition 2, then use the chcon --recursive --type ssh_home_t #{ssh_home} command.
Only if all steps after 1 return failures do we consider the context-enforcing to be a failed step.
I apologize for not writing Chef code for this, but that is not an area I have any strength with.
As a quick-and-dirty, we can simply skip performing the chcon in those cases where SELinux is enabled but there is a 'special' filesystem context on #{ssh_home} by inserting into the "gitlab-shell.rb" recipe (in my omnibus install, this is located at /opt/gitlab/embedded/cookbooks/gitlab/recipes/gitlab-shell.rb ), in the line immediately after only_if "id -Z" (on my system, this is line #75 (closed)), a line containing:
not_if "ls -dZ /var/opt/gitlab/.ssh | egrep '(fusefs\|nfs\|cifs\|ecryptfs)'"
The not_if approach is the method I have used, and this should cover almost all use-cases; though it does leave setting the sebool to the system administrator.
@twk3 Can you please take a look? Someone has a proposed fix in the related MR. We have a customer running in to this on an HA setup. We had to comment out code which isn't ideal at all. Thanks.
In the customer's case, they have a bind mount where /var/opt/gitlab/.ssh/ binds to an NFS share. I know we've tested this with other customers and it works so I wonder why this particular bind mount won't allow chcon. Is it something with the bind mount or NFS?
The gitlab omnibus install sets the git user's home directory to /var/opt/gitlab ; and as such /var/opt/gitlab/.ssh is the proper directory for the SSH daemon to look at for the git user.
If the filesystem is local and SELinux enabled, then no problems. chcon works as expected. If the filesystem however is provided via a remote system (NFS, glusterfs, CIFS), then the SELinux type context is forced to a value that reflects the remote nature of the filesystem in question, and thus cannot be changed to ssh_home_t (although in some cases you can actually use a mounting option to specify that for the entire mountpoint).
If that is not an option (or simply not preferred) then specific SELinux booleans must be toggled based on what type of remote filesystem is providing git's ~/.ssh directory:
use_fusefs_home_dirs
use_nfs_home_dirs
use_samba_home_dirs
(An example command to run for enabling the use of an NFS filesystem for a user's home directory for SSH would be setsebool -P use_nfs_home_dirs on)
In my case I have modified the gitlab-shell.rb chef recipe to include the following line after line 73 (only_if "id -Z"): not_if "ls -Z /var/opt/gitlab/.ssh | egrep '(fuse\|nfs\|cifs)'"
and that has allowed me to successfully run gitlab-ctl reconfigure.
bash[Set proper security context on ssh files for selinux] action run [execute] chcon: failed to change context of `/var/opt/gitlab/.ssh/authorized_keys' to `system_u:object_r:sshd_key_t:s0': Operation not supported
We also tried to set directly the value for the authorized keys in the gitlab.rb configuration file to
For anyone who has experienced this issue, can you let me know what your mount options are for your gitlab nfs mounts in /etc/fstab? Specifically, i'm looking for whether or not your options include a context= option that ensures your context mounts remain persistent (and also the link for RHEL 6 docs).
I believe this is the key to the reconfigure failing, and have only just realized it myself. If this is the case, no changes would be needed for the reconfigure recipe, rather updates to the nfs mount options documentation would be needed, perhaps adding a section on selinux enabled hosts similar to the section for selinux hosts on ci-runner docker configuration.
An enhancement to chef probably could be made, allowing the gitlab.rb configuration to specify remote host, mount options, etc, and then ensure that the correct options were inserted into /etc/fstab for boot-time support, in addition to waiting for those mounts to be available before certain steps of the reconfigure process. I'm not experienced enough with chef to make a MR/suggestion for that functionality.
Also likely related - #1461. think it's also likely to cause the loss of ssh clone functionality after an upgrade if /var/opt/gitlab is managed via lvm. The same issues occur in that scenario, but in this case local chcon execution or using semanage fcontext just once will likely permanently resolve the issue.
It seems there are 2 solutions, once that users can take now, and one that requires a chef update:
update context mounts in /etc/fstab so nfs mounts survive relabing (happens whenever restorecond blasts away chcon that set any filesystems to non-default contexts). Also use it to ensure use_nfs_home_dirs is on if selinux is enabled.
Update chef to use semanage fcontext so that context changes are permanent, not temporary.
Overall, I think gitlab.rb.erb might need to have a selinux section moving forward, to allow users to manage selinux through the omnibus chef configuration on selinux managed machines.
I've been able to use the information I shared above to consistently reproduce the issue, and am including a full log for gitlab-ctl reconfigure as well as a log for ls -Z /var/opt/gitlab
Output from `gitlab-ctl reconfigure`
$ sudo gitlab-ctl reconfigure
Starting Chef Client, version 12.12.15
resolving cookbooks for run list: ["gitlab-ee"]
Synchronizing Cookbooks:
bash[Set proper security context on ssh files for selinux] actionrun
[execute] chcon: failed to change context of ‘gitsync.pub’ to ‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘authorized_keys’ to‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘/var/opt/gitlab/.ssh’ to ‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘/var/opt/gitlab/.ssh/authorized_keys’ to ‘system_u:object_r:sshd_key_t:s0’: Operation notsupported
================================================================================
Error executing action run on resource 'bash[Set proper security context on ssh files for selinux]'
Mixlib::ShellOut::ShellCommandFailed
------------------------------------
Expected process to exit with [0], but received '1'
---- Begin output of "bash" "/tmp/chef-script20170821-1979-1ikum8t" ----
STDOUT:
STDERR: chcon: failed to change context of ‘gitsync.pub’ to ‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘authorized_keys’ to ‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘/var/opt/gitlab/.ssh’ to ‘system_u:object_r:ssh_home_t:s0’: Operation not supported
chcon: failed to change context of ‘/var/opt/gitlab/.ssh/authorized_keys’ to ‘system_u:object_r:sshd_key_t:s0’: Operation not supported
---- End output of "bash" "/tmp/chef-script20170821-1979-1ikum8t" ----
Ran "bash" "/tmp/chef-script20170821-1979-1ikum8t" returned 1
If you're using NFS mounts on a selinux enabled host, (specifically if the git user's home dir is nfs mounted) you need to run setsebool use_nfs_home_dirs on, and then set up your nfs mounts with a specific context.
For now, i'm recommending individually managing each location mount so that you don't create two different contexts for the same directory, and then add a context="system_u:object_r:user_home_dir_t:s0 to the mount of the git user's home dir.
If you're not using NFS mounts, but otherwise have selinux enabled, then chef needs to use semanage fcontext instead of chcon to make sure restorecond doesn't reset the labels on gitlab data directories.
This will require a documentation fix, as well as a chef recipe change. I can submit a MR for the doc changes, but would need to work with someone on the gitlab team about the scheduling of the git stuff, since the information in the docs will be different before/after the updates to omnibus chef.
A co-worker just mentioned a good point... authorized_keys will no longer be supported moving forward.
Throughout all the work I've done testing all of this, I've realized that the only directory that causes issues with gitlab-ctl reconfigure is git user's home directory.
Will the git user's home directory still be a managed directory moving forward? If not, it might be worth classifying as a do-not-fix, and to just refer to the instructions above for any <10.0 installations.