0001 =========
0002 SafeSetID
0003 =========
0004 SafeSetID is an LSM module that gates the setid family of syscalls to restrict
0005 UID/GID transitions from a given UID/GID to only those approved by a
0006 system-wide allowlist. These restrictions also prohibit the given UIDs/GIDs
0007 from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
0008 allowing a user to set up user namespace UID/GID mappings.
0009
0010
0011 Background
0012 ==========
0013 In absence of file capabilities, processes spawned on a Linux system that need
0014 to switch to a different user must be spawned with CAP_SETUID privileges.
0015 CAP_SETUID is granted to programs running as root or those running as a non-root
0016 user that have been explicitly given the CAP_SETUID runtime capability. It is
0017 often preferable to use Linux runtime capabilities rather than file
0018 capabilities, since using file capabilities to run a program with elevated
0019 privileges opens up possible security holes since any user with access to the
0020 file can exec() that program to gain the elevated privileges.
0021
0022 While it is possible to implement a tree of processes by giving full
0023 CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
0024 tree of processes under non-root user(s) in the first place. Specifically,
0025 since CAP_SETUID allows changing to any user on the system, including the root
0026 user, it is an overpowered capability for what is needed in this scenario,
0027 especially since programs often only call setuid() to drop privileges to a
0028 lesser-privileged user -- not elevate privileges. Unfortunately, there is no
0029 generally feasible way in Linux to restrict the potential UIDs that a user can
0030 switch to through setuid() beyond allowing a switch to any user on the system.
0031 This SafeSetID LSM seeks to provide a solution for restricting setid
0032 capabilities in such a way.
0033
0034 The main use case for this LSM is to allow a non-root program to transition to
0035 other untrusted uids without full blown CAP_SETUID capabilities. The non-root
0036 program would still need CAP_SETUID to do any kind of transition, but the
0037 additional restrictions imposed by this LSM would mean it is a "safer" version
0038 of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
0039 do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
0040 namespace). The higher level goal is to allow for uid-based sandboxing of system
0041 services without having to give out CAP_SETUID all over the place just so that
0042 non-root programs can drop to even-lesser-privileged uids. This is especially
0043 relevant when one non-root daemon on the system should be allowed to spawn other
0044 processes as different uids, but its undesirable to give the daemon a
0045 basically-root-equivalent CAP_SETUID.
0046
0047
0048 Other Approaches Considered
0049 ===========================
0050
0051 Solve this problem in userspace
0052 -------------------------------
0053 For candidate applications that would like to have restricted setid capabilities
0054 as implemented in this LSM, an alternative option would be to simply take away
0055 setid capabilities from the application completely and refactor the process
0056 spawning semantics in the application (e.g. by using a privileged helper program
0057 to do process spawning and UID/GID transitions). Unfortunately, there are a
0058 number of semantics around process spawning that would be affected by this, such
0059 as fork() calls where the program doesn't immediately call exec() after the
0060 fork(), parent processes specifying custom environment variables or command line
0061 args for spawned child processes, or inheritance of file handles across a
0062 fork()/exec(). Because of this, as solution that uses a privileged helper in
0063 userspace would likely be less appealing to incorporate into existing projects
0064 that rely on certain process-spawning semantics in Linux.
0065
0066 Use user namespaces
0067 -------------------
0068 Another possible approach would be to run a given process tree in its own user
0069 namespace and give programs in the tree setid capabilities. In this way,
0070 programs in the tree could change to any desired UID/GID in the context of their
0071 own user namespace, and only approved UIDs/GIDs could be mapped back to the
0072 initial system user namespace, affectively preventing privilege escalation.
0073 Unfortunately, it is not generally feasible to use user namespaces in isolation,
0074 without pairing them with other namespace types, which is not always an option.
0075 Linux checks for capabilities based off of the user namespace that "owns" some
0076 entity. For example, Linux has the notion that network namespaces are owned by
0077 the user namespace in which they were created. A consequence of this is that
0078 capability checks for access to a given network namespace are done by checking
0079 whether a task has the given capability in the context of the user namespace
0080 that owns the network namespace -- not necessarily the user namespace under
0081 which the given task runs. Therefore spawning a process in a new user namespace
0082 effectively prevents it from accessing the network namespace owned by the
0083 initial namespace. This is a deal-breaker for any application that expects to
0084 retain the CAP_NET_ADMIN capability for the purpose of adjusting network
0085 configurations. Using user namespaces in isolation causes problems regarding
0086 other system interactions, including use of pid namespaces and device creation.
0087
0088 Use an existing LSM
0089 -------------------
0090 None of the other in-tree LSMs have the capability to gate setid transitions, or
0091 even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
0092 "Since setuid only affects the current process, and since the SELinux controls
0093 are not based on the Linux identity attributes, SELinux does not need to control
0094 this operation."
0095
0096
0097 Directions for use
0098 ==================
0099 This LSM hooks the setid syscalls to make sure transitions are allowed if an
0100 applicable restriction policy is in place. Policies are configured through
0101 securityfs by writing to the safesetid/uid_allowlist_policy and
0102 safesetid/gid_allowlist_policy files at the location where securityfs is
0103 mounted. The format for adding a policy is '<UID>:<UID>' or '<GID>:<GID>',
0104 using literal numbers, and ending with a newline character such as '123:456\n'.
0105 Writing an empty string "" will flush the policy. Again, configuring a policy
0106 for a UID/GID will prevent that UID/GID from obtaining auxiliary setid
0107 privileges, such as allowing a user to set up user namespace UID/GID mappings.
0108
0109 Note on GID policies and setgroups()
0110 ====================================
0111 In v5.9 we are adding support for limiting CAP_SETGID privileges as was done
0112 previously for CAP_SETUID. However, for compatibility with common sandboxing
0113 related code conventions in userspace, we currently allow arbitrary
0114 setgroups() calls for processes with CAP_SETGID restrictions. Until we add
0115 support in a future release for restricting setgroups() calls, these GID
0116 policies add no meaningful security. setgroups() restrictions will be enforced
0117 once we have the policy checking code in place, which will rely on GID policy
0118 configuration code added in v5.9.