Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 .. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
0003 .. Copyright © 2019-2020 ANSSI
0004 .. Copyright © 2021-2022 Microsoft Corporation
0005 
0006 =====================================
0007 Landlock: unprivileged access control
0008 =====================================
0009 
0010 :Author: Mickaël Salaün
0011 :Date: May 2022
0012 
0013 The goal of Landlock is to enable to restrict ambient rights (e.g. global
0014 filesystem access) for a set of processes.  Because Landlock is a stackable
0015 LSM, it makes possible to create safe security sandboxes as new security layers
0016 in addition to the existing system-wide access-controls. This kind of sandbox
0017 is expected to help mitigate the security impact of bugs or
0018 unexpected/malicious behaviors in user space applications.  Landlock empowers
0019 any process, including unprivileged ones, to securely restrict themselves.
0020 
0021 We can quickly make sure that Landlock is enabled in the running system by
0022 looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep
0023 landlock || journalctl -kg landlock`` .  Developers can also easily check for
0024 Landlock support with a :ref:`related system call <landlock_abi_versions>`.  If
0025 Landlock is not currently supported, we need to :ref:`configure the kernel
0026 appropriately <kernel_support>`.
0027 
0028 Landlock rules
0029 ==============
0030 
0031 A Landlock rule describes an action on an object.  An object is currently a
0032 file hierarchy, and the related filesystem actions are defined with `access
0033 rights`_.  A set of rules is aggregated in a ruleset, which can then restrict
0034 the thread enforcing it, and its future children.
0035 
0036 Defining and enforcing a security policy
0037 ----------------------------------------
0038 
0039 We first need to define the ruleset that will contain our rules.  For this
0040 example, the ruleset will contain rules that only allow read actions, but write
0041 actions will be denied.  The ruleset then needs to handle both of these kind of
0042 actions.  This is required for backward and forward compatibility (i.e. the
0043 kernel and user space may not know each other's supported restrictions), hence
0044 the need to be explicit about the denied-by-default access rights.
0045 
0046 .. code-block:: c
0047 
0048     struct landlock_ruleset_attr ruleset_attr = {
0049         .handled_access_fs =
0050             LANDLOCK_ACCESS_FS_EXECUTE |
0051             LANDLOCK_ACCESS_FS_WRITE_FILE |
0052             LANDLOCK_ACCESS_FS_READ_FILE |
0053             LANDLOCK_ACCESS_FS_READ_DIR |
0054             LANDLOCK_ACCESS_FS_REMOVE_DIR |
0055             LANDLOCK_ACCESS_FS_REMOVE_FILE |
0056             LANDLOCK_ACCESS_FS_MAKE_CHAR |
0057             LANDLOCK_ACCESS_FS_MAKE_DIR |
0058             LANDLOCK_ACCESS_FS_MAKE_REG |
0059             LANDLOCK_ACCESS_FS_MAKE_SOCK |
0060             LANDLOCK_ACCESS_FS_MAKE_FIFO |
0061             LANDLOCK_ACCESS_FS_MAKE_BLOCK |
0062             LANDLOCK_ACCESS_FS_MAKE_SYM |
0063             LANDLOCK_ACCESS_FS_REFER,
0064     };
0065 
0066 Because we may not know on which kernel version an application will be
0067 executed, it is safer to follow a best-effort security approach.  Indeed, we
0068 should try to protect users as much as possible whatever the kernel they are
0069 using.  To avoid binary enforcement (i.e. either all security features or
0070 none), we can leverage a dedicated Landlock command to get the current version
0071 of the Landlock ABI and adapt the handled accesses.  Let's check if we should
0072 remove the `LANDLOCK_ACCESS_FS_REFER` access right which is only supported
0073 starting with the second version of the ABI.
0074 
0075 .. code-block:: c
0076 
0077     int abi;
0078 
0079     abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
0080     if (abi < 2) {
0081         ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
0082     }
0083 
0084 This enables to create an inclusive ruleset that will contain our rules.
0085 
0086 .. code-block:: c
0087 
0088     int ruleset_fd;
0089 
0090     ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
0091     if (ruleset_fd < 0) {
0092         perror("Failed to create a ruleset");
0093         return 1;
0094     }
0095 
0096 We can now add a new rule to this ruleset thanks to the returned file
0097 descriptor referring to this ruleset.  The rule will only allow reading the
0098 file hierarchy ``/usr``.  Without another rule, write actions would then be
0099 denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
0100 ``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
0101 descriptor.
0102 
0103 .. code-block:: c
0104 
0105     int err;
0106     struct landlock_path_beneath_attr path_beneath = {
0107         .allowed_access =
0108             LANDLOCK_ACCESS_FS_EXECUTE |
0109             LANDLOCK_ACCESS_FS_READ_FILE |
0110             LANDLOCK_ACCESS_FS_READ_DIR,
0111     };
0112 
0113     path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
0114     if (path_beneath.parent_fd < 0) {
0115         perror("Failed to open file");
0116         close(ruleset_fd);
0117         return 1;
0118     }
0119     err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
0120                             &path_beneath, 0);
0121     close(path_beneath.parent_fd);
0122     if (err) {
0123         perror("Failed to update ruleset");
0124         close(ruleset_fd);
0125         return 1;
0126     }
0127 
0128 It may also be required to create rules following the same logic as explained
0129 for the ruleset creation, by filtering access rights according to the Landlock
0130 ABI version.  In this example, this is not required because
0131 `LANDLOCK_ACCESS_FS_REFER` is not allowed by any rule.
0132 
0133 We now have a ruleset with one rule allowing read access to ``/usr`` while
0134 denying all other handled accesses for the filesystem.  The next step is to
0135 restrict the current thread from gaining more privileges (e.g. thanks to a SUID
0136 binary).
0137 
0138 .. code-block:: c
0139 
0140     if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
0141         perror("Failed to restrict privileges");
0142         close(ruleset_fd);
0143         return 1;
0144     }
0145 
0146 The current thread is now ready to sandbox itself with the ruleset.
0147 
0148 .. code-block:: c
0149 
0150     if (landlock_restrict_self(ruleset_fd, 0)) {
0151         perror("Failed to enforce ruleset");
0152         close(ruleset_fd);
0153         return 1;
0154     }
0155     close(ruleset_fd);
0156 
0157 If the `landlock_restrict_self` system call succeeds, the current thread is now
0158 restricted and this policy will be enforced on all its subsequently created
0159 children as well.  Once a thread is landlocked, there is no way to remove its
0160 security policy; only adding more restrictions is allowed.  These threads are
0161 now in a new Landlock domain, merge of their parent one (if any) with the new
0162 ruleset.
0163 
0164 Full working code can be found in `samples/landlock/sandboxer.c`_.
0165 
0166 Good practices
0167 --------------
0168 
0169 It is recommended setting access rights to file hierarchy leaves as much as
0170 possible.  For instance, it is better to be able to have ``~/doc/`` as a
0171 read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
0172 ``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
0173 Following this good practice leads to self-sufficient hierarchies that don't
0174 depend on their location (i.e. parent directories).  This is particularly
0175 relevant when we want to allow linking or renaming.  Indeed, having consistent
0176 access rights per directory enables to change the location of such directory
0177 without relying on the destination directory access rights (except those that
0178 are required for this operation, see `LANDLOCK_ACCESS_FS_REFER` documentation).
0179 Having self-sufficient hierarchies also helps to tighten the required access
0180 rights to the minimal set of data.  This also helps avoid sinkhole directories,
0181 i.e.  directories where data can be linked to but not linked from.  However,
0182 this depends on data organization, which might not be controlled by developers.
0183 In this case, granting read-write access to ``~/tmp/``, instead of write-only
0184 access, would potentially allow to move ``~/tmp/`` to a non-readable directory
0185 and still keep the ability to list the content of ``~/tmp/``.
0186 
0187 Layers of file path access rights
0188 ---------------------------------
0189 
0190 Each time a thread enforces a ruleset on itself, it updates its Landlock domain
0191 with a new layer of policy.  Indeed, this complementary policy is stacked with
0192 the potentially other rulesets already restricting this thread.  A sandboxed
0193 thread can then safely add more constraints to itself with a new enforced
0194 ruleset.
0195 
0196 One policy layer grants access to a file path if at least one of its rules
0197 encountered on the path grants the access.  A sandboxed thread can only access
0198 a file path if all its enforced policy layers grant the access as well as all
0199 the other system access controls (e.g. filesystem DAC, other LSM policies,
0200 etc.).
0201 
0202 Bind mounts and OverlayFS
0203 -------------------------
0204 
0205 Landlock enables to restrict access to file hierarchies, which means that these
0206 access rights can be propagated with bind mounts (cf.
0207 Documentation/filesystems/sharedsubtree.rst) but not with
0208 Documentation/filesystems/overlayfs.rst.
0209 
0210 A bind mount mirrors a source file hierarchy to a destination.  The destination
0211 hierarchy is then composed of the exact same files, on which Landlock rules can
0212 be tied, either via the source or the destination path.  These rules restrict
0213 access when they are encountered on a path, which means that they can restrict
0214 access to multiple file hierarchies at the same time, whether these hierarchies
0215 are the result of bind mounts or not.
0216 
0217 An OverlayFS mount point consists of upper and lower layers.  These layers are
0218 combined in a merge directory, result of the mount point.  This merge hierarchy
0219 may include files from the upper and lower layers, but modifications performed
0220 on the merge hierarchy only reflects on the upper layer.  From a Landlock
0221 policy point of view, each OverlayFS layers and merge hierarchies are
0222 standalone and contains their own set of files and directories, which is
0223 different from bind mounts.  A policy restricting an OverlayFS layer will not
0224 restrict the resulted merged hierarchy, and vice versa.  Landlock users should
0225 then only think about file hierarchies they want to allow access to, regardless
0226 of the underlying filesystem.
0227 
0228 Inheritance
0229 -----------
0230 
0231 Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
0232 restrictions from its parent.  This is similar to the seccomp inheritance (cf.
0233 Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
0234 task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
0235 Landlock rules to itself, but they will not be automatically applied to other
0236 sibling threads (unlike POSIX thread credential changes, cf.
0237 :manpage:`nptl(7)`).
0238 
0239 When a thread sandboxes itself, we have the guarantee that the related security
0240 policy will stay enforced on all this thread's descendants.  This allows
0241 creating standalone and modular security policies per application, which will
0242 automatically be composed between themselves according to their runtime parent
0243 policies.
0244 
0245 Ptrace restrictions
0246 -------------------
0247 
0248 A sandboxed process has less privileges than a non-sandboxed process and must
0249 then be subject to additional restrictions when manipulating another process.
0250 To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
0251 process, a sandboxed process should have a subset of the target process rules,
0252 which means the tracee must be in a sub-domain of the tracer.
0253 
0254 Compatibility
0255 =============
0256 
0257 Backward and forward compatibility
0258 ----------------------------------
0259 
0260 Landlock is designed to be compatible with past and future versions of the
0261 kernel.  This is achieved thanks to the system call attributes and the
0262 associated bitflags, particularly the ruleset's `handled_access_fs`.  Making
0263 handled access right explicit enables the kernel and user space to have a clear
0264 contract with each other.  This is required to make sure sandboxing will not
0265 get stricter with a system update, which could break applications.
0266 
0267 Developers can subscribe to the `Landlock mailing list
0268 <https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
0269 test their applications with the latest available features.  In the interest of
0270 users, and because they may use different kernel versions, it is strongly
0271 encouraged to follow a best-effort security approach by checking the Landlock
0272 ABI version at runtime and only enforcing the supported features.
0273 
0274 .. _landlock_abi_versions:
0275 
0276 Landlock ABI versions
0277 ---------------------
0278 
0279 The Landlock ABI version can be read with the sys_landlock_create_ruleset()
0280 system call:
0281 
0282 .. code-block:: c
0283 
0284     int abi;
0285 
0286     abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
0287     if (abi < 0) {
0288         switch (errno) {
0289         case ENOSYS:
0290             printf("Landlock is not supported by the current kernel.\n");
0291             break;
0292         case EOPNOTSUPP:
0293             printf("Landlock is currently disabled.\n");
0294             break;
0295         }
0296         return 0;
0297     }
0298     if (abi >= 2) {
0299         printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
0300     }
0301 
0302 The following kernel interfaces are implicitly supported by the first ABI
0303 version.  Features only supported from a specific version are explicitly marked
0304 as such.
0305 
0306 Kernel interface
0307 ================
0308 
0309 Access rights
0310 -------------
0311 
0312 .. kernel-doc:: include/uapi/linux/landlock.h
0313     :identifiers: fs_access
0314 
0315 Creating a new ruleset
0316 ----------------------
0317 
0318 .. kernel-doc:: security/landlock/syscalls.c
0319     :identifiers: sys_landlock_create_ruleset
0320 
0321 .. kernel-doc:: include/uapi/linux/landlock.h
0322     :identifiers: landlock_ruleset_attr
0323 
0324 Extending a ruleset
0325 -------------------
0326 
0327 .. kernel-doc:: security/landlock/syscalls.c
0328     :identifiers: sys_landlock_add_rule
0329 
0330 .. kernel-doc:: include/uapi/linux/landlock.h
0331     :identifiers: landlock_rule_type landlock_path_beneath_attr
0332 
0333 Enforcing a ruleset
0334 -------------------
0335 
0336 .. kernel-doc:: security/landlock/syscalls.c
0337     :identifiers: sys_landlock_restrict_self
0338 
0339 Current limitations
0340 ===================
0341 
0342 Filesystem topology modification
0343 --------------------------------
0344 
0345 As for file renaming and linking, a sandboxed thread cannot modify its
0346 filesystem topology, whether via :manpage:`mount(2)` or
0347 :manpage:`pivot_root(2)`.  However, :manpage:`chroot(2)` calls are not denied.
0348 
0349 Special filesystems
0350 -------------------
0351 
0352 Access to regular files and directories can be restricted by Landlock,
0353 according to the handled accesses of a ruleset.  However, files that do not
0354 come from a user-visible filesystem (e.g. pipe, socket), but can still be
0355 accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
0356 restricted.  Likewise, some special kernel filesystems such as nsfs, which can
0357 be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
0358 restricted.  However, thanks to the `ptrace restrictions`_, access to such
0359 sensitive ``/proc`` files are automatically restricted according to domain
0360 hierarchies.  Future Landlock evolutions could still enable to explicitly
0361 restrict such paths with dedicated ruleset flags.
0362 
0363 Ruleset layers
0364 --------------
0365 
0366 There is a limit of 16 layers of stacked rulesets.  This can be an issue for a
0367 task willing to enforce a new ruleset in complement to its 16 inherited
0368 rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
0369 E2BIG.  It is then strongly suggested to carefully build rulesets once in the
0370 life of a thread, especially for applications able to launch other applications
0371 that may also want to sandbox themselves (e.g. shells, container managers,
0372 etc.).
0373 
0374 Memory usage
0375 ------------
0376 
0377 Kernel memory allocated to create rulesets is accounted and can be restricted
0378 by the Documentation/admin-guide/cgroup-v1/memory.rst.
0379 
0380 Previous limitations
0381 ====================
0382 
0383 File renaming and linking (ABI 1)
0384 ---------------------------------
0385 
0386 Because Landlock targets unprivileged access controls, it needs to properly
0387 handle composition of rules.  Such property also implies rules nesting.
0388 Properly handling multiple layers of rulesets, each one of them able to
0389 restrict access to files, also implies inheritance of the ruleset restrictions
0390 from a parent to its hierarchy.  Because files are identified and restricted by
0391 their hierarchy, moving or linking a file from one directory to another implies
0392 propagation of the hierarchy constraints, or restriction of these actions
0393 according to the potentially lost constraints.  To protect against privilege
0394 escalations through renaming or linking, and for the sake of simplicity,
0395 Landlock previously limited linking and renaming to the same directory.
0396 Starting with the Landlock ABI version 2, it is now possible to securely
0397 control renaming and linking thanks to the new `LANDLOCK_ACCESS_FS_REFER`
0398 access right.
0399 
0400 .. _kernel_support:
0401 
0402 Kernel support
0403 ==============
0404 
0405 Landlock was first introduced in Linux 5.13 but it must be configured at build
0406 time with `CONFIG_SECURITY_LANDLOCK=y`.  Landlock must also be enabled at boot
0407 time as the other security modules.  The list of security modules enabled by
0408 default is set with `CONFIG_LSM`.  The kernel configuration should then
0409 contains `CONFIG_LSM=landlock,[...]` with `[...]`  as the list of other
0410 potentially useful security modules for the running system (see the
0411 `CONFIG_LSM` help).
0412 
0413 If the running kernel doesn't have `landlock` in `CONFIG_LSM`, then we can
0414 still enable it by adding ``lsm=landlock,[...]`` to
0415 Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader
0416 configuration.
0417 
0418 Questions and answers
0419 =====================
0420 
0421 What about user space sandbox managers?
0422 ---------------------------------------
0423 
0424 Using user space process to enforce restrictions on kernel resources can lead
0425 to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
0426 the OS code and state
0427 <https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
0428 
0429 What about namespaces and containers?
0430 -------------------------------------
0431 
0432 Namespaces can help create sandboxes but they are not designed for
0433 access-control and then miss useful features for such use case (e.g. no
0434 fine-grained restrictions).  Moreover, their complexity can lead to security
0435 issues, especially when untrusted processes can manipulate them (cf.
0436 `Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
0437 
0438 Additional documentation
0439 ========================
0440 
0441 * Documentation/security/landlock.rst
0442 * https://landlock.io
0443 
0444 .. Links
0445 .. _samples/landlock/sandboxer.c:
0446    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c