0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ==============
0004 Nitro Enclaves
0005 ==============
0006
0007 Overview
0008 ========
0009
0010 Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability
0011 that allows customers to carve out isolated compute environments within EC2
0012 instances [1].
0013
0014 For example, an application that processes sensitive data and runs in a VM,
0015 can be separated from other applications running in the same VM. This
0016 application then runs in a separate VM than the primary VM, namely an enclave.
0017 It runs alongside the VM that spawned it. This setup matches low latency
0018 applications needs.
0019
0020 The current supported architectures for the NE kernel driver, available in the
0021 upstream Linux kernel, are x86 and ARM64.
0022
0023 The resources that are allocated for the enclave, such as memory and CPUs, are
0024 carved out of the primary VM. Each enclave is mapped to a process running in the
0025 primary VM, that communicates with the NE kernel driver via an ioctl interface.
0026
0027 In this sense, there are two components:
0028
0029 1. An enclave abstraction process - a user space process running in the primary
0030 VM guest that uses the provided ioctl interface of the NE driver to spawn an
0031 enclave VM (that's 2 below).
0032
0033 There is a NE emulated PCI device exposed to the primary VM. The driver for this
0034 new PCI device is included in the NE driver.
0035
0036 The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl
0037 maps to an enclave start PCI command. The PCI device commands are then
0038 translated into actions taken on the hypervisor side; that's the Nitro
0039 hypervisor running on the host where the primary VM is running. The Nitro
0040 hypervisor is based on core KVM technology.
0041
0042 2. The enclave itself - a VM running on the same host as the primary VM that
0043 spawned it. Memory and CPUs are carved out of the primary VM and are dedicated
0044 for the enclave VM. An enclave does not have persistent storage attached.
0045
0046 The memory regions carved out of the primary VM and given to an enclave need to
0047 be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of
0048 this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from
0049 user space [2][3][7]. The memory size for an enclave needs to be at least
0050 64 MiB. The enclave memory and CPUs need to be from the same NUMA node.
0051
0052 An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain
0053 available for the primary VM. A CPU pool has to be set for NE purposes by an
0054 user with admin capability. See the cpu list section from the kernel
0055 documentation [4] for how a CPU pool format looks.
0056
0057 An enclave communicates with the primary VM via a local communication channel,
0058 using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device,
0059 while the enclave VM has a virtio-mmio vsock emulated device. The vsock device
0060 uses eventfd for signaling. The enclave VM sees the usual interfaces - local
0061 APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio
0062 device is placed in memory below the typical 4 GiB.
0063
0064 The application that runs in the enclave needs to be packaged in an enclave
0065 image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the
0066 enclave VM. The enclave VM has its own kernel and follows the standard Linux
0067 boot protocol [6][8].
0068
0069 The kernel bzImage, the kernel command line, the ramdisk(s) are part of the
0070 Enclave Image Format (EIF); plus an EIF header including metadata such as magic
0071 number, eif version, image size and CRC.
0072
0073 Hash values are computed for the entire enclave image (EIF), the kernel and
0074 ramdisk(s). That's used, for example, to check that the enclave image that is
0075 loaded in the enclave VM is the one that was intended to be run.
0076
0077 These crypto measurements are included in a signed attestation document
0078 generated by the Nitro Hypervisor and further used to prove the identity of the
0079 enclave; KMS is an example of service that NE is integrated with and that checks
0080 the attestation doc.
0081
0082 The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The
0083 init process in the enclave connects to the vsock CID of the primary VM and a
0084 predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is
0085 used to check in the primary VM that the enclave has booted. The CID of the
0086 primary VM is 3.
0087
0088 If the enclave VM crashes or gracefully exits, an interrupt event is received by
0089 the NE driver. This event is sent further to the user space enclave process
0090 running in the primary VM via a poll notification mechanism. Then the user space
0091 enclave process can exit.
0092
0093 [1] https://aws.amazon.com/ec2/nitro/nitro-enclaves/
0094 [2] https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html
0095 [3] https://lwn.net/Articles/807108/
0096 [4] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
0097 [5] https://man7.org/linux/man-pages/man7/vsock.7.html
0098 [6] https://www.kernel.org/doc/html/latest/x86/boot.html
0099 [7] https://www.kernel.org/doc/html/latest/arm64/hugetlbpage.html
0100 [8] https://www.kernel.org/doc/html/latest/arm64/booting.html