0001 MDS - Microarchitectural Data Sampling
0002 ======================================
0003
0004 Microarchitectural Data Sampling is a hardware vulnerability which allows
0005 unprivileged speculative access to data which is available in various CPU
0006 internal buffers.
0007
0008 Affected processors
0009 -------------------
0010
0011 This vulnerability affects a wide range of Intel processors. The
0012 vulnerability is not present on:
0013
0014 - Processors from AMD, Centaur and other non Intel vendors
0015
0016 - Older processor models, where the CPU family is < 6
0017
0018 - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus)
0019
0020 - Intel processors which have the ARCH_CAP_MDS_NO bit set in the
0021 IA32_ARCH_CAPABILITIES MSR.
0022
0023 Whether a processor is affected or not can be read out from the MDS
0024 vulnerability file in sysfs. See :ref:`mds_sys_info`.
0025
0026 Not all processors are affected by all variants of MDS, but the mitigation
0027 is identical for all of them so the kernel treats them as a single
0028 vulnerability.
0029
0030 Related CVEs
0031 ------------
0032
0033 The following CVE entries are related to the MDS vulnerability:
0034
0035 ============== ===== ===================================================
0036 CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling
0037 CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling
0038 CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling
0039 CVE-2019-11091 MDSUM Microarchitectural Data Sampling Uncacheable Memory
0040 ============== ===== ===================================================
0041
0042 Problem
0043 -------
0044
0045 When performing store, load, L1 refill operations, processors write data
0046 into temporary microarchitectural structures (buffers). The data in the
0047 buffer can be forwarded to load operations as an optimization.
0048
0049 Under certain conditions, usually a fault/assist caused by a load
0050 operation, data unrelated to the load memory address can be speculatively
0051 forwarded from the buffers. Because the load operation causes a fault or
0052 assist and its result will be discarded, the forwarded data will not cause
0053 incorrect program execution or state changes. But a malicious operation
0054 may be able to forward this speculative data to a disclosure gadget which
0055 allows in turn to infer the value via a cache side channel attack.
0056
0057 Because the buffers are potentially shared between Hyper-Threads cross
0058 Hyper-Thread attacks are possible.
0059
0060 Deeper technical information is available in the MDS specific x86
0061 architecture section: :ref:`Documentation/x86/mds.rst <mds>`.
0062
0063
0064 Attack scenarios
0065 ----------------
0066
0067 Attacks against the MDS vulnerabilities can be mounted from malicious non
0068 priviledged user space applications running on hosts or guest. Malicious
0069 guest OSes can obviously mount attacks as well.
0070
0071 Contrary to other speculation based vulnerabilities the MDS vulnerability
0072 does not allow the attacker to control the memory target address. As a
0073 consequence the attacks are purely sampling based, but as demonstrated with
0074 the TLBleed attack samples can be postprocessed successfully.
0075
0076 Web-Browsers
0077 ^^^^^^^^^^^^
0078
0079 It's unclear whether attacks through Web-Browsers are possible at
0080 all. The exploitation through Java-Script is considered very unlikely,
0081 but other widely used web technologies like Webassembly could possibly be
0082 abused.
0083
0084
0085 .. _mds_sys_info:
0086
0087 MDS system information
0088 -----------------------
0089
0090 The Linux kernel provides a sysfs interface to enumerate the current MDS
0091 status of the system: whether the system is vulnerable, and which
0092 mitigations are active. The relevant sysfs file is:
0093
0094 /sys/devices/system/cpu/vulnerabilities/mds
0095
0096 The possible values in this file are:
0097
0098 .. list-table::
0099
0100 * - 'Not affected'
0101 - The processor is not vulnerable
0102 * - 'Vulnerable'
0103 - The processor is vulnerable, but no mitigation enabled
0104 * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
0105 - The processor is vulnerable but microcode is not updated.
0106
0107 The mitigation is enabled on a best effort basis. See :ref:`vmwerv`
0108 * - 'Mitigation: Clear CPU buffers'
0109 - The processor is vulnerable and the CPU buffer clearing mitigation is
0110 enabled.
0111
0112 If the processor is vulnerable then the following information is appended
0113 to the above information:
0114
0115 ======================== ============================================
0116 'SMT vulnerable' SMT is enabled
0117 'SMT mitigated' SMT is enabled and mitigated
0118 'SMT disabled' SMT is disabled
0119 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
0120 ======================== ============================================
0121
0122 .. _vmwerv:
0123
0124 Best effort mitigation mode
0125 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
0126
0127 If the processor is vulnerable, but the availability of the microcode based
0128 mitigation mechanism is not advertised via CPUID the kernel selects a best
0129 effort mitigation mode. This mode invokes the mitigation instructions
0130 without a guarantee that they clear the CPU buffers.
0131
0132 This is done to address virtualization scenarios where the host has the
0133 microcode update applied, but the hypervisor is not yet updated to expose
0134 the CPUID to the guest. If the host has updated microcode the protection
0135 takes effect otherwise a few cpu cycles are wasted pointlessly.
0136
0137 The state in the mds sysfs file reflects this situation accordingly.
0138
0139
0140 Mitigation mechanism
0141 -------------------------
0142
0143 The kernel detects the affected CPUs and the presence of the microcode
0144 which is required.
0145
0146 If a CPU is affected and the microcode is available, then the kernel
0147 enables the mitigation by default. The mitigation can be controlled at boot
0148 time via a kernel command line option. See
0149 :ref:`mds_mitigation_control_command_line`.
0150
0151 .. _cpu_buffer_clear:
0152
0153 CPU buffer clearing
0154 ^^^^^^^^^^^^^^^^^^^
0155
0156 The mitigation for MDS clears the affected CPU buffers on return to user
0157 space and when entering a guest.
0158
0159 If SMT is enabled it also clears the buffers on idle entry when the CPU
0160 is only affected by MSBDS and not any other MDS variant, because the
0161 other variants cannot be protected against cross Hyper-Thread attacks.
0162
0163 For CPUs which are only affected by MSBDS the user space, guest and idle
0164 transition mitigations are sufficient and SMT is not affected.
0165
0166 .. _virt_mechanism:
0167
0168 Virtualization mitigation
0169 ^^^^^^^^^^^^^^^^^^^^^^^^^
0170
0171 The protection for host to guest transition depends on the L1TF
0172 vulnerability of the CPU:
0173
0174 - CPU is affected by L1TF:
0175
0176 If the L1D flush mitigation is enabled and up to date microcode is
0177 available, the L1D flush mitigation is automatically protecting the
0178 guest transition.
0179
0180 If the L1D flush mitigation is disabled then the MDS mitigation is
0181 invoked explicit when the host MDS mitigation is enabled.
0182
0183 For details on L1TF and virtualization see:
0184 :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_control_kvm>`.
0185
0186 - CPU is not affected by L1TF:
0187
0188 CPU buffers are flushed before entering the guest when the host MDS
0189 mitigation is enabled.
0190
0191 The resulting MDS protection matrix for the host to guest transition:
0192
0193 ============ ===== ============= ============ =================
0194 L1TF MDS VMX-L1FLUSH Host MDS MDS-State
0195
0196 Don't care No Don't care N/A Not affected
0197
0198 Yes Yes Disabled Off Vulnerable
0199
0200 Yes Yes Disabled Full Mitigated
0201
0202 Yes Yes Enabled Don't care Mitigated
0203
0204 No Yes N/A Off Vulnerable
0205
0206 No Yes N/A Full Mitigated
0207 ============ ===== ============= ============ =================
0208
0209 This only covers the host to guest transition, i.e. prevents leakage from
0210 host to guest, but does not protect the guest internally. Guests need to
0211 have their own protections.
0212
0213 .. _xeon_phi:
0214
0215 XEON PHI specific considerations
0216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0217
0218 The XEON PHI processor family is affected by MSBDS which can be exploited
0219 cross Hyper-Threads when entering idle states. Some XEON PHI variants allow
0220 to use MWAIT in user space (Ring 3) which opens an potential attack vector
0221 for malicious user space. The exposure can be disabled on the kernel
0222 command line with the 'ring3mwait=disable' command line option.
0223
0224 XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
0225 before the CPU enters a idle state. As XEON PHI is not affected by L1TF
0226 either disabling SMT is not required for full protection.
0227
0228 .. _mds_smt_control:
0229
0230 SMT control
0231 ^^^^^^^^^^^
0232
0233 All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
0234 means on CPUs which are affected by MFBDS or MLPDS it is necessary to
0235 disable SMT for full protection. These are most of the affected CPUs; the
0236 exception is XEON PHI, see :ref:`xeon_phi`.
0237
0238 Disabling SMT can have a significant performance impact, but the impact
0239 depends on the type of workloads.
0240
0241 See the relevant chapter in the L1TF mitigation documentation for details:
0242 :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
0243
0244
0245 .. _mds_mitigation_control_command_line:
0246
0247 Mitigation control on the kernel command line
0248 ---------------------------------------------
0249
0250 The kernel command line allows to control the MDS mitigations at boot
0251 time with the option "mds=". The valid arguments for this option are:
0252
0253 ============ =============================================================
0254 full If the CPU is vulnerable, enable all available mitigations
0255 for the MDS vulnerability, CPU buffer clearing on exit to
0256 userspace and when entering a VM. Idle transitions are
0257 protected as well if SMT is enabled.
0258
0259 It does not automatically disable SMT.
0260
0261 full,nosmt The same as mds=full, with SMT disabled on vulnerable
0262 CPUs. This is the complete mitigation.
0263
0264 off Disables MDS mitigations completely.
0265
0266 ============ =============================================================
0267
0268 Not specifying this option is equivalent to "mds=full". For processors
0269 that are affected by both TAA (TSX Asynchronous Abort) and MDS,
0270 specifying just "mds=off" without an accompanying "tsx_async_abort=off"
0271 will have no effect as the same mitigation is used for both
0272 vulnerabilities.
0273
0274 Mitigation selection guide
0275 --------------------------
0276
0277 1. Trusted userspace
0278 ^^^^^^^^^^^^^^^^^^^^
0279
0280 If all userspace applications are from a trusted source and do not
0281 execute untrusted code which is supplied externally, then the mitigation
0282 can be disabled.
0283
0284
0285 2. Virtualization with trusted guests
0286 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0287
0288 The same considerations as above versus trusted user space apply.
0289
0290 3. Virtualization with untrusted guests
0291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0292
0293 The protection depends on the state of the L1TF mitigations.
0294 See :ref:`virt_mechanism`.
0295
0296 If the MDS mitigation is enabled and SMT is disabled, guest to host and
0297 guest to guest attacks are prevented.
0298
0299 .. _mds_default_mitigations:
0300
0301 Default mitigations
0302 -------------------
0303
0304 The kernel default mitigations for vulnerable processors are:
0305
0306 - Enable CPU buffer clearing
0307
0308 The kernel does not by default enforce the disabling of SMT, which leaves
0309 SMT systems vulnerable when running untrusted code. The same rationale as
0310 for L1TF applies.
0311 See :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <default_mitigations>`.