Back to home page

OSCL-LXR

 
 

    


0001 MDS - Microarchitectural Data Sampling
0002 ======================================
0003 
0004 Microarchitectural Data Sampling is a hardware vulnerability which allows
0005 unprivileged speculative access to data which is available in various CPU
0006 internal buffers.
0007 
0008 Affected processors
0009 -------------------
0010 
0011 This vulnerability affects a wide range of Intel processors. The
0012 vulnerability is not present on:
0013 
0014    - Processors from AMD, Centaur and other non Intel vendors
0015 
0016    - Older processor models, where the CPU family is < 6
0017 
0018    - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus)
0019 
0020    - Intel processors which have the ARCH_CAP_MDS_NO bit set in the
0021      IA32_ARCH_CAPABILITIES MSR.
0022 
0023 Whether a processor is affected or not can be read out from the MDS
0024 vulnerability file in sysfs. See :ref:`mds_sys_info`.
0025 
0026 Not all processors are affected by all variants of MDS, but the mitigation
0027 is identical for all of them so the kernel treats them as a single
0028 vulnerability.
0029 
0030 Related CVEs
0031 ------------
0032 
0033 The following CVE entries are related to the MDS vulnerability:
0034 
0035    ==============  =====  ===================================================
0036    CVE-2018-12126  MSBDS  Microarchitectural Store Buffer Data Sampling
0037    CVE-2018-12130  MFBDS  Microarchitectural Fill Buffer Data Sampling
0038    CVE-2018-12127  MLPDS  Microarchitectural Load Port Data Sampling
0039    CVE-2019-11091  MDSUM  Microarchitectural Data Sampling Uncacheable Memory
0040    ==============  =====  ===================================================
0041 
0042 Problem
0043 -------
0044 
0045 When performing store, load, L1 refill operations, processors write data
0046 into temporary microarchitectural structures (buffers). The data in the
0047 buffer can be forwarded to load operations as an optimization.
0048 
0049 Under certain conditions, usually a fault/assist caused by a load
0050 operation, data unrelated to the load memory address can be speculatively
0051 forwarded from the buffers. Because the load operation causes a fault or
0052 assist and its result will be discarded, the forwarded data will not cause
0053 incorrect program execution or state changes. But a malicious operation
0054 may be able to forward this speculative data to a disclosure gadget which
0055 allows in turn to infer the value via a cache side channel attack.
0056 
0057 Because the buffers are potentially shared between Hyper-Threads cross
0058 Hyper-Thread attacks are possible.
0059 
0060 Deeper technical information is available in the MDS specific x86
0061 architecture section: :ref:`Documentation/x86/mds.rst <mds>`.
0062 
0063 
0064 Attack scenarios
0065 ----------------
0066 
0067 Attacks against the MDS vulnerabilities can be mounted from malicious non
0068 priviledged user space applications running on hosts or guest. Malicious
0069 guest OSes can obviously mount attacks as well.
0070 
0071 Contrary to other speculation based vulnerabilities the MDS vulnerability
0072 does not allow the attacker to control the memory target address. As a
0073 consequence the attacks are purely sampling based, but as demonstrated with
0074 the TLBleed attack samples can be postprocessed successfully.
0075 
0076 Web-Browsers
0077 ^^^^^^^^^^^^
0078 
0079   It's unclear whether attacks through Web-Browsers are possible at
0080   all. The exploitation through Java-Script is considered very unlikely,
0081   but other widely used web technologies like Webassembly could possibly be
0082   abused.
0083 
0084 
0085 .. _mds_sys_info:
0086 
0087 MDS system information
0088 -----------------------
0089 
0090 The Linux kernel provides a sysfs interface to enumerate the current MDS
0091 status of the system: whether the system is vulnerable, and which
0092 mitigations are active. The relevant sysfs file is:
0093 
0094 /sys/devices/system/cpu/vulnerabilities/mds
0095 
0096 The possible values in this file are:
0097 
0098   .. list-table::
0099 
0100      * - 'Not affected'
0101        - The processor is not vulnerable
0102      * - 'Vulnerable'
0103        - The processor is vulnerable, but no mitigation enabled
0104      * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
0105        - The processor is vulnerable but microcode is not updated.
0106 
0107          The mitigation is enabled on a best effort basis. See :ref:`vmwerv`
0108      * - 'Mitigation: Clear CPU buffers'
0109        - The processor is vulnerable and the CPU buffer clearing mitigation is
0110          enabled.
0111 
0112 If the processor is vulnerable then the following information is appended
0113 to the above information:
0114 
0115     ========================  ============================================
0116     'SMT vulnerable'          SMT is enabled
0117     'SMT mitigated'           SMT is enabled and mitigated
0118     'SMT disabled'            SMT is disabled
0119     'SMT Host state unknown'  Kernel runs in a VM, Host SMT state unknown
0120     ========================  ============================================
0121 
0122 .. _vmwerv:
0123 
0124 Best effort mitigation mode
0125 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
0126 
0127   If the processor is vulnerable, but the availability of the microcode based
0128   mitigation mechanism is not advertised via CPUID the kernel selects a best
0129   effort mitigation mode.  This mode invokes the mitigation instructions
0130   without a guarantee that they clear the CPU buffers.
0131 
0132   This is done to address virtualization scenarios where the host has the
0133   microcode update applied, but the hypervisor is not yet updated to expose
0134   the CPUID to the guest. If the host has updated microcode the protection
0135   takes effect otherwise a few cpu cycles are wasted pointlessly.
0136 
0137   The state in the mds sysfs file reflects this situation accordingly.
0138 
0139 
0140 Mitigation mechanism
0141 -------------------------
0142 
0143 The kernel detects the affected CPUs and the presence of the microcode
0144 which is required.
0145 
0146 If a CPU is affected and the microcode is available, then the kernel
0147 enables the mitigation by default. The mitigation can be controlled at boot
0148 time via a kernel command line option. See
0149 :ref:`mds_mitigation_control_command_line`.
0150 
0151 .. _cpu_buffer_clear:
0152 
0153 CPU buffer clearing
0154 ^^^^^^^^^^^^^^^^^^^
0155 
0156   The mitigation for MDS clears the affected CPU buffers on return to user
0157   space and when entering a guest.
0158 
0159   If SMT is enabled it also clears the buffers on idle entry when the CPU
0160   is only affected by MSBDS and not any other MDS variant, because the
0161   other variants cannot be protected against cross Hyper-Thread attacks.
0162 
0163   For CPUs which are only affected by MSBDS the user space, guest and idle
0164   transition mitigations are sufficient and SMT is not affected.
0165 
0166 .. _virt_mechanism:
0167 
0168 Virtualization mitigation
0169 ^^^^^^^^^^^^^^^^^^^^^^^^^
0170 
0171   The protection for host to guest transition depends on the L1TF
0172   vulnerability of the CPU:
0173 
0174   - CPU is affected by L1TF:
0175 
0176     If the L1D flush mitigation is enabled and up to date microcode is
0177     available, the L1D flush mitigation is automatically protecting the
0178     guest transition.
0179 
0180     If the L1D flush mitigation is disabled then the MDS mitigation is
0181     invoked explicit when the host MDS mitigation is enabled.
0182 
0183     For details on L1TF and virtualization see:
0184     :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_control_kvm>`.
0185 
0186   - CPU is not affected by L1TF:
0187 
0188     CPU buffers are flushed before entering the guest when the host MDS
0189     mitigation is enabled.
0190 
0191   The resulting MDS protection matrix for the host to guest transition:
0192 
0193   ============ ===== ============= ============ =================
0194    L1TF         MDS   VMX-L1FLUSH   Host MDS     MDS-State
0195 
0196    Don't care   No    Don't care    N/A          Not affected
0197 
0198    Yes          Yes   Disabled      Off          Vulnerable
0199 
0200    Yes          Yes   Disabled      Full         Mitigated
0201 
0202    Yes          Yes   Enabled       Don't care   Mitigated
0203 
0204    No           Yes   N/A           Off          Vulnerable
0205 
0206    No           Yes   N/A           Full         Mitigated
0207   ============ ===== ============= ============ =================
0208 
0209   This only covers the host to guest transition, i.e. prevents leakage from
0210   host to guest, but does not protect the guest internally. Guests need to
0211   have their own protections.
0212 
0213 .. _xeon_phi:
0214 
0215 XEON PHI specific considerations
0216 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0217 
0218   The XEON PHI processor family is affected by MSBDS which can be exploited
0219   cross Hyper-Threads when entering idle states. Some XEON PHI variants allow
0220   to use MWAIT in user space (Ring 3) which opens an potential attack vector
0221   for malicious user space. The exposure can be disabled on the kernel
0222   command line with the 'ring3mwait=disable' command line option.
0223 
0224   XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
0225   before the CPU enters a idle state. As XEON PHI is not affected by L1TF
0226   either disabling SMT is not required for full protection.
0227 
0228 .. _mds_smt_control:
0229 
0230 SMT control
0231 ^^^^^^^^^^^
0232 
0233   All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
0234   means on CPUs which are affected by MFBDS or MLPDS it is necessary to
0235   disable SMT for full protection. These are most of the affected CPUs; the
0236   exception is XEON PHI, see :ref:`xeon_phi`.
0237 
0238   Disabling SMT can have a significant performance impact, but the impact
0239   depends on the type of workloads.
0240 
0241   See the relevant chapter in the L1TF mitigation documentation for details:
0242   :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
0243 
0244 
0245 .. _mds_mitigation_control_command_line:
0246 
0247 Mitigation control on the kernel command line
0248 ---------------------------------------------
0249 
0250 The kernel command line allows to control the MDS mitigations at boot
0251 time with the option "mds=". The valid arguments for this option are:
0252 
0253   ============  =============================================================
0254   full          If the CPU is vulnerable, enable all available mitigations
0255                 for the MDS vulnerability, CPU buffer clearing on exit to
0256                 userspace and when entering a VM. Idle transitions are
0257                 protected as well if SMT is enabled.
0258 
0259                 It does not automatically disable SMT.
0260 
0261   full,nosmt    The same as mds=full, with SMT disabled on vulnerable
0262                 CPUs.  This is the complete mitigation.
0263 
0264   off           Disables MDS mitigations completely.
0265 
0266   ============  =============================================================
0267 
0268 Not specifying this option is equivalent to "mds=full". For processors
0269 that are affected by both TAA (TSX Asynchronous Abort) and MDS,
0270 specifying just "mds=off" without an accompanying "tsx_async_abort=off"
0271 will have no effect as the same mitigation is used for both
0272 vulnerabilities.
0273 
0274 Mitigation selection guide
0275 --------------------------
0276 
0277 1. Trusted userspace
0278 ^^^^^^^^^^^^^^^^^^^^
0279 
0280    If all userspace applications are from a trusted source and do not
0281    execute untrusted code which is supplied externally, then the mitigation
0282    can be disabled.
0283 
0284 
0285 2. Virtualization with trusted guests
0286 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0287 
0288    The same considerations as above versus trusted user space apply.
0289 
0290 3. Virtualization with untrusted guests
0291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0292 
0293    The protection depends on the state of the L1TF mitigations.
0294    See :ref:`virt_mechanism`.
0295 
0296    If the MDS mitigation is enabled and SMT is disabled, guest to host and
0297    guest to guest attacks are prevented.
0298 
0299 .. _mds_default_mitigations:
0300 
0301 Default mitigations
0302 -------------------
0303 
0304   The kernel default mitigations for vulnerable processors are:
0305 
0306   - Enable CPU buffer clearing
0307 
0308   The kernel does not by default enforce the disabling of SMT, which leaves
0309   SMT systems vulnerable when running untrusted code. The same rationale as
0310   for L1TF applies.
0311   See :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <default_mitigations>`.