0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 Overview
0004 ========
0005 The Linux kernel contains a variety of code for running as a fully
0006 enlightened guest on Microsoft's Hyper-V hypervisor. Hyper-V
0007 consists primarily of a bare-metal hypervisor plus a virtual machine
0008 management service running in the parent partition (roughly
0009 equivalent to KVM and QEMU, for example). Guest VMs run in child
0010 partitions. In this documentation, references to Hyper-V usually
0011 encompass both the hypervisor and the VMM service without making a
0012 distinction about which functionality is provided by which
0013 component.
0014
0015 Hyper-V runs on x86/x64 and arm64 architectures, and Linux guests
0016 are supported on both. The functionality and behavior of Hyper-V is
0017 generally the same on both architectures unless noted otherwise.
0018
0019 Linux Guest Communication with Hyper-V
0020 --------------------------------------
0021 Linux guests communicate with Hyper-V in four different ways:
0022
0023 * Implicit traps: As defined by the x86/x64 or arm64 architecture,
0024 some guest actions trap to Hyper-V. Hyper-V emulates the action and
0025 returns control to the guest. This behavior is generally invisible
0026 to the Linux kernel.
0027
0028 * Explicit hypercalls: Linux makes an explicit function call to
0029 Hyper-V, passing parameters. Hyper-V performs the requested action
0030 and returns control to the caller. Parameters are passed in
0031 processor registers or in memory shared between the Linux guest and
0032 Hyper-V. On x86/x64, hypercalls use a Hyper-V specific calling
0033 sequence. On arm64, hypercalls use the ARM standard SMCCC calling
0034 sequence.
0035
0036 * Synthetic register access: Hyper-V implements a variety of
0037 synthetic registers. On x86/x64 these registers appear as MSRs in
0038 the guest, and the Linux kernel can read or write these MSRs using
0039 the normal mechanisms defined by the x86/x64 architecture. On
0040 arm64, these synthetic registers must be accessed using explicit
0041 hypercalls.
0042
0043 * VMbus: VMbus is a higher-level software construct that is built on
0044 the other 3 mechanisms. It is a message passing interface between
0045 the Hyper-V host and the Linux guest. It uses memory that is shared
0046 between Hyper-V and the guest, along with various signaling
0047 mechanisms.
0048
0049 The first three communication mechanisms are documented in the
0050 `Hyper-V Top Level Functional Spec (TLFS)`_. The TLFS describes
0051 general Hyper-V functionality and provides details on the hypercalls
0052 and synthetic registers. The TLFS is currently written for the
0053 x86/x64 architecture only.
0054
0055 .. _Hyper-V Top Level Functional Spec (TLFS): https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
0056
0057 VMbus is not documented. This documentation provides a high-level
0058 overview of VMbus and how it works, but the details can be discerned
0059 only from the code.
0060
0061 Sharing Memory
0062 --------------
0063 Many aspects are communication between Hyper-V and Linux are based
0064 on sharing memory. Such sharing is generally accomplished as
0065 follows:
0066
0067 * Linux allocates memory from its physical address space using
0068 standard Linux mechanisms.
0069
0070 * Linux tells Hyper-V the guest physical address (GPA) of the
0071 allocated memory. Many shared areas are kept to 1 page so that a
0072 single GPA is sufficient. Larger shared areas require a list of
0073 GPAs, which usually do not need to be contiguous in the guest
0074 physical address space. How Hyper-V is told about the GPA or list
0075 of GPAs varies. In some cases, a single GPA is written to a
0076 synthetic register. In other cases, a GPA or list of GPAs is sent
0077 in a VMbus message.
0078
0079 * Hyper-V translates the GPAs into "real" physical memory addresses,
0080 and creates a virtual mapping that it can use to access the memory.
0081
0082 * Linux can later revoke sharing it has previously established by
0083 telling Hyper-V to set the shared GPA to zero.
0084
0085 Hyper-V operates with a page size of 4 Kbytes. GPAs communicated to
0086 Hyper-V may be in the form of page numbers, and always describe a
0087 range of 4 Kbytes. Since the Linux guest page size on x86/x64 is
0088 also 4 Kbytes, the mapping from guest page to Hyper-V page is 1-to-1.
0089 On arm64, Hyper-V supports guests with 4/16/64 Kbyte pages as
0090 defined by the arm64 architecture. If Linux is using 16 or 64
0091 Kbyte pages, Linux code must be careful to communicate with Hyper-V
0092 only in terms of 4 Kbyte pages. HV_HYP_PAGE_SIZE and related macros
0093 are used in code that communicates with Hyper-V so that it works
0094 correctly in all configurations.
0095
0096 As described in the TLFS, a few memory pages shared between Hyper-V
0097 and the Linux guest are "overlay" pages. With overlay pages, Linux
0098 uses the usual approach of allocating guest memory and telling
0099 Hyper-V the GPA of the allocated memory. But Hyper-V then replaces
0100 that physical memory page with a page it has allocated, and the
0101 original physical memory page is no longer accessible in the guest
0102 VM. Linux may access the memory normally as if it were the memory
0103 that it originally allocated. The "overlay" behavior is visible
0104 only because the contents of the page (as seen by Linux) change at
0105 the time that Linux originally establishes the sharing and the
0106 overlay page is inserted. Similarly, the contents change if Linux
0107 revokes the sharing, in which case Hyper-V removes the overlay page,
0108 and the guest page originally allocated by Linux becomes visible
0109 again.
0110
0111 Before Linux does a kexec to a kdump kernel or any other kernel,
0112 memory shared with Hyper-V should be revoked. Hyper-V could modify
0113 a shared page or remove an overlay page after the new kernel is
0114 using the page for a different purpose, corrupting the new kernel.
0115 Hyper-V does not provide a single "set everything" operation to
0116 guest VMs, so Linux code must individually revoke all sharing before
0117 doing kexec. See hv_kexec_handler() and hv_crash_handler(). But
0118 the crash/panic path still has holes in cleanup because some shared
0119 pages are set using per-CPU synthetic registers and there's no
0120 mechanism to revoke the shared pages for CPUs other than the CPU
0121 running the panic path.
0122
0123 CPU Management
0124 --------------
0125 Hyper-V does not have a ability to hot-add or hot-remove a CPU
0126 from a running VM. However, Windows Server 2019 Hyper-V and
0127 earlier versions may provide guests with ACPI tables that indicate
0128 more CPUs than are actually present in the VM. As is normal, Linux
0129 treats these additional CPUs as potential hot-add CPUs, and reports
0130 them as such even though Hyper-V will never actually hot-add them.
0131 Starting in Windows Server 2022 Hyper-V, the ACPI tables reflect
0132 only the CPUs actually present in the VM, so Linux does not report
0133 any hot-add CPUs.
0134
0135 A Linux guest CPU may be taken offline using the normal Linux
0136 mechanisms, provided no VMbus channel interrupts are assigned to
0137 the CPU. See the section on VMbus Interrupts for more details
0138 on how VMbus channel interrupts can be re-assigned to permit
0139 taking a CPU offline.
0140
0141 32-bit and 64-bit
0142 -----------------
0143 On x86/x64, Hyper-V supports 32-bit and 64-bit guests, and Linux
0144 will build and run in either version. While the 32-bit version is
0145 expected to work, it is used rarely and may suffer from undetected
0146 regressions.
0147
0148 On arm64, Hyper-V supports only 64-bit guests.
0149
0150 Endian-ness
0151 -----------
0152 All communication between Hyper-V and guest VMs uses Little-Endian
0153 format on both x86/x64 and arm64. Big-endian format on arm64 is not
0154 supported by Hyper-V, and Linux code does not use endian-ness macros
0155 when accessing data shared with Hyper-V.
0156
0157 Versioning
0158 ----------
0159 Current Linux kernels operate correctly with older versions of
0160 Hyper-V back to Windows Server 2012 Hyper-V. Support for running
0161 on the original Hyper-V release in Windows Server 2008/2008 R2
0162 has been removed.
0163
0164 A Linux guest on Hyper-V outputs in dmesg the version of Hyper-V
0165 it is running on. This version is in the form of a Windows build
0166 number and is for display purposes only. Linux code does not
0167 test this version number at runtime to determine available features
0168 and functionality. Hyper-V indicates feature/function availability
0169 via flags in synthetic MSRs that Hyper-V provides to the guest,
0170 and the guest code tests these flags.
0171
0172 VMbus has its own protocol version that is negotiated during the
0173 initial VMbus connection from the guest to Hyper-V. This version
0174 number is also output to dmesg during boot. This version number
0175 is checked in a few places in the code to determine if specific
0176 functionality is present.
0177
0178 Furthermore, each synthetic device on VMbus also has a protocol
0179 version that is separate from the VMbus protocol version. Device
0180 drivers for these synthetic devices typically negotiate the device
0181 protocol version, and may test that protocol version to determine
0182 if specific device functionality is present.
0183
0184 Code Packaging
0185 --------------
0186 Hyper-V related code appears in the Linux kernel code tree in three
0187 main areas:
0188
0189 1. drivers/hv
0190
0191 2. arch/x86/hyperv and arch/arm64/hyperv
0192
0193 3. individual device driver areas such as drivers/scsi, drivers/net,
0194 drivers/clocksource, etc.
0195
0196 A few miscellaneous files appear elsewhere. See the full list under
0197 "Hyper-V/Azure CORE AND DRIVERS" and "DRM DRIVER FOR HYPERV
0198 SYNTHETIC VIDEO DEVICE" in the MAINTAINERS file.
0199
0200 The code in #1 and #2 is built only when CONFIG_HYPERV is set.
0201 Similarly, the code for most Hyper-V related drivers is built only
0202 when CONFIG_HYPERV is set.
0203
0204 Most Hyper-V related code in #1 and #3 can be built as a module.
0205 The architecture specific code in #2 must be built-in. Also,
0206 drivers/hv/hv_common.c is low-level code that is common across
0207 architectures and must be built-in.