Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 Overview
0004 ========
0005 The Linux kernel contains a variety of code for running as a fully
0006 enlightened guest on Microsoft's Hyper-V hypervisor.  Hyper-V
0007 consists primarily of a bare-metal hypervisor plus a virtual machine
0008 management service running in the parent partition (roughly
0009 equivalent to KVM and QEMU, for example).  Guest VMs run in child
0010 partitions.  In this documentation, references to Hyper-V usually
0011 encompass both the hypervisor and the VMM service without making a
0012 distinction about which functionality is provided by which
0013 component.
0014 
0015 Hyper-V runs on x86/x64 and arm64 architectures, and Linux guests
0016 are supported on both.  The functionality and behavior of Hyper-V is
0017 generally the same on both architectures unless noted otherwise.
0018 
0019 Linux Guest Communication with Hyper-V
0020 --------------------------------------
0021 Linux guests communicate with Hyper-V in four different ways:
0022 
0023 * Implicit traps: As defined by the x86/x64 or arm64 architecture,
0024   some guest actions trap to Hyper-V.  Hyper-V emulates the action and
0025   returns control to the guest.  This behavior is generally invisible
0026   to the Linux kernel.
0027 
0028 * Explicit hypercalls: Linux makes an explicit function call to
0029   Hyper-V, passing parameters.  Hyper-V performs the requested action
0030   and returns control to the caller.  Parameters are passed in
0031   processor registers or in memory shared between the Linux guest and
0032   Hyper-V.   On x86/x64, hypercalls use a Hyper-V specific calling
0033   sequence.  On arm64, hypercalls use the ARM standard SMCCC calling
0034   sequence.
0035 
0036 * Synthetic register access: Hyper-V implements a variety of
0037   synthetic registers.  On x86/x64 these registers appear as MSRs in
0038   the guest, and the Linux kernel can read or write these MSRs using
0039   the normal mechanisms defined by the x86/x64 architecture.  On
0040   arm64, these synthetic registers must be accessed using explicit
0041   hypercalls.
0042 
0043 * VMbus: VMbus is a higher-level software construct that is built on
0044   the other 3 mechanisms.  It is a message passing interface between
0045   the Hyper-V host and the Linux guest.  It uses memory that is shared
0046   between Hyper-V and the guest, along with various signaling
0047   mechanisms.
0048 
0049 The first three communication mechanisms are documented in the
0050 `Hyper-V Top Level Functional Spec (TLFS)`_.  The TLFS describes
0051 general Hyper-V functionality and provides details on the hypercalls
0052 and synthetic registers.  The TLFS is currently written for the
0053 x86/x64 architecture only.
0054 
0055 .. _Hyper-V Top Level Functional Spec (TLFS): https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
0056 
0057 VMbus is not documented.  This documentation provides a high-level
0058 overview of VMbus and how it works, but the details can be discerned
0059 only from the code.
0060 
0061 Sharing Memory
0062 --------------
0063 Many aspects are communication between Hyper-V and Linux are based
0064 on sharing memory.  Such sharing is generally accomplished as
0065 follows:
0066 
0067 * Linux allocates memory from its physical address space using
0068   standard Linux mechanisms.
0069 
0070 * Linux tells Hyper-V the guest physical address (GPA) of the
0071   allocated memory.  Many shared areas are kept to 1 page so that a
0072   single GPA is sufficient.   Larger shared areas require a list of
0073   GPAs, which usually do not need to be contiguous in the guest
0074   physical address space.  How Hyper-V is told about the GPA or list
0075   of GPAs varies.  In some cases, a single GPA is written to a
0076   synthetic register.  In other cases, a GPA or list of GPAs is sent
0077   in a VMbus message.
0078 
0079 * Hyper-V translates the GPAs into "real" physical memory addresses,
0080   and creates a virtual mapping that it can use to access the memory.
0081 
0082 * Linux can later revoke sharing it has previously established by
0083   telling Hyper-V to set the shared GPA to zero.
0084 
0085 Hyper-V operates with a page size of 4 Kbytes. GPAs communicated to
0086 Hyper-V may be in the form of page numbers, and always describe a
0087 range of 4 Kbytes.  Since the Linux guest page size on x86/x64 is
0088 also 4 Kbytes, the mapping from guest page to Hyper-V page is 1-to-1.
0089 On arm64, Hyper-V supports guests with 4/16/64 Kbyte pages as
0090 defined by the arm64 architecture.   If Linux is using 16 or 64
0091 Kbyte pages, Linux code must be careful to communicate with Hyper-V
0092 only in terms of 4 Kbyte pages.  HV_HYP_PAGE_SIZE and related macros
0093 are used in code that communicates with Hyper-V so that it works
0094 correctly in all configurations.
0095 
0096 As described in the TLFS, a few memory pages shared between Hyper-V
0097 and the Linux guest are "overlay" pages.  With overlay pages, Linux
0098 uses the usual approach of allocating guest memory and telling
0099 Hyper-V the GPA of the allocated memory.  But Hyper-V then replaces
0100 that physical memory page with a page it has allocated, and the
0101 original physical memory page is no longer accessible in the guest
0102 VM.  Linux may access the memory normally as if it were the memory
0103 that it originally allocated.  The "overlay" behavior is visible
0104 only because the contents of the page (as seen by Linux) change at
0105 the time that Linux originally establishes the sharing and the
0106 overlay page is inserted.  Similarly, the contents change if Linux
0107 revokes the sharing, in which case Hyper-V removes the overlay page,
0108 and the guest page originally allocated by Linux becomes visible
0109 again.
0110 
0111 Before Linux does a kexec to a kdump kernel or any other kernel,
0112 memory shared with Hyper-V should be revoked.  Hyper-V could modify
0113 a shared page or remove an overlay page after the new kernel is
0114 using the page for a different purpose, corrupting the new kernel.
0115 Hyper-V does not provide a single "set everything" operation to
0116 guest VMs, so Linux code must individually revoke all sharing before
0117 doing kexec.   See hv_kexec_handler() and hv_crash_handler().  But
0118 the crash/panic path still has holes in cleanup because some shared
0119 pages are set using per-CPU synthetic registers and there's no
0120 mechanism to revoke the shared pages for CPUs other than the CPU
0121 running the panic path.
0122 
0123 CPU Management
0124 --------------
0125 Hyper-V does not have a ability to hot-add or hot-remove a CPU
0126 from a running VM.  However, Windows Server 2019 Hyper-V and
0127 earlier versions may provide guests with ACPI tables that indicate
0128 more CPUs than are actually present in the VM.  As is normal, Linux
0129 treats these additional CPUs as potential hot-add CPUs, and reports
0130 them as such even though Hyper-V will never actually hot-add them.
0131 Starting in Windows Server 2022 Hyper-V, the ACPI tables reflect
0132 only the CPUs actually present in the VM, so Linux does not report
0133 any hot-add CPUs.
0134 
0135 A Linux guest CPU may be taken offline using the normal Linux
0136 mechanisms, provided no VMbus channel interrupts are assigned to
0137 the CPU.  See the section on VMbus Interrupts for more details
0138 on how VMbus channel interrupts can be re-assigned to permit
0139 taking a CPU offline.
0140 
0141 32-bit and 64-bit
0142 -----------------
0143 On x86/x64, Hyper-V supports 32-bit and 64-bit guests, and Linux
0144 will build and run in either version. While the 32-bit version is
0145 expected to work, it is used rarely and may suffer from undetected
0146 regressions.
0147 
0148 On arm64, Hyper-V supports only 64-bit guests.
0149 
0150 Endian-ness
0151 -----------
0152 All communication between Hyper-V and guest VMs uses Little-Endian
0153 format on both x86/x64 and arm64.  Big-endian format on arm64 is not
0154 supported by Hyper-V, and Linux code does not use endian-ness macros
0155 when accessing data shared with Hyper-V.
0156 
0157 Versioning
0158 ----------
0159 Current Linux kernels operate correctly with older versions of
0160 Hyper-V back to Windows Server 2012 Hyper-V. Support for running
0161 on the original Hyper-V release in Windows Server 2008/2008 R2
0162 has been removed.
0163 
0164 A Linux guest on Hyper-V outputs in dmesg the version of Hyper-V
0165 it is running on.  This version is in the form of a Windows build
0166 number and is for display purposes only. Linux code does not
0167 test this version number at runtime to determine available features
0168 and functionality. Hyper-V indicates feature/function availability
0169 via flags in synthetic MSRs that Hyper-V provides to the guest,
0170 and the guest code tests these flags.
0171 
0172 VMbus has its own protocol version that is negotiated during the
0173 initial VMbus connection from the guest to Hyper-V. This version
0174 number is also output to dmesg during boot.  This version number
0175 is checked in a few places in the code to determine if specific
0176 functionality is present.
0177 
0178 Furthermore, each synthetic device on VMbus also has a protocol
0179 version that is separate from the VMbus protocol version. Device
0180 drivers for these synthetic devices typically negotiate the device
0181 protocol version, and may test that protocol version to determine
0182 if specific device functionality is present.
0183 
0184 Code Packaging
0185 --------------
0186 Hyper-V related code appears in the Linux kernel code tree in three
0187 main areas:
0188 
0189 1. drivers/hv
0190 
0191 2. arch/x86/hyperv and arch/arm64/hyperv
0192 
0193 3. individual device driver areas such as drivers/scsi, drivers/net,
0194    drivers/clocksource, etc.
0195 
0196 A few miscellaneous files appear elsewhere. See the full list under
0197 "Hyper-V/Azure CORE AND DRIVERS" and "DRM DRIVER FOR HYPERV
0198 SYNTHETIC VIDEO DEVICE" in the MAINTAINERS file.
0199 
0200 The code in #1 and #2 is built only when CONFIG_HYPERV is set.
0201 Similarly, the code for most Hyper-V related drivers is built only
0202 when CONFIG_HYPERV is set.
0203 
0204 Most Hyper-V related code in #1 and #3 can be built as a module.
0205 The architecture specific code in #2 must be built-in.  Also,
0206 drivers/hv/hv_common.c is low-level code that is common across
0207 architectures and must be built-in.