Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 ======================
0004 Generic vcpu interface
0005 ======================
0006 
0007 The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
0008 KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
0009 kvm_device_attr as other devices, but targets VCPU-wide settings and controls.
0010 
0011 The groups and attributes per virtual cpu, if any, are architecture specific.
0012 
0013 1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
0014 ==================================
0015 
0016 :Architectures: ARM64
0017 
0018 1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
0019 ---------------------------------------
0020 
0021 :Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
0022              pointer to an int
0023 
0024 Returns:
0025 
0026          =======  ========================================================
0027          -EBUSY   The PMU overflow interrupt is already set
0028          -EFAULT  Error reading interrupt number
0029          -ENXIO   PMUv3 not supported or the overflow interrupt not set
0030                   when attempting to get it
0031          -ENODEV  KVM_ARM_VCPU_PMU_V3 feature missing from VCPU
0032          -EINVAL  Invalid PMU overflow interrupt number supplied or
0033                   trying to set the IRQ number without using an in-kernel
0034                   irqchip.
0035          =======  ========================================================
0036 
0037 A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
0038 number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
0039 type must be same for each vcpu. As a PPI, the interrupt number is the same for
0040 all vcpus, while as an SPI it must be a separate number per vcpu.
0041 
0042 1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
0043 ---------------------------------------
0044 
0045 :Parameters: no additional parameter in kvm_device_attr.addr
0046 
0047 Returns:
0048 
0049          =======  ======================================================
0050          -EEXIST  Interrupt number already used
0051          -ENODEV  PMUv3 not supported or GIC not initialized
0052          -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
0053                   number not set
0054          -EBUSY   PMUv3 already initialized
0055          =======  ======================================================
0056 
0057 Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
0058 virtual GIC implementation, this must be done after initializing the in-kernel
0059 irqchip.
0060 
0061 1.3 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FILTER
0062 -----------------------------------------
0063 
0064 :Parameters: in kvm_device_attr.addr the address for a PMU event filter is a
0065              pointer to a struct kvm_pmu_event_filter
0066 
0067 :Returns:
0068 
0069          =======  ======================================================
0070          -ENODEV  PMUv3 not supported or GIC not initialized
0071          -ENXIO   PMUv3 not properly configured or in-kernel irqchip not
0072                   configured as required prior to calling this attribute
0073          -EBUSY   PMUv3 already initialized or a VCPU has already run
0074          -EINVAL  Invalid filter range
0075          =======  ======================================================
0076 
0077 Request the installation of a PMU event filter described as follows::
0078 
0079     struct kvm_pmu_event_filter {
0080             __u16       base_event;
0081             __u16       nevents;
0082 
0083     #define KVM_PMU_EVENT_ALLOW 0
0084     #define KVM_PMU_EVENT_DENY  1
0085 
0086             __u8        action;
0087             __u8        pad[3];
0088     };
0089 
0090 A filter range is defined as the range [@base_event, @base_event + @nevents),
0091 together with an @action (KVM_PMU_EVENT_ALLOW or KVM_PMU_EVENT_DENY). The
0092 first registered range defines the global policy (global ALLOW if the first
0093 @action is DENY, global DENY if the first @action is ALLOW). Multiple ranges
0094 can be programmed, and must fit within the event space defined by the PMU
0095 architecture (10 bits on ARMv8.0, 16 bits from ARMv8.1 onwards).
0096 
0097 Note: "Cancelling" a filter by registering the opposite action for the same
0098 range doesn't change the default action. For example, installing an ALLOW
0099 filter for event range [0:10) as the first filter and then applying a DENY
0100 action for the same range will leave the whole range as disabled.
0101 
0102 Restrictions: Event 0 (SW_INCR) is never filtered, as it doesn't count a
0103 hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
0104 isn't strictly speaking an event. Filtering the cycle counter is possible
0105 using event 0x11 (CPU_CYCLES).
0106 
0107 1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
0108 ------------------------------------------
0109 
0110 :Parameters: in kvm_device_attr.addr the address to an int representing the PMU
0111              identifier.
0112 
0113 :Returns:
0114 
0115          =======  ====================================================
0116          -EBUSY   PMUv3 already initialized, a VCPU has already run or
0117                   an event filter has already been set
0118          -EFAULT  Error accessing the PMU identifier
0119          -ENXIO   PMU not found
0120          -ENODEV  PMUv3 not supported or GIC not initialized
0121          -ENOMEM  Could not allocate memory
0122          =======  ====================================================
0123 
0124 Request that the VCPU uses the specified hardware PMU when creating guest events
0125 for the purpose of PMU emulation. The PMU identifier can be read from the "type"
0126 file for the desired PMU instance under /sys/devices (or, equivalent,
0127 /sys/bus/even_source). This attribute is particularly useful on heterogeneous
0128 systems where there are at least two CPU PMUs on the system. The PMU that is set
0129 for one VCPU will be used by all the other VCPUs. It isn't possible to set a PMU
0130 if a PMU event filter is already present.
0131 
0132 Note that KVM will not make any attempts to run the VCPU on the physical CPUs
0133 associated with the PMU specified by this attribute. This is entirely left to
0134 userspace. However, attempting to run the VCPU on a physical CPU not supported
0135 by the PMU will fail and KVM_RUN will return with
0136 exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
0137 hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
0138 the cpu field to the processor id.
0139 
0140 2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
0141 =================================
0142 
0143 :Architectures: ARM64
0144 
0145 2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
0146 -----------------------------------------------------------------------------
0147 
0148 :Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
0149              pointer to an int
0150 
0151 Returns:
0152 
0153          =======  =================================
0154          -EINVAL  Invalid timer interrupt number
0155          -EBUSY   One or more VCPUs has already run
0156          =======  =================================
0157 
0158 A value describing the architected timer interrupt number when connected to an
0159 in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  Setting the
0160 attribute overrides the default values (see below).
0161 
0162 =============================  ==========================================
0163 KVM_ARM_VCPU_TIMER_IRQ_VTIMER  The EL1 virtual timer intid (default: 27)
0164 KVM_ARM_VCPU_TIMER_IRQ_PTIMER  The EL1 physical timer intid (default: 30)
0165 =============================  ==========================================
0166 
0167 Setting the same PPI for different timers will prevent the VCPUs from running.
0168 Setting the interrupt number on a VCPU configures all VCPUs created at that
0169 time to use the number provided for a given timer, overwriting any previously
0170 configured values on other VCPUs.  Userspace should configure the interrupt
0171 numbers on at least one VCPU after creating all VCPUs and before running any
0172 VCPUs.
0173 
0174 3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
0175 ==================================
0176 
0177 :Architectures: ARM64
0178 
0179 3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
0180 --------------------------------------
0181 
0182 :Parameters: 64-bit base address
0183 
0184 Returns:
0185 
0186          =======  ======================================
0187          -ENXIO   Stolen time not implemented
0188          -EEXIST  Base address already set for this VCPU
0189          -EINVAL  Base address not 64 byte aligned
0190          =======  ======================================
0191 
0192 Specifies the base address of the stolen time structure for this VCPU. The
0193 base address must be 64 byte aligned and exist within a valid guest memory
0194 region. See Documentation/virt/kvm/arm/pvtime.rst for more information
0195 including the layout of the stolen time structure.
0196 
0197 4. GROUP: KVM_VCPU_TSC_CTRL
0198 ===========================
0199 
0200 :Architectures: x86
0201 
0202 4.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET
0203 
0204 :Parameters: 64-bit unsigned TSC offset
0205 
0206 Returns:
0207 
0208          ======= ======================================
0209          -EFAULT Error reading/writing the provided
0210                  parameter address.
0211          -ENXIO  Attribute not supported
0212          ======= ======================================
0213 
0214 Specifies the guest's TSC offset relative to the host's TSC. The guest's
0215 TSC is then derived by the following equation:
0216 
0217   guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET
0218 
0219 This attribute is useful to adjust the guest's TSC on live migration,
0220 so that the TSC counts the time during which the VM was paused. The
0221 following describes a possible algorithm to use for this purpose.
0222 
0223 From the source VMM process:
0224 
0225 1. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_src),
0226    kvmclock nanoseconds (guest_src), and host CLOCK_REALTIME nanoseconds
0227    (host_src).
0228 
0229 2. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the
0230    guest TSC offset (ofs_src[i]).
0231 
0232 3. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the
0233    guest's TSC (freq).
0234 
0235 From the destination VMM process:
0236 
0237 4. Invoke the KVM_SET_CLOCK ioctl, providing the source nanoseconds from
0238    kvmclock (guest_src) and CLOCK_REALTIME (host_src) in their respective
0239    fields.  Ensure that the KVM_CLOCK_REALTIME flag is set in the provided
0240    structure.
0241 
0242    KVM will advance the VM's kvmclock to account for elapsed time since
0243    recording the clock values.  Note that this will cause problems in
0244    the guest (e.g., timeouts) unless CLOCK_REALTIME is synchronized
0245    between the source and destination, and a reasonably short time passes
0246    between the source pausing the VMs and the destination executing
0247    steps 4-7.
0248 
0249 5. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_dest) and
0250    kvmclock nanoseconds (guest_dest).
0251 
0252 6. Adjust the guest TSC offsets for every vCPU to account for (1) time
0253    elapsed since recording state and (2) difference in TSCs between the
0254    source and destination machine:
0255 
0256    ofs_dst[i] = ofs_src[i] -
0257      (guest_src - guest_dest) * freq +
0258      (tsc_src - tsc_dest)
0259 
0260    ("ofs[i] + tsc - guest * freq" is the guest TSC value corresponding to
0261    a time of 0 in kvmclock.  The above formula ensures that it is the
0262    same on the destination as it was on the source).
0263 
0264 7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the
0265    respective value derived in the previous step.