0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ============================================================
0004 Hardware-Feedback Interface for scheduling on Intel Hardware
0005 ============================================================
0006
0007 Overview
0008 --------
0009
0010 Intel has described the Hardware Feedback Interface (HFI) in the Intel 64 and
0011 IA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section
0012 14.6 [1]_.
0013
0014 The HFI gives the operating system a performance and energy efficiency
0015 capability data for each CPU in the system. Linux can use the information from
0016 the HFI to influence task placement decisions.
0017
0018 The Hardware Feedback Interface
0019 -------------------------------
0020
0021 The Hardware Feedback Interface provides to the operating system information
0022 about the performance and energy efficiency of each CPU in the system. Each
0023 capability is given as a unit-less quantity in the range [0-255]. Higher values
0024 indicate higher capability. Energy efficiency and performance are reported in
0025 separate capabilities. Even though on some systems these two metrics may be
0026 related, they are specified as independent capabilities in the Intel SDM.
0027
0028 These capabilities may change at runtime as a result of changes in the
0029 operating conditions of the system or the action of external factors. The rate
0030 at which these capabilities are updated is specific to each processor model. On
0031 some models, capabilities are set at boot time and never change. On others,
0032 capabilities may change every tens of milliseconds. For instance, a remote
0033 mechanism may be used to lower Thermal Design Power. Such change can be
0034 reflected in the HFI. Likewise, if the system needs to be throttled due to
0035 excessive heat, the HFI may reflect reduced performance on specific CPUs.
0036
0037 The kernel or a userspace policy daemon can use these capabilities to modify
0038 task placement decisions. For instance, if either the performance or energy
0039 capabilities of a given logical processor becomes zero, it is an indication that
0040 the hardware recommends to the operating system to not schedule any tasks on
0041 that processor for performance or energy efficiency reasons, respectively.
0042
0043 Implementation details for Linux
0044 --------------------------------
0045
0046 The infrastructure to handle thermal event interrupts has two parts. In the
0047 Local Vector Table of a CPU's local APIC, there exists a register for the
0048 Thermal Monitor Register. This register controls how interrupts are delivered
0049 to a CPU when the thermal monitor generates and interrupt. Further details
0050 can be found in the Intel SDM Vol. 3 Section 10.5 [1]_.
0051
0052 The thermal monitor may generate interrupts per CPU or per package. The HFI
0053 generates package-level interrupts. This monitor is configured and initialized
0054 via a set of machine-specific registers. Specifically, the HFI interrupt and
0055 status are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT
0056 and IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI
0057 table per package. Further details can be found in the Intel SDM Vol. 3
0058 Section 14.9 [1]_.
0059
0060 The hardware issues an HFI interrupt after updating the HFI table and is ready
0061 for the operating system to consume it. CPUs receive such interrupt via the
0062 thermal entry in the Local APIC's Local Vector Table.
0063
0064 When servicing such interrupt, the HFI driver parses the updated table and
0065 relays the update to userspace using the thermal notification framework. Given
0066 that there may be many HFI updates every second, the updates relayed to
0067 userspace are throttled at a rate of CONFIG_HZ jiffies.
0068
0069 References
0070 ----------
0071
0072 .. [1] https://www.intel.com/sdm