0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 .. _perf_index:
0004
0005 ====
0006 Perf
0007 ====
0008
0009 Perf Event Attributes
0010 =====================
0011
0012 :Author: Andrew Murray <andrew.murray@arm.com>
0013 :Date: 2019-03-06
0014
0015 exclude_user
0016 ------------
0017
0018 This attribute excludes userspace.
0019
0020 Userspace always runs at EL0 and thus this attribute will exclude EL0.
0021
0022
0023 exclude_kernel
0024 --------------
0025
0026 This attribute excludes the kernel.
0027
0028 The kernel runs at EL2 with VHE and EL1 without. Guest kernels always run
0029 at EL1.
0030
0031 For the host this attribute will exclude EL1 and additionally EL2 on a VHE
0032 system.
0033
0034 For the guest this attribute will exclude EL1. Please note that EL2 is
0035 never counted within a guest.
0036
0037
0038 exclude_hv
0039 ----------
0040
0041 This attribute excludes the hypervisor.
0042
0043 For a VHE host this attribute is ignored as we consider the host kernel to
0044 be the hypervisor.
0045
0046 For a non-VHE host this attribute will exclude EL2 as we consider the
0047 hypervisor to be any code that runs at EL2 which is predominantly used for
0048 guest/host transitions.
0049
0050 For the guest this attribute has no effect. Please note that EL2 is
0051 never counted within a guest.
0052
0053
0054 exclude_host / exclude_guest
0055 ----------------------------
0056
0057 These attributes exclude the KVM host and guest, respectively.
0058
0059 The KVM host may run at EL0 (userspace), EL1 (non-VHE kernel) and EL2 (VHE
0060 kernel or non-VHE hypervisor).
0061
0062 The KVM guest may run at EL0 (userspace) and EL1 (kernel).
0063
0064 Due to the overlapping exception levels between host and guests we cannot
0065 exclusively rely on the PMU's hardware exception filtering - therefore we
0066 must enable/disable counting on the entry and exit to the guest. This is
0067 performed differently on VHE and non-VHE systems.
0068
0069 For non-VHE systems we exclude EL2 for exclude_host - upon entering and
0070 exiting the guest we disable/enable the event as appropriate based on the
0071 exclude_host and exclude_guest attributes.
0072
0073 For VHE systems we exclude EL1 for exclude_guest and exclude both EL0,EL2
0074 for exclude_host. Upon entering and exiting the guest we modify the event
0075 to include/exclude EL0 as appropriate based on the exclude_host and
0076 exclude_guest attributes.
0077
0078 The statements above also apply when these attributes are used within a
0079 non-VHE guest however please note that EL2 is never counted within a guest.
0080
0081
0082 Accuracy
0083 --------
0084
0085 On non-VHE hosts we enable/disable counters on the entry/exit of host/guest
0086 transition at EL2 - however there is a period of time between
0087 enabling/disabling the counters and entering/exiting the guest. We are
0088 able to eliminate counters counting host events on the boundaries of guest
0089 entry/exit when counting guest events by filtering out EL2 for
0090 exclude_host. However when using !exclude_hv there is a small blackout
0091 window at the guest entry/exit where host events are not captured.
0092
0093 On VHE systems there are no blackout windows.
0094
0095 Perf Userspace PMU Hardware Counter Access
0096 ==========================================
0097
0098 Overview
0099 --------
0100 The perf userspace tool relies on the PMU to monitor events. It offers an
0101 abstraction layer over the hardware counters since the underlying
0102 implementation is cpu-dependent.
0103 Arm64 allows userspace tools to have access to the registers storing the
0104 hardware counters' values directly.
0105
0106 This targets specifically self-monitoring tasks in order to reduce the overhead
0107 by directly accessing the registers without having to go through the kernel.
0108
0109 How-to
0110 ------
0111 The focus is set on the armv8 PMUv3 which makes sure that the access to the pmu
0112 registers is enabled and that the userspace has access to the relevant
0113 information in order to use them.
0114
0115 In order to have access to the hardware counters, the global sysctl
0116 kernel/perf_user_access must first be enabled:
0117
0118 .. code-block:: sh
0119
0120 echo 1 > /proc/sys/kernel/perf_user_access
0121
0122 It is necessary to open the event using the perf tool interface with config1:1
0123 attr bit set: the sys_perf_event_open syscall returns a fd which can
0124 subsequently be used with the mmap syscall in order to retrieve a page of memory
0125 containing information about the event. The PMU driver uses this page to expose
0126 to the user the hardware counter's index and other necessary data. Using this
0127 index enables the user to access the PMU registers using the `mrs` instruction.
0128 Access to the PMU registers is only valid while the sequence lock is unchanged.
0129 In particular, the PMSELR_EL0 register is zeroed each time the sequence lock is
0130 changed.
0131
0132 The userspace access is supported in libperf using the perf_evsel__mmap()
0133 and perf_evsel__read() functions. See `tools/lib/perf/tests/test-evsel.c`_ for
0134 an example.
0135
0136 About heterogeneous systems
0137 ---------------------------
0138 On heterogeneous systems such as big.LITTLE, userspace PMU counter access can
0139 only be enabled when the tasks are pinned to a homogeneous subset of cores and
0140 the corresponding PMU instance is opened by specifying the 'type' attribute.
0141 The use of generic event types is not supported in this case.
0142
0143 Have a look at `tools/perf/arch/arm64/tests/user-events.c`_ for an example. It
0144 can be run using the perf tool to check that the access to the registers works
0145 correctly from userspace:
0146
0147 .. code-block:: sh
0148
0149 perf test -v user
0150
0151 About chained events and counter sizes
0152 --------------------------------------
0153 The user can request either a 32-bit (config1:0 == 0) or 64-bit (config1:0 == 1)
0154 counter along with userspace access. The sys_perf_event_open syscall will fail
0155 if a 64-bit counter is requested and the hardware doesn't support 64-bit
0156 counters. Chained events are not supported in conjunction with userspace counter
0157 access. If a 32-bit counter is requested on hardware with 64-bit counters, then
0158 userspace must treat the upper 32-bits read from the counter as UNKNOWN. The
0159 'pmc_width' field in the user page will indicate the valid width of the counter
0160 and should be used to mask the upper bits as needed.
0161
0162 .. Links
0163 .. _tools/perf/arch/arm64/tests/user-events.c:
0164 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c
0165 .. _tools/lib/perf/tests/test-evsel.c:
0166 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c