0001 ======================================================
0002 HiSilicon SoC uncore Performance Monitoring Unit (PMU)
0003 ======================================================
0004
0005 The HiSilicon SoC chip includes various independent system device PMUs
0006 such as L3 cache (L3C), Hydra Home Agent (HHA) and DDRC. These PMUs are
0007 independent and have hardware logic to gather statistics and performance
0008 information.
0009
0010 The HiSilicon SoC encapsulates multiple CPU and IO dies. Each CPU cluster
0011 (CCL) is made up of 4 cpu cores sharing one L3 cache; each CPU die is
0012 called Super CPU cluster (SCCL) and is made up of 6 CCLs. Each SCCL has
0013 two HHAs (0 - 1) and four DDRCs (0 - 3), respectively.
0014
0015 HiSilicon SoC uncore PMU driver
0016 -------------------------------
0017
0018 Each device PMU has separate registers for event counting, control and
0019 interrupt, and the PMU driver shall register perf PMU drivers like L3C,
0020 HHA and DDRC etc. The available events and configuration options shall
0021 be described in the sysfs, see:
0022
0023 /sys/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>/, or
0024 /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
0025 The "perf list" command shall list the available events from sysfs.
0026
0027 Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
0028 name will appear in event listing as hisi_sccl<sccl-id>_module<index-id>.
0029 where "sccl-id" is the identifier of the SCCL and "index-id" is the index of
0030 module.
0031
0032 e.g. hisi_sccl3_l3c0/rd_hit_cpipe is READ_HIT_CPIPE event of L3C index #0 in
0033 SCCL ID #3.
0034
0035 e.g. hisi_sccl1_hha0/rx_operations is RX_OPERATIONS event of HHA index #0 in
0036 SCCL ID #1.
0037
0038 The driver also provides a "cpumask" sysfs attribute, which shows the CPU core
0039 ID used to count the uncore PMU event.
0040
0041 Example usage of perf::
0042
0043 $# perf list
0044 hisi_sccl3_l3c0/rd_hit_cpipe/ [kernel PMU event]
0045 ------------------------------------------
0046 hisi_sccl3_l3c0/wr_hit_cpipe/ [kernel PMU event]
0047 ------------------------------------------
0048 hisi_sccl1_l3c0/rd_hit_cpipe/ [kernel PMU event]
0049 ------------------------------------------
0050 hisi_sccl1_l3c0/wr_hit_cpipe/ [kernel PMU event]
0051 ------------------------------------------
0052
0053 $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
0054 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5
0055
0056 For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
0057 as PMU v1, but some new functions are added to the hardware.
0058
0059 (a) L3C PMU supports filtering by core/thread within the cluster which can be
0060 specified as a bitmap::
0061
0062 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
0063
0064 This will only count the operations from core/thread 0 and 1 in this cluster.
0065
0066 (b) Tracetag allow the user to chose to count only read, write or atomic
0067 operations via the tt_req parameeter in perf. The default value counts all
0068 operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
0069 represents write operations, 3'b110 represents atomic store operations and
0070 3'b111 represents atomic non-store operations, other values are reserved::
0071
0072 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5
0073
0074 This will only count the read operations in this cluster.
0075
0076 (c) Datasrc allows the user to check where the data comes from. It is 5 bits.
0077 Some important codes are as follows:
0078 5'b00001: comes from L3C in this die;
0079 5'b01000: comes from L3C in the cross-die;
0080 5'b01001: comes from L3C which is in another socket;
0081 5'b01110: comes from the local DDR;
0082 5'b01111: comes from the cross-die DDR;
0083 5'b10000: comes from cross-socket DDR;
0084 etc, it is mainly helpful to find that the data source is nearest from the CPU
0085 cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
0086 configured in perf command::
0087
0088 $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
0089 hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
0090
0091 (d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
0092 contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
0093 clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
0094 SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
0095 CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
0096 5'b00000: I/O_MGMT_ICL;
0097 5'b00001: Network_ICL;
0098 5'b00011: HAC_ICL;
0099 5'b10000: PCIe_ICL;
0100
0101 Users could configure IDs to count data come from specific CCL/ICL, by setting
0102 srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
0103 tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
0104 check the bit when matching against the srcid_cmd/tgtid_cmd.
0105
0106 If all of these options are disabled, it can works by the default value that
0107 doesn't distinguish the filter condition and ID information and will return
0108 the total counter values in the PMU counters.
0109
0110 The current driver does not support sampling. So "perf record" is unsupported.
0111 Also attach to a task is unsupported as the events are all uncore.
0112
0113 Note: Please contact the maintainer for a complete list of events supported for
0114 the PMU devices in the SoC and its information if needed.