Back to home page

OSCL-LXR

 
 

    


0001 .. SPDX-License-Identifier: GPL-2.0
0002 
0003 ============================================================
0004 Intel(R) Speed Select Technology User Guide
0005 ============================================================
0006 
0007 The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
0008 collection of features that give more granular control over CPU performance.
0009 With Intel(R) SST, one server can be configured for power and performance for a
0010 variety of diverse workload requirements.
0011 
0012 Refer to the links below for an overview of the technology:
0013 
0014 - https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
0015 - https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
0016 
0017 These capabilities are further enhanced in some of the newer generations of
0018 server platforms where these features can be enumerated and controlled
0019 dynamically without pre-configuring via BIOS setup options. This dynamic
0020 configuration is done via mailbox commands to the hardware. One way to enumerate
0021 and configure these features is by using the Intel Speed Select utility.
0022 
0023 This document explains how to use the Intel Speed Select tool to enumerate and
0024 control Intel(R) SST features. This document gives example commands and explains
0025 how these commands change the power and performance profile of the system under
0026 test. Using this tool as an example, customers can replicate the messaging
0027 implemented in the tool in their production software.
0028 
0029 intel-speed-select configuration tool
0030 ======================================
0031 
0032 Most Linux distribution packages may include the "intel-speed-select" tool. If not,
0033 it can be built by downloading the Linux kernel tree from kernel.org. Once
0034 downloaded, the tool can be built without building the full kernel.
0035 
0036 From the kernel tree, run the following commands::
0037 
0038 # cd tools/power/x86/intel-speed-select/
0039 # make
0040 # make install
0041 
0042 Getting Help
0043 ------------
0044 
0045 To get help with the tool, execute the command below::
0046 
0047 # intel-speed-select --help
0048 
0049 The top-level help describes arguments and features. Notice that there is a
0050 multi-level help structure in the tool. For example, to get help for the feature "perf-profile"::
0051 
0052 # intel-speed-select perf-profile --help
0053 
0054 To get help on a command, another level of help is provided. For example for the command info "info"::
0055 
0056 # intel-speed-select perf-profile info --help
0057 
0058 Summary of platform capability
0059 ------------------------------
0060 To check the current platform and driver capabilities, execute::
0061 
0062 #intel-speed-select --info
0063 
0064 For example on a test system::
0065 
0066  # intel-speed-select --info
0067  Intel(R) Speed Select Technology
0068  Executing on CPU model: X
0069  Platform: API version : 1
0070  Platform: Driver version : 1
0071  Platform: mbox supported : 1
0072  Platform: mmio supported : 1
0073  Intel(R) SST-PP (feature perf-profile) is supported
0074  TDP level change control is unlocked, max level: 4
0075  Intel(R) SST-TF (feature turbo-freq) is supported
0076  Intel(R) SST-BF (feature base-freq) is not supported
0077  Intel(R) SST-CP (feature core-power) is supported
0078 
0079 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
0080 ------------------------------------------------------------------------
0081 
0082 This feature allows configuration of a server dynamically based on workload
0083 performance requirements. This helps users during deployment as they do not have
0084 to choose a specific server configuration statically.  This Intel(R) Speed Select
0085 Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
0086 that allows multiple optimized performance profiles per system. Each profile
0087 defines a set of CPUs that need to be online and rest offline to sustain a
0088 guaranteed base frequency. Once the user issues a command to use a specific
0089 performance profile and meet CPU online/offline requirement, the user can expect
0090 a change in the base frequency dynamically. This feature is called
0091 "perf-profile" when using the Intel Speed Select tool.
0092 
0093 Number or performance levels
0094 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0095 
0096 There can be multiple performance profiles on a system. To get the number of
0097 profiles, execute the command below::
0098 
0099  # intel-speed-select perf-profile get-config-levels
0100  Intel(R) Speed Select Technology
0101  Executing on CPU model: X
0102  package-0
0103   die-0
0104     cpu-0
0105         get-config-levels:4
0106  package-1
0107   die-0
0108     cpu-14
0109         get-config-levels:4
0110 
0111 On this system under test, there are 4 performance profiles in addition to the
0112 base performance profile (which is performance level 0).
0113 
0114 Lock/Unlock status
0115 ~~~~~~~~~~~~~~~~~~
0116 
0117 Even if there are multiple performance profiles, it is possible that they
0118 are locked. If they are locked, users cannot issue a command to change the
0119 performance state. It is possible that there is a BIOS setup to unlock or check
0120 with your system vendor.
0121 
0122 To check if the system is locked, execute the following command::
0123 
0124  # intel-speed-select perf-profile get-lock-status
0125  Intel(R) Speed Select Technology
0126  Executing on CPU model: X
0127  package-0
0128   die-0
0129     cpu-0
0130         get-lock-status:0
0131  package-1
0132   die-0
0133     cpu-14
0134         get-lock-status:0
0135 
0136 In this case, lock status is 0, which means that the system is unlocked.
0137 
0138 Properties of a performance level
0139 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0140 
0141 To get properties of a specific performance level (For example for the level 0, below), execute the command below::
0142 
0143  # intel-speed-select perf-profile info -l 0
0144  Intel(R) Speed Select Technology
0145  Executing on CPU model: X
0146  package-0
0147   die-0
0148     cpu-0
0149       perf-profile-level-0
0150         cpu-count:28
0151         enable-cpu-mask:000003ff,f0003fff
0152         enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
0153         thermal-design-power-ratio:26
0154         base-frequency(MHz):2600
0155         speed-select-turbo-freq:disabled
0156         speed-select-base-freq:disabled
0157         ...
0158         ...
0159 
0160 Here -l option is used to specify a performance level.
0161 
0162 If the option -l is omitted, then this command will print information about all
0163 the performance levels. The above command is printing properties of the
0164 performance level 0.
0165 
0166 For this performance profile, the list of CPUs displayed by the
0167 "enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
0168 condition is met, then base frequency of 2600 MHz can be maintained. To
0169 understand more, execute "intel-speed-select perf-profile info" for performance
0170 level 4::
0171 
0172  # intel-speed-select perf-profile info -l 4
0173  Intel(R) Speed Select Technology
0174  Executing on CPU model: X
0175  package-0
0176   die-0
0177     cpu-0
0178       perf-profile-level-4
0179         cpu-count:28
0180         enable-cpu-mask:000000fa,f0000faf
0181         enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
0182         thermal-design-power-ratio:28
0183         base-frequency(MHz):2800
0184         speed-select-turbo-freq:disabled
0185         speed-select-base-freq:unsupported
0186         ...
0187         ...
0188 
0189 There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
0190 the user only keeps these CPUs online and the rest "offline," then the base
0191 frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
0192 
0193 Get current performance level
0194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0195 
0196 To get the current performance level, execute::
0197 
0198  # intel-speed-select perf-profile get-config-current-level
0199  Intel(R) Speed Select Technology
0200  Executing on CPU model: X
0201  package-0
0202   die-0
0203     cpu-0
0204         get-config-current_level:0
0205 
0206 First verify that the base_frequency displayed by the cpufreq sysfs is correct::
0207 
0208  # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
0209  2600000
0210 
0211 This matches the base-frequency (MHz) field value displayed from the
0212 "perf-profile info" command for performance level 0(cpufreq frequency is in
0213 KHz).
0214 
0215 To check if the average frequency is equal to the base frequency for a 100% busy
0216 workload, disable turbo::
0217 
0218 # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
0219 
0220 Then runs a busy workload on all CPUs, for example::
0221 
0222 #stress -c 64
0223 
0224 To verify the base frequency, run turbostat::
0225 
0226  #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0227 
0228   Package       Core    CPU     Bzy_MHz
0229                 -       -       2600
0230   0             0       0       2600
0231   0             1       1       2600
0232   0             2       2       2600
0233   0             3       3       2600
0234   0             4       4       2600
0235   .             .       .       .
0236 
0237 
0238 Changing performance level
0239 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0240 
0241 To the change the performance level to 4, execute::
0242 
0243  # intel-speed-select -d perf-profile set-config-level -l 4 -o
0244  Intel(R) Speed Select Technology
0245  Executing on CPU model: X
0246  package-0
0247   die-0
0248     cpu-0
0249       perf-profile
0250         set_tdp_level:success
0251 
0252 In the command above, "-o" is optional. If it is specified, then it will also
0253 offline CPUs which are not present in the enable_cpu_mask for this performance
0254 level.
0255 
0256 Now if the base_frequency is checked::
0257 
0258  #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
0259  2800000
0260 
0261 Which shows that the base frequency now increased from 2600 MHz at performance
0262 level 0 to 2800 MHz at performance level 4. As a result, any workload, which can
0263 use fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
0264 
0265 Changing performance level via BMC Interface
0266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0267 
0268 It is possible to change SST-PP level using out of band (OOB) agent (Via some
0269 remote management console, through BMC "Baseboard Management Controller"
0270 interface). This mode is supported from the Sapphire Rapids processor
0271 generation. The kernel and tool change to support this mode is added to Linux
0272 kernel version 5.18. To enable this feature, kernel config
0273 "CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool
0274 is "v1.12" to support this feature, which is part of Linux kernel version 5.18.
0275 
0276 To support such configuration, this tool can be used as a daemon. Add
0277 a command line option --oob::
0278 
0279  # intel-speed-select --oob
0280  Intel(R) Speed Select Technology
0281  Executing on CPU model:143[0x8f]
0282  OOB mode is enabled and will run as daemon
0283 
0284 In this mode the tool will online/offline CPUs based on the new performance
0285 level.
0286 
0287 Check presence of other Intel(R) SST features
0288 ---------------------------------------------
0289 
0290 Each of the performance profiles also specifies weather there is support of
0291 other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
0292 (Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
0293 SST-TF)).
0294 
0295 For example, from the output of "perf-profile info" above, for level 0 and level
0296 4:
0297 
0298 For level 0::
0299        speed-select-turbo-freq:disabled
0300        speed-select-base-freq:disabled
0301 
0302 For level 4::
0303        speed-select-turbo-freq:disabled
0304        speed-select-base-freq:unsupported
0305 
0306 Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
0307 changed from "disabled" to "unsupported" compared to performance level 0.
0308 
0309 This means that at performance level 4, the "speed-select-base-freq" feature is
0310 not supported. However, at performance level 0, this feature is "supported", but
0311 currently "disabled", meaning the user has not activated this feature. Whereas
0312 "speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
0313 levels, but currently not activated by the user.
0314 
0315 The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
0316 technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
0317 The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
0318 is supported on a platform.
0319 
0320 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
0321 ---------------------------------------------------------------
0322 
0323 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
0324 allows users to define per core priority. This defines a mechanism to distribute
0325 power among cores when there is a power constrained scenario. This defines a
0326 class of service (CLOS) configuration.
0327 
0328 The user can configure up to 4 class of service configurations. Each CLOS group
0329 configuration allows definitions of parameters, which affects how the frequency
0330 can be limited and power is distributed. Each CPU core can be tied to a class of
0331 service and hence an associated priority. The granularity is at core level not
0332 at per CPU level.
0333 
0334 Enable CLOS based prioritization
0335 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0336 
0337 To use CLOS based prioritization feature, firmware must be informed to enable
0338 and use a priority type. There is a default per platform priority type, which
0339 can be changed with optional command line parameter.
0340 
0341 To enable and check the options, execute::
0342 
0343  # intel-speed-select core-power enable --help
0344  Intel(R) Speed Select Technology
0345  Executing on CPU model: X
0346  Enable core-power for a package/die
0347         Clos Enable: Specify priority type with [--priority|-p]
0348                  0: Proportional, 1: Ordered
0349 
0350 There are two types of priority types:
0351 
0352 - Ordered
0353 
0354 Priority for ordered throttling is defined based on the index of the assigned
0355 CLOS group. Where CLOS0 gets highest priority (throttled last).
0356 
0357 Priority order is:
0358 CLOS0 > CLOS1 > CLOS2 > CLOS3.
0359 
0360 - Proportional
0361 
0362 When proportional priority is used, there is an additional parameter called
0363 frequency_weight, which can be specified per CLOS group. The goal of
0364 proportional priority is to provide each core with the requested min., then
0365 distribute all remaining (excess/deficit) budgets in proportion to a defined
0366 weight. This proportional priority can be configured using "core-power config"
0367 command.
0368 
0369 To enable with the platform default priority type, execute::
0370 
0371  # intel-speed-select core-power enable
0372  Intel(R) Speed Select Technology
0373  Executing on CPU model: X
0374  package-0
0375   die-0
0376     cpu-0
0377       core-power
0378         enable:success
0379  package-1
0380   die-0
0381     cpu-6
0382       core-power
0383         enable:success
0384 
0385 The scope of this enable is per package or die scoped when a package contains
0386 multiple dies. To check if CLOS is enabled and get priority type, "core-power
0387 info" command can be used. For example to check the status of core-power feature
0388 on CPU 0, execute::
0389 
0390  # intel-speed-select -c 0 core-power info
0391  Intel(R) Speed Select Technology
0392  Executing on CPU model: X
0393  package-0
0394   die-0
0395     cpu-0
0396       core-power
0397         support-status:supported
0398         enable-status:enabled
0399         clos-enable-status:enabled
0400         priority-type:proportional
0401  package-1
0402   die-0
0403     cpu-24
0404       core-power
0405         support-status:supported
0406         enable-status:enabled
0407         clos-enable-status:enabled
0408         priority-type:proportional
0409 
0410 Configuring CLOS groups
0411 ~~~~~~~~~~~~~~~~~~~~~~~
0412 
0413 Each CLOS group has its own attributes including min, max, freq_weight and
0414 desired. These parameters can be configured with "core-power config" command.
0415 Defaults will be used if user skips setting a parameter except clos id, which is
0416 mandatory. To check core-power config options, execute::
0417 
0418  # intel-speed-select core-power config --help
0419  Intel(R) Speed Select Technology
0420  Executing on CPU model: X
0421  Set core-power configuration for one of the four clos ids
0422         Specify targeted clos id with [--clos|-c]
0423         Specify clos Proportional Priority [--weight|-w]
0424         Specify clos min in MHz with [--min|-n]
0425         Specify clos max in MHz with [--max|-m]
0426 
0427 For example::
0428 
0429  # intel-speed-select core-power config -c 0
0430  Intel(R) Speed Select Technology
0431  Executing on CPU model: X
0432  clos epp is not specified, default: 0
0433  clos frequency weight is not specified, default: 0
0434  clos min is not specified, default: 0 MHz
0435  clos max is not specified, default: 25500 MHz
0436  clos desired is not specified, default: 0
0437  package-0
0438   die-0
0439     cpu-0
0440       core-power
0441         config:success
0442  package-1
0443   die-0
0444     cpu-6
0445       core-power
0446         config:success
0447 
0448 The user has the option to change defaults. For example, the user can change the
0449 "min" and set the base frequency to always get guaranteed base frequency.
0450 
0451 Get the current CLOS configuration
0452 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0453 
0454 To check the current configuration, "core-power get-config" can be used. For
0455 example, to get the configuration of CLOS 0::
0456 
0457  # intel-speed-select core-power get-config -c 0
0458  Intel(R) Speed Select Technology
0459  Executing on CPU model: X
0460  package-0
0461   die-0
0462     cpu-0
0463       core-power
0464         clos:0
0465         epp:0
0466         clos-proportional-priority:0
0467         clos-min:0 MHz
0468         clos-max:Max Turbo frequency
0469         clos-desired:0 MHz
0470  package-1
0471   die-0
0472     cpu-24
0473       core-power
0474         clos:0
0475         epp:0
0476         clos-proportional-priority:0
0477         clos-min:0 MHz
0478         clos-max:Max Turbo frequency
0479         clos-desired:0 MHz
0480 
0481 Associating a CPU with a CLOS group
0482 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0483 
0484 To associate a CPU to a CLOS group "core-power assoc" command can be used::
0485 
0486  # intel-speed-select core-power assoc --help
0487  Intel(R) Speed Select Technology
0488  Executing on CPU model: X
0489  Associate a clos id to a CPU
0490         Specify targeted clos id with [--clos|-c]
0491 
0492 
0493 For example to associate CPU 10 to CLOS group 3, execute::
0494 
0495  # intel-speed-select -c 10 core-power assoc -c 3
0496  Intel(R) Speed Select Technology
0497  Executing on CPU model: X
0498  package-0
0499   die-0
0500     cpu-10
0501       core-power
0502         assoc:success
0503 
0504 Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.
0505 Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency
0506 limits.
0507 
0508 To check the existing association for a CPU, "core-power get-assoc" command can
0509 be used. For example, to get association of CPU 10, execute::
0510 
0511  # intel-speed-select -c 10 core-power get-assoc
0512  Intel(R) Speed Select Technology
0513  Executing on CPU model: X
0514  package-1
0515   die-0
0516     cpu-10
0517       get-assoc
0518         clos:3
0519 
0520 This shows that CPU 10 is part of a CLOS group 3.
0521 
0522 
0523 Disable CLOS based prioritization
0524 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0525 
0526 To disable, execute::
0527 
0528 # intel-speed-select core-power disable
0529 
0530 Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
0531 is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
0532 Intel(R) SST-TF to fail. This will cause the "disable" command to display an error
0533 if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
0534 feature must be disabled first.
0535 
0536 Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
0537 -------------------------------------------------------------------
0538 
0539 The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
0540 the user control base frequency. If some critical workload threads demand
0541 constant high guaranteed performance, then this feature can be used to execute
0542 the thread at higher base frequency on specific sets of CPUs (high priority
0543 CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
0544 This feature does not require offline of the low priority CPUs.
0545 
0546 The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
0547 Performance Profile (Intel(R) SST-PP) performance level configuration. It is
0548 possible that only certain performance levels support Intel(R) SST-BF. It is also
0549 possible that only base performance level (level = 0) has support of Intel
0550 SST-BF. Consequently, first select the desired performance level to enable this
0551 feature.
0552 
0553 In the system under test here, Intel(R) SST-BF is supported at the base
0554 performance level 0, but currently disabled. For example for the level 0::
0555 
0556  # intel-speed-select -c 0 perf-profile info -l 0
0557  Intel(R) Speed Select Technology
0558  Executing on CPU model: X
0559  package-0
0560   die-0
0561     cpu-0
0562       perf-profile-level-0
0563         ...
0564 
0565         speed-select-base-freq:disabled
0566         ...
0567 
0568 Before enabling Intel(R) SST-BF and measuring its impact on a workload
0569 performance, execute some workload and measure performance and get a baseline
0570 performance to compare against.
0571 
0572 Here the user wants more guaranteed performance. For this reason, it is likely
0573 that turbo is disabled. To disable turbo, execute::
0574 
0575 #echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
0576 
0577 Based on the output of the "intel-speed-select perf-profile info -l 0" base
0578 frequency of guaranteed frequency 2600 MHz.
0579 
0580 
0581 Measure baseline performance for comparison
0582 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0583 
0584 To compare, pick a multi-threaded workload where each thread can be scheduled on
0585 separate CPUs. "Hackbench pipe" test is a good example on how to improve
0586 performance using Intel(R) SST-BF.
0587 
0588 Below, the workload is measuring average scheduler wakeup latency, so a lower
0589 number means better performance::
0590 
0591  # taskset -c 3,4 perf bench -r 100 sched pipe
0592  # Running 'sched/pipe' benchmark:
0593  # Executed 1000000 pipe operations between two processes
0594      Total time: 6.102 [sec]
0595        6.102445 usecs/op
0596          163868 ops/sec
0597 
0598 While running the above test, if we take turbostat output, it will show us that
0599 2 of the CPUs are busy and reaching max. frequency (which would be the base
0600 frequency as the turbo is disabled). The turbostat output::
0601 
0602  #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0603  Package        Core    CPU     Bzy_MHz
0604  0              0       0       1000
0605  0              1       1       1005
0606  0              2       2       1000
0607  0              3       3       2600
0608  0              4       4       2600
0609  0              5       5       1000
0610  0              6       6       1000
0611  0              7       7       1005
0612  0              8       8       1005
0613  0              9       9       1000
0614  0              10      10      1000
0615  0              11      11      995
0616  0              12      12      1000
0617  0              13      13      1000
0618 
0619 From the above turbostat output, both CPU 3 and 4 are very busy and reaching
0620 full guaranteed frequency of 2600 MHz.
0621 
0622 Intel(R) SST-BF Capabilities
0623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0624 
0625 To get capabilities of Intel(R) SST-BF for the current performance level 0,
0626 execute::
0627 
0628  # intel-speed-select base-freq info -l 0
0629  Intel(R) Speed Select Technology
0630  Executing on CPU model: X
0631  package-0
0632   die-0
0633     cpu-0
0634       speed-select-base-freq
0635         high-priority-base-frequency(MHz):3000
0636         high-priority-cpu-mask:00000216,00002160
0637         high-priority-cpu-list:5,6,8,13,33,34,36,41
0638         low-priority-base-frequency(MHz):2400
0639         tjunction-temperature(C):125
0640         thermal-design-power(W):205
0641 
0642 The above capabilities show that there are some CPUs on this system that can
0643 offer base frequency of 3000 MHz compared to the standard base frequency at this
0644 performance levels. Nevertheless, these CPUs are fixed, and they are presented
0645 via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
0646 feature is selected, the low priorities CPUs (which are not in
0647 high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
0648 clipping of low priority CPUs is acceptable, then the user can enable Intel
0649 SST-BF feature particularly for the above "sched pipe" workload since only two
0650 CPUs are used, they can be scheduled on high priority CPUs and can get boost of
0651 400 MHz.
0652 
0653 Enable Intel(R) SST-BF
0654 ~~~~~~~~~~~~~~~~~~~~~~
0655 
0656 To enable Intel(R) SST-BF feature, execute::
0657 
0658  # intel-speed-select base-freq enable -a
0659  Intel(R) Speed Select Technology
0660  Executing on CPU model: X
0661  package-0
0662   die-0
0663     cpu-0
0664       base-freq
0665         enable:success
0666  package-1
0667   die-0
0668     cpu-14
0669       base-freq
0670         enable:success
0671 
0672 In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
0673 also adjusts the priority of cores using Intel(R) Speed Select Technology Core
0674 Power (Intel(R) SST-CP) features. This option sets the minimum performance of each
0675 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
0676 maximum performance so that the hardware will give maximum performance possible
0677 for each CPU.
0678 
0679 If -a option is not used, then the following steps are required before enabling
0680 Intel(R) SST-BF:
0681 
0682 - Discover Intel(R) SST-BF and note low and high priority base frequency
0683 - Note the high priority CPU list
0684 - Enable CLOS using core-power feature set
0685 - Configure CLOS parameters. Use CLOS.min to set to minimum performance
0686 - Subscribe desired CPUs to CLOS groups
0687 
0688 With this configuration, if the same workload is executed by pinning the
0689 workload to high priority CPUs (CPU 5 and 6 in this case)::
0690 
0691  #taskset -c 5,6 perf bench -r 100 sched pipe
0692  # Running 'sched/pipe' benchmark:
0693  # Executed 1000000 pipe operations between two processes
0694      Total time: 5.627 [sec]
0695        5.627922 usecs/op
0696          177685 ops/sec
0697 
0698 This way, by enabling Intel(R) SST-BF, the performance of this benchmark is
0699 improved (latency reduced) by 7.79%. From the turbostat output, it can be
0700 observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
0701 The turbostat output::
0702 
0703  #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0704  Package        Core    CPU     Bzy_MHz
0705  0              0       0       2151
0706  0              1       1       2166
0707  0              2       2       2175
0708  0              3       3       2175
0709  0              4       4       2175
0710  0              5       5       3000
0711  0              6       6       3000
0712  0              7       7       2180
0713  0              8       8       2662
0714  0              9       9       2176
0715  0              10      10      2175
0716  0              11      11      2176
0717  0              12      12      2176
0718  0              13      13      2661
0719 
0720 Disable Intel(R) SST-BF
0721 ~~~~~~~~~~~~~~~~~~~~~~~
0722 
0723 To disable the Intel(R) SST-BF feature, execute::
0724 
0725 # intel-speed-select base-freq disable -a
0726 
0727 
0728 Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
0729 --------------------------------------------------------------------
0730 
0731 This feature enables the ability to set different "All core turbo ratio limits"
0732 to cores based on the priority. By using this feature, some cores can be
0733 configured to get higher turbo frequency by designating them as high priority at
0734 the cost of lower or no turbo frequency on the low priority cores.
0735 
0736 For this reason, this feature is only useful when system is busy utilizing all
0737 CPUs, but the user wants some configurable option to get high performance on
0738 some CPUs.
0739 
0740 The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
0741 depends on the Intel(R) Speed Select Technology - Performance Profile (Intel
0742 SST-PP) performance level configuration. It is possible that only a certain
0743 performance level supports Intel(R) SST-TF. It is also possible that only the base
0744 performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
0745 select the desired performance level to enable this feature.
0746 
0747 In the system under test here, Intel(R) SST-TF is supported at the base
0748 performance level 0, but currently disabled::
0749 
0750  # intel-speed-select -c 0 perf-profile info -l 0
0751  Intel(R) Speed Select Technology
0752  package-0
0753   die-0
0754     cpu-0
0755       perf-profile-level-0
0756         ...
0757         ...
0758         speed-select-turbo-freq:disabled
0759         ...
0760         ...
0761 
0762 
0763 To check if performance can be improved using Intel(R) SST-TF feature, get the turbo
0764 frequency properties with Intel(R) SST-TF enabled and compare to the base turbo
0765 capability of this system.
0766 
0767 Get Base turbo capability
0768 ~~~~~~~~~~~~~~~~~~~~~~~~~
0769 
0770 To get the base turbo capability of performance level 0, execute::
0771 
0772  # intel-speed-select perf-profile info -l 0
0773  Intel(R) Speed Select Technology
0774  Executing on CPU model: X
0775  package-0
0776   die-0
0777     cpu-0
0778       perf-profile-level-0
0779         ...
0780         ...
0781         turbo-ratio-limits-sse
0782           bucket-0
0783             core-count:2
0784             max-turbo-frequency(MHz):3200
0785           bucket-1
0786             core-count:4
0787             max-turbo-frequency(MHz):3100
0788           bucket-2
0789             core-count:6
0790             max-turbo-frequency(MHz):3100
0791           bucket-3
0792             core-count:8
0793             max-turbo-frequency(MHz):3100
0794           bucket-4
0795             core-count:10
0796             max-turbo-frequency(MHz):3100
0797           bucket-5
0798             core-count:12
0799             max-turbo-frequency(MHz):3100
0800           bucket-6
0801             core-count:14
0802             max-turbo-frequency(MHz):3100
0803           bucket-7
0804             core-count:16
0805             max-turbo-frequency(MHz):3100
0806 
0807 Based on the data above, when all the CPUS are busy, the max. frequency of 3100
0808 MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
0809 and on CPU 12 and 13, execute "hackbench pipe" workload::
0810 
0811  # taskset -c 12,13 perf bench -r 100 sched pipe
0812  # Running 'sched/pipe' benchmark:
0813  # Executed 1000000 pipe operations between two processes
0814      Total time: 5.705 [sec]
0815        5.705488 usecs/op
0816          175269 ops/sec
0817 
0818 The turbostat output::
0819 
0820  #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0821  Package        Core    CPU     Bzy_MHz
0822  0              0       0       3000
0823  0              1       1       3000
0824  0              2       2       3000
0825  0              3       3       3000
0826  0              4       4       3000
0827  0              5       5       3100
0828  0              6       6       3100
0829  0              7       7       3000
0830  0              8       8       3100
0831  0              9       9       3000
0832  0              10      10      3000
0833  0              11      11      3000
0834  0              12      12      3100
0835  0              13      13      3100
0836 
0837 Based on turbostat output, the performance is limited by frequency cap of 3100
0838 MHz. To check if the hackbench performance can be improved for CPU 12 and CPU
0839 13, first check the capability of the Intel(R) SST-TF feature for this performance
0840 level.
0841 
0842 Get Intel(R) SST-TF Capability
0843 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0844 
0845 To get the capability, the "turbo-freq info" command can be used::
0846 
0847  # intel-speed-select turbo-freq info -l 0
0848  Intel(R) Speed Select Technology
0849  Executing on CPU model: X
0850  package-0
0851   die-0
0852     cpu-0
0853       speed-select-turbo-freq
0854           bucket-0
0855             high-priority-cores-count:2
0856             high-priority-max-frequency(MHz):3200
0857             high-priority-max-avx2-frequency(MHz):3200
0858             high-priority-max-avx512-frequency(MHz):3100
0859           bucket-1
0860             high-priority-cores-count:4
0861             high-priority-max-frequency(MHz):3100
0862             high-priority-max-avx2-frequency(MHz):3000
0863             high-priority-max-avx512-frequency(MHz):2900
0864           bucket-2
0865             high-priority-cores-count:6
0866             high-priority-max-frequency(MHz):3100
0867             high-priority-max-avx2-frequency(MHz):3000
0868             high-priority-max-avx512-frequency(MHz):2900
0869           speed-select-turbo-freq-clip-frequencies
0870             low-priority-max-frequency(MHz):2600
0871             low-priority-max-avx2-frequency(MHz):2400
0872             low-priority-max-avx512-frequency(MHz):2100
0873 
0874 Based on the output above, there is an Intel(R) SST-TF bucket for which there are
0875 two high priority cores. If only two high priority cores are set, then max.
0876 turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
0877 more than the base turbo capability for all cores.
0878 
0879 In turn, for the hackbench workload, two CPUs can be set as high priority and
0880 rest as low priority. One side effect is that once enabled, the low priority
0881 cores will be clipped to a lower frequency of 2600 MHz.
0882 
0883 Enable Intel(R) SST-TF
0884 ~~~~~~~~~~~~~~~~~~~~~~
0885 
0886 To enable Intel(R) SST-TF, execute::
0887 
0888  # intel-speed-select -c 12,13 turbo-freq enable -a
0889  Intel(R) Speed Select Technology
0890  Executing on CPU model: X
0891  package-0
0892   die-0
0893     cpu-12
0894       turbo-freq
0895         enable:success
0896  package-0
0897   die-0
0898     cpu-13
0899       turbo-freq
0900         enable:success
0901  package--1
0902   die-0
0903     cpu-63
0904       turbo-freq --auto
0905         enable:success
0906 
0907 In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
0908 feature and also sets the CPUs to high and low priority using Intel Speed
0909 Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
0910 with "-c" arguments are marked as high priority, including its siblings.
0911 
0912 If -a option is not used, then the following steps are required before enabling
0913 Intel(R) SST-TF:
0914 
0915 - Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
0916 
0917 - Enable CLOS using core-power feature set - Configure CLOS parameters
0918 
0919 - Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
0920 
0921 If the same hackbench workload is executed, schedule hackbench threads on high
0922 priority CPUs::
0923 
0924  #taskset -c 12,13 perf bench -r 100 sched pipe
0925  # Running 'sched/pipe' benchmark:
0926  # Executed 1000000 pipe operations between two processes
0927      Total time: 5.510 [sec]
0928        5.510165 usecs/op
0929          180826 ops/sec
0930 
0931 This improved performance by around 3.3% improvement on a busy system. Here the
0932 turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
0933 The turbostat output::
0934 
0935  #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0936  Package        Core    CPU     Bzy_MHz
0937  ...
0938  0              12      12      3200
0939  0              13      13      3200