0001 .. SPDX-License-Identifier: GPL-2.0
0002
0003 ============================================================
0004 Intel(R) Speed Select Technology User Guide
0005 ============================================================
0006
0007 The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
0008 collection of features that give more granular control over CPU performance.
0009 With Intel(R) SST, one server can be configured for power and performance for a
0010 variety of diverse workload requirements.
0011
0012 Refer to the links below for an overview of the technology:
0013
0014 - https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
0015 - https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
0016
0017 These capabilities are further enhanced in some of the newer generations of
0018 server platforms where these features can be enumerated and controlled
0019 dynamically without pre-configuring via BIOS setup options. This dynamic
0020 configuration is done via mailbox commands to the hardware. One way to enumerate
0021 and configure these features is by using the Intel Speed Select utility.
0022
0023 This document explains how to use the Intel Speed Select tool to enumerate and
0024 control Intel(R) SST features. This document gives example commands and explains
0025 how these commands change the power and performance profile of the system under
0026 test. Using this tool as an example, customers can replicate the messaging
0027 implemented in the tool in their production software.
0028
0029 intel-speed-select configuration tool
0030 ======================================
0031
0032 Most Linux distribution packages may include the "intel-speed-select" tool. If not,
0033 it can be built by downloading the Linux kernel tree from kernel.org. Once
0034 downloaded, the tool can be built without building the full kernel.
0035
0036 From the kernel tree, run the following commands::
0037
0038 # cd tools/power/x86/intel-speed-select/
0039 # make
0040 # make install
0041
0042 Getting Help
0043 ------------
0044
0045 To get help with the tool, execute the command below::
0046
0047 # intel-speed-select --help
0048
0049 The top-level help describes arguments and features. Notice that there is a
0050 multi-level help structure in the tool. For example, to get help for the feature "perf-profile"::
0051
0052 # intel-speed-select perf-profile --help
0053
0054 To get help on a command, another level of help is provided. For example for the command info "info"::
0055
0056 # intel-speed-select perf-profile info --help
0057
0058 Summary of platform capability
0059 ------------------------------
0060 To check the current platform and driver capabilities, execute::
0061
0062 #intel-speed-select --info
0063
0064 For example on a test system::
0065
0066 # intel-speed-select --info
0067 Intel(R) Speed Select Technology
0068 Executing on CPU model: X
0069 Platform: API version : 1
0070 Platform: Driver version : 1
0071 Platform: mbox supported : 1
0072 Platform: mmio supported : 1
0073 Intel(R) SST-PP (feature perf-profile) is supported
0074 TDP level change control is unlocked, max level: 4
0075 Intel(R) SST-TF (feature turbo-freq) is supported
0076 Intel(R) SST-BF (feature base-freq) is not supported
0077 Intel(R) SST-CP (feature core-power) is supported
0078
0079 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
0080 ------------------------------------------------------------------------
0081
0082 This feature allows configuration of a server dynamically based on workload
0083 performance requirements. This helps users during deployment as they do not have
0084 to choose a specific server configuration statically. This Intel(R) Speed Select
0085 Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
0086 that allows multiple optimized performance profiles per system. Each profile
0087 defines a set of CPUs that need to be online and rest offline to sustain a
0088 guaranteed base frequency. Once the user issues a command to use a specific
0089 performance profile and meet CPU online/offline requirement, the user can expect
0090 a change in the base frequency dynamically. This feature is called
0091 "perf-profile" when using the Intel Speed Select tool.
0092
0093 Number or performance levels
0094 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0095
0096 There can be multiple performance profiles on a system. To get the number of
0097 profiles, execute the command below::
0098
0099 # intel-speed-select perf-profile get-config-levels
0100 Intel(R) Speed Select Technology
0101 Executing on CPU model: X
0102 package-0
0103 die-0
0104 cpu-0
0105 get-config-levels:4
0106 package-1
0107 die-0
0108 cpu-14
0109 get-config-levels:4
0110
0111 On this system under test, there are 4 performance profiles in addition to the
0112 base performance profile (which is performance level 0).
0113
0114 Lock/Unlock status
0115 ~~~~~~~~~~~~~~~~~~
0116
0117 Even if there are multiple performance profiles, it is possible that they
0118 are locked. If they are locked, users cannot issue a command to change the
0119 performance state. It is possible that there is a BIOS setup to unlock or check
0120 with your system vendor.
0121
0122 To check if the system is locked, execute the following command::
0123
0124 # intel-speed-select perf-profile get-lock-status
0125 Intel(R) Speed Select Technology
0126 Executing on CPU model: X
0127 package-0
0128 die-0
0129 cpu-0
0130 get-lock-status:0
0131 package-1
0132 die-0
0133 cpu-14
0134 get-lock-status:0
0135
0136 In this case, lock status is 0, which means that the system is unlocked.
0137
0138 Properties of a performance level
0139 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0140
0141 To get properties of a specific performance level (For example for the level 0, below), execute the command below::
0142
0143 # intel-speed-select perf-profile info -l 0
0144 Intel(R) Speed Select Technology
0145 Executing on CPU model: X
0146 package-0
0147 die-0
0148 cpu-0
0149 perf-profile-level-0
0150 cpu-count:28
0151 enable-cpu-mask:000003ff,f0003fff
0152 enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
0153 thermal-design-power-ratio:26
0154 base-frequency(MHz):2600
0155 speed-select-turbo-freq:disabled
0156 speed-select-base-freq:disabled
0157 ...
0158 ...
0159
0160 Here -l option is used to specify a performance level.
0161
0162 If the option -l is omitted, then this command will print information about all
0163 the performance levels. The above command is printing properties of the
0164 performance level 0.
0165
0166 For this performance profile, the list of CPUs displayed by the
0167 "enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
0168 condition is met, then base frequency of 2600 MHz can be maintained. To
0169 understand more, execute "intel-speed-select perf-profile info" for performance
0170 level 4::
0171
0172 # intel-speed-select perf-profile info -l 4
0173 Intel(R) Speed Select Technology
0174 Executing on CPU model: X
0175 package-0
0176 die-0
0177 cpu-0
0178 perf-profile-level-4
0179 cpu-count:28
0180 enable-cpu-mask:000000fa,f0000faf
0181 enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
0182 thermal-design-power-ratio:28
0183 base-frequency(MHz):2800
0184 speed-select-turbo-freq:disabled
0185 speed-select-base-freq:unsupported
0186 ...
0187 ...
0188
0189 There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
0190 the user only keeps these CPUs online and the rest "offline," then the base
0191 frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
0192
0193 Get current performance level
0194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0195
0196 To get the current performance level, execute::
0197
0198 # intel-speed-select perf-profile get-config-current-level
0199 Intel(R) Speed Select Technology
0200 Executing on CPU model: X
0201 package-0
0202 die-0
0203 cpu-0
0204 get-config-current_level:0
0205
0206 First verify that the base_frequency displayed by the cpufreq sysfs is correct::
0207
0208 # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
0209 2600000
0210
0211 This matches the base-frequency (MHz) field value displayed from the
0212 "perf-profile info" command for performance level 0(cpufreq frequency is in
0213 KHz).
0214
0215 To check if the average frequency is equal to the base frequency for a 100% busy
0216 workload, disable turbo::
0217
0218 # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
0219
0220 Then runs a busy workload on all CPUs, for example::
0221
0222 #stress -c 64
0223
0224 To verify the base frequency, run turbostat::
0225
0226 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0227
0228 Package Core CPU Bzy_MHz
0229 - - 2600
0230 0 0 0 2600
0231 0 1 1 2600
0232 0 2 2 2600
0233 0 3 3 2600
0234 0 4 4 2600
0235 . . . .
0236
0237
0238 Changing performance level
0239 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0240
0241 To the change the performance level to 4, execute::
0242
0243 # intel-speed-select -d perf-profile set-config-level -l 4 -o
0244 Intel(R) Speed Select Technology
0245 Executing on CPU model: X
0246 package-0
0247 die-0
0248 cpu-0
0249 perf-profile
0250 set_tdp_level:success
0251
0252 In the command above, "-o" is optional. If it is specified, then it will also
0253 offline CPUs which are not present in the enable_cpu_mask for this performance
0254 level.
0255
0256 Now if the base_frequency is checked::
0257
0258 #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
0259 2800000
0260
0261 Which shows that the base frequency now increased from 2600 MHz at performance
0262 level 0 to 2800 MHz at performance level 4. As a result, any workload, which can
0263 use fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
0264
0265 Changing performance level via BMC Interface
0266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0267
0268 It is possible to change SST-PP level using out of band (OOB) agent (Via some
0269 remote management console, through BMC "Baseboard Management Controller"
0270 interface). This mode is supported from the Sapphire Rapids processor
0271 generation. The kernel and tool change to support this mode is added to Linux
0272 kernel version 5.18. To enable this feature, kernel config
0273 "CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool
0274 is "v1.12" to support this feature, which is part of Linux kernel version 5.18.
0275
0276 To support such configuration, this tool can be used as a daemon. Add
0277 a command line option --oob::
0278
0279 # intel-speed-select --oob
0280 Intel(R) Speed Select Technology
0281 Executing on CPU model:143[0x8f]
0282 OOB mode is enabled and will run as daemon
0283
0284 In this mode the tool will online/offline CPUs based on the new performance
0285 level.
0286
0287 Check presence of other Intel(R) SST features
0288 ---------------------------------------------
0289
0290 Each of the performance profiles also specifies weather there is support of
0291 other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
0292 (Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
0293 SST-TF)).
0294
0295 For example, from the output of "perf-profile info" above, for level 0 and level
0296 4:
0297
0298 For level 0::
0299 speed-select-turbo-freq:disabled
0300 speed-select-base-freq:disabled
0301
0302 For level 4::
0303 speed-select-turbo-freq:disabled
0304 speed-select-base-freq:unsupported
0305
0306 Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
0307 changed from "disabled" to "unsupported" compared to performance level 0.
0308
0309 This means that at performance level 4, the "speed-select-base-freq" feature is
0310 not supported. However, at performance level 0, this feature is "supported", but
0311 currently "disabled", meaning the user has not activated this feature. Whereas
0312 "speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
0313 levels, but currently not activated by the user.
0314
0315 The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
0316 technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
0317 The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
0318 is supported on a platform.
0319
0320 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
0321 ---------------------------------------------------------------
0322
0323 Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
0324 allows users to define per core priority. This defines a mechanism to distribute
0325 power among cores when there is a power constrained scenario. This defines a
0326 class of service (CLOS) configuration.
0327
0328 The user can configure up to 4 class of service configurations. Each CLOS group
0329 configuration allows definitions of parameters, which affects how the frequency
0330 can be limited and power is distributed. Each CPU core can be tied to a class of
0331 service and hence an associated priority. The granularity is at core level not
0332 at per CPU level.
0333
0334 Enable CLOS based prioritization
0335 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0336
0337 To use CLOS based prioritization feature, firmware must be informed to enable
0338 and use a priority type. There is a default per platform priority type, which
0339 can be changed with optional command line parameter.
0340
0341 To enable and check the options, execute::
0342
0343 # intel-speed-select core-power enable --help
0344 Intel(R) Speed Select Technology
0345 Executing on CPU model: X
0346 Enable core-power for a package/die
0347 Clos Enable: Specify priority type with [--priority|-p]
0348 0: Proportional, 1: Ordered
0349
0350 There are two types of priority types:
0351
0352 - Ordered
0353
0354 Priority for ordered throttling is defined based on the index of the assigned
0355 CLOS group. Where CLOS0 gets highest priority (throttled last).
0356
0357 Priority order is:
0358 CLOS0 > CLOS1 > CLOS2 > CLOS3.
0359
0360 - Proportional
0361
0362 When proportional priority is used, there is an additional parameter called
0363 frequency_weight, which can be specified per CLOS group. The goal of
0364 proportional priority is to provide each core with the requested min., then
0365 distribute all remaining (excess/deficit) budgets in proportion to a defined
0366 weight. This proportional priority can be configured using "core-power config"
0367 command.
0368
0369 To enable with the platform default priority type, execute::
0370
0371 # intel-speed-select core-power enable
0372 Intel(R) Speed Select Technology
0373 Executing on CPU model: X
0374 package-0
0375 die-0
0376 cpu-0
0377 core-power
0378 enable:success
0379 package-1
0380 die-0
0381 cpu-6
0382 core-power
0383 enable:success
0384
0385 The scope of this enable is per package or die scoped when a package contains
0386 multiple dies. To check if CLOS is enabled and get priority type, "core-power
0387 info" command can be used. For example to check the status of core-power feature
0388 on CPU 0, execute::
0389
0390 # intel-speed-select -c 0 core-power info
0391 Intel(R) Speed Select Technology
0392 Executing on CPU model: X
0393 package-0
0394 die-0
0395 cpu-0
0396 core-power
0397 support-status:supported
0398 enable-status:enabled
0399 clos-enable-status:enabled
0400 priority-type:proportional
0401 package-1
0402 die-0
0403 cpu-24
0404 core-power
0405 support-status:supported
0406 enable-status:enabled
0407 clos-enable-status:enabled
0408 priority-type:proportional
0409
0410 Configuring CLOS groups
0411 ~~~~~~~~~~~~~~~~~~~~~~~
0412
0413 Each CLOS group has its own attributes including min, max, freq_weight and
0414 desired. These parameters can be configured with "core-power config" command.
0415 Defaults will be used if user skips setting a parameter except clos id, which is
0416 mandatory. To check core-power config options, execute::
0417
0418 # intel-speed-select core-power config --help
0419 Intel(R) Speed Select Technology
0420 Executing on CPU model: X
0421 Set core-power configuration for one of the four clos ids
0422 Specify targeted clos id with [--clos|-c]
0423 Specify clos Proportional Priority [--weight|-w]
0424 Specify clos min in MHz with [--min|-n]
0425 Specify clos max in MHz with [--max|-m]
0426
0427 For example::
0428
0429 # intel-speed-select core-power config -c 0
0430 Intel(R) Speed Select Technology
0431 Executing on CPU model: X
0432 clos epp is not specified, default: 0
0433 clos frequency weight is not specified, default: 0
0434 clos min is not specified, default: 0 MHz
0435 clos max is not specified, default: 25500 MHz
0436 clos desired is not specified, default: 0
0437 package-0
0438 die-0
0439 cpu-0
0440 core-power
0441 config:success
0442 package-1
0443 die-0
0444 cpu-6
0445 core-power
0446 config:success
0447
0448 The user has the option to change defaults. For example, the user can change the
0449 "min" and set the base frequency to always get guaranteed base frequency.
0450
0451 Get the current CLOS configuration
0452 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0453
0454 To check the current configuration, "core-power get-config" can be used. For
0455 example, to get the configuration of CLOS 0::
0456
0457 # intel-speed-select core-power get-config -c 0
0458 Intel(R) Speed Select Technology
0459 Executing on CPU model: X
0460 package-0
0461 die-0
0462 cpu-0
0463 core-power
0464 clos:0
0465 epp:0
0466 clos-proportional-priority:0
0467 clos-min:0 MHz
0468 clos-max:Max Turbo frequency
0469 clos-desired:0 MHz
0470 package-1
0471 die-0
0472 cpu-24
0473 core-power
0474 clos:0
0475 epp:0
0476 clos-proportional-priority:0
0477 clos-min:0 MHz
0478 clos-max:Max Turbo frequency
0479 clos-desired:0 MHz
0480
0481 Associating a CPU with a CLOS group
0482 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0483
0484 To associate a CPU to a CLOS group "core-power assoc" command can be used::
0485
0486 # intel-speed-select core-power assoc --help
0487 Intel(R) Speed Select Technology
0488 Executing on CPU model: X
0489 Associate a clos id to a CPU
0490 Specify targeted clos id with [--clos|-c]
0491
0492
0493 For example to associate CPU 10 to CLOS group 3, execute::
0494
0495 # intel-speed-select -c 10 core-power assoc -c 3
0496 Intel(R) Speed Select Technology
0497 Executing on CPU model: X
0498 package-0
0499 die-0
0500 cpu-10
0501 core-power
0502 assoc:success
0503
0504 Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.
0505 Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency
0506 limits.
0507
0508 To check the existing association for a CPU, "core-power get-assoc" command can
0509 be used. For example, to get association of CPU 10, execute::
0510
0511 # intel-speed-select -c 10 core-power get-assoc
0512 Intel(R) Speed Select Technology
0513 Executing on CPU model: X
0514 package-1
0515 die-0
0516 cpu-10
0517 get-assoc
0518 clos:3
0519
0520 This shows that CPU 10 is part of a CLOS group 3.
0521
0522
0523 Disable CLOS based prioritization
0524 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0525
0526 To disable, execute::
0527
0528 # intel-speed-select core-power disable
0529
0530 Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
0531 is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
0532 Intel(R) SST-TF to fail. This will cause the "disable" command to display an error
0533 if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
0534 feature must be disabled first.
0535
0536 Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
0537 -------------------------------------------------------------------
0538
0539 The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
0540 the user control base frequency. If some critical workload threads demand
0541 constant high guaranteed performance, then this feature can be used to execute
0542 the thread at higher base frequency on specific sets of CPUs (high priority
0543 CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
0544 This feature does not require offline of the low priority CPUs.
0545
0546 The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
0547 Performance Profile (Intel(R) SST-PP) performance level configuration. It is
0548 possible that only certain performance levels support Intel(R) SST-BF. It is also
0549 possible that only base performance level (level = 0) has support of Intel
0550 SST-BF. Consequently, first select the desired performance level to enable this
0551 feature.
0552
0553 In the system under test here, Intel(R) SST-BF is supported at the base
0554 performance level 0, but currently disabled. For example for the level 0::
0555
0556 # intel-speed-select -c 0 perf-profile info -l 0
0557 Intel(R) Speed Select Technology
0558 Executing on CPU model: X
0559 package-0
0560 die-0
0561 cpu-0
0562 perf-profile-level-0
0563 ...
0564
0565 speed-select-base-freq:disabled
0566 ...
0567
0568 Before enabling Intel(R) SST-BF and measuring its impact on a workload
0569 performance, execute some workload and measure performance and get a baseline
0570 performance to compare against.
0571
0572 Here the user wants more guaranteed performance. For this reason, it is likely
0573 that turbo is disabled. To disable turbo, execute::
0574
0575 #echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
0576
0577 Based on the output of the "intel-speed-select perf-profile info -l 0" base
0578 frequency of guaranteed frequency 2600 MHz.
0579
0580
0581 Measure baseline performance for comparison
0582 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0583
0584 To compare, pick a multi-threaded workload where each thread can be scheduled on
0585 separate CPUs. "Hackbench pipe" test is a good example on how to improve
0586 performance using Intel(R) SST-BF.
0587
0588 Below, the workload is measuring average scheduler wakeup latency, so a lower
0589 number means better performance::
0590
0591 # taskset -c 3,4 perf bench -r 100 sched pipe
0592 # Running 'sched/pipe' benchmark:
0593 # Executed 1000000 pipe operations between two processes
0594 Total time: 6.102 [sec]
0595 6.102445 usecs/op
0596 163868 ops/sec
0597
0598 While running the above test, if we take turbostat output, it will show us that
0599 2 of the CPUs are busy and reaching max. frequency (which would be the base
0600 frequency as the turbo is disabled). The turbostat output::
0601
0602 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0603 Package Core CPU Bzy_MHz
0604 0 0 0 1000
0605 0 1 1 1005
0606 0 2 2 1000
0607 0 3 3 2600
0608 0 4 4 2600
0609 0 5 5 1000
0610 0 6 6 1000
0611 0 7 7 1005
0612 0 8 8 1005
0613 0 9 9 1000
0614 0 10 10 1000
0615 0 11 11 995
0616 0 12 12 1000
0617 0 13 13 1000
0618
0619 From the above turbostat output, both CPU 3 and 4 are very busy and reaching
0620 full guaranteed frequency of 2600 MHz.
0621
0622 Intel(R) SST-BF Capabilities
0623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0624
0625 To get capabilities of Intel(R) SST-BF for the current performance level 0,
0626 execute::
0627
0628 # intel-speed-select base-freq info -l 0
0629 Intel(R) Speed Select Technology
0630 Executing on CPU model: X
0631 package-0
0632 die-0
0633 cpu-0
0634 speed-select-base-freq
0635 high-priority-base-frequency(MHz):3000
0636 high-priority-cpu-mask:00000216,00002160
0637 high-priority-cpu-list:5,6,8,13,33,34,36,41
0638 low-priority-base-frequency(MHz):2400
0639 tjunction-temperature(C):125
0640 thermal-design-power(W):205
0641
0642 The above capabilities show that there are some CPUs on this system that can
0643 offer base frequency of 3000 MHz compared to the standard base frequency at this
0644 performance levels. Nevertheless, these CPUs are fixed, and they are presented
0645 via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
0646 feature is selected, the low priorities CPUs (which are not in
0647 high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
0648 clipping of low priority CPUs is acceptable, then the user can enable Intel
0649 SST-BF feature particularly for the above "sched pipe" workload since only two
0650 CPUs are used, they can be scheduled on high priority CPUs and can get boost of
0651 400 MHz.
0652
0653 Enable Intel(R) SST-BF
0654 ~~~~~~~~~~~~~~~~~~~~~~
0655
0656 To enable Intel(R) SST-BF feature, execute::
0657
0658 # intel-speed-select base-freq enable -a
0659 Intel(R) Speed Select Technology
0660 Executing on CPU model: X
0661 package-0
0662 die-0
0663 cpu-0
0664 base-freq
0665 enable:success
0666 package-1
0667 die-0
0668 cpu-14
0669 base-freq
0670 enable:success
0671
0672 In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
0673 also adjusts the priority of cores using Intel(R) Speed Select Technology Core
0674 Power (Intel(R) SST-CP) features. This option sets the minimum performance of each
0675 Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
0676 maximum performance so that the hardware will give maximum performance possible
0677 for each CPU.
0678
0679 If -a option is not used, then the following steps are required before enabling
0680 Intel(R) SST-BF:
0681
0682 - Discover Intel(R) SST-BF and note low and high priority base frequency
0683 - Note the high priority CPU list
0684 - Enable CLOS using core-power feature set
0685 - Configure CLOS parameters. Use CLOS.min to set to minimum performance
0686 - Subscribe desired CPUs to CLOS groups
0687
0688 With this configuration, if the same workload is executed by pinning the
0689 workload to high priority CPUs (CPU 5 and 6 in this case)::
0690
0691 #taskset -c 5,6 perf bench -r 100 sched pipe
0692 # Running 'sched/pipe' benchmark:
0693 # Executed 1000000 pipe operations between two processes
0694 Total time: 5.627 [sec]
0695 5.627922 usecs/op
0696 177685 ops/sec
0697
0698 This way, by enabling Intel(R) SST-BF, the performance of this benchmark is
0699 improved (latency reduced) by 7.79%. From the turbostat output, it can be
0700 observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
0701 The turbostat output::
0702
0703 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0704 Package Core CPU Bzy_MHz
0705 0 0 0 2151
0706 0 1 1 2166
0707 0 2 2 2175
0708 0 3 3 2175
0709 0 4 4 2175
0710 0 5 5 3000
0711 0 6 6 3000
0712 0 7 7 2180
0713 0 8 8 2662
0714 0 9 9 2176
0715 0 10 10 2175
0716 0 11 11 2176
0717 0 12 12 2176
0718 0 13 13 2661
0719
0720 Disable Intel(R) SST-BF
0721 ~~~~~~~~~~~~~~~~~~~~~~~
0722
0723 To disable the Intel(R) SST-BF feature, execute::
0724
0725 # intel-speed-select base-freq disable -a
0726
0727
0728 Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
0729 --------------------------------------------------------------------
0730
0731 This feature enables the ability to set different "All core turbo ratio limits"
0732 to cores based on the priority. By using this feature, some cores can be
0733 configured to get higher turbo frequency by designating them as high priority at
0734 the cost of lower or no turbo frequency on the low priority cores.
0735
0736 For this reason, this feature is only useful when system is busy utilizing all
0737 CPUs, but the user wants some configurable option to get high performance on
0738 some CPUs.
0739
0740 The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
0741 depends on the Intel(R) Speed Select Technology - Performance Profile (Intel
0742 SST-PP) performance level configuration. It is possible that only a certain
0743 performance level supports Intel(R) SST-TF. It is also possible that only the base
0744 performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
0745 select the desired performance level to enable this feature.
0746
0747 In the system under test here, Intel(R) SST-TF is supported at the base
0748 performance level 0, but currently disabled::
0749
0750 # intel-speed-select -c 0 perf-profile info -l 0
0751 Intel(R) Speed Select Technology
0752 package-0
0753 die-0
0754 cpu-0
0755 perf-profile-level-0
0756 ...
0757 ...
0758 speed-select-turbo-freq:disabled
0759 ...
0760 ...
0761
0762
0763 To check if performance can be improved using Intel(R) SST-TF feature, get the turbo
0764 frequency properties with Intel(R) SST-TF enabled and compare to the base turbo
0765 capability of this system.
0766
0767 Get Base turbo capability
0768 ~~~~~~~~~~~~~~~~~~~~~~~~~
0769
0770 To get the base turbo capability of performance level 0, execute::
0771
0772 # intel-speed-select perf-profile info -l 0
0773 Intel(R) Speed Select Technology
0774 Executing on CPU model: X
0775 package-0
0776 die-0
0777 cpu-0
0778 perf-profile-level-0
0779 ...
0780 ...
0781 turbo-ratio-limits-sse
0782 bucket-0
0783 core-count:2
0784 max-turbo-frequency(MHz):3200
0785 bucket-1
0786 core-count:4
0787 max-turbo-frequency(MHz):3100
0788 bucket-2
0789 core-count:6
0790 max-turbo-frequency(MHz):3100
0791 bucket-3
0792 core-count:8
0793 max-turbo-frequency(MHz):3100
0794 bucket-4
0795 core-count:10
0796 max-turbo-frequency(MHz):3100
0797 bucket-5
0798 core-count:12
0799 max-turbo-frequency(MHz):3100
0800 bucket-6
0801 core-count:14
0802 max-turbo-frequency(MHz):3100
0803 bucket-7
0804 core-count:16
0805 max-turbo-frequency(MHz):3100
0806
0807 Based on the data above, when all the CPUS are busy, the max. frequency of 3100
0808 MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
0809 and on CPU 12 and 13, execute "hackbench pipe" workload::
0810
0811 # taskset -c 12,13 perf bench -r 100 sched pipe
0812 # Running 'sched/pipe' benchmark:
0813 # Executed 1000000 pipe operations between two processes
0814 Total time: 5.705 [sec]
0815 5.705488 usecs/op
0816 175269 ops/sec
0817
0818 The turbostat output::
0819
0820 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0821 Package Core CPU Bzy_MHz
0822 0 0 0 3000
0823 0 1 1 3000
0824 0 2 2 3000
0825 0 3 3 3000
0826 0 4 4 3000
0827 0 5 5 3100
0828 0 6 6 3100
0829 0 7 7 3000
0830 0 8 8 3100
0831 0 9 9 3000
0832 0 10 10 3000
0833 0 11 11 3000
0834 0 12 12 3100
0835 0 13 13 3100
0836
0837 Based on turbostat output, the performance is limited by frequency cap of 3100
0838 MHz. To check if the hackbench performance can be improved for CPU 12 and CPU
0839 13, first check the capability of the Intel(R) SST-TF feature for this performance
0840 level.
0841
0842 Get Intel(R) SST-TF Capability
0843 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0844
0845 To get the capability, the "turbo-freq info" command can be used::
0846
0847 # intel-speed-select turbo-freq info -l 0
0848 Intel(R) Speed Select Technology
0849 Executing on CPU model: X
0850 package-0
0851 die-0
0852 cpu-0
0853 speed-select-turbo-freq
0854 bucket-0
0855 high-priority-cores-count:2
0856 high-priority-max-frequency(MHz):3200
0857 high-priority-max-avx2-frequency(MHz):3200
0858 high-priority-max-avx512-frequency(MHz):3100
0859 bucket-1
0860 high-priority-cores-count:4
0861 high-priority-max-frequency(MHz):3100
0862 high-priority-max-avx2-frequency(MHz):3000
0863 high-priority-max-avx512-frequency(MHz):2900
0864 bucket-2
0865 high-priority-cores-count:6
0866 high-priority-max-frequency(MHz):3100
0867 high-priority-max-avx2-frequency(MHz):3000
0868 high-priority-max-avx512-frequency(MHz):2900
0869 speed-select-turbo-freq-clip-frequencies
0870 low-priority-max-frequency(MHz):2600
0871 low-priority-max-avx2-frequency(MHz):2400
0872 low-priority-max-avx512-frequency(MHz):2100
0873
0874 Based on the output above, there is an Intel(R) SST-TF bucket for which there are
0875 two high priority cores. If only two high priority cores are set, then max.
0876 turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
0877 more than the base turbo capability for all cores.
0878
0879 In turn, for the hackbench workload, two CPUs can be set as high priority and
0880 rest as low priority. One side effect is that once enabled, the low priority
0881 cores will be clipped to a lower frequency of 2600 MHz.
0882
0883 Enable Intel(R) SST-TF
0884 ~~~~~~~~~~~~~~~~~~~~~~
0885
0886 To enable Intel(R) SST-TF, execute::
0887
0888 # intel-speed-select -c 12,13 turbo-freq enable -a
0889 Intel(R) Speed Select Technology
0890 Executing on CPU model: X
0891 package-0
0892 die-0
0893 cpu-12
0894 turbo-freq
0895 enable:success
0896 package-0
0897 die-0
0898 cpu-13
0899 turbo-freq
0900 enable:success
0901 package--1
0902 die-0
0903 cpu-63
0904 turbo-freq --auto
0905 enable:success
0906
0907 In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
0908 feature and also sets the CPUs to high and low priority using Intel Speed
0909 Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
0910 with "-c" arguments are marked as high priority, including its siblings.
0911
0912 If -a option is not used, then the following steps are required before enabling
0913 Intel(R) SST-TF:
0914
0915 - Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
0916
0917 - Enable CLOS using core-power feature set - Configure CLOS parameters
0918
0919 - Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
0920
0921 If the same hackbench workload is executed, schedule hackbench threads on high
0922 priority CPUs::
0923
0924 #taskset -c 12,13 perf bench -r 100 sched pipe
0925 # Running 'sched/pipe' benchmark:
0926 # Executed 1000000 pipe operations between two processes
0927 Total time: 5.510 [sec]
0928 5.510165 usecs/op
0929 180826 ops/sec
0930
0931 This improved performance by around 3.3% improvement on a busy system. Here the
0932 turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
0933 The turbostat output::
0934
0935 #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
0936 Package Core CPU Bzy_MHz
0937 ...
0938 0 12 12 3200
0939 0 13 13 3200