0001 =================================
0002 Power allocator governor tunables
0003 =================================
0004
0005 Trip points
0006 -----------
0007
0008 The governor works optimally with the following two passive trip points:
0009
0010 1. "switch on" trip point: temperature above which the governor
0011 control loop starts operating. This is the first passive trip
0012 point of the thermal zone.
0013
0014 2. "desired temperature" trip point: it should be higher than the
0015 "switch on" trip point. This the target temperature the governor
0016 is controlling for. This is the last passive trip point of the
0017 thermal zone.
0018
0019 PID Controller
0020 --------------
0021
0022 The power allocator governor implements a
0023 Proportional-Integral-Derivative controller (PID controller) with
0024 temperature as the control input and power as the controlled output:
0025
0026 P_max = k_p * e + k_i * err_integral + k_d * diff_err + sustainable_power
0027
0028 where
0029 - e = desired_temperature - current_temperature
0030 - err_integral is the sum of previous errors
0031 - diff_err = e - previous_error
0032
0033 It is similar to the one depicted below::
0034
0035 k_d
0036 |
0037 current_temp |
0038 | v
0039 | +----------+ +---+
0040 | +----->| diff_err |-->| X |------+
0041 | | +----------+ +---+ |
0042 | | | tdp actor
0043 | | k_i | | get_requested_power()
0044 | | | | | | |
0045 | | | | | | | ...
0046 v | v v v v v
0047 +---+ | +-------+ +---+ +---+ +---+ +----------+
0048 | S |-----+----->| sum e |----->| X |--->| S |-->| S |-->|power |
0049 +---+ | +-------+ +---+ +---+ +---+ |allocation|
0050 ^ | ^ +----------+
0051 | | | | |
0052 | | +---+ | | |
0053 | +------->| X |-------------------+ v v
0054 | +---+ granted performance
0055 desired_temperature ^
0056 |
0057 |
0058 k_po/k_pu
0059
0060 Sustainable power
0061 -----------------
0062
0063 An estimate of the sustainable dissipatable power (in mW) should be
0064 provided while registering the thermal zone. This estimates the
0065 sustained power that can be dissipated at the desired control
0066 temperature. This is the maximum sustained power for allocation at
0067 the desired maximum temperature. The actual sustained power can vary
0068 for a number of reasons. The closed loop controller will take care of
0069 variations such as environmental conditions, and some factors related
0070 to the speed-grade of the silicon. `sustainable_power` is therefore
0071 simply an estimate, and may be tuned to affect the aggressiveness of
0072 the thermal ramp. For reference, the sustainable power of a 4" phone
0073 is typically 2000mW, while on a 10" tablet is around 4500mW (may vary
0074 depending on screen size). It is possible to have the power value
0075 expressed in an abstract scale. The sustained power should be aligned
0076 to the scale used by the related cooling devices.
0077
0078 If you are using device tree, do add it as a property of the
0079 thermal-zone. For example::
0080
0081 thermal-zones {
0082 soc_thermal {
0083 polling-delay = <1000>;
0084 polling-delay-passive = <100>;
0085 sustainable-power = <2500>;
0086 ...
0087
0088 Instead, if the thermal zone is registered from the platform code, pass a
0089 `thermal_zone_params` that has a `sustainable_power`. If no
0090 `thermal_zone_params` were being passed, then something like below
0091 will suffice::
0092
0093 static const struct thermal_zone_params tz_params = {
0094 .sustainable_power = 3500,
0095 };
0096
0097 and then pass `tz_params` as the 5th parameter to
0098 `thermal_zone_device_register()`
0099
0100 k_po and k_pu
0101 -------------
0102
0103 The implementation of the PID controller in the power allocator
0104 thermal governor allows the configuration of two proportional term
0105 constants: `k_po` and `k_pu`. `k_po` is the proportional term
0106 constant during temperature overshoot periods (current temperature is
0107 above "desired temperature" trip point). Conversely, `k_pu` is the
0108 proportional term constant during temperature undershoot periods
0109 (current temperature below "desired temperature" trip point).
0110
0111 These controls are intended as the primary mechanism for configuring
0112 the permitted thermal "ramp" of the system. For instance, a lower
0113 `k_pu` value will provide a slower ramp, at the cost of capping
0114 available capacity at a low temperature. On the other hand, a high
0115 value of `k_pu` will result in the governor granting very high power
0116 while temperature is low, and may lead to temperature overshooting.
0117
0118 The default value for `k_pu` is::
0119
0120 2 * sustainable_power / (desired_temperature - switch_on_temp)
0121
0122 This means that at `switch_on_temp` the output of the controller's
0123 proportional term will be 2 * `sustainable_power`. The default value
0124 for `k_po` is::
0125
0126 sustainable_power / (desired_temperature - switch_on_temp)
0127
0128 Focusing on the proportional and feed forward values of the PID
0129 controller equation we have::
0130
0131 P_max = k_p * e + sustainable_power
0132
0133 The proportional term is proportional to the difference between the
0134 desired temperature and the current one. When the current temperature
0135 is the desired one, then the proportional component is zero and
0136 `P_max` = `sustainable_power`. That is, the system should operate in
0137 thermal equilibrium under constant load. `sustainable_power` is only
0138 an estimate, which is the reason for closed-loop control such as this.
0139
0140 Expanding `k_pu` we get::
0141
0142 P_max = 2 * sustainable_power * (T_set - T) / (T_set - T_on) +
0143 sustainable_power
0144
0145 where:
0146
0147 - T_set is the desired temperature
0148 - T is the current temperature
0149 - T_on is the switch on temperature
0150
0151 When the current temperature is the switch_on temperature, the above
0152 formula becomes::
0153
0154 P_max = 2 * sustainable_power * (T_set - T_on) / (T_set - T_on) +
0155 sustainable_power = 2 * sustainable_power + sustainable_power =
0156 3 * sustainable_power
0157
0158 Therefore, the proportional term alone linearly decreases power from
0159 3 * `sustainable_power` to `sustainable_power` as the temperature
0160 rises from the switch on temperature to the desired temperature.
0161
0162 k_i and integral_cutoff
0163 -----------------------
0164
0165 `k_i` configures the PID loop's integral term constant. This term
0166 allows the PID controller to compensate for long term drift and for
0167 the quantized nature of the output control: cooling devices can't set
0168 the exact power that the governor requests. When the temperature
0169 error is below `integral_cutoff`, errors are accumulated in the
0170 integral term. This term is then multiplied by `k_i` and the result
0171 added to the output of the controller. Typically `k_i` is set low (1
0172 or 2) and `integral_cutoff` is 0.
0173
0174 k_d
0175 ---
0176
0177 `k_d` configures the PID loop's derivative term constant. It's
0178 recommended to leave it as the default: 0.
0179
0180 Cooling device power API
0181 ========================
0182
0183 Cooling devices controlled by this governor must supply the additional
0184 "power" API in their `cooling_device_ops`. It consists on three ops:
0185
0186 1. ::
0187
0188 int get_requested_power(struct thermal_cooling_device *cdev,
0189 struct thermal_zone_device *tz, u32 *power);
0190
0191
0192 @cdev:
0193 The `struct thermal_cooling_device` pointer
0194 @tz:
0195 thermal zone in which we are currently operating
0196 @power:
0197 pointer in which to store the calculated power
0198
0199 `get_requested_power()` calculates the power requested by the device
0200 in milliwatts and stores it in @power . It should return 0 on
0201 success, -E* on failure. This is currently used by the power
0202 allocator governor to calculate how much power to give to each cooling
0203 device.
0204
0205 2. ::
0206
0207 int state2power(struct thermal_cooling_device *cdev, struct
0208 thermal_zone_device *tz, unsigned long state,
0209 u32 *power);
0210
0211 @cdev:
0212 The `struct thermal_cooling_device` pointer
0213 @tz:
0214 thermal zone in which we are currently operating
0215 @state:
0216 A cooling device state
0217 @power:
0218 pointer in which to store the equivalent power
0219
0220 Convert cooling device state @state into power consumption in
0221 milliwatts and store it in @power. It should return 0 on success, -E*
0222 on failure. This is currently used by thermal core to calculate the
0223 maximum power that an actor can consume.
0224
0225 3. ::
0226
0227 int power2state(struct thermal_cooling_device *cdev, u32 power,
0228 unsigned long *state);
0229
0230 @cdev:
0231 The `struct thermal_cooling_device` pointer
0232 @power:
0233 power in milliwatts
0234 @state:
0235 pointer in which to store the resulting state
0236
0237 Calculate a cooling device state that would make the device consume at
0238 most @power mW and store it in @state. It should return 0 on success,
0239 -E* on failure. This is currently used by the thermal core to convert
0240 a given power set by the power allocator governor to a state that the
0241 cooling device can set. It is a function because this conversion may
0242 depend on external factors that may change so this function should the
0243 best conversion given "current circumstances".
0244
0245 Cooling device weights
0246 ----------------------
0247
0248 Weights are a mechanism to bias the allocation among cooling
0249 devices. They express the relative power efficiency of different
0250 cooling devices. Higher weight can be used to express higher power
0251 efficiency. Weighting is relative such that if each cooling device
0252 has a weight of one they are considered equal. This is particularly
0253 useful in heterogeneous systems where two cooling devices may perform
0254 the same kind of compute, but with different efficiency. For example,
0255 a system with two different types of processors.
0256
0257 If the thermal zone is registered using
0258 `thermal_zone_device_register()` (i.e., platform code), then weights
0259 are passed as part of the thermal zone's `thermal_bind_parameters`.
0260 If the platform is registered using device tree, then they are passed
0261 as the `contribution` property of each map in the `cooling-maps` node.
0262
0263 Limitations of the power allocator governor
0264 ===========================================
0265
0266 The power allocator governor's PID controller works best if there is a
0267 periodic tick. If you have a driver that calls
0268 `thermal_zone_device_update()` (or anything that ends up calling the
0269 governor's `throttle()` function) repetitively, the governor response
0270 won't be very good. Note that this is not particular to this
0271 governor, step-wise will also misbehave if you call its throttle()
0272 faster than the normal thermal framework tick (due to interrupts for
0273 example) as it will overreact.
0274
0275 Energy Model requirements
0276 =========================
0277
0278 Another important thing is the consistent scale of the power values
0279 provided by the cooling devices. All of the cooling devices in a single
0280 thermal zone should have power values reported either in milli-Watts
0281 or scaled to the same 'abstract scale'.