Back to home page

OSCL-LXR

 
 

    


0001 ==================================================
0002 Runtime Power Management Framework for I/O Devices
0003 ==================================================
0004 
0005 (C) 2009-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
0006 
0007 (C) 2010 Alan Stern <stern@rowland.harvard.edu>
0008 
0009 (C) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
0010 
0011 1. Introduction
0012 ===============
0013 
0014 Support for runtime power management (runtime PM) of I/O devices is provided
0015 at the power management core (PM core) level by means of:
0016 
0017 * The power management workqueue pm_wq in which bus types and device drivers can
0018   put their PM-related work items.  It is strongly recommended that pm_wq be
0019   used for queuing all work items related to runtime PM, because this allows
0020   them to be synchronized with system-wide power transitions (suspend to RAM,
0021   hibernation and resume from system sleep states).  pm_wq is declared in
0022   include/linux/pm_runtime.h and defined in kernel/power/main.c.
0023 
0024 * A number of runtime PM fields in the 'power' member of 'struct device' (which
0025   is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
0026   be used for synchronizing runtime PM operations with one another.
0027 
0028 * Three device runtime PM callbacks in 'struct dev_pm_ops' (defined in
0029   include/linux/pm.h).
0030 
0031 * A set of helper functions defined in drivers/base/power/runtime.c that can be
0032   used for carrying out runtime PM operations in such a way that the
0033   synchronization between them is taken care of by the PM core.  Bus types and
0034   device drivers are encouraged to use these functions.
0035 
0036 The runtime PM callbacks present in 'struct dev_pm_ops', the device runtime PM
0037 fields of 'struct dev_pm_info' and the core helper functions provided for
0038 runtime PM are described below.
0039 
0040 2. Device Runtime PM Callbacks
0041 ==============================
0042 
0043 There are three device runtime PM callbacks defined in 'struct dev_pm_ops'::
0044 
0045   struct dev_pm_ops {
0046         ...
0047         int (*runtime_suspend)(struct device *dev);
0048         int (*runtime_resume)(struct device *dev);
0049         int (*runtime_idle)(struct device *dev);
0050         ...
0051   };
0052 
0053 The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks
0054 are executed by the PM core for the device's subsystem that may be either of
0055 the following:
0056 
0057   1. PM domain of the device, if the device's PM domain object, dev->pm_domain,
0058      is present.
0059 
0060   2. Device type of the device, if both dev->type and dev->type->pm are present.
0061 
0062   3. Device class of the device, if both dev->class and dev->class->pm are
0063      present.
0064 
0065   4. Bus type of the device, if both dev->bus and dev->bus->pm are present.
0066 
0067 If the subsystem chosen by applying the above rules doesn't provide the relevant
0068 callback, the PM core will invoke the corresponding driver callback stored in
0069 dev->driver->pm directly (if present).
0070 
0071 The PM core always checks which callback to use in the order given above, so the
0072 priority order of callbacks from high to low is: PM domain, device type, class
0073 and bus type.  Moreover, the high-priority one will always take precedence over
0074 a low-priority one.  The PM domain, bus type, device type and class callbacks
0075 are referred to as subsystem-level callbacks in what follows.
0076 
0077 By default, the callbacks are always invoked in process context with interrupts
0078 enabled.  However, the pm_runtime_irq_safe() helper function can be used to tell
0079 the PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume()
0080 and ->runtime_idle() callbacks for the given device in atomic context with
0081 interrupts disabled.  This implies that the callback routines in question must
0082 not block or sleep, but it also means that the synchronous helper functions
0083 listed at the end of Section 4 may be used for that device within an interrupt
0084 handler or generally in an atomic context.
0085 
0086 The subsystem-level suspend callback, if present, is _entirely_ _responsible_
0087 for handling the suspend of the device as appropriate, which may, but need not
0088 include executing the device driver's own ->runtime_suspend() callback (from the
0089 PM core's point of view it is not necessary to implement a ->runtime_suspend()
0090 callback in a device driver as long as the subsystem-level suspend callback
0091 knows what to do to handle the device).
0092 
0093   * Once the subsystem-level suspend callback (or the driver suspend callback,
0094     if invoked directly) has completed successfully for the given device, the PM
0095     core regards the device as suspended, which need not mean that it has been
0096     put into a low power state.  It is supposed to mean, however, that the
0097     device will not process data and will not communicate with the CPU(s) and
0098     RAM until the appropriate resume callback is executed for it.  The runtime
0099     PM status of a device after successful execution of the suspend callback is
0100     'suspended'.
0101 
0102   * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM
0103     status remains 'active', which means that the device _must_ be fully
0104     operational afterwards.
0105 
0106   * If the suspend callback returns an error code different from -EBUSY and
0107     -EAGAIN, the PM core regards this as a fatal error and will refuse to run
0108     the helper functions described in Section 4 for the device until its status
0109     is directly set to  either 'active', or 'suspended' (the PM core provides
0110     special helper functions for this purpose).
0111 
0112 In particular, if the driver requires remote wakeup capability (i.e. hardware
0113 mechanism allowing the device to request a change of its power state, such as
0114 PCI PME) for proper functioning and device_can_wakeup() returns 'false' for the
0115 device, then ->runtime_suspend() should return -EBUSY.  On the other hand, if
0116 device_can_wakeup() returns 'true' for the device and the device is put into a
0117 low-power state during the execution of the suspend callback, it is expected
0118 that remote wakeup will be enabled for the device.  Generally, remote wakeup
0119 should be enabled for all input devices put into low-power states at run time.
0120 
0121 The subsystem-level resume callback, if present, is **entirely responsible** for
0122 handling the resume of the device as appropriate, which may, but need not
0123 include executing the device driver's own ->runtime_resume() callback (from the
0124 PM core's point of view it is not necessary to implement a ->runtime_resume()
0125 callback in a device driver as long as the subsystem-level resume callback knows
0126 what to do to handle the device).
0127 
0128   * Once the subsystem-level resume callback (or the driver resume callback, if
0129     invoked directly) has completed successfully, the PM core regards the device
0130     as fully operational, which means that the device _must_ be able to complete
0131     I/O operations as needed.  The runtime PM status of the device is then
0132     'active'.
0133 
0134   * If the resume callback returns an error code, the PM core regards this as a
0135     fatal error and will refuse to run the helper functions described in Section
0136     4 for the device, until its status is directly set to either 'active', or
0137     'suspended' (by means of special helper functions provided by the PM core
0138     for this purpose).
0139 
0140 The idle callback (a subsystem-level one, if present, or the driver one) is
0141 executed by the PM core whenever the device appears to be idle, which is
0142 indicated to the PM core by two counters, the device's usage counter and the
0143 counter of 'active' children of the device.
0144 
0145   * If any of these counters is decreased using a helper function provided by
0146     the PM core and it turns out to be equal to zero, the other counter is
0147     checked.  If that counter also is equal to zero, the PM core executes the
0148     idle callback with the device as its argument.
0149 
0150 The action performed by the idle callback is totally dependent on the subsystem
0151 (or driver) in question, but the expected and recommended action is to check
0152 if the device can be suspended (i.e. if all of the conditions necessary for
0153 suspending the device are satisfied) and to queue up a suspend request for the
0154 device in that case.  If there is no idle callback, or if the callback returns
0155 0, then the PM core will attempt to carry out a runtime suspend of the device,
0156 also respecting devices configured for autosuspend.  In essence this means a
0157 call to pm_runtime_autosuspend() (do note that drivers needs to update the
0158 device last busy mark, pm_runtime_mark_last_busy(), to control the delay under
0159 this circumstance).  To prevent this (for example, if the callback routine has
0160 started a delayed suspend), the routine must return a non-zero value.  Negative
0161 error return codes are ignored by the PM core.
0162 
0163 The helper functions provided by the PM core, described in Section 4, guarantee
0164 that the following constraints are met with respect to runtime PM callbacks for
0165 one device:
0166 
0167 (1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
0168     ->runtime_suspend() in parallel with ->runtime_resume() or with another
0169     instance of ->runtime_suspend() for the same device) with the exception that
0170     ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
0171     ->runtime_idle() (although ->runtime_idle() will not be started while any
0172     of the other callbacks is being executed for the same device).
0173 
0174 (2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
0175     devices (i.e. the PM core will only execute ->runtime_idle() or
0176     ->runtime_suspend() for the devices the runtime PM status of which is
0177     'active').
0178 
0179 (3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
0180     the usage counter of which is equal to zero _and_ either the counter of
0181     'active' children of which is equal to zero, or the 'power.ignore_children'
0182     flag of which is set.
0183 
0184 (4) ->runtime_resume() can only be executed for 'suspended' devices  (i.e. the
0185     PM core will only execute ->runtime_resume() for the devices the runtime
0186     PM status of which is 'suspended').
0187 
0188 Additionally, the helper functions provided by the PM core obey the following
0189 rules:
0190 
0191   * If ->runtime_suspend() is about to be executed or there's a pending request
0192     to execute it, ->runtime_idle() will not be executed for the same device.
0193 
0194   * A request to execute or to schedule the execution of ->runtime_suspend()
0195     will cancel any pending requests to execute ->runtime_idle() for the same
0196     device.
0197 
0198   * If ->runtime_resume() is about to be executed or there's a pending request
0199     to execute it, the other callbacks will not be executed for the same device.
0200 
0201   * A request to execute ->runtime_resume() will cancel any pending or
0202     scheduled requests to execute the other callbacks for the same device,
0203     except for scheduled autosuspends.
0204 
0205 3. Runtime PM Device Fields
0206 ===========================
0207 
0208 The following device runtime PM fields are present in 'struct dev_pm_info', as
0209 defined in include/linux/pm.h:
0210 
0211   `struct timer_list suspend_timer;`
0212     - timer used for scheduling (delayed) suspend and autosuspend requests
0213 
0214   `unsigned long timer_expires;`
0215     - timer expiration time, in jiffies (if this is different from zero, the
0216       timer is running and will expire at that time, otherwise the timer is not
0217       running)
0218 
0219   `struct work_struct work;`
0220     - work structure used for queuing up requests (i.e. work items in pm_wq)
0221 
0222   `wait_queue_head_t wait_queue;`
0223     - wait queue used if any of the helper functions needs to wait for another
0224       one to complete
0225 
0226   `spinlock_t lock;`
0227     - lock used for synchronization
0228 
0229   `atomic_t usage_count;`
0230     - the usage counter of the device
0231 
0232   `atomic_t child_count;`
0233     - the count of 'active' children of the device
0234 
0235   `unsigned int ignore_children;`
0236     - if set, the value of child_count is ignored (but still updated)
0237 
0238   `unsigned int disable_depth;`
0239     - used for disabling the helper functions (they work normally if this is
0240       equal to zero); the initial value of it is 1 (i.e. runtime PM is
0241       initially disabled for all devices)
0242 
0243   `int runtime_error;`
0244     - if set, there was a fatal error (one of the callbacks returned error code
0245       as described in Section 2), so the helper functions will not work until
0246       this flag is cleared; this is the error code returned by the failing
0247       callback
0248 
0249   `unsigned int idle_notification;`
0250     - if set, ->runtime_idle() is being executed
0251 
0252   `unsigned int request_pending;`
0253     - if set, there's a pending request (i.e. a work item queued up into pm_wq)
0254 
0255   `enum rpm_request request;`
0256     - type of request that's pending (valid if request_pending is set)
0257 
0258   `unsigned int deferred_resume;`
0259     - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
0260       being executed for that device and it is not practical to wait for the
0261       suspend to complete; means "start a resume as soon as you've suspended"
0262 
0263   `enum rpm_status runtime_status;`
0264     - the runtime PM status of the device; this field's initial value is
0265       RPM_SUSPENDED, which means that each device is initially regarded by the
0266       PM core as 'suspended', regardless of its real hardware status
0267 
0268   `enum rpm_status last_status;`
0269     - the last runtime PM status of the device captured before disabling runtime
0270       PM for it (invalid initially and when disable_depth is 0)
0271 
0272   `unsigned int runtime_auto;`
0273     - if set, indicates that the user space has allowed the device driver to
0274       power manage the device at run time via the /sys/devices/.../power/control
0275       `interface;` it may only be modified with the help of the
0276       pm_runtime_allow() and pm_runtime_forbid() helper functions
0277 
0278   `unsigned int no_callbacks;`
0279     - indicates that the device does not use the runtime PM callbacks (see
0280       Section 8); it may be modified only by the pm_runtime_no_callbacks()
0281       helper function
0282 
0283   `unsigned int irq_safe;`
0284     - indicates that the ->runtime_suspend() and ->runtime_resume() callbacks
0285       will be invoked with the spinlock held and interrupts disabled
0286 
0287   `unsigned int use_autosuspend;`
0288     - indicates that the device's driver supports delayed autosuspend (see
0289       Section 9); it may be modified only by the
0290       pm_runtime{_dont}_use_autosuspend() helper functions
0291 
0292   `unsigned int timer_autosuspends;`
0293     - indicates that the PM core should attempt to carry out an autosuspend
0294       when the timer expires rather than a normal suspend
0295 
0296   `int autosuspend_delay;`
0297     - the delay time (in milliseconds) to be used for autosuspend
0298 
0299   `unsigned long last_busy;`
0300     - the time (in jiffies) when the pm_runtime_mark_last_busy() helper
0301       function was last called for this device; used in calculating inactivity
0302       periods for autosuspend
0303 
0304 All of the above fields are members of the 'power' member of 'struct device'.
0305 
0306 4. Runtime PM Device Helper Functions
0307 =====================================
0308 
0309 The following runtime PM helper functions are defined in
0310 drivers/base/power/runtime.c and include/linux/pm_runtime.h:
0311 
0312   `void pm_runtime_init(struct device *dev);`
0313     - initialize the device runtime PM fields in 'struct dev_pm_info'
0314 
0315   `void pm_runtime_remove(struct device *dev);`
0316     - make sure that the runtime PM of the device will be disabled after
0317       removing the device from device hierarchy
0318 
0319   `int pm_runtime_idle(struct device *dev);`
0320     - execute the subsystem-level idle callback for the device; returns an
0321       error code on failure, where -EINPROGRESS means that ->runtime_idle() is
0322       already being executed; if there is no callback or the callback returns 0
0323       then run pm_runtime_autosuspend(dev) and return its result
0324 
0325   `int pm_runtime_suspend(struct device *dev);`
0326     - execute the subsystem-level suspend callback for the device; returns 0 on
0327       success, 1 if the device's runtime PM status was already 'suspended', or
0328       error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
0329       to suspend the device again in future and -EACCES means that
0330       'power.disable_depth' is different from 0
0331 
0332   `int pm_runtime_autosuspend(struct device *dev);`
0333     - same as pm_runtime_suspend() except that the autosuspend delay is taken
0334       `into account;` if pm_runtime_autosuspend_expiration() says the delay has
0335       not yet expired then an autosuspend is scheduled for the appropriate time
0336       and 0 is returned
0337 
0338   `int pm_runtime_resume(struct device *dev);`
0339     - execute the subsystem-level resume callback for the device; returns 0 on
0340       success, 1 if the device's runtime PM status is already 'active' (also if
0341       'power.disable_depth' is nonzero, but the status was 'active' when it was
0342       changing from 0 to 1) or error code on failure, where -EAGAIN means it may
0343       be safe to attempt to resume the device again in future, but
0344       'power.runtime_error' should be checked additionally, and -EACCES means
0345       that the callback could not be run, because 'power.disable_depth' was
0346       different from 0
0347 
0348   `int pm_runtime_resume_and_get(struct device *dev);`
0349     - run pm_runtime_resume(dev) and if successful, increment the device's
0350       usage counter; return the result of pm_runtime_resume
0351 
0352   `int pm_request_idle(struct device *dev);`
0353     - submit a request to execute the subsystem-level idle callback for the
0354       device (the request is represented by a work item in pm_wq); returns 0 on
0355       success or error code if the request has not been queued up
0356 
0357   `int pm_request_autosuspend(struct device *dev);`
0358     - schedule the execution of the subsystem-level suspend callback for the
0359       device when the autosuspend delay has expired; if the delay has already
0360       expired then the work item is queued up immediately
0361 
0362   `int pm_schedule_suspend(struct device *dev, unsigned int delay);`
0363     - schedule the execution of the subsystem-level suspend callback for the
0364       device in future, where 'delay' is the time to wait before queuing up a
0365       suspend work item in pm_wq, in milliseconds (if 'delay' is zero, the work
0366       item is queued up immediately); returns 0 on success, 1 if the device's PM
0367       runtime status was already 'suspended', or error code if the request
0368       hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
0369       ->runtime_suspend() is already scheduled and not yet expired, the new
0370       value of 'delay' will be used as the time to wait
0371 
0372   `int pm_request_resume(struct device *dev);`
0373     - submit a request to execute the subsystem-level resume callback for the
0374       device (the request is represented by a work item in pm_wq); returns 0 on
0375       success, 1 if the device's runtime PM status was already 'active', or
0376       error code if the request hasn't been queued up
0377 
0378   `void pm_runtime_get_noresume(struct device *dev);`
0379     - increment the device's usage counter
0380 
0381   `int pm_runtime_get(struct device *dev);`
0382     - increment the device's usage counter, run pm_request_resume(dev) and
0383       return its result
0384 
0385   `int pm_runtime_get_sync(struct device *dev);`
0386     - increment the device's usage counter, run pm_runtime_resume(dev) and
0387       return its result;
0388       note that it does not drop the device's usage counter on errors, so
0389       consider using pm_runtime_resume_and_get() instead of it, especially
0390       if its return value is checked by the caller, as this is likely to
0391       result in cleaner code.
0392 
0393   `int pm_runtime_get_if_in_use(struct device *dev);`
0394     - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the
0395       runtime PM status is RPM_ACTIVE and the runtime PM usage counter is
0396       nonzero, increment the counter and return 1; otherwise return 0 without
0397       changing the counter
0398 
0399   `int pm_runtime_get_if_active(struct device *dev, bool ign_usage_count);`
0400     - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the
0401       runtime PM status is RPM_ACTIVE, and either ign_usage_count is true
0402       or the device's usage_count is non-zero, increment the counter and
0403       return 1; otherwise return 0 without changing the counter
0404 
0405   `void pm_runtime_put_noidle(struct device *dev);`
0406     - decrement the device's usage counter
0407 
0408   `int pm_runtime_put(struct device *dev);`
0409     - decrement the device's usage counter; if the result is 0 then run
0410       pm_request_idle(dev) and return its result
0411 
0412   `int pm_runtime_put_autosuspend(struct device *dev);`
0413     - decrement the device's usage counter; if the result is 0 then run
0414       pm_request_autosuspend(dev) and return its result
0415 
0416   `int pm_runtime_put_sync(struct device *dev);`
0417     - decrement the device's usage counter; if the result is 0 then run
0418       pm_runtime_idle(dev) and return its result
0419 
0420   `int pm_runtime_put_sync_suspend(struct device *dev);`
0421     - decrement the device's usage counter; if the result is 0 then run
0422       pm_runtime_suspend(dev) and return its result
0423 
0424   `int pm_runtime_put_sync_autosuspend(struct device *dev);`
0425     - decrement the device's usage counter; if the result is 0 then run
0426       pm_runtime_autosuspend(dev) and return its result
0427 
0428   `void pm_runtime_enable(struct device *dev);`
0429     - decrement the device's 'power.disable_depth' field; if that field is equal
0430       to zero, the runtime PM helper functions can execute subsystem-level
0431       callbacks described in Section 2 for the device
0432 
0433   `int pm_runtime_disable(struct device *dev);`
0434     - increment the device's 'power.disable_depth' field (if the value of that
0435       field was previously zero, this prevents subsystem-level runtime PM
0436       callbacks from being run for the device), make sure that all of the
0437       pending runtime PM operations on the device are either completed or
0438       canceled; returns 1 if there was a resume request pending and it was
0439       necessary to execute the subsystem-level resume callback for the device
0440       to satisfy that request, otherwise 0 is returned
0441 
0442   `int pm_runtime_barrier(struct device *dev);`
0443     - check if there's a resume request pending for the device and resume it
0444       (synchronously) in that case, cancel any other pending runtime PM requests
0445       regarding it and wait for all runtime PM operations on it in progress to
0446       complete; returns 1 if there was a resume request pending and it was
0447       necessary to execute the subsystem-level resume callback for the device to
0448       satisfy that request, otherwise 0 is returned
0449 
0450   `void pm_suspend_ignore_children(struct device *dev, bool enable);`
0451     - set/unset the power.ignore_children flag of the device
0452 
0453   `int pm_runtime_set_active(struct device *dev);`
0454     - clear the device's 'power.runtime_error' flag, set the device's runtime
0455       PM status to 'active' and update its parent's counter of 'active'
0456       children as appropriate (it is only valid to use this function if
0457       'power.runtime_error' is set or 'power.disable_depth' is greater than
0458       zero); it will fail and return error code if the device has a parent
0459       which is not active and the 'power.ignore_children' flag of which is unset
0460 
0461   `void pm_runtime_set_suspended(struct device *dev);`
0462     - clear the device's 'power.runtime_error' flag, set the device's runtime
0463       PM status to 'suspended' and update its parent's counter of 'active'
0464       children as appropriate (it is only valid to use this function if
0465       'power.runtime_error' is set or 'power.disable_depth' is greater than
0466       zero)
0467 
0468   `bool pm_runtime_active(struct device *dev);`
0469     - return true if the device's runtime PM status is 'active' or its
0470       'power.disable_depth' field is not equal to zero, or false otherwise
0471 
0472   `bool pm_runtime_suspended(struct device *dev);`
0473     - return true if the device's runtime PM status is 'suspended' and its
0474       'power.disable_depth' field is equal to zero, or false otherwise
0475 
0476   `bool pm_runtime_status_suspended(struct device *dev);`
0477     - return true if the device's runtime PM status is 'suspended'
0478 
0479   `void pm_runtime_allow(struct device *dev);`
0480     - set the power.runtime_auto flag for the device and decrease its usage
0481       counter (used by the /sys/devices/.../power/control interface to
0482       effectively allow the device to be power managed at run time)
0483 
0484   `void pm_runtime_forbid(struct device *dev);`
0485     - unset the power.runtime_auto flag for the device and increase its usage
0486       counter (used by the /sys/devices/.../power/control interface to
0487       effectively prevent the device from being power managed at run time)
0488 
0489   `void pm_runtime_no_callbacks(struct device *dev);`
0490     - set the power.no_callbacks flag for the device and remove the runtime
0491       PM attributes from /sys/devices/.../power (or prevent them from being
0492       added when the device is registered)
0493 
0494   `void pm_runtime_irq_safe(struct device *dev);`
0495     - set the power.irq_safe flag for the device, causing the runtime-PM
0496       callbacks to be invoked with interrupts off
0497 
0498   `bool pm_runtime_is_irq_safe(struct device *dev);`
0499     - return true if power.irq_safe flag was set for the device, causing
0500       the runtime-PM callbacks to be invoked with interrupts off
0501 
0502   `void pm_runtime_mark_last_busy(struct device *dev);`
0503     - set the power.last_busy field to the current time
0504 
0505   `void pm_runtime_use_autosuspend(struct device *dev);`
0506     - set the power.use_autosuspend flag, enabling autosuspend delays; call
0507       pm_runtime_get_sync if the flag was previously cleared and
0508       power.autosuspend_delay is negative
0509 
0510   `void pm_runtime_dont_use_autosuspend(struct device *dev);`
0511     - clear the power.use_autosuspend flag, disabling autosuspend delays;
0512       decrement the device's usage counter if the flag was previously set and
0513       power.autosuspend_delay is negative; call pm_runtime_idle
0514 
0515   `void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);`
0516     - set the power.autosuspend_delay value to 'delay' (expressed in
0517       milliseconds); if 'delay' is negative then runtime suspends are
0518       prevented; if power.use_autosuspend is set, pm_runtime_get_sync may be
0519       called or the device's usage counter may be decremented and
0520       pm_runtime_idle called depending on if power.autosuspend_delay is
0521       changed to or from a negative value; if power.use_autosuspend is clear,
0522       pm_runtime_idle is called
0523 
0524   `unsigned long pm_runtime_autosuspend_expiration(struct device *dev);`
0525     - calculate the time when the current autosuspend delay period will expire,
0526       based on power.last_busy and power.autosuspend_delay; if the delay time
0527       is 1000 ms or larger then the expiration time is rounded up to the
0528       nearest second; returns 0 if the delay period has already expired or
0529       power.use_autosuspend isn't set, otherwise returns the expiration time
0530       in jiffies
0531 
0532 It is safe to execute the following helper functions from interrupt context:
0533 
0534 - pm_request_idle()
0535 - pm_request_autosuspend()
0536 - pm_schedule_suspend()
0537 - pm_request_resume()
0538 - pm_runtime_get_noresume()
0539 - pm_runtime_get()
0540 - pm_runtime_put_noidle()
0541 - pm_runtime_put()
0542 - pm_runtime_put_autosuspend()
0543 - pm_runtime_enable()
0544 - pm_suspend_ignore_children()
0545 - pm_runtime_set_active()
0546 - pm_runtime_set_suspended()
0547 - pm_runtime_suspended()
0548 - pm_runtime_mark_last_busy()
0549 - pm_runtime_autosuspend_expiration()
0550 
0551 If pm_runtime_irq_safe() has been called for a device then the following helper
0552 functions may also be used in interrupt context:
0553 
0554 - pm_runtime_idle()
0555 - pm_runtime_suspend()
0556 - pm_runtime_autosuspend()
0557 - pm_runtime_resume()
0558 - pm_runtime_get_sync()
0559 - pm_runtime_put_sync()
0560 - pm_runtime_put_sync_suspend()
0561 - pm_runtime_put_sync_autosuspend()
0562 
0563 5. Runtime PM Initialization, Device Probing and Removal
0564 ========================================================
0565 
0566 Initially, the runtime PM is disabled for all devices, which means that the
0567 majority of the runtime PM helper functions described in Section 4 will return
0568 -EAGAIN until pm_runtime_enable() is called for the device.
0569 
0570 In addition to that, the initial runtime PM status of all devices is
0571 'suspended', but it need not reflect the actual physical state of the device.
0572 Thus, if the device is initially active (i.e. it is able to process I/O), its
0573 runtime PM status must be changed to 'active', with the help of
0574 pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
0575 
0576 However, if the device has a parent and the parent's runtime PM is enabled,
0577 calling pm_runtime_set_active() for the device will affect the parent, unless
0578 the parent's 'power.ignore_children' flag is set.  Namely, in that case the
0579 parent won't be able to suspend at run time, using the PM core's helper
0580 functions, as long as the child's status is 'active', even if the child's
0581 runtime PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
0582 the child yet or pm_runtime_disable() has been called for it).  For this reason,
0583 once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
0584 should be called for it too as soon as reasonably possible or its runtime PM
0585 status should be changed back to 'suspended' with the help of
0586 pm_runtime_set_suspended().
0587 
0588 If the default initial runtime PM status of the device (i.e. 'suspended')
0589 reflects the actual state of the device, its bus type's or its driver's
0590 ->probe() callback will likely need to wake it up using one of the PM core's
0591 helper functions described in Section 4.  In that case, pm_runtime_resume()
0592 should be used.  Of course, for this purpose the device's runtime PM has to be
0593 enabled earlier by calling pm_runtime_enable().
0594 
0595 Note, if the device may execute pm_runtime calls during the probe (such as
0596 if it is registered with a subsystem that may call back in) then the
0597 pm_runtime_get_sync() call paired with a pm_runtime_put() call will be
0598 appropriate to ensure that the device is not put back to sleep during the
0599 probe. This can happen with systems such as the network device layer.
0600 
0601 It may be desirable to suspend the device once ->probe() has finished.
0602 Therefore the driver core uses the asynchronous pm_request_idle() to submit a
0603 request to execute the subsystem-level idle callback for the device at that
0604 time.  A driver that makes use of the runtime autosuspend feature may want to
0605 update the last busy mark before returning from ->probe().
0606 
0607 Moreover, the driver core prevents runtime PM callbacks from racing with the bus
0608 notifier callback in __device_release_driver(), which is necessary because the
0609 notifier is used by some subsystems to carry out operations affecting the
0610 runtime PM functionality.  It does so by calling pm_runtime_get_sync() before
0611 driver_sysfs_remove() and the BUS_NOTIFY_UNBIND_DRIVER notifications.  This
0612 resumes the device if it's in the suspended state and prevents it from
0613 being suspended again while those routines are being executed.
0614 
0615 To allow bus types and drivers to put devices into the suspended state by
0616 calling pm_runtime_suspend() from their ->remove() routines, the driver core
0617 executes pm_runtime_put_sync() after running the BUS_NOTIFY_UNBIND_DRIVER
0618 notifications in __device_release_driver().  This requires bus types and
0619 drivers to make their ->remove() callbacks avoid races with runtime PM directly,
0620 but it also allows more flexibility in the handling of devices during the
0621 removal of their drivers.
0622 
0623 Drivers in ->remove() callback should undo the runtime PM changes done
0624 in ->probe(). Usually this means calling pm_runtime_disable(),
0625 pm_runtime_dont_use_autosuspend() etc.
0626 
0627 The user space can effectively disallow the driver of the device to power manage
0628 it at run time by changing the value of its /sys/devices/.../power/control
0629 attribute to "on", which causes pm_runtime_forbid() to be called.  In principle,
0630 this mechanism may also be used by the driver to effectively turn off the
0631 runtime power management of the device until the user space turns it on.
0632 Namely, during the initialization the driver can make sure that the runtime PM
0633 status of the device is 'active' and call pm_runtime_forbid().  It should be
0634 noted, however, that if the user space has already intentionally changed the
0635 value of /sys/devices/.../power/control to "auto" to allow the driver to power
0636 manage the device at run time, the driver may confuse it by using
0637 pm_runtime_forbid() this way.
0638 
0639 6. Runtime PM and System Sleep
0640 ==============================
0641 
0642 Runtime PM and system sleep (i.e., system suspend and hibernation, also known
0643 as suspend-to-RAM and suspend-to-disk) interact with each other in a couple of
0644 ways.  If a device is active when a system sleep starts, everything is
0645 straightforward.  But what should happen if the device is already suspended?
0646 
0647 The device may have different wake-up settings for runtime PM and system sleep.
0648 For example, remote wake-up may be enabled for runtime suspend but disallowed
0649 for system sleep (device_may_wakeup(dev) returns 'false').  When this happens,
0650 the subsystem-level system suspend callback is responsible for changing the
0651 device's wake-up setting (it may leave that to the device driver's system
0652 suspend routine).  It may be necessary to resume the device and suspend it again
0653 in order to do so.  The same is true if the driver uses different power levels
0654 or other settings for runtime suspend and system sleep.
0655 
0656 During system resume, the simplest approach is to bring all devices back to full
0657 power, even if they had been suspended before the system suspend began.  There
0658 are several reasons for this, including:
0659 
0660   * The device might need to switch power levels, wake-up settings, etc.
0661 
0662   * Remote wake-up events might have been lost by the firmware.
0663 
0664   * The device's children may need the device to be at full power in order
0665     to resume themselves.
0666 
0667   * The driver's idea of the device state may not agree with the device's
0668     physical state.  This can happen during resume from hibernation.
0669 
0670   * The device might need to be reset.
0671 
0672   * Even though the device was suspended, if its usage counter was > 0 then most
0673     likely it would need a runtime resume in the near future anyway.
0674 
0675 If the device had been suspended before the system suspend began and it's
0676 brought back to full power during resume, then its runtime PM status will have
0677 to be updated to reflect the actual post-system sleep status.  The way to do
0678 this is:
0679 
0680          - pm_runtime_disable(dev);
0681          - pm_runtime_set_active(dev);
0682          - pm_runtime_enable(dev);
0683 
0684 The PM core always increments the runtime usage counter before calling the
0685 ->suspend() callback and decrements it after calling the ->resume() callback.
0686 Hence disabling runtime PM temporarily like this will not cause any runtime
0687 suspend attempts to be permanently lost.  If the usage count goes to zero
0688 following the return of the ->resume() callback, the ->runtime_idle() callback
0689 will be invoked as usual.
0690 
0691 On some systems, however, system sleep is not entered through a global firmware
0692 or hardware operation.  Instead, all hardware components are put into low-power
0693 states directly by the kernel in a coordinated way.  Then, the system sleep
0694 state effectively follows from the states the hardware components end up in
0695 and the system is woken up from that state by a hardware interrupt or a similar
0696 mechanism entirely under the kernel's control.  As a result, the kernel never
0697 gives control away and the states of all devices during resume are precisely
0698 known to it.  If that is the case and none of the situations listed above takes
0699 place (in particular, if the system is not waking up from hibernation), it may
0700 be more efficient to leave the devices that had been suspended before the system
0701 suspend began in the suspended state.
0702 
0703 To this end, the PM core provides a mechanism allowing some coordination between
0704 different levels of device hierarchy.  Namely, if a system suspend .prepare()
0705 callback returns a positive number for a device, that indicates to the PM core
0706 that the device appears to be runtime-suspended and its state is fine, so it
0707 may be left in runtime suspend provided that all of its descendants are also
0708 left in runtime suspend.  If that happens, the PM core will not execute any
0709 system suspend and resume callbacks for all of those devices, except for the
0710 .complete() callback, which is then entirely responsible for handling the device
0711 as appropriate.  This only applies to system suspend transitions that are not
0712 related to hibernation (see Documentation/driver-api/pm/devices.rst for more
0713 information).
0714 
0715 The PM core does its best to reduce the probability of race conditions between
0716 the runtime PM and system suspend/resume (and hibernation) callbacks by carrying
0717 out the following operations:
0718 
0719   * During system suspend pm_runtime_get_noresume() is called for every device
0720     right before executing the subsystem-level .prepare() callback for it and
0721     pm_runtime_barrier() is called for every device right before executing the
0722     subsystem-level .suspend() callback for it.  In addition to that the PM core
0723     calls __pm_runtime_disable() with 'false' as the second argument for every
0724     device right before executing the subsystem-level .suspend_late() callback
0725     for it.
0726 
0727   * During system resume pm_runtime_enable() and pm_runtime_put() are called for
0728     every device right after executing the subsystem-level .resume_early()
0729     callback and right after executing the subsystem-level .complete() callback
0730     for it, respectively.
0731 
0732 7. Generic subsystem callbacks
0733 
0734 Subsystems may wish to conserve code space by using the set of generic power
0735 management callbacks provided by the PM core, defined in
0736 driver/base/power/generic_ops.c:
0737 
0738   `int pm_generic_runtime_suspend(struct device *dev);`
0739     - invoke the ->runtime_suspend() callback provided by the driver of this
0740       device and return its result, or return 0 if not defined
0741 
0742   `int pm_generic_runtime_resume(struct device *dev);`
0743     - invoke the ->runtime_resume() callback provided by the driver of this
0744       device and return its result, or return 0 if not defined
0745 
0746   `int pm_generic_suspend(struct device *dev);`
0747     - if the device has not been suspended at run time, invoke the ->suspend()
0748       callback provided by its driver and return its result, or return 0 if not
0749       defined
0750 
0751   `int pm_generic_suspend_noirq(struct device *dev);`
0752     - if pm_runtime_suspended(dev) returns "false", invoke the ->suspend_noirq()
0753       callback provided by the device's driver and return its result, or return
0754       0 if not defined
0755 
0756   `int pm_generic_resume(struct device *dev);`
0757     - invoke the ->resume() callback provided by the driver of this device and,
0758       if successful, change the device's runtime PM status to 'active'
0759 
0760   `int pm_generic_resume_noirq(struct device *dev);`
0761     - invoke the ->resume_noirq() callback provided by the driver of this device
0762 
0763   `int pm_generic_freeze(struct device *dev);`
0764     - if the device has not been suspended at run time, invoke the ->freeze()
0765       callback provided by its driver and return its result, or return 0 if not
0766       defined
0767 
0768   `int pm_generic_freeze_noirq(struct device *dev);`
0769     - if pm_runtime_suspended(dev) returns "false", invoke the ->freeze_noirq()
0770       callback provided by the device's driver and return its result, or return
0771       0 if not defined
0772 
0773   `int pm_generic_thaw(struct device *dev);`
0774     - if the device has not been suspended at run time, invoke the ->thaw()
0775       callback provided by its driver and return its result, or return 0 if not
0776       defined
0777 
0778   `int pm_generic_thaw_noirq(struct device *dev);`
0779     - if pm_runtime_suspended(dev) returns "false", invoke the ->thaw_noirq()
0780       callback provided by the device's driver and return its result, or return
0781       0 if not defined
0782 
0783   `int pm_generic_poweroff(struct device *dev);`
0784     - if the device has not been suspended at run time, invoke the ->poweroff()
0785       callback provided by its driver and return its result, or return 0 if not
0786       defined
0787 
0788   `int pm_generic_poweroff_noirq(struct device *dev);`
0789     - if pm_runtime_suspended(dev) returns "false", run the ->poweroff_noirq()
0790       callback provided by the device's driver and return its result, or return
0791       0 if not defined
0792 
0793   `int pm_generic_restore(struct device *dev);`
0794     - invoke the ->restore() callback provided by the driver of this device and,
0795       if successful, change the device's runtime PM status to 'active'
0796 
0797   `int pm_generic_restore_noirq(struct device *dev);`
0798     - invoke the ->restore_noirq() callback provided by the device's driver
0799 
0800 These functions are the defaults used by the PM core if a subsystem doesn't
0801 provide its own callbacks for ->runtime_idle(), ->runtime_suspend(),
0802 ->runtime_resume(), ->suspend(), ->suspend_noirq(), ->resume(),
0803 ->resume_noirq(), ->freeze(), ->freeze_noirq(), ->thaw(), ->thaw_noirq(),
0804 ->poweroff(), ->poweroff_noirq(), ->restore(), ->restore_noirq() in the
0805 subsystem-level dev_pm_ops structure.
0806 
0807 Device drivers that wish to use the same function as a system suspend, freeze,
0808 poweroff and runtime suspend callback, and similarly for system resume, thaw,
0809 restore, and runtime resume, can achieve this with the help of the
0810 UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its
0811 last argument to NULL).
0812 
0813 8. "No-Callback" Devices
0814 ========================
0815 
0816 Some "devices" are only logical sub-devices of their parent and cannot be
0817 power-managed on their own.  (The prototype example is a USB interface.  Entire
0818 USB devices can go into low-power mode or send wake-up requests, but neither is
0819 possible for individual interfaces.)  The drivers for these devices have no
0820 need of runtime PM callbacks; if the callbacks did exist, ->runtime_suspend()
0821 and ->runtime_resume() would always return 0 without doing anything else and
0822 ->runtime_idle() would always call pm_runtime_suspend().
0823 
0824 Subsystems can tell the PM core about these devices by calling
0825 pm_runtime_no_callbacks().  This should be done after the device structure is
0826 initialized and before it is registered (although after device registration is
0827 also okay).  The routine will set the device's power.no_callbacks flag and
0828 prevent the non-debugging runtime PM sysfs attributes from being created.
0829 
0830 When power.no_callbacks is set, the PM core will not invoke the
0831 ->runtime_idle(), ->runtime_suspend(), or ->runtime_resume() callbacks.
0832 Instead it will assume that suspends and resumes always succeed and that idle
0833 devices should be suspended.
0834 
0835 As a consequence, the PM core will never directly inform the device's subsystem
0836 or driver about runtime power changes.  Instead, the driver for the device's
0837 parent must take responsibility for telling the device's driver when the
0838 parent's power state changes.
0839 
0840 Note that, in some cases it may not be desirable for subsystems/drivers to call
0841 pm_runtime_no_callbacks() for their devices. This could be because a subset of
0842 the runtime PM callbacks needs to be implemented, a platform dependent PM
0843 domain could get attached to the device or that the device is power managed
0844 through a supplier device link. For these reasons and to avoid boilerplate code
0845 in subsystems/drivers, the PM core allows runtime PM callbacks to be
0846 unassigned. More precisely, if a callback pointer is NULL, the PM core will act
0847 as though there was a callback and it returned 0.
0848 
0849 9. Autosuspend, or automatically-delayed suspends
0850 =================================================
0851 
0852 Changing a device's power state isn't free; it requires both time and energy.
0853 A device should be put in a low-power state only when there's some reason to
0854 think it will remain in that state for a substantial time.  A common heuristic
0855 says that a device which hasn't been used for a while is liable to remain
0856 unused; following this advice, drivers should not allow devices to be suspended
0857 at runtime until they have been inactive for some minimum period.  Even when
0858 the heuristic ends up being non-optimal, it will still prevent devices from
0859 "bouncing" too rapidly between low-power and full-power states.
0860 
0861 The term "autosuspend" is an historical remnant.  It doesn't mean that the
0862 device is automatically suspended (the subsystem or driver still has to call
0863 the appropriate PM routines); rather it means that runtime suspends will
0864 automatically be delayed until the desired period of inactivity has elapsed.
0865 
0866 Inactivity is determined based on the power.last_busy field.  Drivers should
0867 call pm_runtime_mark_last_busy() to update this field after carrying out I/O,
0868 typically just before calling pm_runtime_put_autosuspend().  The desired length
0869 of the inactivity period is a matter of policy.  Subsystems can set this length
0870 initially by calling pm_runtime_set_autosuspend_delay(), but after device
0871 registration the length should be controlled by user space, using the
0872 /sys/devices/.../power/autosuspend_delay_ms attribute.
0873 
0874 In order to use autosuspend, subsystems or drivers must call
0875 pm_runtime_use_autosuspend() (preferably before registering the device), and
0876 thereafter they should use the various `*_autosuspend()` helper functions
0877 instead of the non-autosuspend counterparts::
0878 
0879         Instead of: pm_runtime_suspend    use: pm_runtime_autosuspend;
0880         Instead of: pm_schedule_suspend   use: pm_request_autosuspend;
0881         Instead of: pm_runtime_put        use: pm_runtime_put_autosuspend;
0882         Instead of: pm_runtime_put_sync   use: pm_runtime_put_sync_autosuspend.
0883 
0884 Drivers may also continue to use the non-autosuspend helper functions; they
0885 will behave normally, which means sometimes taking the autosuspend delay into
0886 account (see pm_runtime_idle).
0887 
0888 Under some circumstances a driver or subsystem may want to prevent a device
0889 from autosuspending immediately, even though the usage counter is zero and the
0890 autosuspend delay time has expired.  If the ->runtime_suspend() callback
0891 returns -EAGAIN or -EBUSY, and if the next autosuspend delay expiration time is
0892 in the future (as it normally would be if the callback invoked
0893 pm_runtime_mark_last_busy()), the PM core will automatically reschedule the
0894 autosuspend.  The ->runtime_suspend() callback can't do this rescheduling
0895 itself because no suspend requests of any kind are accepted while the device is
0896 suspending (i.e., while the callback is running).
0897 
0898 The implementation is well suited for asynchronous use in interrupt contexts.
0899 However such use inevitably involves races, because the PM core can't
0900 synchronize ->runtime_suspend() callbacks with the arrival of I/O requests.
0901 This synchronization must be handled by the driver, using its private lock.
0902 Here is a schematic pseudo-code example::
0903 
0904         foo_read_or_write(struct foo_priv *foo, void *data)
0905         {
0906                 lock(&foo->private_lock);
0907                 add_request_to_io_queue(foo, data);
0908                 if (foo->num_pending_requests++ == 0)
0909                         pm_runtime_get(&foo->dev);
0910                 if (!foo->is_suspended)
0911                         foo_process_next_request(foo);
0912                 unlock(&foo->private_lock);
0913         }
0914 
0915         foo_io_completion(struct foo_priv *foo, void *req)
0916         {
0917                 lock(&foo->private_lock);
0918                 if (--foo->num_pending_requests == 0) {
0919                         pm_runtime_mark_last_busy(&foo->dev);
0920                         pm_runtime_put_autosuspend(&foo->dev);
0921                 } else {
0922                         foo_process_next_request(foo);
0923                 }
0924                 unlock(&foo->private_lock);
0925                 /* Send req result back to the user ... */
0926         }
0927 
0928         int foo_runtime_suspend(struct device *dev)
0929         {
0930                 struct foo_priv foo = container_of(dev, ...);
0931                 int ret = 0;
0932 
0933                 lock(&foo->private_lock);
0934                 if (foo->num_pending_requests > 0) {
0935                         ret = -EBUSY;
0936                 } else {
0937                         /* ... suspend the device ... */
0938                         foo->is_suspended = 1;
0939                 }
0940                 unlock(&foo->private_lock);
0941                 return ret;
0942         }
0943 
0944         int foo_runtime_resume(struct device *dev)
0945         {
0946                 struct foo_priv foo = container_of(dev, ...);
0947 
0948                 lock(&foo->private_lock);
0949                 /* ... resume the device ... */
0950                 foo->is_suspended = 0;
0951                 pm_runtime_mark_last_busy(&foo->dev);
0952                 if (foo->num_pending_requests > 0)
0953                         foo_process_next_request(foo);
0954                 unlock(&foo->private_lock);
0955                 return 0;
0956         }
0957 
0958 The important point is that after foo_io_completion() asks for an autosuspend,
0959 the foo_runtime_suspend() callback may race with foo_read_or_write().
0960 Therefore foo_runtime_suspend() has to check whether there are any pending I/O
0961 requests (while holding the private lock) before allowing the suspend to
0962 proceed.
0963 
0964 In addition, the power.autosuspend_delay field can be changed by user space at
0965 any time.  If a driver cares about this, it can call
0966 pm_runtime_autosuspend_expiration() from within the ->runtime_suspend()
0967 callback while holding its private lock.  If the function returns a nonzero
0968 value then the delay has not yet expired and the callback should return
0969 -EAGAIN.