0001 ===================================================================
0002 delays - Information on the various kernel delay / sleep mechanisms
0003 ===================================================================
0004
0005 This document seeks to answer the common question: "What is the
0006 RightWay (TM) to insert a delay?"
0007
0008 This question is most often faced by driver writers who have to
0009 deal with hardware delays and who may not be the most intimately
0010 familiar with the inner workings of the Linux Kernel.
0011
0012
0013 Inserting Delays
0014 ----------------
0015
0016 The first, and most important, question you need to ask is "Is my
0017 code in an atomic context?" This should be followed closely by "Does
0018 it really need to delay in atomic context?" If so...
0019
0020 ATOMIC CONTEXT:
0021 You must use the `*delay` family of functions. These
0022 functions use the jiffie estimation of clock speed
0023 and will busy wait for enough loop cycles to achieve
0024 the desired delay:
0025
0026 ndelay(unsigned long nsecs)
0027 udelay(unsigned long usecs)
0028 mdelay(unsigned long msecs)
0029
0030 udelay is the generally preferred API; ndelay-level
0031 precision may not actually exist on many non-PC devices.
0032
0033 mdelay is macro wrapper around udelay, to account for
0034 possible overflow when passing large arguments to udelay.
0035 In general, use of mdelay is discouraged and code should
0036 be refactored to allow for the use of msleep.
0037
0038 NON-ATOMIC CONTEXT:
0039 You should use the `*sleep[_range]` family of functions.
0040 There are a few more options here, while any of them may
0041 work correctly, using the "right" sleep function will
0042 help the scheduler, power management, and just make your
0043 driver better :)
0044
0045 -- Backed by busy-wait loop:
0046
0047 udelay(unsigned long usecs)
0048
0049 -- Backed by hrtimers:
0050
0051 usleep_range(unsigned long min, unsigned long max)
0052
0053 -- Backed by jiffies / legacy_timers
0054
0055 msleep(unsigned long msecs)
0056 msleep_interruptible(unsigned long msecs)
0057
0058 Unlike the `*delay` family, the underlying mechanism
0059 driving each of these calls varies, thus there are
0060 quirks you should be aware of.
0061
0062
0063 SLEEPING FOR "A FEW" USECS ( < ~10us? ):
0064 * Use udelay
0065
0066 - Why not usleep?
0067 On slower systems, (embedded, OR perhaps a speed-
0068 stepped PC!) the overhead of setting up the hrtimers
0069 for usleep *may* not be worth it. Such an evaluation
0070 will obviously depend on your specific situation, but
0071 it is something to be aware of.
0072
0073 SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms):
0074 * Use usleep_range
0075
0076 - Why not msleep for (1ms - 20ms)?
0077 Explained originally here:
0078 https://lore.kernel.org/r/15327.1186166232@lwn.net
0079
0080 msleep(1~20) may not do what the caller intends, and
0081 will often sleep longer (~20 ms actual sleep for any
0082 value given in the 1~20ms range). In many cases this
0083 is not the desired behavior.
0084
0085 - Why is there no "usleep" / What is a good range?
0086 Since usleep_range is built on top of hrtimers, the
0087 wakeup will be very precise (ish), thus a simple
0088 usleep function would likely introduce a large number
0089 of undesired interrupts.
0090
0091 With the introduction of a range, the scheduler is
0092 free to coalesce your wakeup with any other wakeup
0093 that may have happened for other reasons, or at the
0094 worst case, fire an interrupt for your upper bound.
0095
0096 The larger a range you supply, the greater a chance
0097 that you will not trigger an interrupt; this should
0098 be balanced with what is an acceptable upper bound on
0099 delay / performance for your specific code path. Exact
0100 tolerances here are very situation specific, thus it
0101 is left to the caller to determine a reasonable range.
0102
0103 SLEEPING FOR LARGER MSECS ( 10ms+ )
0104 * Use msleep or possibly msleep_interruptible
0105
0106 - What's the difference?
0107 msleep sets the current task to TASK_UNINTERRUPTIBLE
0108 whereas msleep_interruptible sets the current task to
0109 TASK_INTERRUPTIBLE before scheduling the sleep. In
0110 short, the difference is whether the sleep can be ended
0111 early by a signal. In general, just use msleep unless
0112 you know you have a need for the interruptible variant.
0113
0114 FLEXIBLE SLEEPING (any delay, uninterruptible)
0115 * Use fsleep