Back to home page

OSCL-LXR

 
 

    


0001 =========================
0002 Hardware Latency Detector
0003 =========================
0004 
0005 Introduction
0006 -------------
0007 
0008 The tracer hwlat_detector is a special purpose tracer that is used to
0009 detect large system latencies induced by the behavior of certain underlying
0010 hardware or firmware, independent of Linux itself. The code was developed
0011 originally to detect SMIs (System Management Interrupts) on x86 systems,
0012 however there is nothing x86 specific about this patchset. It was
0013 originally written for use by the "RT" patch since the Real Time
0014 kernel is highly latency sensitive.
0015 
0016 SMIs are not serviced by the Linux kernel, which means that it does not
0017 even know that they are occuring. SMIs are instead set up by BIOS code
0018 and are serviced by BIOS code, usually for "critical" events such as
0019 management of thermal sensors and fans. Sometimes though, SMIs are used for
0020 other tasks and those tasks can spend an inordinate amount of time in the
0021 handler (sometimes measured in milliseconds). Obviously this is a problem if
0022 you are trying to keep event service latencies down in the microsecond range.
0023 
0024 The hardware latency detector works by hogging one of the cpus for configurable
0025 amounts of time (with interrupts disabled), polling the CPU Time Stamp Counter
0026 for some period, then looking for gaps in the TSC data. Any gap indicates a
0027 time when the polling was interrupted and since the interrupts are disabled,
0028 the only thing that could do that would be an SMI or other hardware hiccup
0029 (or an NMI, but those can be tracked).
0030 
0031 Note that the hwlat detector should *NEVER* be used in a production environment.
0032 It is intended to be run manually to determine if the hardware platform has a
0033 problem with long system firmware service routines.
0034 
0035 Usage
0036 ------
0037 
0038 Write the ASCII text "hwlat" into the current_tracer file of the tracing system
0039 (mounted at /sys/kernel/tracing or /sys/kernel/tracing). It is possible to
0040 redefine the threshold in microseconds (us) above which latency spikes will
0041 be taken into account.
0042 
0043 Example::
0044 
0045         # echo hwlat > /sys/kernel/tracing/current_tracer
0046         # echo 100 > /sys/kernel/tracing/tracing_thresh
0047 
0048 The /sys/kernel/tracing/hwlat_detector interface contains the following files:
0049 
0050   - width - time period to sample with CPUs held (usecs)
0051             must be less than the total window size (enforced)
0052   - window - total period of sampling, width being inside (usecs)
0053 
0054 By default the width is set to 500,000 and window to 1,000,000, meaning that
0055 for every 1,000,000 usecs (1s) the hwlat detector will spin for 500,000 usecs
0056 (0.5s). If tracing_thresh contains zero when hwlat tracer is enabled, it will
0057 change to a default of 10 usecs. If any latencies that exceed the threshold is
0058 observed then the data will be written to the tracing ring buffer.
0059 
0060 The minimum sleep time between periods is 1 millisecond. Even if width
0061 is less than 1 millisecond apart from window, to allow the system to not
0062 be totally starved.
0063 
0064 If tracing_thresh was zero when hwlat detector was started, it will be set
0065 back to zero if another tracer is loaded. Note, the last value in
0066 tracing_thresh that hwlat detector had will be saved and this value will
0067 be restored in tracing_thresh if it is still zero when hwlat detector is
0068 started again.
0069 
0070 The following tracing directory files are used by the hwlat_detector:
0071 
0072 in /sys/kernel/tracing:
0073 
0074  - tracing_threshold    - minimum latency value to be considered (usecs)
0075  - tracing_max_latency  - maximum hardware latency actually observed (usecs)
0076  - tracing_cpumask      - the CPUs to move the hwlat thread across
0077  - hwlat_detector/width - specified amount of time to spin within window (usecs)
0078  - hwlat_detector/window        - amount of time between (width) runs (usecs)
0079  - hwlat_detector/mode  - the thread mode
0080 
0081 By default, one hwlat detector's kernel thread will migrate across each CPU
0082 specified in cpumask at the beginning of a new window, in a round-robin
0083 fashion. This behavior can be changed by changing the thread mode,
0084 the available options are:
0085 
0086  - none:        do not force migration
0087  - round-robin: migrate across each CPU specified in cpumask [default]
0088  - per-cpu:     create one thread for each cpu in tracing_cpumask