Back to home page

OSCL-LXR

 
 

    


0001 ========
0002 CPU load
0003 ========
0004 
0005 Linux exports various bits of information via ``/proc/stat`` and
0006 ``/proc/uptime`` that userland tools, such as top(1), use to calculate
0007 the average time system spent in a particular state, for example::
0008 
0009     $ iostat
0010     Linux 2.6.18.3-exp (linmac)     02/20/2007
0011 
0012     avg-cpu:  %user   %nice %system %iowait  %steal   %idle
0013               10.01    0.00    2.92    5.44    0.00   81.63
0014 
0015     ...
0016 
0017 Here the system thinks that over the default sampling period the
0018 system spent 10.01% of the time doing work in user space, 2.92% in the
0019 kernel, and was overall 81.63% of the time idle.
0020 
0021 In most cases the ``/proc/stat``         information reflects the reality quite
0022 closely, however due to the nature of how/when the kernel collects
0023 this data sometimes it can not be trusted at all.
0024 
0025 So how is this information collected?  Whenever timer interrupt is
0026 signalled the kernel looks what kind of task was running at this
0027 moment and increments the counter that corresponds to this tasks
0028 kind/state.  The problem with this is that the system could have
0029 switched between various states multiple times between two timer
0030 interrupts yet the counter is incremented only for the last state.
0031 
0032 
0033 Example
0034 -------
0035 
0036 If we imagine the system with one task that periodically burns cycles
0037 in the following manner::
0038 
0039      time line between two timer interrupts
0040     |--------------------------------------|
0041      ^                                    ^
0042      |_ something begins working          |
0043                                           |_ something goes to sleep
0044                                          (only to be awaken quite soon)
0045 
0046 In the above situation the system will be 0% loaded according to the
0047 ``/proc/stat`` (since the timer interrupt will always happen when the
0048 system is executing the idle handler), but in reality the load is
0049 closer to 99%.
0050 
0051 One can imagine many more situations where this behavior of the kernel
0052 will lead to quite erratic information inside ``/proc/stat``::
0053 
0054 
0055         /* gcc -o hog smallhog.c */
0056         #include <time.h>
0057         #include <limits.h>
0058         #include <signal.h>
0059         #include <sys/time.h>
0060         #define HIST 10
0061 
0062         static volatile sig_atomic_t stop;
0063 
0064         static void sighandler(int signr)
0065         {
0066                 (void) signr;
0067                 stop = 1;
0068         }
0069 
0070         static unsigned long hog (unsigned long niters)
0071         {
0072                 stop = 0;
0073                 while (!stop && --niters);
0074                 return niters;
0075         }
0076 
0077         int main (void)
0078         {
0079                 int i;
0080                 struct itimerval it = {
0081                         .it_interval = { .tv_sec = 0, .tv_usec = 1 },
0082                         .it_value    = { .tv_sec = 0, .tv_usec = 1 } };
0083                 sigset_t set;
0084                 unsigned long v[HIST];
0085                 double tmp = 0.0;
0086                 unsigned long n;
0087                 signal(SIGALRM, &sighandler);
0088                 setitimer(ITIMER_REAL, &it, NULL);
0089 
0090                 hog (ULONG_MAX);
0091                 for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog(ULONG_MAX);
0092                 for (i = 0; i < HIST; ++i) tmp += v[i];
0093                 tmp /= HIST;
0094                 n = tmp - (tmp / 3.0);
0095 
0096                 sigemptyset(&set);
0097                 sigaddset(&set, SIGALRM);
0098 
0099                 for (;;) {
0100                         hog(n);
0101                         sigwait(&set, &i);
0102                 }
0103                 return 0;
0104         }
0105 
0106 
0107 References
0108 ----------
0109 
0110 - https://lore.kernel.org/r/loom.20070212T063225-663@post.gmane.org
0111 - Documentation/filesystems/proc.rst (1.8)
0112 
0113 
0114 Thanks
0115 ------
0116 
0117 Con Kolivas, Pavel Machek