0001 =========================
0002 Process Number Controller
0003 =========================
0004
0005 Abstract
0006 --------
0007
0008 The process number controller is used to allow a cgroup hierarchy to stop any
0009 new tasks from being fork()'d or clone()'d after a certain limit is reached.
0010
0011 Since it is trivial to hit the task limit without hitting any kmemcg limits in
0012 place, PIDs are a fundamental resource. As such, PID exhaustion must be
0013 preventable in the scope of a cgroup hierarchy by allowing resource limiting of
0014 the number of tasks in a cgroup.
0015
0016 Usage
0017 -----
0018
0019 In order to use the `pids` controller, set the maximum number of tasks in
0020 pids.max (this is not available in the root cgroup for obvious reasons). The
0021 number of processes currently in the cgroup is given by pids.current.
0022
0023 Organisational operations are not blocked by cgroup policies, so it is possible
0024 to have pids.current > pids.max. This can be done by either setting the limit to
0025 be smaller than pids.current, or attaching enough processes to the cgroup such
0026 that pids.current > pids.max. However, it is not possible to violate a cgroup
0027 policy through fork() or clone(). fork() and clone() will return -EAGAIN if the
0028 creation of a new process would cause a cgroup policy to be violated.
0029
0030 To set a cgroup to have no limit, set pids.max to "max". This is the default for
0031 all new cgroups (N.B. that PID limits are hierarchical, so the most stringent
0032 limit in the hierarchy is followed).
0033
0034 pids.current tracks all child cgroup hierarchies, so parent/pids.current is a
0035 superset of parent/child/pids.current.
0036
0037 The pids.events file contains event counters:
0038
0039 - max: Number of times fork failed because limit was hit.
0040
0041 Example
0042 -------
0043
0044 First, we mount the pids controller::
0045
0046 # mkdir -p /sys/fs/cgroup/pids
0047 # mount -t cgroup -o pids none /sys/fs/cgroup/pids
0048
0049 Then we create a hierarchy, set limits and attach processes to it::
0050
0051 # mkdir -p /sys/fs/cgroup/pids/parent/child
0052 # echo 2 > /sys/fs/cgroup/pids/parent/pids.max
0053 # echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs
0054 # cat /sys/fs/cgroup/pids/parent/pids.current
0055 2
0056 #
0057
0058 It should be noted that attempts to overcome the set limit (2 in this case) will
0059 fail::
0060
0061 # cat /sys/fs/cgroup/pids/parent/pids.current
0062 2
0063 # ( /bin/echo "Here's some processes for you." | cat )
0064 sh: fork: Resource temporary unavailable
0065 #
0066
0067 Even if we migrate to a child cgroup (which doesn't have a set limit), we will
0068 not be able to overcome the most stringent limit in the hierarchy (in this case,
0069 parent's)::
0070
0071 # echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs
0072 # cat /sys/fs/cgroup/pids/parent/pids.current
0073 2
0074 # cat /sys/fs/cgroup/pids/parent/child/pids.current
0075 2
0076 # cat /sys/fs/cgroup/pids/parent/child/pids.max
0077 max
0078 # ( /bin/echo "Here's some processes for you." | cat )
0079 sh: fork: Resource temporary unavailable
0080 #
0081
0082 We can set a limit that is smaller than pids.current, which will stop any new
0083 processes from being forked at all (note that the shell itself counts towards
0084 pids.current)::
0085
0086 # echo 1 > /sys/fs/cgroup/pids/parent/pids.max
0087 # /bin/echo "We can't even spawn a single process now."
0088 sh: fork: Resource temporary unavailable
0089 # echo 0 > /sys/fs/cgroup/pids/parent/pids.max
0090 # /bin/echo "We can't even spawn a single process now."
0091 sh: fork: Resource temporary unavailable
0092 #