]>
Commit | Line | Data |
---|---|---|
014cb63b | 1 | .\" Copyright (C) 2015 Serge Hallyn <serge@hallyn.com> |
43df1ab3 | 2 | .\" and Copyright (C) 2016 Michael Kerrisk <mtk.manpages@gmail.com> |
014cb63b MK |
3 | .\" |
4 | .\" %%%LICENSE_START(VERBATIM) | |
5 | .\" Permission is granted to make and distribute verbatim copies of this | |
6 | .\" manual provided the copyright notice and this permission notice are | |
7 | .\" preserved on all copies. | |
8 | .\" | |
9 | .\" Permission is granted to copy and distribute modified versions of this | |
10 | .\" manual under the conditions for verbatim copying, provided that the | |
11 | .\" entire resulting derived work is distributed under the terms of a | |
12 | .\" permission notice identical to this one. | |
13 | .\" | |
14 | .\" Since the Linux kernel and libraries are constantly changing, this | |
15 | .\" manual page may be incorrect or out-of-date. The author(s) assume no | |
16 | .\" responsibility for errors or omissions, or for damages resulting from | |
17 | .\" the use of the information contained herein. The author(s) may not | |
18 | .\" have taken the same level of care in the production of this manual, | |
19 | .\" which is licensed free of charge, as they might when working | |
20 | .\" professionally. | |
21 | .\" | |
22 | .\" Formatted or processed versions of this manual, if unaccompanied by | |
23 | .\" the source, must acknowledge the copyright and authors of this work. | |
24 | .\" %%%LICENSE_END | |
25 | .\" | |
21f0d132 MK |
26 | .TH CGROUPS 7 2016-04-24 "Linux" "Linux Programmer's Manual" |
27 | .SH NAME | |
28 | cgroups \- Linux control groups | |
29 | .SH DESCRIPTION | |
30 | Control cgroups, usually referred to as cgroups, | |
31 | are a Linux kernel feature which provides for grouping of tasks and | |
32 | resource tracking and limitations for those groups. | |
effa83ce MK |
33 | While several systems have been introduced to help in configuring and |
34 | managing cgroups, the kernel's cgroup interface is provided through | |
21f0d132 MK |
35 | a pseudo-filesystem called cgroupfs. |
36 | Task grouping is implemented in the core cgroup kernel code, | |
37 | while resource tracking and limits are implemented in | |
38 | a set of per-resource-type subsystems (memory, CPU, and so on) which may be | |
effa83ce | 39 | enabled as separate hierarchies, or joined into comounted hierarchies. |
21f0d132 MK |
40 | |
41 | Each hierarchy constitutes a separate mount of the cgroup filesystem, | |
effa83ce | 42 | with the subsystems enabled in that hierarchy listed in the mount options. |
21f0d132 MK |
43 | For each mounted hierarchy, |
44 | the directory tree mirrors the control group hierarchy. | |
effa83ce MK |
45 | Each control group is represented by a directory, with each of its child |
46 | control cgroups represented as a child directory. | |
21f0d132 MK |
47 | For instance, |
48 | .IR /user/joe/1.session | |
49 | represents control group | |
50 | .IR 1.session , | |
51 | which is a child of cgroup | |
52 | .IR joe , | |
53 | which is a child of | |
54 | .IR /user . | |
55 | Under each cgroup directory is a set of files which can be read or | |
effa83ce MK |
56 | written to, reflecting resource limits and a few general cgroup |
57 | properties. | |
58 | ||
21f0d132 MK |
59 | In general, cgroup limits are hierarchical, meaning that the limits placed on |
60 | .IR /user/joe | |
61 | cannot be exceeded by | |
62 | .IR /usr/joe/1.session . | |
63 | There are currently exceptions to this rule, | |
64 | but stricter adherence is a goal as cgroups are being largely reworked. | |
65 | ||
effa83ce | 66 | In addition, cgroups can be mounted with no bound subsystem, in which case |
21f0d132 MK |
67 | they serve only to track processes. |
68 | An example of this is the | |
69 | .I name=systemd | |
70 | cgroup which is used by | |
71 | .BR systemd (1) | |
72 | to track services and user sessions. | |
73 | .\" | |
176a4211 MK |
74 | .SS Terminology |
75 | A | |
76 | .I cgroup | |
77 | is a collection of processes that are bound to a set of | |
78 | limits or parameters defined via the cgroup filesystem. | |
79 | ||
80 | A | |
81 | .I subsystem | |
82 | is a kernel component that modifies the behavior of | |
83 | the processes in a cgroup. | |
84 | Various subsystems have been implemented, making it possible to do things | |
85 | such as limiting the amount of CPU time and memory available to a cgroup, | |
86 | accounting for the CPU time used by a cgroup, | |
87 | and freezing and resuming execution of the processes in a cgroup. | |
88 | Subsystems are sometimes also known as | |
89 | .IR "resource controllers" | |
90 | (or simply, controllers). | |
91 | ||
92 | The cgroups for a subsystem are arranged in a | |
93 | .IR hierarchy . | |
94 | This hierarchy is defined by creating, removing, and | |
95 | renaming subdirectories within the cgroup filesystem. | |
96 | At each level of the hierarchy, attributes (e.g., limits) can be defined; | |
97 | these attributes may govern or propagate | |
98 | to child cgroups and and their descendants in the hierarchy. | |
99 | .\" | |
43df1ab3 MK |
100 | .SS Cgroups version 1 and version 2 |
101 | The initial release of the cgroups implementation was in Linux 2.6.24. | |
176a4211 | 102 | Over time, various cgroup subsystems have been added |
43df1ab3 MK |
103 | to allow the management of various types of resources. |
104 | However, the development of these subsystems was largely uncoordinated, | |
105 | with the result that many inconsistencies arose between subsystems | |
106 | and management of the cgroup hierarchies became rather complex. | |
107 | (A longer description of these problems can be found in | |
108 | the kernel source file | |
109 | .IR Documentation/cgroup-v2.txt .) | |
110 | ||
111 | Because of the problems with the initial cgroups implementation, | |
112 | now known as cgroups version 1, | |
113 | starting in Linux 3.10, work began on a new, | |
114 | orthogonal implementation to remedy these problems. | |
115 | Initially marked experimental, and hidden behind the | |
116 | .I "\-o\ __DEVEL__sane_behavior" | |
117 | mount option, the new version (cgroups version 2) | |
118 | was eventually made official with the release of Linux 4.5. | |
119 | Differences between the two versions are described in the text below. | |
120 | ||
121 | Although cgroups v2 is intended as a replacement for cgroups v1, | |
122 | the older system continues to exist | |
123 | (and for compatibility reasons is unlikely to be removed). | |
124 | Currently, cgroups v2 implements only a subset of the controllers | |
125 | available in cgroups v1. | |
126 | The two systems are implemented so that both v1 controllers and | |
127 | v2 controllers can be mounted on the same system. | |
128 | Thus, for example, it is possible to use those controllers | |
129 | that are supported under version 2, | |
130 | while also using version 1 controllers | |
131 | where version 2 does not yet support those controllers. | |
132 | .\" | |
21f0d132 | 133 | .SS Mounting |
effa83ce | 134 | To be available, a given cgroup subsystem must be compiled into the |
21f0d132 MK |
135 | kernel. |
136 | Since they are exposed through a virtual filesystem, subsystems | |
137 | must be mounted before they can be controlled. | |
138 | The usual place for this is under | |
139 | .I /sys/fs/cgroup. | |
94eeedfd | 140 | If all the desired subsystems can be comounted, |
effa83ce MK |
141 | then the system may simply |
142 | ||
21f0d132 | 143 | mount -t cgroup cgroup /sys/fs/cgroup |
effa83ce MK |
144 | |
145 | If multiple, separately mounted subsystems are desired, then this is | |
21f0d132 MK |
146 | usually done in per-subsystem subdirectories. |
147 | This requires first mounting a tmpfs under | |
148 | .I /sys/fs/cgroup | |
149 | so that subdirectories can be created. | |
150 | For instance, one could mount | |
151 | .IR cpu , | |
152 | .IR memory , | |
153 | and | |
154 | .I devices | |
155 | cgroups as follows: | |
156 | ||
157 | .nf | |
158 | .in +4n | |
159 | mount -t tmpfs -o size=100000,mode=755 cgroups /sys/fs/cgroup | |
160 | for s in cpu memory devices; do | |
161 | mkdir /sys/fs/cgroup/$s | |
162 | mount -t cgroup -o $s $s /sys/fs/cgroup/$s | |
163 | done | |
164 | .in | |
165 | .fi | |
effa83ce | 166 | |
94eeedfd MK |
167 | Comounting subsystems has the effect that a task is in the same cgroup for |
168 | all comounted subsystems. | |
21f0d132 MK |
169 | Separately mounting subsystems allows a task to |
170 | be in cgroup | |
171 | .I /foo1 | |
172 | for one subsystem while being in | |
173 | .I /foo2/foo3 | |
174 | for another. | |
175 | .\" | |
176 | .SS Introspection | |
effa83ce | 177 | The list of subsystems compiled into the kernel can be seen in the file |
21f0d132 MK |
178 | .IR /proc/cgroups . |
179 | The file | |
180 | .I /proc/pid/cgroup | |
181 | lists the task's current cgroup | |
effa83ce | 182 | membership for each mounted hierarchy. |
21f0d132 MK |
183 | .\" |
184 | .SS Creating cgroups and moving tasks | |
effa83ce | 185 | The system begins with a single root cgroup (per hierarchy), '/', which all tasks belong to. |
21f0d132 MK |
186 | A new cgroup is created by creating a directory in the cgroup filesystem: |
187 | ||
188 | mkdir /sys/fs/cgroup/cpu/cg1 | |
189 | ||
190 | This creates a new empty cgroup. | |
191 | Tasks may be moved to this cgroup by writing | |
192 | their PIDs into the cgroup's | |
193 | .I cgroup.procs | |
194 | (deprecated) | |
195 | .I tasks | |
196 | file: | |
197 | ||
198 | echo $$ > /sys/fs/cgroup/cpu/cg1/cgroup.procs | |
199 | ||
200 | The same file can be read to obtain a list of the processes currently in | |
201 | .IR cg1 . | |
202 | By using the | |
203 | .I cgroup.procs | |
204 | file instead of the | |
205 | .I tasks | |
206 | file, all tasks in the | |
207 | thread group are moved into the new cgroup at once. | |
208 | ||
209 | On | |
210 | .BR fork (2), | |
211 | the new child is created as a member of the parent's cgroup, | |
212 | leading to implicit grouping of process hierarchies. | |
effa83ce MK |
213 | |
214 | Note: in the upcoming unified hierarchy, a new restriction is imposed such | |
21f0d132 MK |
215 | that tasks may only exist in leaf cgroups. |
216 | For instance, if cgroup | |
217 | .I /cg1/cg2 | |
218 | exists, then a task may exist in | |
219 | .IR /cg1/cg2 , | |
220 | but not in | |
221 | .IR /cg1 . | |
222 | This is to avoid the current ambiguity in the delegation of resources | |
223 | between tasks in | |
224 | .I /cg1 | |
225 | and its child cgroups. | |
226 | The recommended workaround is to create a subdirectory called | |
227 | .I leaf | |
228 | for any non-leaf cgroup which should contain tasks, and make sure not to | |
229 | create child cgroups of it. | |
230 | In the above example, tasks which previously would have gone into | |
231 | .I /cg1 | |
232 | would now go into | |
233 | .IR /cg1/leaf . | |
234 | This has the advantage of making explicit the relationship between tasks in | |
235 | .I /cg1/leaf | |
236 | and | |
237 | .IR /cg1 's | |
238 | other children. | |
239 | .\" | |
240 | .SS Removing cgroups | |
241 | To remove a cgroup, it must first have no child cgroups and contain no tasks. | |
242 | So long as that is the case, | |
243 | the cgroup by removing the corresponding directory pathname. | |
244 | ||
245 | A special file in each cgroup hierarchy, | |
246 | .IR release_agent , | |
247 | can be used to register a program to handle cgroups which become newly empty. | |
248 | The program will be called each time a cgroup marked for | |
249 | autoremove becomes empty and childless. | |
250 | The cgroup path will be provided as the first command-line argument. | |
251 | The cgroup must be marked as eligible for autoremove by writing '1' into its | |
252 | .IR notify_on_release | |
253 | file; | |
effa83ce MK |
254 | this value is inherited by newly created child cgroups. |
255 | ||
43df1ab3 | 256 | A new feature in cgroups v2 is the |
21f0d132 MK |
257 | .I cgroup.populated |
258 | file. | |
259 | This reads 0 if there are no tasks in the cgroup or its descendants, | |
260 | and 1 otherwise. | |
261 | It can be watched for changes using | |
262 | .BR inotify (7). | |
263 | This allows user-space applications to efficiently watch cgroups | |
264 | for autoremove conditions. | |
265 | .\" | |
43df1ab3 MK |
266 | .SS Cgroups version 2 |
267 | In cgroups v2, | |
268 | all mounted controllers reside in a single unified hierarchy. | |
269 | While (different) controllers may be simultaneously | |
270 | mounted under the v1 and v2 hierarchies, | |
271 | it is not possible to mount the same controller simultaneously | |
272 | under both the v1 and the v2 hierarchies. | |
effa83ce | 273 | |
43df1ab3 | 274 | The new behaviors in cgroups v2 are summarized below: |
21f0d132 MK |
275 | .TP 3 |
276 | 1. Tasks only in leaf nodes | |
277 | With the exception of the root cgroup, tasks may only reside in leaf nodes. | |
effa83ce MK |
278 | This avoids the need to decide how to partition resources between tasks which |
279 | are members of cgroup A and tasks in child cgroups of A. | |
21f0d132 | 280 | .TP |
effa83ce | 281 | 2. Active cgroups must be specified |
21f0d132 MK |
282 | The unified hierarchy presents two new files, |
283 | .IR cgroup.controllers | |
284 | and | |
285 | .IR cgroup.subtree_control . | |
286 | When a cgroup | |
287 | .I A/b | |
288 | is created, its | |
289 | .IR cgroup.controllers | |
effa83ce | 290 | file contains the list of controllers which were active in its parent, A. |
21f0d132 MK |
291 | This is the list of controllers which are available to this cgroup. |
292 | No controllers are active until they are enabled through the | |
293 | .IR cgroup.subtree_control | |
294 | file, by writing the name of the space-separate list of controllers, | |
295 | each preceded by '+' (to enable) or '-' (to disable). | |
296 | If the | |
297 | .I freezer | |
298 | controller is not enabled in | |
299 | .IR /A/B , | |
300 | then it cannot be enabled in | |
301 | .IR /A/B/C . | |
302 | .TP | |
effa83ce | 303 | 3. No "tasks" or "cgroup.clone_children" files |
21f0d132 | 304 | .TP |
effa83ce | 305 | 4. Empty cgroup notification |
21f0d132 MK |
306 | A new file, |
307 | .IR cgroup.populated , | |
308 | under each cgroup contains '0' when the | |
309 | cgroup is empty, and 1 when it is populated. | |
310 | It therefore may be watched to detect when a cgroup becomes (non-)empty. | |
311 | This replaces the original notify-on-release mechanism. | |
312 | ||
313 | For more changes, please see the | |
314 | .I Documentation/cgroups/unified-hierarchy | |
effa83ce | 315 | file in the kernel source. |
21f0d132 | 316 | .\" |
43df1ab3 | 317 | .SS Cgroups version 1 subsystems |
21f0d132 | 318 | .TP |
f0d27655 | 319 | .I cpu |
94eeedfd | 320 | Cgroups can be guaranteed a minimum number of "CPU shares" |
f0d27655 MK |
321 | when a system is busy. |
322 | This does not limit a cgroup's CPU usage if the CPUs are not busy. | |
323 | .TP | |
324 | .I cpuacct | |
325 | This provides accounting for CPU usage by groups of tasks. | |
326 | .TP | |
488c879a | 327 | .I cpuset |
21f0d132 MK |
328 | This cgroup can be used to bind the tasks in a cgroup to |
329 | a specified set of CPUs and NUMA nodes. | |
330 | .TP | |
f0d27655 MK |
331 | .I memory |
332 | The memory controller supports reporting and limiting of process memory, kernel | |
333 | memory, and swap used by cgroups. | |
21f0d132 MK |
334 | .TP |
335 | .I devices | |
effa83ce | 336 | This supports controlling which tasks may create (mknod) devices as |
21f0d132 MK |
337 | well as open them for reading or writing. |
338 | The policies may be specified as whitelists and blacklists. | |
339 | Hierarchy is enforced, so new rules must not | |
effa83ce | 340 | violate existing rules for the target or ancestor cgroups. |
21f0d132 MK |
341 | .TP |
342 | .I freezer | |
343 | The | |
344 | .I freezer | |
345 | cgroup can suspend and restore (resume) all tasks in a cgroup. | |
346 | Freezing a cgroup | |
347 | .I /A | |
348 | also causes its children, for example, tasks in | |
349 | .IR /A/B , | |
effa83ce | 350 | to be frozen. |
21f0d132 | 351 | .TP |
21f0d132 | 352 | .I net_cls |
effa83ce | 353 | This places a classid, specified for the cgroup, on network packets |
21f0d132 MK |
354 | created by a cgroup. |
355 | These classids can then be used in firewall rules, | |
356 | as well as used to shape traffic using | |
357 | .BR tc (8). | |
358 | This only applies to packets | |
effa83ce | 359 | leaving the cgroup, not to traffic arriving at the cgroup. |
21f0d132 | 360 | .TP |
f0d27655 MK |
361 | .I blkio |
362 | The | |
363 | .I blkio | |
364 | cgroup controls and limits access to specified block devices by | |
365 | applying IO control in the form of throttling and upper limits against leaf | |
366 | nodes and intermediate nodes in the storage hierarchy. | |
367 | ||
368 | Two policies are available. | |
369 | The first is a proportional-weight time-based division | |
370 | of disk implemented with CFQ. | |
371 | This is in effect for leaf nodes using CFQ. | |
372 | The second is a throttling policy which specifies | |
373 | upper I/O rate limits on a device. | |
374 | .TP | |
375 | .I perf_event | |
43df1ab3 MK |
376 | This controller allows |
377 | .I perf | |
378 | monitoring of the set of processes grouped in a cgroup. | |
f0d27655 | 379 | .TP |
21f0d132 MK |
380 | .I net_prio |
381 | This allows priorities to be specified, per network interface, for cgroups. | |
382 | .TP | |
f0d27655 MK |
383 | .I hugetlb |
384 | This supports limiting the use of huge pages by cgroups. | |
0d293858 MK |
385 | .TP |
386 | .I pids | |
387 | This controller permits limiting the number of process that may be created | |
388 | in a cgroup (and its descendants). | |
bbfdf727 | 389 | .SH SEE ALSO |
69c47536 | 390 | .BR cpuset (7), |
bbfdf727 | 391 | .BR namespaces (7) |