]>
Commit | Line | Data |
---|---|---|
014cb63b | 1 | .\" Copyright (C) 2015 Serge Hallyn <serge@hallyn.com> |
43df1ab3 | 2 | .\" and Copyright (C) 2016 Michael Kerrisk <mtk.manpages@gmail.com> |
014cb63b MK |
3 | .\" |
4 | .\" %%%LICENSE_START(VERBATIM) | |
5 | .\" Permission is granted to make and distribute verbatim copies of this | |
6 | .\" manual provided the copyright notice and this permission notice are | |
7 | .\" preserved on all copies. | |
8 | .\" | |
9 | .\" Permission is granted to copy and distribute modified versions of this | |
10 | .\" manual under the conditions for verbatim copying, provided that the | |
11 | .\" entire resulting derived work is distributed under the terms of a | |
12 | .\" permission notice identical to this one. | |
13 | .\" | |
14 | .\" Since the Linux kernel and libraries are constantly changing, this | |
15 | .\" manual page may be incorrect or out-of-date. The author(s) assume no | |
16 | .\" responsibility for errors or omissions, or for damages resulting from | |
17 | .\" the use of the information contained herein. The author(s) may not | |
18 | .\" have taken the same level of care in the production of this manual, | |
19 | .\" which is licensed free of charge, as they might when working | |
20 | .\" professionally. | |
21 | .\" | |
22 | .\" Formatted or processed versions of this manual, if unaccompanied by | |
23 | .\" the source, must acknowledge the copyright and authors of this work. | |
24 | .\" %%%LICENSE_END | |
25 | .\" | |
3df541c0 | 26 | .TH CGROUPS 7 2016-07-17 "Linux" "Linux Programmer's Manual" |
21f0d132 MK |
27 | .SH NAME |
28 | cgroups \- Linux control groups | |
29 | .SH DESCRIPTION | |
30 | Control cgroups, usually referred to as cgroups, | |
a15e0673 | 31 | are a Linux kernel feature which allow processes to |
8bff7140 MK |
32 | be organized into hierarchical groups whose usage of |
33 | various types of resources can then be limited and monitored. | |
34 | The kernel's cgroup interface is provided through | |
21f0d132 | 35 | a pseudo-filesystem called cgroupfs. |
6398ca15 | 36 | Grouping is implemented in the core cgroup kernel code, |
21f0d132 | 37 | while resource tracking and limits are implemented in |
8bff7140 | 38 | a set of per-resource-type subsystems (memory, CPU, and so on). |
21f0d132 | 39 | .\" |
176a4211 MK |
40 | .SS Terminology |
41 | A | |
42 | .I cgroup | |
43 | is a collection of processes that are bound to a set of | |
44 | limits or parameters defined via the cgroup filesystem. | |
45 | ||
46 | A | |
47 | .I subsystem | |
48 | is a kernel component that modifies the behavior of | |
49 | the processes in a cgroup. | |
50 | Various subsystems have been implemented, making it possible to do things | |
51 | such as limiting the amount of CPU time and memory available to a cgroup, | |
52 | accounting for the CPU time used by a cgroup, | |
53 | and freezing and resuming execution of the processes in a cgroup. | |
54 | Subsystems are sometimes also known as | |
55 | .IR "resource controllers" | |
56 | (or simply, controllers). | |
57 | ||
55f52de8 | 58 | The cgroups for a controller are arranged in a |
176a4211 MK |
59 | .IR hierarchy . |
60 | This hierarchy is defined by creating, removing, and | |
61 | renaming subdirectories within the cgroup filesystem. | |
8fc9db1e MK |
62 | At each level of the hierarchy, attributes (e.g., limits) can be defined. |
63 | The limits, control, and accounting provided by cgroups generally have | |
64 | effect throughout the subhierarchy underneath the cgroup where the | |
65 | attributes are defined. | |
8bff7140 MK |
66 | Thus, for example, the limits placed on |
67 | a cgroup at a higher level in the hierarchy cannot be exceeded | |
68 | by descendant cgroups. | |
176a4211 | 69 | .\" |
43df1ab3 MK |
70 | .SS Cgroups version 1 and version 2 |
71 | The initial release of the cgroups implementation was in Linux 2.6.24. | |
55f52de8 | 72 | Over time, various cgroup controllers have been added |
43df1ab3 | 73 | to allow the management of various types of resources. |
55f52de8 MK |
74 | However, the development of these controllers was largely uncoordinated, |
75 | with the result that many inconsistencies arose between controllers | |
43df1ab3 MK |
76 | and management of the cgroup hierarchies became rather complex. |
77 | (A longer description of these problems can be found in | |
78 | the kernel source file | |
0a837899 | 79 | .IR Documentation/cgroup\-v2.txt .) |
43df1ab3 | 80 | |
813d9220 MK |
81 | Because of the problems with the initial cgroups implementation |
82 | (cgroups version 1), | |
43df1ab3 MK |
83 | starting in Linux 3.10, work began on a new, |
84 | orthogonal implementation to remedy these problems. | |
85 | Initially marked experimental, and hidden behind the | |
86 | .I "\-o\ __DEVEL__sane_behavior" | |
87 | mount option, the new version (cgroups version 2) | |
88 | was eventually made official with the release of Linux 4.5. | |
89 | Differences between the two versions are described in the text below. | |
90 | ||
91 | Although cgroups v2 is intended as a replacement for cgroups v1, | |
92 | the older system continues to exist | |
93 | (and for compatibility reasons is unlikely to be removed). | |
94 | Currently, cgroups v2 implements only a subset of the controllers | |
95 | available in cgroups v1. | |
96 | The two systems are implemented so that both v1 controllers and | |
97 | v2 controllers can be mounted on the same system. | |
98 | Thus, for example, it is possible to use those controllers | |
99 | that are supported under version 2, | |
100 | while also using version 1 controllers | |
101 | where version 2 does not yet support those controllers. | |
1a90a85e MK |
102 | The only restriction here is that a controller can't be simultaneously |
103 | employed in both a cgroups v1 hierarchy and in the cgroups v2 hierarchy. | |
43df1ab3 | 104 | .\" |
8bff7140 MK |
105 | .SS Cgroups version 1 |
106 | Under cgroups v1, each controller may be mounted against a separate | |
107 | cgroup filesystem that provides its own hierarchical organization of the | |
108 | processes on the system. | |
109 | It is also possible comount multiple (or even all) cgroups v1 controllers | |
110 | against the same cgroup filesystem, meaning that the comounted controllers | |
111 | manage the same hierarchical organization of processes. | |
112 | ||
113 | For each mounted hierarchy, | |
114 | the directory tree mirrors the control group hierarchy. | |
115 | Each control group is represented by a directory, with each of its child | |
116 | control cgroups represented as a child directory. | |
117 | For instance, | |
118 | .IR /user/joe/1.session | |
119 | represents control group | |
120 | .IR 1.session , | |
121 | which is a child of cgroup | |
122 | .IR joe , | |
123 | which is a child of | |
124 | .IR /user . | |
125 | Under each cgroup directory is a set of files which can be read or | |
126 | written to, reflecting resource limits and a few general cgroup | |
127 | properties. | |
128 | ||
129 | In addition, in cgroups v1, | |
55f52de8 | 130 | cgroups can be mounted with no bound controller, in which case |
8bff7140 | 131 | they serve only to track processes. |
59dabd75 | 132 | (See the discussion of release notification below.) |
8bff7140 MK |
133 | An example of this is the |
134 | .I name=systemd | |
135 | cgroup which is used by | |
136 | .BR systemd (1) | |
137 | to track services and user sessions. | |
138 | .\" | |
6398ca15 | 139 | .SS Tasks (threads) versus processes |
c775bca2 MK |
140 | In cgroups v1, a distinction is drawn between |
141 | .I processes | |
142 | and | |
143 | .IR tasks . | |
144 | In this view, a process can consist of multiple tasks | |
6398ca15 MK |
145 | (more commonly called threads, from a user-space perspective, |
146 | and called such in the remainder of this man page). | |
0ec74e08 | 147 | In cgroups v1, it is possible to independently manipulate |
6398ca15 | 148 | the cgroup memberships of the threads in a process. |
c775bca2 MK |
149 | Because this ability caused certain problems, |
150 | .\" FIXME Add some text describing why this was a problem. | |
151 | the ability to independently manipulate the cgroup memberships | |
6398ca15 | 152 | of the threads in a process has been removed in cgroups v2. |
c775bca2 MK |
153 | Cgroups v2 allows manipulation of cgroup membership only for processes |
154 | (which has the effect of changing the cgroup membership of | |
6398ca15 | 155 | all threads in the process). |
c775bca2 | 156 | .\" |
77e0a626 MK |
157 | .SS Mounting v1 controllers |
158 | The use of cgroups requires a kernel built with the | |
159 | .BR CONFIG_CGROUP\option. | |
160 | In addition, each of the v1 controllers has an associated | |
161 | configuration option that must be set in order to employ that controller. | |
effa83ce | 162 | |
77e0a626 MK |
163 | In order to use a v1 controller, |
164 | it must be mounted against a cgroup filesystem. | |
4e07c70f MK |
165 | The usual place for such mounts is under a |
166 | .BR tmpfs (5) | |
167 | filesystem mounted at | |
77e0a626 MK |
168 | .IR /sys/fs/cgroup . |
169 | Thus, one might mount the | |
170 | .I cpu | |
171 | controller as follows: | |
34d725f6 | 172 | |
77e0a626 MK |
173 | .nf |
174 | .in +4n | |
175 | mount \-t cgroup \-o cpu none /sys/fs/cgroup/cpu | |
176 | .in | |
177 | .fi | |
effa83ce | 178 | |
77e0a626 MK |
179 | It is possible to comount multiple controllers against the same hierarchy. |
180 | For example, here the | |
181 | .IR cpu | |
21f0d132 | 182 | and |
77e0a626 MK |
183 | .IR cpuacct |
184 | controllers are comounted against a single hierarchy: | |
21f0d132 MK |
185 | |
186 | .nf | |
187 | .in +4n | |
77e0a626 | 188 | mount \-t cgroup \-o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct |
21f0d132 MK |
189 | .in |
190 | .fi | |
effa83ce | 191 | |
55f52de8 | 192 | Comounting controllers has the effect that a process is in the same cgroup for |
77e0a626 | 193 | all of the comounted controllers. |
55f52de8 | 194 | Separately mounting controllers allows a process to |
21f0d132 MK |
195 | be in cgroup |
196 | .I /foo1 | |
55f52de8 | 197 | for one controller while being in |
21f0d132 MK |
198 | .I /foo2/foo3 |
199 | for another. | |
77e0a626 MK |
200 | |
201 | It is possible to comount all v1 controllers against the same hierarchy: | |
202 | ||
203 | .nf | |
204 | .in +4n | |
205 | mount \-t cgroup \-o all cgroup /sys/fs/cgroup | |
206 | .in | |
207 | .fi | |
208 | ||
209 | (One can achieve the same result by omitting | |
210 | .IR "\-o all" , | |
211 | since it is the default if no controllers are explicitly specified.) | |
212 | ||
31ec2a5c MK |
213 | It is not possible to mount the same controller |
214 | against multiple cgroup hierarchies. | |
215 | For example, it is not possible to mount both the | |
216 | .I cpu | |
217 | and | |
218 | .I cpuacct | |
219 | controllers against one hierarchy, and to mount the | |
220 | .I cpu | |
221 | controller alone against another hierarchy. | |
222 | It is possible to create multiple mount points with exactly | |
223 | the same set of comounted controllers. | |
224 | However, in this case all that results is multiple mount points | |
225 | providing a view of the same hierarchy. | |
226 | ||
77e0a626 MK |
227 | Note that on many systems, the v1 controllers are automatically mounted under |
228 | .IR /sys/fs/cgroup ; | |
229 | in particular, | |
230 | .BR systemd (1) | |
231 | automatically creates such mount points. | |
21f0d132 | 232 | .\" |
860573ad MK |
233 | .SS Cgroups version 1 controllers |
234 | Each of the cgroups version 1 controllers is governed | |
235 | by a kernel configuration option (listed below). | |
236 | Additionally, the availability of the cgroups feature is governed by the | |
237 | .BR CONFIG_CGROUPS | |
238 | kernel configuration option. | |
239 | .TP | |
240 | .IR cpu " (since Linux 2.6.24; " \fBCONFIG_CGROUP_SCHED\fP ) | |
241 | Cgroups can be guaranteed a minimum number of "CPU shares" | |
242 | when a system is busy. | |
243 | This does not limit a cgroup's CPU usage if the CPUs are not busy. | |
4ad9a706 MK |
244 | For further information, see |
245 | .IR Documentation/scheduler/sched-design-CFS.txt . | |
860573ad | 246 | |
4ad9a706 MK |
247 | In Linux 3.2, |
248 | this controller was extended to provide CPU "bandwidth" control. | |
249 | If the kernel is configured with | |
250 | .BR COONFIG_CFS_BANDWIDTH , | |
251 | then within each scheduling period | |
252 | (defined via a file in the cgroup directory), it is possible to define | |
253 | an upper limit on the CPU time allocated to the processes in a cgroup. | |
254 | This upper limit applies even if there is no other competition for the CPU. | |
860573ad MK |
255 | Further information can be found in the kernel source file |
256 | .IR Documentation/scheduler/sched\-bwc.txt . | |
257 | .TP | |
258 | .IR cpuacct " (since Linux 2.6.24; " \fBCONFIG_CGROUP_CPUACCT\fP ) | |
259 | This provides accounting for CPU usage by groups of processes. | |
260 | ||
261 | Further information can be found in the kernel source file | |
262 | .IR Documentation/cgroup\-v1/cpuacct.txt . | |
263 | .TP | |
264 | .IR cpuset " (since Linux 2.6.24; " \fBCONFIG_CPUSETS\fP ) | |
265 | This cgroup can be used to bind the processes in a cgroup to | |
266 | a specified set of CPUs and NUMA nodes. | |
267 | ||
268 | Further information can be found in the kernel source file | |
269 | .IR Documentation/cgroup\-v1/cpusets.txt . | |
270 | .TP | |
271 | .IR memory " (since Linux 2.6.25; " \fBCONFIG_MEMCG\fP ) | |
272 | The memory controller supports reporting and limiting of process memory, kernel | |
273 | memory, and swap used by cgroups. | |
274 | ||
275 | Further information can be found in the kernel source file | |
276 | .IR Documentation/cgroup\-v1/memory.txt . | |
277 | .TP | |
278 | .IR devices " (since Linux 2.6.26; " \fBCONFIG_CGROUP_DEVICE\fP ) | |
279 | This supports controlling which processes may create (mknod) devices as | |
280 | well as open them for reading or writing. | |
281 | The policies may be specified as whitelists and blacklists. | |
282 | Hierarchy is enforced, so new rules must not | |
283 | violate existing rules for the target or ancestor cgroups. | |
284 | ||
285 | Further information can be found in the kernel source file | |
286 | .IR Documentation/cgroup-v1/devices.txt . | |
287 | .TP | |
288 | .IR freezer " (since Linux 2.6.28; " \fBCONFIG_CGROUP_FREEZER\fP ) | |
289 | The | |
290 | .IR freezer | |
291 | cgroup can suspend and restore (resume) all processes in a cgroup. | |
292 | Freezing a cgroup | |
293 | .I /A | |
294 | also causes its children, for example, processes in | |
295 | .IR /A/B , | |
296 | to be frozen. | |
297 | ||
298 | Further information can be found in the kernel source file | |
299 | .IR Documentation/cgroup-v1/freezer-subsystem.txt . | |
300 | .TP | |
301 | .IR net_cls " (since Linux 2.6.29; " \fBCONFIG_CGROUP_NET_CLASSID\fP ) | |
302 | This places a classid, specified for the cgroup, on network packets | |
303 | created by a cgroup. | |
304 | These classids can then be used in firewall rules, | |
305 | as well as used to shape traffic using | |
306 | .BR tc (8). | |
307 | This applies only to packets | |
308 | leaving the cgroup, not to traffic arriving at the cgroup. | |
309 | ||
310 | Further information can be found in the kernel source file | |
311 | .IR Documentation/cgroup-v1/net_cls.txt . | |
312 | .TP | |
313 | .IR blkio " (since Linux 2.6.33; " \fBCONFIG_BLK_CGROUP\fP ) | |
314 | The | |
315 | .I blkio | |
316 | cgroup controls and limits access to specified block devices by | |
317 | applying IO control in the form of throttling and upper limits against leaf | |
318 | nodes and intermediate nodes in the storage hierarchy. | |
319 | ||
320 | Two policies are available. | |
321 | The first is a proportional-weight time-based division | |
322 | of disk implemented with CFQ. | |
323 | This is in effect for leaf nodes using CFQ. | |
324 | The second is a throttling policy which specifies | |
325 | upper I/O rate limits on a device. | |
326 | ||
327 | Further information can be found in the kernel source file | |
328 | .IR Documentation/cgroup-v1/blkio-controller.txt . | |
329 | .TP | |
330 | .IR perf_event " (since Linux 2.6.39; " \fBCONFIG_CGROUP_PERF\fP ) | |
331 | This controller allows | |
332 | .I perf | |
333 | monitoring of the set of processes grouped in a cgroup. | |
334 | ||
335 | Further information can be found in the kernel source file | |
c174eb6a | 336 | .IR tools/perf/Documentation/perf-record.txt . |
860573ad MK |
337 | .TP |
338 | .IR net_prio " (since Linux 3.3; " \fBCONFIG_CGROUP_NET_PRIO\fP ) | |
339 | This allows priorities to be specified, per network interface, for cgroups. | |
340 | ||
341 | Further information can be found in the kernel source file | |
342 | .IR Documentation/cgroup-v1/net_prio.txt . | |
343 | .TP | |
344 | .IR hugetlb " (since Linux 3.5; " \fBCONFIG_CGROUP_HUGETLB\fP ) | |
345 | This supports limiting the use of huge pages by cgroups. | |
346 | ||
347 | Further information can be found in the kernel source file | |
348 | .IR Documentation/cgroup-v1/hugetlb.txt . | |
349 | .TP | |
350 | .IR pids " (since Linux 4.3; " \fBCONFIG_CGROUP_PIDS\fP ) | |
351 | This controller permits limiting the number of process that may be created | |
352 | in a cgroup (and its descendants). | |
353 | ||
354 | Further information can be found in the kernel source file | |
355 | .IR Documentation/cgroup-v1/pids.txt . | |
356 | .\" | |
6398ca15 | 357 | .SS Creating cgroups and moving processes |
9ed582ac | 358 | A cgroup filesystem initially contains a single root cgroup, '/', |
6398ca15 | 359 | which all processes belong to. |
21f0d132 MK |
360 | A new cgroup is created by creating a directory in the cgroup filesystem: |
361 | ||
362 | mkdir /sys/fs/cgroup/cpu/cg1 | |
363 | ||
364 | This creates a new empty cgroup. | |
f524e7f8 MK |
365 | |
366 | A process may be moved to this cgroup by writing its PID into the cgroup's | |
21f0d132 | 367 | .I cgroup.procs |
21f0d132 MK |
368 | file: |
369 | ||
370 | echo $$ > /sys/fs/cgroup/cpu/cg1/cgroup.procs | |
371 | ||
f524e7f8 MK |
372 | Only one PID at a time should be written to this file. |
373 | ||
374 | Writing the value 0 to a | |
375 | .IR cgroup.procs | |
376 | file causes the writing process to be moved to the corresponding cgroup. | |
377 | ||
6398ca15 MK |
378 | When writing a PID into the |
379 | .IR cgroup.procs , | |
87402a2e MK |
380 | all threads in the process are moved into the new cgroup at once. |
381 | ||
f524e7f8 MK |
382 | Within a hierarchy, a process can be a member of exactly one cgroup. |
383 | Writing a process's PID to a | |
384 | .IR cgroup.procs | |
385 | file automatically removes it from the cgroup of | |
386 | which it was previously a member. | |
387 | ||
388 | The | |
389 | .I cgroup.procs | |
390 | file can be read to obtain a list of the processes that are | |
391 | members of a cgroup. | |
392 | The returned list of PIDs is not guaranteed to be in order. | |
393 | Nor is it guaranteed to be free of duplicates. | |
394 | (For example, a PID may be recycled while reading from the list.) | |
395 | ||
87402a2e MK |
396 | In cgroups v1 (but not cgroups v2), an individual thread can be moved to |
397 | another cgroup by writing its thread ID | |
398 | (i.e., the kernel thread ID returned by | |
399 | .BR clone (2) | |
400 | and | |
401 | .BR gettid (2)) | |
402 | to the | |
403 | .IR tasks | |
404 | file in a cgroup directory. | |
405 | This file can be read to discover the set of threads | |
406 | that are members of the cgroup. | |
407 | This file is not present in cgroup v2 directories. | |
b43be47e MK |
408 | .\" |
409 | .SS Removing cgroups | |
410 | To remove a cgroup, | |
411 | it must first have no child cgroups and contain no (nonzombie) processes. | |
412 | So long as that is the case, one can simply | |
413 | remove the corresponding directory pathname. | |
414 | Note that files in a cgroup directory cannot and need not be | |
415 | removed. | |
416 | .\" | |
88afe701 | 417 | .SS Cgroups v1 release notification |
23388d41 MK |
418 | Two files can be used to determine whether the kernel provides |
419 | notifications when a cgroup becomes empty. | |
420 | A cgroup is considered to be empty when it contains no child | |
421 | cgroups and no member processes. | |
422 | ||
423 | A special file in the root directory of each cgroup hierarchy, | |
88afe701 | 424 | .IR release_agent , |
23388d41 MK |
425 | can be used to register the pathname of a program that may be invoked when |
426 | a cgroup in the hierarchy becomes empty. | |
427 | The pathname of the newly empty cgroup (relative to the cgroup mount point) | |
428 | is provided as the sole command-line argument when the | |
429 | .IR release_agent | |
430 | program is invoked. | |
431 | The | |
432 | .IR release_agent | |
433 | program might remove the cgroup directory, | |
434 | or perhaps repopulate with a process. | |
435 | ||
436 | The default value of the | |
437 | .IR release_agent | |
438 | file is empty, meaning that no release agent is invoked. | |
439 | ||
440 | Whether or not the | |
441 | .IR release_agent | |
442 | program is invoked when a particular cgroup becomes empty is determined | |
443 | by the value in the | |
88afe701 | 444 | .IR notify_on_release |
23388d41 MK |
445 | file in the corresponding cgroup directory. |
446 | If this file contains the value 0, then the | |
447 | .IR release_agent | |
448 | program is not invoked. | |
449 | If it contains the value 1, the | |
450 | .IR release_agent | |
451 | program is invoked. | |
452 | The default value for this file in the root cgroup is 0. | |
453 | At the time when a new cgroup is created, | |
454 | the value in this file is inherited from the corresponding file | |
455 | in the parent cgroup. | |
88afe701 | 456 | .\" |
b43be47e MK |
457 | .SS Cgroups version 2 |
458 | In cgroups v2, | |
459 | all mounted controllers reside in a single unified hierarchy. | |
460 | While (different) controllers may be simultaneously | |
461 | mounted under the v1 and v2 hierarchies, | |
462 | it is not possible to mount the same controller simultaneously | |
463 | under both the v1 and the v2 hierarchies. | |
464 | ||
2befa495 MK |
465 | The new behaviors in cgroups v2 are summarized here, |
466 | and in some cases elaborated in the following subsections. | |
467 | .IP 1. 3 | |
a15e0673 | 468 | Cgroups v2 provides a unified hierarchy against |
dddb7ea1 MK |
469 | which all controllers are mounted. |
470 | .IP 2. | |
2befa495 MK |
471 | "Internal" processes are not permitted. |
472 | With the exception of the root cgroup, processes may reside | |
473 | only in leaf nodes (cgroups that do not themselves contain child cgroups). | |
dddb7ea1 | 474 | .IP 3. |
2befa495 MK |
475 | Active cgroups must be specified via the files |
476 | .IR cgroup.controllers | |
477 | and | |
478 | .IR cgroup.subtree_control . | |
dddb7ea1 | 479 | .IP 4. |
2befa495 MK |
480 | The |
481 | .I tasks | |
482 | file has been removed. | |
483 | In addition, the | |
484 | .I cgroup.clone_children | |
485 | file that is employed by the | |
486 | .I cpuset | |
487 | controller has been removed. | |
dddb7ea1 | 488 | .IP 5. |
2befa495 MK |
489 | An improved mechanism for notification of empty cgroups is provided by the |
490 | .IR cgroup.events | |
491 | file. | |
492 | .PP | |
493 | For more changes, see the | |
494 | .I Documentation/cgroup-v2.txt | |
495 | file in the kernel source. | |
496 | .\" | |
dddb7ea1 MK |
497 | .SS Cgroups v2 unified hierarchy |
498 | In cgroups v1, the ability to mount different controllers | |
499 | against different hierarchies was intended to allow great flexibility | |
500 | for application design. | |
501 | In practice, though, the flexibility turned out to less useful than expected, | |
502 | and in many cases added complexity. | |
503 | Therefore, in cgroups v2, | |
504 | all available controllers are mounted against a single hierarchy. | |
505 | The available controllers are automatically mounted, | |
506 | meaning that it is not necessary (or possible) to specify the controllers | |
507 | when mounting the cgroup v2 filesystem using a command such as the following: | |
508 | ||
509 | mount -t cgroup2 none /mnt/cgroup2 | |
510 | ||
511 | A cgroup v2 controller is available only if it is not currently in use | |
512 | via a mount against a cgroup v1 hierarchy. | |
513 | Or, to put things another way, it is not possible to employ | |
514 | the same controller against both a v1 hierarchy and the unified v2 hierarchy. | |
515 | .\" | |
e4c759bc | 516 | .SS Cgroups v2 """no internal processes""" rule |
2befa495 MK |
517 | With the exception of the root cgroup, processes may reside |
518 | only in leaf nodes (cgroups that do not themselves contain child cgroups). | |
b43be47e MK |
519 | This avoids the need to decide how to partition resources between |
520 | processes which are members of cgroup A and processes in child cgroups of A. | |
effa83ce | 521 | |
21f0d132 MK |
522 | For instance, if cgroup |
523 | .I /cg1/cg2 | |
6398ca15 | 524 | exists, then a process may reside in |
21f0d132 MK |
525 | .IR /cg1/cg2 , |
526 | but not in | |
527 | .IR /cg1 . | |
5b38b21d | 528 | This is to avoid an ambiguity in cgroups v1 |
3ddb25ac | 529 | with respect to the delegation of resources between processes in |
21f0d132 MK |
530 | .I /cg1 |
531 | and its child cgroups. | |
3ddb25ac | 532 | The recommended approach in cgroups v2 is to create a subdirectory called |
21f0d132 | 533 | .I leaf |
3ddb25ac MK |
534 | for any nonleaf cgroup which should contain processes, but no child cgroups. |
535 | Thus, processes which previously would have gone into | |
21f0d132 MK |
536 | .I /cg1 |
537 | would now go into | |
538 | .IR /cg1/leaf . | |
3ddb25ac MK |
539 | This has the advantage of making explicit |
540 | the relationship between processes in | |
21f0d132 MK |
541 | .I /cg1/leaf |
542 | and | |
543 | .IR /cg1 's | |
544 | other children. | |
2befa495 MK |
545 | .\" |
546 | .SS Cgroups v2 subtree control | |
21f0d132 MK |
547 | When a cgroup |
548 | .I A/b | |
549 | is created, its | |
550 | .IR cgroup.controllers | |
effa83ce | 551 | file contains the list of controllers which were active in its parent, A. |
21f0d132 MK |
552 | This is the list of controllers which are available to this cgroup. |
553 | No controllers are active until they are enabled through the | |
554 | .IR cgroup.subtree_control | |
df6f53cc | 555 | file, by writing the list of space-delimited names of the controllers, |
0a837899 | 556 | each preceded by '+' (to enable) or '\-' (to disable). |
21f0d132 MK |
557 | If the |
558 | .I freezer | |
559 | controller is not enabled in | |
560 | .IR /A/B , | |
561 | then it cannot be enabled in | |
562 | .IR /A/B/C . | |
21f0d132 | 563 | .\" |
754f4cf5 MK |
564 | .SS Cgroups v2 cgroup.events file |
565 | With cgroups v2, a new mechanism is provided to obtain notification | |
566 | about when a cgroup becomes empty. | |
567 | The cgroups v1 | |
568 | .IR release_agent | |
569 | and | |
570 | .IR notify_on_release | |
571 | files are removed, and replaced by a new, more general-purpose file, | |
572 | .IR cgroup.events . | |
573 | This file contains key-value pairs | |
574 | (delimited by newline characters, with the key and value separated by spaces) | |
575 | that identify events or state for a cgroup. | |
576 | Currently, only one key appears in this file, | |
577 | .IR populated , | |
578 | which has either the value 0, | |
579 | meaning that the cgroup (and its descendants) | |
580 | contain no (nonzombie) processes, | |
581 | or 1, meaning that the cgroup contains member processes. | |
582 | ||
583 | The | |
584 | .IR cgroup.events | |
585 | file can be monitored, in order to receive notification when a cgroup | |
586 | transitions between the populated and unpopulated states (or vice versa). | |
587 | When monitoring this file using | |
588 | .BR inotify (7), | |
589 | transitions generate | |
590 | .BR IN_MODIFY | |
591 | events, and when monitoring the file using | |
592 | .BR poll (2), | |
593 | transitions generate | |
594 | .B POLLPRI | |
595 | events. | |
596 | ||
597 | The cgroups v2 | |
598 | .IR notify_on_release | |
599 | mechanism offers at least two advantages over the cgroups v1 | |
600 | .IR release_agent | |
601 | mechanism. | |
602 | First, it allows for cheaper notification, | |
603 | since a single process can monitor multiple | |
604 | .IR cgroup.events | |
605 | files. | |
606 | By contrast, the cgroups v1 mechanism requires the creation | |
607 | of a process for each notification. | |
a15e0673 | 608 | Second, notification can be delegated to a process that lives inside |
754f4cf5 | 609 | a container associated with the newly empty cgroup. |
c91a9f8a | 610 | .\" |
5c2181ad MK |
611 | .SS /proc files |
612 | .TP | |
34eb3340 | 613 | .IR /proc/cgroups " (since Linux 2.6.24)" |
92bb6d36 | 614 | This file contains information about the controllers |
1a4f7d59 | 615 | that are compiled into the kernel. |
34eb3340 MK |
616 | An example of the contents of this file (reformatted for readability) |
617 | is the following: | |
618 | ||
619 | .nf | |
620 | .in +4n | |
4580c2f6 MK |
621 | #subsys_name hierarchy num_cgroups enabled |
622 | cpuset 4 1 1 | |
623 | cpu 8 1 1 | |
624 | cpuacct 8 1 1 | |
625 | blkio 6 1 1 | |
626 | memory 3 1 1 | |
627 | devices 10 84 1 | |
628 | freezer 7 1 1 | |
629 | net_cls 9 1 1 | |
630 | perf_event 5 1 1 | |
631 | net_prio 9 1 1 | |
632 | hugetlb 0 1 0 | |
633 | pids 2 1 1 | |
34eb3340 MK |
634 | .in |
635 | .fi | |
636 | ||
637 | The fields in this file are, from left to right: | |
638 | .RS | |
639 | .IP 1. 3 | |
640 | The name of the controller. | |
641 | .IP 2. | |
92bb6d36 | 642 | The unique ID of the cgroup hierarchy on which this controller is mounted. |
11c0797f | 643 | If multiple cgroups v1 controllers are bound to the same hierarchy, |
34eb3340 | 644 | then each will show the same hierarchy ID in this field. |
92bb6d36 MK |
645 | The value in this field will be 0 if: |
646 | .RS 5 | |
647 | .IP a) 3 | |
648 | the controller is not mounted on a cgroups v1 hierarchy; | |
649 | .IP b) | |
650 | the controller is bound to the cgroups v2 single unified hierarchy; or | |
651 | .IP c) | |
652 | the controller is disabled (see below). | |
653 | .RE | |
34eb3340 MK |
654 | .IP 3. |
655 | The number of control groups in this hierarchy using this controller. | |
656 | .IP 4. | |
657 | This field contains the value 1 if this controller is enabled, | |
658 | or 0 if it has been disabled (via the | |
659 | .IR cgroup_disable | |
660 | kernel command-line boot parameter). | |
661 | .RE | |
662 | .TP | |
5c2181ad | 663 | .IR /proc/[pid]/cgroup " (since Linux 2.6.24)" |
f5faa016 MK |
664 | This file describes control groups to which the process |
665 | with the corresponding PID belongs. | |
5f8a7eb2 | 666 | The displayed information differs for |
2c4fbe35 | 667 | cgroups version 1 and version 2 hierarchies. |
5f8a7eb2 MK |
668 | |
669 | For each cgroup hierarchy of which the process is a member, | |
670 | there is one entry containing three | |
5c2181ad | 671 | colon-separated fields of the form: |
5f8a7eb2 | 672 | |
55f52de8 | 673 | hierarchy-ID:controller-list:cgroup-path |
5f8a7eb2 MK |
674 | |
675 | For example: | |
5c2181ad MK |
676 | .nf |
677 | .ft CW | |
678 | ||
679 | 5:cpuacct,cpu,cpuset:/daemons | |
680 | .ft | |
681 | .fi | |
682 | .IP | |
683 | The colon-separated fields are, from left to right: | |
5f8a7eb2 | 684 | .RS |
5c2181ad | 685 | .IP 1. 3 |
5f8a7eb2 MK |
686 | For cgroups version 1 hierarchies, |
687 | this field contains a unique hierarchy ID number | |
688 | that can be matched to a hierarchy ID in | |
689 | .IR /proc/cgroups . | |
690 | For the cgroups version 2 hierarchy, this field contains the value 0. | |
5c2181ad | 691 | .IP 2. |
5f8a7eb2 | 692 | For cgroups version 1 hierarchies, |
55f52de8 | 693 | this field contains a comma-separated list of the controllers |
5f8a7eb2 MK |
694 | bound to the hierarchy. |
695 | For the cgroups version 2 hierarchy, this field is empty. | |
5c2181ad | 696 | .IP 3. |
5f8a7eb2 MK |
697 | This field contains the pathname of the control group in the hierarchy |
698 | to which the process belongs. | |
699 | This pathname is relative to the mount point of the hierarchy. | |
5c2181ad | 700 | .RE |
2e23a9b2 MK |
701 | .SH ERRORS |
702 | The following errors can occur for | |
703 | .BR mount (2): | |
704 | .TP | |
705 | .B EBUSY | |
706 | An attempt to mount a cgroup version 1 filesystem specified neither the | |
707 | .I name= | |
708 | option (to mount a named hierarchy) nor a controller name (or | |
28bcfee9 | 709 | .IR all ). |
15ce4b0c MK |
710 | .SH NOTES |
711 | A child process created via | |
712 | .BR fork (2) | |
713 | inherits its parent's cgroup memberships. | |
714 | A process's cgroup memberships are preserved across | |
715 | .BR execve (2). | |
bbfdf727 | 716 | .SH SEE ALSO |
ebbc83be | 717 | .BR prlimit (1), |
f60a5da2 | 718 | .BR systemd (1), |
325b7eb0 | 719 | .BR clone (2), |
ebbc83be MK |
720 | .BR ioprio_set (2), |
721 | .BR perf_event_open (2), | |
722 | .BR setrlimit (2), | |
cff6de30 | 723 | .BR cgroup_namespaces (7), |
69c47536 | 724 | .BR cpuset (7), |
ebbc83be MK |
725 | .BR namespaces (7), |
726 | .BR sched (7), | |
727 | .BR user_namespaces (7) |