]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/cgroups.7
man*/: ffix
[thirdparty/man-pages.git] / man7 / cgroups.7
CommitLineData
014cb63b 1.\" Copyright (C) 2015 Serge Hallyn <serge@hallyn.com>
4242dfbe 2.\" and Copyright (C) 2016, 2017 Michael Kerrisk <mtk.manpages@gmail.com>
014cb63b 3.\"
5fbde956 4.\" SPDX-License-Identifier: Linux-man-pages-copyleft
014cb63b 5.\"
4c1c5274 6.TH cgroups 7 (date) "Linux man-pages (unreleased)"
21f0d132
MK
7.SH NAME
8cgroups \- Linux control groups
9.SH DESCRIPTION
77eefc59 10Control groups, usually referred to as cgroups,
a15e0673 11are a Linux kernel feature which allow processes to
8bff7140
MK
12be organized into hierarchical groups whose usage of
13various types of resources can then be limited and monitored.
14The kernel's cgroup interface is provided through
21f0d132 15a pseudo-filesystem called cgroupfs.
6398ca15 16Grouping is implemented in the core cgroup kernel code,
21f0d132 17while resource tracking and limits are implemented in
8bff7140 18a set of per-resource-type subsystems (memory, CPU, and so on).
21f0d132 19.\"
176a4211
MK
20.SS Terminology
21A
22.I cgroup
23is a collection of processes that are bound to a set of
24limits or parameters defined via the cgroup filesystem.
c6d039a3 25.P
176a4211
MK
26A
27.I subsystem
28is a kernel component that modifies the behavior of
29the processes in a cgroup.
30Various subsystems have been implemented, making it possible to do things
31such as limiting the amount of CPU time and memory available to a cgroup,
32accounting for the CPU time used by a cgroup,
33and freezing and resuming execution of the processes in a cgroup.
34Subsystems are sometimes also known as
1ae6b2c7 35.I resource controllers
176a4211 36(or simply, controllers).
c6d039a3 37.P
55f52de8 38The cgroups for a controller are arranged in a
176a4211
MK
39.IR hierarchy .
40This hierarchy is defined by creating, removing, and
41renaming subdirectories within the cgroup filesystem.
8fc9db1e
MK
42At each level of the hierarchy, attributes (e.g., limits) can be defined.
43The limits, control, and accounting provided by cgroups generally have
44effect throughout the subhierarchy underneath the cgroup where the
45attributes are defined.
8bff7140
MK
46Thus, for example, the limits placed on
47a cgroup at a higher level in the hierarchy cannot be exceeded
48by descendant cgroups.
176a4211 49.\"
43df1ab3
MK
50.SS Cgroups version 1 and version 2
51The initial release of the cgroups implementation was in Linux 2.6.24.
55f52de8 52Over time, various cgroup controllers have been added
43df1ab3 53to allow the management of various types of resources.
55f52de8
MK
54However, the development of these controllers was largely uncoordinated,
55with the result that many inconsistencies arose between controllers
43df1ab3 56and management of the cgroup hierarchies became rather complex.
069cbb60
SH
57A longer description of these problems can be found in the kernel
58source file
1ae6b2c7 59.I Documentation/admin\-guide/cgroup\-v2.rst
069cbb60 60(or
1ae6b2c7 61.I Documentation/cgroup\-v2.txt
069cbb60 62in Linux 4.17 and earlier).
c6d039a3 63.P
813d9220
MK
64Because of the problems with the initial cgroups implementation
65(cgroups version 1),
43df1ab3
MK
66starting in Linux 3.10, work began on a new,
67orthogonal implementation to remedy these problems.
68Initially marked experimental, and hidden behind the
69.I "\-o\ __DEVEL__sane_behavior"
70mount option, the new version (cgroups version 2)
71was eventually made official with the release of Linux 4.5.
72Differences between the two versions are described in the text below.
8f0b7d76
MG
73The file
74.IR cgroup.sane_behavior ,
5e833e27
MK
75present in cgroups v1, is a relic of this mount option.
76The file always reports "0" and is only retained for backward compatibility.
c6d039a3 77.P
43df1ab3
MK
78Although cgroups v2 is intended as a replacement for cgroups v1,
79the older system continues to exist
80(and for compatibility reasons is unlikely to be removed).
81Currently, cgroups v2 implements only a subset of the controllers
82available in cgroups v1.
83The two systems are implemented so that both v1 controllers and
84v2 controllers can be mounted on the same system.
85Thus, for example, it is possible to use those controllers
86that are supported under version 2,
87while also using version 1 controllers
88where version 2 does not yet support those controllers.
1a90a85e
MK
89The only restriction here is that a controller can't be simultaneously
90employed in both a cgroups v1 hierarchy and in the cgroups v2 hierarchy.
43df1ab3 91.\"
5714ccee 92.SH CGROUPS VERSION 1
8bff7140
MK
93Under cgroups v1, each controller may be mounted against a separate
94cgroup filesystem that provides its own hierarchical organization of the
95processes on the system.
980f1827 96It is also possible to comount multiple (or even all) cgroups v1 controllers
8bff7140
MK
97against the same cgroup filesystem, meaning that the comounted controllers
98manage the same hierarchical organization of processes.
c6d039a3 99.P
8bff7140
MK
100For each mounted hierarchy,
101the directory tree mirrors the control group hierarchy.
102Each control group is represented by a directory, with each of its child
103control cgroups represented as a child directory.
104For instance,
1ae6b2c7 105.I /user/joe/1.session
8bff7140
MK
106represents control group
107.IR 1.session ,
108which is a child of cgroup
109.IR joe ,
110which is a child of
111.IR /user .
112Under each cgroup directory is a set of files which can be read or
113written to, reflecting resource limits and a few general cgroup
114properties.
8bff7140 115.\"
6398ca15 116.SS Tasks (threads) versus processes
c775bca2
MK
117In cgroups v1, a distinction is drawn between
118.I processes
119and
120.IR tasks .
121In this view, a process can consist of multiple tasks
6398ca15
MK
122(more commonly called threads, from a user-space perspective,
123and called such in the remainder of this man page).
0ec74e08 124In cgroups v1, it is possible to independently manipulate
6398ca15 125the cgroup memberships of the threads in a process.
c6d039a3 126.P
c56ec51b
MK
127The cgroups v1 ability to split threads across different cgroups
128caused problems in some cases.
129For example, it made no sense for the
130.I memory
131controller,
132since all of the threads of a process share a single address space.
133Because of these problems,
c775bca2 134the ability to independently manipulate the cgroup memberships
56769384
MK
135of the threads in a process was removed in the initial cgroups v2
136implementation, and subsequently restored in a more limited form
137(see the discussion of "thread mode" below).
c775bca2 138.\"
77e0a626
MK
139.SS Mounting v1 controllers
140The use of cgroups requires a kernel built with the
1ae6b2c7 141.B CONFIG_CGROUP
8e6578f8 142option.
77e0a626
MK
143In addition, each of the v1 controllers has an associated
144configuration option that must be set in order to employ that controller.
c6d039a3 145.P
77e0a626
MK
146In order to use a v1 controller,
147it must be mounted against a cgroup filesystem.
4e07c70f
MK
148The usual place for such mounts is under a
149.BR tmpfs (5)
150filesystem mounted at
77e0a626
MK
151.IR /sys/fs/cgroup .
152Thus, one might mount the
153.I cpu
154controller as follows:
c6d039a3 155.P
77e0a626 156.in +4n
b8302363 157.EX
77e0a626 158mount \-t cgroup \-o cpu none /sys/fs/cgroup/cpu
b8302363 159.EE
e646a1ba 160.in
c6d039a3 161.P
77e0a626
MK
162It is possible to comount multiple controllers against the same hierarchy.
163For example, here the
1ae6b2c7 164.I cpu
21f0d132 165and
1ae6b2c7 166.I cpuacct
77e0a626 167controllers are comounted against a single hierarchy:
c6d039a3 168.P
21f0d132 169.in +4n
b8302363 170.EX
77e0a626 171mount \-t cgroup \-o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct
b8302363 172.EE
e646a1ba 173.in
c6d039a3 174.P
55f52de8 175Comounting controllers has the effect that a process is in the same cgroup for
77e0a626 176all of the comounted controllers.
55f52de8 177Separately mounting controllers allows a process to
21f0d132
MK
178be in cgroup
179.I /foo1
55f52de8 180for one controller while being in
21f0d132
MK
181.I /foo2/foo3
182for another.
c6d039a3 183.P
77e0a626 184It is possible to comount all v1 controllers against the same hierarchy:
c6d039a3 185.P
77e0a626 186.in +4n
b8302363 187.EX
77e0a626 188mount \-t cgroup \-o all cgroup /sys/fs/cgroup
b8302363 189.EE
e646a1ba 190.in
c6d039a3 191.P
77e0a626
MK
192(One can achieve the same result by omitting
193.IR "\-o all" ,
194since it is the default if no controllers are explicitly specified.)
c6d039a3 195.P
31ec2a5c
MK
196It is not possible to mount the same controller
197against multiple cgroup hierarchies.
198For example, it is not possible to mount both the
199.I cpu
200and
201.I cpuacct
202controllers against one hierarchy, and to mount the
203.I cpu
204controller alone against another hierarchy.
525a8b54 205It is possible to create multiple mount with exactly
31ec2a5c
MK
206the same set of comounted controllers.
207However, in this case all that results is multiple mount points
208providing a view of the same hierarchy.
c6d039a3 209.P
77e0a626
MK
210Note that on many systems, the v1 controllers are automatically mounted under
211.IR /sys/fs/cgroup ;
212in particular,
213.BR systemd (1)
525a8b54 214automatically creates such mounts.
21f0d132 215.\"
7409b54b
MK
216.SS Unmounting v1 controllers
217A mounted cgroup filesystem can be unmounted using the
218.BR umount (8)
219command, as in the following example:
c6d039a3 220.P
7409b54b
MK
221.in +4n
222.EX
223umount /sys/fs/cgroup/pids
224.EE
225.in
c6d039a3 226.P
7409b54b
MK
227.IR "But note well" :
228a cgroup filesystem is unmounted only if it is not busy,
229that is, it has no child cgroups.
230If this is not the case, then the only effect of the
231.BR umount (8)
232is to make the mount invisible.
525a8b54 233Thus, to ensure that the mount is really removed,
7409b54b
MK
234one must first remove all child cgroups,
235which in turn can be done only after all member processes
236have been moved from those cgroups to the root cgroup.
237.\"
860573ad
MK
238.SS Cgroups version 1 controllers
239Each of the cgroups version 1 controllers is governed
240by a kernel configuration option (listed below).
241Additionally, the availability of the cgroups feature is governed by the
1ae6b2c7 242.B CONFIG_CGROUPS
860573ad
MK
243kernel configuration option.
244.TP
245.IR cpu " (since Linux 2.6.24; " \fBCONFIG_CGROUP_SCHED\fP )
246Cgroups can be guaranteed a minimum number of "CPU shares"
247when a system is busy.
248This does not limit a cgroup's CPU usage if the CPUs are not busy.
4ad9a706 249For further information, see
1ae6b2c7 250.I Documentation/scheduler/sched\-design\-CFS.rst
069cbb60 251(or
1ae6b2c7 252.I Documentation/scheduler/sched\-design\-CFS.txt
069cbb60 253in Linux 5.2 and earlier).
a721e8b2 254.IP
4ad9a706
MK
255In Linux 3.2,
256this controller was extended to provide CPU "bandwidth" control.
257If the kernel is configured with
81ff7360 258.BR CONFIG_CFS_BANDWIDTH ,
4ad9a706
MK
259then within each scheduling period
260(defined via a file in the cgroup directory), it is possible to define
261an upper limit on the CPU time allocated to the processes in a cgroup.
262This upper limit applies even if there is no other competition for the CPU.
860573ad 263Further information can be found in the kernel source file
1ae6b2c7 264.I Documentation/scheduler/sched\-bwc.rst
069cbb60 265(or
1ae6b2c7 266.I Documentation/scheduler/sched\-bwc.txt
069cbb60 267in Linux 5.2 and earlier).
860573ad
MK
268.TP
269.IR cpuacct " (since Linux 2.6.24; " \fBCONFIG_CGROUP_CPUACCT\fP )
270This provides accounting for CPU usage by groups of processes.
a721e8b2 271.IP
860573ad 272Further information can be found in the kernel source file
1ae6b2c7 273.I Documentation/admin\-guide/cgroup\-v1/cpuacct.rst
069cbb60 274(or
1ae6b2c7 275.I Documentation/cgroup\-v1/cpuacct.txt
069cbb60 276in Linux 5.2 and earlier).
860573ad
MK
277.TP
278.IR cpuset " (since Linux 2.6.24; " \fBCONFIG_CPUSETS\fP )
279This cgroup can be used to bind the processes in a cgroup to
280a specified set of CPUs and NUMA nodes.
a721e8b2 281.IP
860573ad 282Further information can be found in the kernel source file
1ae6b2c7 283.I Documentation/admin\-guide/cgroup\-v1/cpusets.rst
069cbb60 284(or
1ae6b2c7 285.I Documentation/cgroup\-v1/cpusets.txt
069cbb60
SH
286in Linux 5.2 and earlier).
287.
860573ad
MK
288.TP
289.IR memory " (since Linux 2.6.25; " \fBCONFIG_MEMCG\fP )
290The memory controller supports reporting and limiting of process memory, kernel
291memory, and swap used by cgroups.
a721e8b2 292.IP
860573ad 293Further information can be found in the kernel source file
1ae6b2c7 294.I Documentation/admin\-guide/cgroup\-v1/memory.rst
069cbb60 295(or
1ae6b2c7 296.I Documentation/cgroup\-v1/memory.txt
069cbb60 297in Linux 5.2 and earlier).
860573ad
MK
298.TP
299.IR devices " (since Linux 2.6.26; " \fBCONFIG_CGROUP_DEVICE\fP )
300This supports controlling which processes may create (mknod) devices as
301well as open them for reading or writing.
640453bb 302The policies may be specified as allow-lists and deny-lists.
860573ad
MK
303Hierarchy is enforced, so new rules must not
304violate existing rules for the target or ancestor cgroups.
a721e8b2 305.IP
860573ad 306Further information can be found in the kernel source file
1ae6b2c7 307.I Documentation/admin\-guide/cgroup\-v1/devices.rst
069cbb60 308(or
1ae6b2c7 309.I Documentation/cgroup\-v1/devices.txt
069cbb60 310in Linux 5.2 and earlier).
860573ad
MK
311.TP
312.IR freezer " (since Linux 2.6.28; " \fBCONFIG_CGROUP_FREEZER\fP )
313The
1ae6b2c7 314.I freezer
860573ad
MK
315cgroup can suspend and restore (resume) all processes in a cgroup.
316Freezing a cgroup
317.I /A
318also causes its children, for example, processes in
319.IR /A/B ,
320to be frozen.
a721e8b2 321.IP
860573ad 322Further information can be found in the kernel source file
1ae6b2c7 323.I Documentation/admin\-guide/cgroup\-v1/freezer\-subsystem.rst
069cbb60 324(or
1ae6b2c7 325.I Documentation/cgroup\-v1/freezer\-subsystem.txt
069cbb60 326in Linux 5.2 and earlier).
860573ad
MK
327.TP
328.IR net_cls " (since Linux 2.6.29; " \fBCONFIG_CGROUP_NET_CLASSID\fP )
329This places a classid, specified for the cgroup, on network packets
330created by a cgroup.
331These classids can then be used in firewall rules,
332as well as used to shape traffic using
333.BR tc (8).
334This applies only to packets
335leaving the cgroup, not to traffic arriving at the cgroup.
a721e8b2 336.IP
860573ad 337Further information can be found in the kernel source file
1ae6b2c7 338.I Documentation/admin\-guide/cgroup\-v1/net_cls.rst
069cbb60 339(or
1ae6b2c7 340.I Documentation/cgroup\-v1/net_cls.txt
069cbb60 341in Linux 5.2 and earlier).
860573ad
MK
342.TP
343.IR blkio " (since Linux 2.6.33; " \fBCONFIG_BLK_CGROUP\fP )
344The
345.I blkio
346cgroup controls and limits access to specified block devices by
347applying IO control in the form of throttling and upper limits against leaf
348nodes and intermediate nodes in the storage hierarchy.
a721e8b2 349.IP
860573ad
MK
350Two policies are available.
351The first is a proportional-weight time-based division
352of disk implemented with CFQ.
353This is in effect for leaf nodes using CFQ.
354The second is a throttling policy which specifies
355upper I/O rate limits on a device.
a721e8b2 356.IP
860573ad 357Further information can be found in the kernel source file
1ae6b2c7 358.I Documentation/admin\-guide/cgroup\-v1/blkio\-controller.rst
069cbb60 359(or
1ae6b2c7 360.I Documentation/cgroup\-v1/blkio\-controller.txt
069cbb60 361in Linux 5.2 and earlier).
860573ad
MK
362.TP
363.IR perf_event " (since Linux 2.6.39; " \fBCONFIG_CGROUP_PERF\fP )
364This controller allows
365.I perf
366monitoring of the set of processes grouped in a cgroup.
a721e8b2 367.IP
069cbb60 368Further information can be found in the kernel source files
860573ad
MK
369.TP
370.IR net_prio " (since Linux 3.3; " \fBCONFIG_CGROUP_NET_PRIO\fP )
371This allows priorities to be specified, per network interface, for cgroups.
a721e8b2 372.IP
860573ad 373Further information can be found in the kernel source file
1ae6b2c7 374.I Documentation/admin\-guide/cgroup\-v1/net_prio.rst
069cbb60 375(or
1ae6b2c7 376.I Documentation/cgroup\-v1/net_prio.txt
069cbb60 377in Linux 5.2 and earlier).
860573ad
MK
378.TP
379.IR hugetlb " (since Linux 3.5; " \fBCONFIG_CGROUP_HUGETLB\fP )
380This supports limiting the use of huge pages by cgroups.
a721e8b2 381.IP
860573ad 382Further information can be found in the kernel source file
1ae6b2c7 383.I Documentation/admin\-guide/cgroup\-v1/hugetlb.rst
069cbb60 384(or
1ae6b2c7 385.I Documentation/cgroup\-v1/hugetlb.txt
069cbb60 386in Linux 5.2 and earlier).
860573ad
MK
387.TP
388.IR pids " (since Linux 4.3; " \fBCONFIG_CGROUP_PIDS\fP )
389This controller permits limiting the number of process that may be created
390in a cgroup (and its descendants).
a721e8b2 391.IP
860573ad 392Further information can be found in the kernel source file
1ae6b2c7 393.I Documentation/admin\-guide/cgroup\-v1/pids.rst
069cbb60 394(or
1ae6b2c7 395.I Documentation/cgroup\-v1/pids.txt
069cbb60 396in Linux 5.2 and earlier).
cfec905e
NB
397.TP
398.IR rdma " (since Linux 4.11; " \fBCONFIG_CGROUP_RDMA\fP )
d145c025
MK
399The RDMA controller permits limiting the use of
400RDMA/IB-specific resources per cgroup.
cfec905e
NB
401.IP
402Further information can be found in the kernel source file
1ae6b2c7 403.I Documentation/admin\-guide/cgroup\-v1/rdma.rst
069cbb60 404(or
1ae6b2c7 405.I Documentation/cgroup\-v1/rdma.txt
069cbb60 406in Linux 5.2 and earlier).
860573ad 407.\"
6398ca15 408.SS Creating cgroups and moving processes
9ed582ac 409A cgroup filesystem initially contains a single root cgroup, '/',
6398ca15 410which all processes belong to.
21f0d132 411A new cgroup is created by creating a directory in the cgroup filesystem:
c6d039a3 412.P
4769a778
MK
413.in +4n
414.EX
415mkdir /sys/fs/cgroup/cpu/cg1
416.EE
417.in
c6d039a3 418.P
21f0d132 419This creates a new empty cgroup.
c6d039a3 420.P
f524e7f8 421A process may be moved to this cgroup by writing its PID into the cgroup's
21f0d132 422.I cgroup.procs
21f0d132 423file:
c6d039a3 424.P
4769a778
MK
425.in +4n
426.EX
427echo $$ > /sys/fs/cgroup/cpu/cg1/cgroup.procs
428.EE
429.in
c6d039a3 430.P
f524e7f8 431Only one PID at a time should be written to this file.
c6d039a3 432.P
f524e7f8 433Writing the value 0 to a
1ae6b2c7 434.I cgroup.procs
f524e7f8 435file causes the writing process to be moved to the corresponding cgroup.
c6d039a3 436.P
6398ca15
MK
437When writing a PID into the
438.IR cgroup.procs ,
87402a2e 439all threads in the process are moved into the new cgroup at once.
c6d039a3 440.P
f524e7f8
MK
441Within a hierarchy, a process can be a member of exactly one cgroup.
442Writing a process's PID to a
1ae6b2c7 443.I cgroup.procs
f524e7f8
MK
444file automatically removes it from the cgroup of
445which it was previously a member.
c6d039a3 446.P
f524e7f8
MK
447The
448.I cgroup.procs
449file can be read to obtain a list of the processes that are
450members of a cgroup.
451The returned list of PIDs is not guaranteed to be in order.
452Nor is it guaranteed to be free of duplicates.
453(For example, a PID may be recycled while reading from the list.)
c6d039a3 454.P
56769384 455In cgroups v1, an individual thread can be moved to
87402a2e
MK
456another cgroup by writing its thread ID
457(i.e., the kernel thread ID returned by
458.BR clone (2)
459and
460.BR gettid (2))
461to the
1ae6b2c7 462.I tasks
87402a2e
MK
463file in a cgroup directory.
464This file can be read to discover the set of threads
465that are members of the cgroup.
b43be47e
MK
466.\"
467.SS Removing cgroups
468To remove a cgroup,
469it must first have no child cgroups and contain no (nonzombie) processes.
470So long as that is the case, one can simply
471remove the corresponding directory pathname.
472Note that files in a cgroup directory cannot and need not be
473removed.
474.\"
88afe701 475.SS Cgroups v1 release notification
23388d41
MK
476Two files can be used to determine whether the kernel provides
477notifications when a cgroup becomes empty.
478A cgroup is considered to be empty when it contains no child
479cgroups and no member processes.
c6d039a3 480.P
23388d41 481A special file in the root directory of each cgroup hierarchy,
88afe701 482.IR release_agent ,
23388d41
MK
483can be used to register the pathname of a program that may be invoked when
484a cgroup in the hierarchy becomes empty.
485The pathname of the newly empty cgroup (relative to the cgroup mount point)
486is provided as the sole command-line argument when the
1ae6b2c7 487.I release_agent
23388d41
MK
488program is invoked.
489The
1ae6b2c7 490.I release_agent
23388d41 491program might remove the cgroup directory,
980f1827 492or perhaps repopulate it with a process.
c6d039a3 493.P
23388d41 494The default value of the
1ae6b2c7 495.I release_agent
23388d41 496file is empty, meaning that no release agent is invoked.
c6d039a3 497.P
59af0514
MK
498The content of the
499.I release_agent
500file can also be specified via a mount option when the
501cgroup filesystem is mounted:
c6d039a3 502.P
59af0514
MK
503.in +4n
504.EX
fb6d2c09 505mount \-o release_agent=pathname ...
59af0514
MK
506.EE
507.in
c6d039a3 508.P
23388d41 509Whether or not the
1ae6b2c7 510.I release_agent
23388d41
MK
511program is invoked when a particular cgroup becomes empty is determined
512by the value in the
1ae6b2c7 513.I notify_on_release
23388d41
MK
514file in the corresponding cgroup directory.
515If this file contains the value 0, then the
1ae6b2c7 516.I release_agent
23388d41
MK
517program is not invoked.
518If it contains the value 1, the
1ae6b2c7 519.I release_agent
23388d41
MK
520program is invoked.
521The default value for this file in the root cgroup is 0.
522At the time when a new cgroup is created,
523the value in this file is inherited from the corresponding file
524in the parent cgroup.
88afe701 525.\"
d311c798
MK
526.SS Cgroup v1 named hierarchies
527In cgroups v1,
528it is possible to mount a cgroup hierarchy that has no attached controllers:
c6d039a3 529.P
d311c798
MK
530.in +4n
531.EX
fb6d2c09 532mount \-t cgroup \-o none,name=somename none /some/mount/point
d311c798
MK
533.EE
534.in
c6d039a3 535.P
d311c798
MK
536Multiple instances of such hierarchies can be mounted;
537each hierarchy must have a unique name.
538The only purpose of such hierarchies is to track processes.
539(See the discussion of release notification below.)
540An example of this is the
541.I name=systemd
542cgroup hierarchy that is used by
543.BR systemd (1)
544to track services and user sessions.
c6d039a3 545.P
29fa4cbc
MK
546Since Linux 5.0, the
547.I cgroup_no_v1
548kernel boot option (described below) can be used to disable cgroup v1
549named hierarchies, by specifying
550.IR cgroup_no_v1=named .
d311c798 551.\"
5714ccee 552.SH CGROUPS VERSION 2
b43be47e
MK
553In cgroups v2,
554all mounted controllers reside in a single unified hierarchy.
555While (different) controllers may be simultaneously
556mounted under the v1 and v2 hierarchies,
557it is not possible to mount the same controller simultaneously
558under both the v1 and the v2 hierarchies.
c6d039a3 559.P
2befa495
MK
560The new behaviors in cgroups v2 are summarized here,
561and in some cases elaborated in the following subsections.
cdede5cd 562.IP \[bu] 3
a15e0673 563Cgroups v2 provides a unified hierarchy against
dddb7ea1 564which all controllers are mounted.
cdede5cd 565.IP \[bu]
2befa495
MK
566"Internal" processes are not permitted.
567With the exception of the root cgroup, processes may reside
568only in leaf nodes (cgroups that do not themselves contain child cgroups).
4f017a68 569The details are somewhat more subtle than this, and are described below.
cdede5cd 570.IP \[bu]
2befa495 571Active cgroups must be specified via the files
1ae6b2c7 572.I cgroup.controllers
2befa495
MK
573and
574.IR cgroup.subtree_control .
cdede5cd 575.IP \[bu]
2befa495
MK
576The
577.I tasks
578file has been removed.
579In addition, the
580.I cgroup.clone_children
581file that is employed by the
582.I cpuset
583controller has been removed.
cdede5cd 584.IP \[bu]
2befa495 585An improved mechanism for notification of empty cgroups is provided by the
1ae6b2c7 586.I cgroup.events
2befa495 587file.
c6d039a3 588.P
2befa495 589For more changes, see the
1ae6b2c7 590.I Documentation/admin\-guide/cgroup\-v2.rst
069cbb60
SH
591file in the kernel source
592(or
1ae6b2c7 593.I Documentation/cgroup\-v2.txt
069cbb60
SH
594in Linux 4.17 and earlier).
595.
c6d039a3 596.P
e91d4f9e
MK
597Some of the new behaviors listed above saw subsequent modification with
598the addition in Linux 4.14 of "thread mode" (described below).
2befa495 599.\"
dddb7ea1
MK
600.SS Cgroups v2 unified hierarchy
601In cgroups v1, the ability to mount different controllers
602against different hierarchies was intended to allow great flexibility
603for application design.
e91fc446
MK
604In practice, though,
605the flexibility turned out to be less useful than expected,
dddb7ea1
MK
606and in many cases added complexity.
607Therefore, in cgroups v2,
608all available controllers are mounted against a single hierarchy.
609The available controllers are automatically mounted,
610meaning that it is not necessary (or possible) to specify the controllers
611when mounting the cgroup v2 filesystem using a command such as the following:
c6d039a3 612.P
4769a778
MK
613.in +4n
614.EX
fb6d2c09 615mount \-t cgroup2 none /mnt/cgroup2
4769a778
MK
616.EE
617.in
c6d039a3 618.P
dddb7ea1
MK
619A cgroup v2 controller is available only if it is not currently in use
620via a mount against a cgroup v1 hierarchy.
621Or, to put things another way, it is not possible to employ
622the same controller against both a v1 hierarchy and the unified v2 hierarchy.
57cbb0db
MK
623This means that it may be necessary first to unmount a v1 controller
624(as described above) before that controller is available in v2.
625Since
626.BR systemd (1)
627makes heavy use of some v1 controllers by default,
628it can in some cases be simpler to boot the system with
629selected v1 controllers disabled.
630To do this, specify the
1ae6b2c7 631.I cgroup_no_v1=list
57cbb0db
MK
632option on the kernel boot command line;
633.I list
634is a comma-separated list of the names of the controllers to disable,
635or the word
636.I all
637to disable all v1 controllers.
638(This situation is correctly handled by
639.BR systemd (1),
640which falls back to operating without the specified controllers.)
c6d039a3 641.P
03bb1264
MK
642Note that on many modern systems,
643.BR systemd (1)
644automatically mounts the
645.I cgroup2
646filesystem at
647.I /sys/fs/cgroup/unified
648during the boot process.
dddb7ea1 649.\"
efb95954
MK
650.SS Cgroups v2 mount options
651The following options
1ae6b2c7 652.RI ( mount\~\-o )
efb95954
MK
653can be specified when mounting the group v2 filesystem:
654.TP
655.IR nsdelegate " (since Linux 4.15)"
656Treat cgroup namespaces as delegation boundaries.
657For details, see below.
9e18674a
MK
658.TP
659.IR memory_localevents " (since Linux 5.2)"
660.\" commit 9852ae3fe5293264f01c49f2571ef7688f7823ce
661The
662.I memory.events
663should show statistics only for the cgroup itself,
664and not for any descendant cgroups.
665This was the behavior before Linux 5.2.
666Starting in Linux 5.2,
667the default behavior is to include statistics for descendant cgroups in
668.IR memory.events ,
669and this mount option can be used to revert to the legacy behavior.
670This option is system wide and can be set on mount or
671modified through remount only from the initial mount namespace;
672it is silently ignored in noninitial namespaces.
efb95954 673.\"
44c429ed
MK
674.SS Cgroups v2 controllers
675The following controllers, documented in the kernel source file
1ae6b2c7 676.I Documentation/admin\-guide/cgroup\-v2.rst
069cbb60 677(or
1ae6b2c7 678.I Documentation/cgroup\-v2.txt
069cbb60 679in Linux 4.17 and earlier),
44c429ed
MK
680are supported in cgroups version 2:
681.TP
cda7f4a3
MK
682.IR cpu " (since Linux 4.15)"
683This is the successor to the version 1
684.I cpu
685and
686.I cpuacct
687controllers.
688.TP
38c287b8
MK
689.IR cpuset " (since Linux 5.0)"
690This is the successor of the version 1
691.I cpuset
692controller.
693.TP
cda7f4a3
MK
694.IR freezer " (since Linux 5.2)"
695.\" commit 76f969e8948d82e78e1bc4beb6b9465908e74873
696This is the successor of the version 1
697.I freezer
698controller.
699.TP
38c287b8
MK
700.IR hugetlb " (since Linux 5.6)"
701This is the successor of the version 1
702.I hugetlb
703controller.
704.TP
44c429ed
MK
705.IR io " (since Linux 4.5)"
706This is the successor of the version 1
707.I blkio
708controller.
709.TP
710.IR memory " (since Linux 4.5)"
711This is the successor of the version 1
712.I memory
713controller.
714.TP
cda7f4a3 715.IR perf_event " (since Linux 4.11)"
44c429ed 716This is the same as the version 1
cda7f4a3 717.I perf_event
44c429ed
MK
718controller.
719.TP
cda7f4a3 720.IR pids " (since Linux 4.5)"
f7286edc 721This is the same as the version 1
cda7f4a3 722.I pids
44c429ed
MK
723controller.
724.TP
725.IR rdma " (since Linux 4.11)"
726This is the same as the version 1
727.I rdma
728controller.
c6d039a3 729.P
38c287b8
MK
730There is no direct equivalent of the
731.I net_cls
732and
733.I net_prio
734controllers from cgroups version 1.
735Instead, support has been added to
736.BR iptables (8)
737to allow eBPF filters that hook on cgroup v2 pathnames to make decisions
738about network traffic on a per-cgroup basis.
c6d039a3 739.P
38c287b8
MK
740The v2
741.I devices
742controller provides no interface files;
743instead, device control is gated by attaching an eBPF
744.RB ( BPF_CGROUP_DEVICE )
745program to a v2 cgroup.
44c429ed 746.\"
2befa495 747.SS Cgroups v2 subtree control
8d5f42dc
MK
748Each cgroup in the v2 hierarchy contains the following two files:
749.TP
1ae6b2c7 750.I cgroup.controllers
277559a4 751This read-only file exposes a list of the controllers that are
8d5f42dc
MK
752.I available
753in this cgroup.
754The contents of this file match the contents of the
755.I cgroup.subtree_control
756file in the parent cgroup.
757.TP
758.I cgroup.subtree_control
759This is a list of controllers that are
1ae6b2c7 760.I active
8d5f42dc
MK
761.RI ( enabled )
762in the cgroup.
763The set of controllers in this file is a subset of the set in the
1ae6b2c7 764.I cgroup.controllers
8d5f42dc
MK
765of this cgroup.
766The set of active controllers is modified by writing strings to this file
767containing space-delimited controller names,
768each preceded by '+' (to enable a controller)
769or '\-' (to disable a controller), as in the following example:
770.IP
771.in +4n
772.EX
b957f81f 773echo \[aq]+pids \-memory\[aq] > x/y/cgroup.subtree_control
8d5f42dc
MK
774.EE
775.in
776.IP
c9b101d1
MK
777An attempt to enable a controller
778that is not present in
779.I cgroup.controllers
780leads to an
781.B ENOENT
782error when writing to the
783.I cgroup.subtree_control
784file.
c6d039a3 785.P
8d5f42dc
MK
786Because the list of controllers in
787.I cgroup.subtree_control
788is a subset of those
789.IR cgroup.controllers ,
790a controller that has been disabled in one cgroup in the hierarchy
791can never be re-enabled in the subtree below that cgroup.
c6d039a3 792.P
8d5f42dc
MK
793A cgroup's
794.I cgroup.subtree_control
795file determines the set of controllers that are exercised in the
796.I child
797cgroups.
798When a controller (e.g.,
799.IR pids )
800is present in the
801.I cgroup.subtree_control
802file of a parent cgroup,
803then the corresponding controller-interface files (e.g.,
804.IR pids.max )
805are automatically created in the children of that cgroup
806and can be used to exert resource control in the child cgroups.
21f0d132 807.\"
2468f14e
MK
808.SS Cgroups v2 """no internal processes""" rule
809Cgroups v2 enforces a so-called "no internal processes" rule.
810Roughly speaking, this rule means that,
811with the exception of the root cgroup, processes may reside
812only in leaf nodes (cgroups that do not themselves contain child cgroups).
813This avoids the need to decide how to partition resources between
814processes which are members of cgroup A and processes in child cgroups of A.
c6d039a3 815.P
2468f14e
MK
816For instance, if cgroup
817.I /cg1/cg2
818exists, then a process may reside in
819.IR /cg1/cg2 ,
820but not in
821.IR /cg1 .
822This is to avoid an ambiguity in cgroups v1
823with respect to the delegation of resources between processes in
824.I /cg1
825and its child cgroups.
826The recommended approach in cgroups v2 is to create a subdirectory called
827.I leaf
828for any nonleaf cgroup which should contain processes, but no child cgroups.
829Thus, processes which previously would have gone into
830.I /cg1
831would now go into
832.IR /cg1/leaf .
833This has the advantage of making explicit
834the relationship between processes in
835.I /cg1/leaf
836and
837.IR /cg1 's
838other children.
c6d039a3 839.P
2468f14e
MK
840The "no internal processes" rule is in fact more subtle than stated above.
841More precisely, the rule is that a (nonroot) cgroup can't both
842(1) have member processes, and
36546c38 843(2) distribute resources into child cgroups\[em]that is, have a nonempty
2468f14e
MK
844.I cgroup.subtree_control
845file.
846Thus, it
847.I is
848possible for a cgroup to have both member processes and child cgroups,
849but before controllers can be enabled for that cgroup,
850the member processes must be moved out of the cgroup
851(e.g., perhaps into the child cgroups).
c6d039a3 852.P
e91d4f9e
MK
853With the Linux 4.14 addition of "thread mode" (described below),
854the "no internal processes" rule has been relaxed in some cases.
2468f14e 855.\"
754f4cf5 856.SS Cgroups v2 cgroup.events file
71e2545e
MK
857Each nonroot cgroup in the v2 hierarchy contains a read-only file,
858.IR cgroup.events ,
859whose contents are key-value pairs
754f4cf5 860(delimited by newline characters, with the key and value separated by spaces)
e00e18a2 861providing state information about the cgroup:
c6d039a3 862.P
71e2545e
MK
863.in +4n
864.EX
865$ \fBcat mygrp/cgroup.events\fP
866populated 1
c309dee7 867frozen 0
71e2545e
MK
868.EE
869.in
c6d039a3 870.P
71e2545e
MK
871The following keys may appear in this file:
872.TP
1ae6b2c7 873.I populated
71e2545e
MK
874The value of this key is either 1,
875if this cgroup or any of its descendants has member processes,
876or otherwise 0.
c309dee7
MK
877.TP
878.IR frozen " (since Linux 5.2)"
879.\" commit 76f969e8948d82e78e1bc4beb6b9465908e7487
880The value of this key is 1 if this cgroup is currently frozen,
881or 0 if it is not.
c6d039a3 882.P
754f4cf5 883The
1ae6b2c7 884.I cgroup.events
71e2545e
MK
885file can be monitored, in order to receive notification when the value of
886one of its keys changes.
887Such monitoring can be done using
754f4cf5 888.BR inotify (7),
71e2545e 889which notifies changes as
1ae6b2c7 890.B IN_MODIFY
71e2545e 891events, or
754f4cf5 892.BR poll (2),
71e2545e 893which notifies changes by returning the
754f4cf5 894.B POLLPRI
7747ed97
MK
895and
896.B POLLERR
71e2545e 897bits in the
1ae6b2c7 898.I revents
7747ed97 899field.
71e2545e
MK
900.\"
901.SS Cgroup v2 release notification
902Cgroups v2 provides a new mechanism for obtaining notification
903when a cgroup becomes empty.
904The cgroups v1
1ae6b2c7 905.I release_agent
71e2545e 906and
1ae6b2c7 907.I notify_on_release
71e2545e 908files are removed, and replaced by the
ccb1a262 909.I populated
71e2545e 910key in the
1ae6b2c7 911.I cgroup.events
71e2545e
MK
912file.
913This key either has the value 0,
914meaning that the cgroup (and its descendants)
915contain no (nonzombie) member processes,
916or 1, meaning that the cgroup (or one of its descendants)
917contains member processes.
c6d039a3 918.P
71e2545e 919The cgroups v2 release-notification mechanism
daf57a6a 920offers the following advantages over the cgroups v1
1ae6b2c7 921.I release_agent
daf57a6a 922mechanism:
cdede5cd 923.IP \[bu] 3
daf57a6a 924It allows for cheaper notification,
754f4cf5 925since a single process can monitor multiple
1ae6b2c7 926.I cgroup.events
71e2545e 927files (using the techniques described earlier).
daf57a6a
MK
928By contrast, the cgroups v1 mechanism requires the expense of creating
929a process for each notification.
cdede5cd 930.IP \[bu]
daf57a6a
MK
931Notification for different cgroup subhierarchies can be delegated
932to different processes.
933By contrast, the cgroups v1 mechanism allows only one release agent
934for an entire hierarchy.
c91a9f8a 935.\"
5e071499
MK
936.SS Cgroups v2 cgroup.stat file
937.\" commit ec39225cca42c05ac36853d11d28f877fde5c42e
938Each cgroup in the v2 hierarchy contains a read-only
1ae6b2c7 939.I cgroup.stat
5e071499
MK
940file (first introduced in Linux 4.14)
941that consists of lines containing key-value pairs.
942The following keys currently appear in this file:
943.TP
944.I nr_descendants
945This is the total number of visible (i.e., living) descendant cgroups
946underneath this cgroup.
947.TP
948.I nr_dying_descendants
949This is the total number of dying descendant cgroups
950underneath this cgroup.
951A cgroup enters the dying state after being deleted.
952It remains in that state for an undefined period
953(which will depend on system load)
c7f63e74
MK
954while resources are freed before the cgroup is destroyed.
955Note that the presence of some cgroups in the dying state is normal,
956and is not indicative of any problem.
5e071499
MK
957.IP
958A process can't be made a member of a dying cgroup,
959and a dying cgroup can't be brought back to life.
960.\"
5845e10b
MK
961.SS Limiting the number of descendant cgroups
962Each cgroup in the v2 hierarchy contains the following files,
963which can be used to view and set limits on the number
964of descendant cgroups under that cgroup:
965.TP
966.IR cgroup.max.depth " (since Linux 4.14)"
967.\" commit 1a926e0bbab83bae8207d05a533173425e0496d1
968This file defines a limit on the depth of nesting of descendant cgroups.
969A value of 0 in this file means that no descendant cgroups can be created.
970An attempt to create a descendant whose nesting level exceeds
971the limit fails
972.RI ( mkdir (2)
973fails with the error
974.BR EAGAIN ).
975.IP
976Writing the string
1ae6b2c7 977.I """max"""
5845e10b
MK
978to this file means that no limit is imposed.
979The default value in this file is
21259e57 980.IR """max""" .
5845e10b
MK
981.TP
982.IR cgroup.max.descendants " (since Linux 4.14)"
983.\" commit 1a926e0bbab83bae8207d05a533173425e0496d1
984This file defines a limit on the number of live descendant cgroups that
985this cgroup may have.
986An attempt to create more descendants than allowed by the limit fails
987.RI ( mkdir (2)
988fails with the error
989.BR EAGAIN ).
990.IP
991Writing the string
1ae6b2c7 992.I """max"""
5845e10b
MK
993to this file means that no limit is imposed.
994The default value in this file is
995.IR """max""" .
996.\"
4b1c2041 997.SH CGROUPS DELEGATION: DELEGATING A HIERARCHY TO A LESS PRIVILEGED USER
4242dfbe
MK
998In the context of cgroups,
999delegation means passing management of some subtree
51629a30 1000of the cgroup hierarchy to a nonprivileged user.
87b18a8b
MK
1001Cgroups v1 provides support for delegation based on file permissions
1002in the cgroup hierarchy but with less strict containment rules than v2
1003(as noted below).
1004Cgroups v2 supports delegation with containment by explicit design.
4b1c2041
MK
1005The focus of the discussion in this section is on delegation in cgroups v2,
1006with some differences for cgroups v1 noted along the way.
c6d039a3 1007.P
4242dfbe
MK
1008Some terminology is required in order to describe delegation.
1009A
1010.I delegater
1011is a privileged user (i.e., root) who owns a parent cgroup.
1012A
1013.I delegatee
1014is a nonprivileged user who will be granted the permissions needed
1015to manage some subhierarchy under that parent cgroup,
1016known as the
1017.IR "delegated subtree" .
c6d039a3 1018.P
4242dfbe
MK
1019To perform delegation,
1020the delegater makes certain directories and files writable by the delegatee,
1021typically by changing the ownership of the objects to be the user ID
1022of the delegatee.
0735069b
MK
1023Assuming that we want to delegate the hierarchy rooted at (say)
1024.I /dlgt_grp
4242dfbe
MK
1025and that there are not yet any child cgroups under that cgroup,
1026the ownership of the following is changed to the user ID of the delegatee:
1027.TP
1ae6b2c7 1028.I /dlgt_grp
4242dfbe
MK
1029Changing the ownership of the root of the subtree means that any new
1030cgroups created under the subtree (and the files they contain)
1031will also be owned by the delegatee.
1032.TP
1ae6b2c7 1033.I /dlgt_grp/cgroup.procs
f7286edc 1034Changing the ownership of this file means that the delegatee
4242dfbe
MK
1035can move processes into the root of the delegated subtree.
1036.TP
4b1c2041 1037.IR /dlgt_grp/cgroup.subtree_control " (cgroups v2 only)"
15f2303d 1038Changing the ownership of this file means that the delegatee
e5936eb6 1039can enable controllers (that are present in
0735069b 1040.IR /dlgt_grp/cgroup.controllers )
4242dfbe 1041in order to further redistribute resources at lower levels in the subtree.
e5936eb6
MK
1042(As an alternative to changing the ownership of this file,
1043the delegater might instead add selected controllers to this file.)
639b6c8c 1044.TP
4b1c2041 1045.IR /dlgt_grp/cgroup.threads " (cgroups v2 only)"
639b6c8c
MK
1046Changing the ownership of this file is necessary if a threaded subtree
1047is being delegated (see the description of "thread mode", below).
7b327dd5 1048This permits the delegatee to write thread IDs to the file.
cd7f4c49
MK
1049(The ownership of this file can also be changed when delegating
1050a domain subtree, but currently this serves no purpose,
1051since, as described below, it is not possible to move a thread between
1052domain cgroups by writing its thread ID to the
1ae6b2c7 1053.I cgroup.threads
cd7f4c49 1054file.)
4b1c2041
MK
1055.IP
1056In cgroups v1, the corresponding file that should instead be delegated is the
1057.I tasks
1058file.
c6d039a3 1059.P
4242dfbe
MK
1060The delegater should
1061.I not
1062change the ownership of any of the controller interfaces files (e.g.,
1063.IR pids.max ,
1064.IR memory.high )
1065in
0735069b 1066.IR dlgt_grp .
4242dfbe
MK
1067Those files are used from the next level above the delegated subtree
1068in order to distribute resources into the subtree,
1069and the delegatee should not have permission to change
1070the resources that are distributed into the delegated subtree.
c6d039a3 1071.P
668ef765 1072See also the discussion of the
1ae6b2c7 1073.I /sys/kernel/cgroup/delegate
4b1c2041 1074file in NOTES for information about further delegatable files in cgroups v2.
c6d039a3 1075.P
4242dfbe
MK
1076After the aforementioned steps have been performed,
1077the delegatee can create child cgroups within the delegated subtree
6dc513cd
MK
1078(the cgroup subdirectories and the files they contain
1079will be owned by the delegatee)
4242dfbe
MK
1080and move processes between cgroups in the subtree.
1081If some controllers are present in
0735069b 1082.IR dlgt_grp/cgroup.subtree_control ,
4242dfbe 1083or the ownership of that file was passed to the delegatee,
f7286edc 1084the delegatee can also control the further redistribution
4242dfbe 1085of the corresponding resources into the delegated subtree.
27b086e9 1086.\"
ed3f4f34 1087.SS Cgroups v2 delegation: nsdelegate and cgroup namespaces
ed3f4f34
MK
1088Starting with Linux 4.13,
1089.\" commit 5136f6365ce3eace5a926e10f16ed2a233db5ba9
4b1c2041 1090there is a second way to perform cgroup delegation in the cgroups v2 hierarchy.
07361828 1091This is done by mounting or remounting the cgroup v2 filesystem with the
ed3f4f34 1092.I nsdelegate
07361828
MK
1093mount option.
1094For example, if the cgroup v2 filesystem has already been mounted,
1095we can remount it with the
1096.I nsdelegate
1097option as follows:
c6d039a3 1098.P
ed3f4f34
MK
1099.in +4n
1100.EX
fb6d2c09 1101mount \-t cgroup2 \-o remount,nsdelegate \e
07361828 1102 none /sys/fs/cgroup/unified
ed3f4f34
MK
1103.EE
1104.in
07361828 1105.\"
6a0aa2ec 1106.\" Alternatively, we could boot the kernel with the options:
07361828
MK
1107.\"
1108.\" cgroup_no_v1=all systemd.legacy_systemd_cgroup_controller
1109.\"
1110.\" The effect of the latter option is to prevent systemd from employing
1111.\" its "hybrid" cgroup mode, where it tries to make use of cgroups v2.
c6d039a3 1112.P
dc581e07 1113The effect of this mount option is to cause cgroup namespaces
ed3f4f34
MK
1114to automatically become delegation boundaries.
1115More specifically,
1116the following restrictions apply for processes inside the cgroup namespace:
cdede5cd 1117.IP \[bu] 3
446d1643 1118Writes to controller interface files in the root directory of the namespace
ed3f4f34
MK
1119will fail with the error
1120.BR EPERM .
1121Processes inside the cgroup namespace can still write to delegatable
446d1643 1122files in the root directory of the cgroup namespace such as
1ae6b2c7 1123.I cgroup.procs
ed3f4f34
MK
1124and
1125.IR cgroup.subtree_control ,
446d1643 1126and can create subhierarchy underneath the root directory.
cdede5cd 1127.IP \[bu]
ed3f4f34
MK
1128Attempts to migrate processes across the namespace boundary are denied
1129(with the error
1130.BR ENOENT ).
1131Processes inside the cgroup namespace can still
1132(subject to the containment rules described below)
1133move processes between cgroups
1134.I within
1135the subhierarchy under the namespace root.
c6d039a3 1136.P
ed3f4f34
MK
1137The ability to define cgroup namespaces as delegation boundaries
1138makes cgroup namespaces more useful.
1139To understand why, suppose that we already have one cgroup hierarchy
1140that has been delegated to a nonprivileged user,
1141.IR cecilia ,
1142using the older delegation technique described above.
1143Suppose further that
1144.I cecilia
1145wanted to further delegate a subhierarchy
1146under the existing delegated hierarchy.
1147(For example, the delegated hierarchy might be associated with
1148an unprivileged container run by
1149.IR cecilia .)
1150Even if a cgroup namespace was employed,
1151because both hierarchies are owned by the unprivileged user
1152.IR cecilia ,
1153the following illegitimate actions could be performed:
cdede5cd 1154.IP \[bu] 3
ed3f4f34 1155A process in the inferior hierarchy could change the
619dbe1c 1156resource controller settings in the root directory of that hierarchy.
ed3f4f34
MK
1157(These resource controller settings are intended to allow control to
1158be exercised from the
1159.I parent
1160cgroup;
1161a process inside the child cgroup should not be allowed to modify them.)
cdede5cd 1162.IP \[bu]
ed3f4f34
MK
1163A process inside the inferior hierarchy could move processes
1164into and out of the inferior hierarchy if the cgroups in the
1165superior hierarchy were somehow visible.
c6d039a3 1166.P
ed3f4f34
MK
1167Employing the
1168.I nsdelegate
1169mount option prevents both of these possibilities.
c6d039a3 1170.P
ed3f4f34
MK
1171The
1172.I nsdelegate
1173mount option only has an effect when performed in
1174the initial mount namespace;
1175in other mount namespaces, the option is silently ignored.
c6d039a3 1176.P
07361828
MK
1177.IR Note :
1178On some systems,
1179.BR systemd (1)
1180automatically mounts the cgroup v2 filesystem.
1181In order to experiment with the
1182.I nsdelegate
44084d19
MK
1183operation, it may be useful to boot the kernel with
1184the following command-line options:
c6d039a3 1185.P
44084d19
MK
1186.in +4n
1187.EX
1188cgroup_no_v1=all systemd.legacy_systemd_cgroup_controller
1189.EE
1190.in
c6d039a3 1191.P
44084d19
MK
1192These options cause the kernel to boot with the cgroups v1 controllers
1193disabled (meaning that the controllers are available in the v2 hierarchy),
1194and tells
1195.BR systemd (1)
1196not to mount and use the cgroup v2 hierarchy,
1197so that the v2 hierarchy can be manually mounted
1198with the desired options after boot-up.
ed3f4f34 1199.\"
4b1c2041 1200.SS Cgroup delegation containment rules
4242dfbe 1201Some delegation
1ae6b2c7 1202.I containment rules
4242dfbe
MK
1203ensure that the delegatee can move processes between cgroups within the
1204delegated subtree,
1205but can't move processes from outside the delegated subtree into
1206the subtree or vice versa.
1207A nonprivileged process (i.e., the delegatee) can write the PID of
1208a "target" process into a
1ae6b2c7 1209.I cgroup.procs
4242dfbe 1210file only if all of the following are true:
cdede5cd 1211.IP \[bu] 3
4242dfbe
MK
1212The writer has write permission on the
1213.I cgroup.procs
1214file in the destination cgroup.
cdede5cd 1215.IP \[bu]
4242dfbe
MK
1216The writer has write permission on the
1217.I cgroup.procs
396761ee 1218file in the nearest common ancestor of the source and destination cgroups.
e366c4d4
MK
1219Note that in some cases,
1220the nearest common ancestor may be the source or destination cgroup itself.
4b1c2041
MK
1221This requirement is not enforced for cgroups v1 hierarchies,
1222with the consequence that containment in v1 is less strict than in v2.
1223(For example, in cgroups v1 the user that owns two distinct
1224delegated subhierarchies can move a process between the hierarchies.)
cdede5cd 1225.IP \[bu]
ed3f4f34
MK
1226If the cgroup v2 filesystem was mounted with the
1227.I nsdelegate
7b574df5 1228option, the writer must be able to see the source and destination cgroups
ed3f4f34 1229from its cgroup namespace.
cdede5cd 1230.IP \[bu]
4b1c2041 1231In cgroups v1:
28f612ea
MK
1232the effective UID of the writer (i.e., the delegatee) matches the
1233real user ID or the saved set-user-ID of the target process.
4b1c2041
MK
1234Before Linux 4.11,
1235.\" commit 576dd464505fc53d501bb94569db76f220104d28
1236this requirement also applied in cgroups v2
28f612ea
MK
1237(This was a historical requirement inherited from cgroups v1
1238that was later deemed unnecessary,
1239since the other rules suffice for containment in cgroups v2.)
c6d039a3 1240.P
4242dfbe
MK
1241.IR Note :
1242one consequence of these delegation containment rules is that the
0735069b
MK
1243unprivileged delegatee can't place the first process into
1244the delegated subtree;
1245instead, the delegater must place the first process
1246(a process owned by the delegatee) into the delegated subtree.
4242dfbe 1247.\"
75e83bc2 1248.SH CGROUPS VERSION 2 THREAD MODE
c8902e25
MK
1249Among the restrictions imposed by cgroups v2 that were not present
1250in cgroups v1 are the following:
cdede5cd 1251.IP \[bu] 3
c8902e25
MK
1252.IR "No thread-granularity control" :
1253all of the threads of a process must be in the same cgroup.
cdede5cd 1254.IP \[bu]
c8902e25
MK
1255.IR "No internal processes" :
1256a cgroup can't both have member processes and
1257exercise controllers on child cgroups.
c6d039a3 1258.P
c8902e25
MK
1259Both of these restrictions were added because
1260the lack of these restrictions had caused problems
1261in cgroups v1.
1262In particular, the cgroups v1 ability to allow thread-level granularity
1263for cgroup membership made no sense for some controllers.
1264(A notable example was the
1265.I memory
1266controller: since threads share an address space,
1267it made no sense to split threads across different
1268.I memory
1269cgroups.)
c6d039a3 1270.P
c8902e25
MK
1271Notwithstanding the initial design decision in cgroups v2,
1272there were use cases for certain controllers, notably the
1ae6b2c7 1273.I cpu
c8902e25
MK
1274controller,
1275for which thread-level granularity of control was meaningful and useful.
1276To accommodate such use cases, Linux 4.14 added
1277.I "thread mode"
1278for cgroups v2.
c6d039a3 1279.P
c8902e25 1280Thread mode allows the following:
cdede5cd 1281.IP \[bu] 3
c8902e25 1282The creation of
1ae6b2c7 1283.I threaded subtrees
c8902e25
MK
1284in which the threads of a process may
1285be spread across cgroups inside the tree.
1286(A threaded subtree may contain multiple multithreaded processes.)
cdede5cd 1287.IP \[bu]
c8902e25 1288The concept of
1ae6b2c7 1289.IR "threaded controllers" ,
c8902e25 1290which can distribute resources across the cgroups in a threaded subtree.
cdede5cd 1291.IP \[bu]
c8902e25
MK
1292A relaxation of the "no internal processes rule",
1293so that, within a threaded subtree,
1294a cgroup can both contain member threads and
1295exercise resource control over child cgroups.
c6d039a3 1296.P
c8902e25
MK
1297With the addition of thread mode,
1298each nonroot cgroup now contains a new file,
1299.IR cgroup.type ,
1300that exposes, and in some circumstances can be used to change,
1301the "type" of a cgroup.
1302This file contains one of the following type values:
1303.TP
1ae6b2c7 1304.I domain
c8902e25
MK
1305This is a normal v2 cgroup that provides process-granularity control.
1306If a process is a member of this cgroup,
1307then all threads of the process are (by definition) in the same cgroup.
1308This is the default cgroup type,
1309and provides the same behavior that was provided for
1310cgroups in the initial cgroups v2 implementation.
1311.TP
1ae6b2c7 1312.I threaded
c8902e25
MK
1313This cgroup is a member of a threaded subtree.
1314Threads can be added to this cgroup,
1315and controllers can be enabled for the cgroup.
1316.TP
1ae6b2c7 1317.I domain threaded
c8902e25
MK
1318This is a domain cgroup that serves as the root of a threaded subtree.
1319This cgroup type is also known as "threaded root".
1320.TP
1ae6b2c7 1321.I domain invalid
c8902e25
MK
1322This is a cgroup inside a threaded subtree
1323that is in an "invalid" state.
1324Processes can't be added to the cgroup,
1325and controllers can't be enabled for the cgroup.
1326The only thing that can be done with this cgroup (other than deleting it)
1327is to convert it to a
1ae6b2c7 1328.I threaded
c8902e25 1329cgroup by writing the string
1ae6b2c7 1330.I """threaded"""
c8902e25
MK
1331to the
1332.I cgroup.type
1333file.
61254835
MK
1334.IP
1335The rationale for the existence of this "interim" type
1336during the creation of a threaded subtree
1337(rather than the kernel simply immediately converting all cgroups
1338under the threaded root to the type
1339.IR threaded )
1340is to allow for
1341possible future extensions to the thread mode model
c8902e25
MK
1342.\"
1343.SS Threaded versus domain controllers
1344With the addition of threads mode,
1345cgroups v2 now distinguishes two types of resource controllers:
cdede5cd 1346.IP \[bu] 3
c8902e25 1347.I Threaded
2cd9bbfa 1348.\" In the kernel source, look for ".threaded[ \t]*= true" in
218eadf4 1349.\" initializations of struct cgroup_subsys
c8902e25
MK
1350controllers: these controllers support thread-granularity for
1351resource control and can be enabled inside threaded subtrees,
1352with the result that the corresponding controller-interface files
1353appear inside the cgroups in the threaded subtree.
aa2c3623 1354As at Linux 4.19, the following controllers are threaded:
c8902e25
MK
1355.IR cpu ,
1356.IR perf_event ,
1357and
1358.IR pids .
cdede5cd 1359.IP \[bu]
c8902e25
MK
1360.I Domain
1361controllers: these controllers support only process granularity
1362for resource control.
1363From the perspective of a domain controller,
1364all threads of a process are always in the same cgroup.
1365Domain controllers can't be enabled inside a threaded subtree.
1366.\"
1367.SS Creating a threaded subtree
1368There are two pathways that lead to the creation of a threaded subtree.
1369The first pathway proceeds as follows:
22356d97 1370.IP (1) 5
c8902e25 1371We write the string
1ae6b2c7 1372.I """threaded"""
c8902e25
MK
1373to the
1374.I cgroup.type
1375file of a cgroup
1ae6b2c7 1376.I y/z
c8902e25
MK
1377that currently has the type
1378.IR domain .
1379This has the following effects:
1380.RS
cdede5cd 1381.IP \[bu] 3
c8902e25 1382The type of the cgroup
1ae6b2c7 1383.I y/z
c8902e25
MK
1384becomes
1385.IR threaded .
cdede5cd 1386.IP \[bu]
c8902e25
MK
1387The type of the parent cgroup,
1388.IR y ,
1389becomes
1390.IR "domain threaded" .
1391The parent cgroup is the root of a threaded subtree
1392(also known as the "threaded root").
cdede5cd 1393.IP \[bu]
c8902e25 1394All other cgroups under
1ae6b2c7 1395.I y
c8902e25 1396that were not already of type
1ae6b2c7 1397.I threaded
c8902e25
MK
1398(because they were inside already existing threaded subtrees
1399under the new threaded root)
1400are converted to type
1401.IR "domain invalid" .
1402Any subsequently created cgroups under
1403.I y
1404will also have the type
1405.IR "domain invalid" .
1406.RE
22356d97 1407.IP (2)
c8902e25 1408We write the string
1ae6b2c7 1409.I """threaded"""
c8902e25 1410to each of the
1ae6b2c7 1411.I domain invalid
c8902e25
MK
1412cgroups under
1413.IR y ,
1414in order to convert them to the type
1415.IR threaded .
1416As a consequence of this step, all threads under the threaded root
1417now have the type
1ae6b2c7 1418.I threaded
c8902e25
MK
1419and the threaded subtree is now fully usable.
1420The requirement to write
1ae6b2c7 1421.I """threaded"""
c8902e25
MK
1422to each of these cgroups is somewhat cumbersome,
1423but allows for possible future extensions to the thread-mode model.
c6d039a3 1424.P
c8902e25 1425The second way of creating a threaded subtree is as follows:
22356d97 1426.IP (1) 5
c8902e25
MK
1427In an existing cgroup,
1428.IR z ,
1429that currently has the type
1430.IR domain ,
22356d97
AC
1431we (1.1) enable one or more threaded controllers and
1432(1.2) make a process a member of
c8902e25
MK
1433.IR z .
1434(These two steps can be done in either order.)
1435This has the following consequences:
1436.RS
cdede5cd 1437.IP \[bu] 3
c8902e25
MK
1438The type of
1439.I z
1440becomes
1441.IR "domain threaded" .
cdede5cd 1442.IP \[bu]
c8902e25
MK
1443All of the descendant cgroups of
1444.I x
7a1cddd2 1445that were not already of type
1ae6b2c7 1446.I threaded
c8902e25
MK
1447are converted to type
1448.IR "domain invalid" .
1449.RE
22356d97 1450.IP (2)
c8902e25 1451As before, we make the threaded subtree usable by writing the string
1ae6b2c7 1452.I """threaded"""
c8902e25 1453to each of the
1ae6b2c7 1454.I domain invalid
c8902e25
MK
1455cgroups under
1456.IR y ,
1457in order to convert them to the type
1458.IR threaded .
c6d039a3 1459.P
c8902e25
MK
1460One of the consequences of the above pathways to creating a threaded subtree
1461is that the threaded root cgroup can be a parent only to
1462.I threaded
1463(and
1464.IR "domain invalid" )
1465cgroups.
1466The threaded root cgroup can't be a parent of a
1467.I domain
1468cgroups, and a
1469.I threaded
1470cgroup
1471can't have a sibling that is a
1472.I domain
1473cgroup.
1474.\"
1475.SS Using a threaded subtree
1476Within a threaded subtree, threaded controllers can be enabled
1477in each subgroup whose type has been changed to
1478.IR threaded ;
1479upon doing so, the corresponding controller interface files
1480appear in the children of that cgroup.
c6d039a3 1481.P
c8902e25
MK
1482A process can be moved into a threaded subtree by writing its PID to the
1483.I cgroup.procs
1484file in one of the cgroups inside the tree.
1485This has the effect of making all of the threads
1486in the process members of the corresponding cgroup
1487and makes the process a member of the threaded subtree.
1488The threads of the process can then be spread across
1489the threaded subtree by writing their thread IDs (see
1490.BR gettid (2))
1491to the
b2c3e720 1492.I cgroup.threads
c8902e25
MK
1493files in different cgroups inside the subtree.
1494The threads of a process must all reside in the same threaded subtree.
c6d039a3 1495.P
d84e558e
MK
1496As with writing to
1497.IR cgroup.procs ,
1498some containment rules apply when writing to the
b2c3e720 1499.I cgroup.threads
d84e558e 1500file:
cdede5cd 1501.IP \[bu] 3
d84e558e
MK
1502The writer must have write permission on the
1503cgroup.threads
1504file in the destination cgroup.
cdede5cd 1505.IP \[bu]
d84e558e
MK
1506The writer must have write permission on the
1507.I cgroup.procs
1508file in the common ancestor of the source and destination cgroups.
1509(In some cases,
1510the common ancestor may be the source or destination cgroup itself.)
cdede5cd 1511.IP \[bu]
d84e558e
MK
1512The source and destination cgroups must be in the same threaded subtree.
1513(Outside a threaded subtree, an attempt to move a thread by writing
1514its thread ID to the
1515.I cgroup.threads
1516file in a different
1517.I domain
1518cgroup fails with the error
1519.BR EOPNOTSUPP .)
c6d039a3 1520.P
4178f132
MK
1521The
1522.I cgroup.threads
c8902e25
MK
1523file is present in each cgroup (including
1524.I domain
1525cgroups) and can be read in order to discover the set of threads
1526that is present in the cgroup.
1527The set of thread IDs obtained when reading this file
1528is not guaranteed to be ordered or free of duplicates.
c6d039a3 1529.P
c8902e25
MK
1530The
1531.I cgroup.procs
1532file in the threaded root shows the PIDs of all processes
1533that are members of the threaded subtree.
1534The
1535.I cgroup.procs
1536files in the other cgroups in the subtree are not readable.
c6d039a3 1537.P
c8902e25
MK
1538Domain controllers can't be enabled in a threaded subtree;
1539no controller-interface files appear inside the cgroups underneath the
1540threaded root.
1541From the point of view of a domain controller,
1542threaded subtrees are invisible:
1543a multithreaded process inside a threaded subtree appears to a domain
1544controller as a process that resides in the threaded root cgroup.
c6d039a3 1545.P
c8902e25
MK
1546Within a threaded subtree, the "no internal processes" rule does not apply:
1547a cgroup can both contain member processes (or thread)
1548and exercise controllers on child cgroups.
1549.\"
1550.SS Rules for writing to cgroup.type and creating threaded subtrees
1551A number of rules apply when writing to the
1552.I cgroup.type
1553file:
cdede5cd 1554.IP \[bu] 3
c8902e25 1555Only the string
1ae6b2c7 1556.I """threaded"""
c8902e25
MK
1557may be written.
1558In other words, the only explicit transition that is possible is to convert a
1559.I domain
1560cgroup to type
1561.IR threaded .
cdede5cd 1562.IP \[bu]
6c9aa5ad 1563The effect of writing
1ae6b2c7 1564.I """threaded"""
6c9aa5ad
MK
1565depends on the current value in
1566.IR cgroup.type ,
1567as follows:
c8902e25 1568.RS
cdede5cd 1569.IP \[bu] 3
1ae6b2c7 1570.I domain
6c9aa5ad
MK
1571or
1572.IR "domain threaded" :
1573start the creation of a threaded subtree
1574(whose root is the parent of this cgroup) via
c8902e25 1575the first of the pathways described above;
cdede5cd 1576.IP \[bu]
6c9aa5ad 1577.IR "domain\ invalid" :
4644794c 1578convert this cgroup (which is inside a threaded subtree) to a usable (i.e.,
c8902e25
MK
1579.IR threaded )
1580state;
cdede5cd 1581.IP \[bu]
6c9aa5ad
MK
1582.IR threaded :
1583no effect (a "no-op").
c8902e25 1584.RE
cdede5cd 1585.IP \[bu]
c8902e25
MK
1586We can't write to a
1587.I cgroup.type
1588file if the parent's type is
1589.IR "domain invalid" .
1590In other words, the cgroups of a threaded subtree must be converted to the
1591.I threaded
1592state in a top-down manner.
c6d039a3 1593.P
00c27092 1594There are also some constraints that must be satisfied
c8902e25
MK
1595in order to create a threaded subtree rooted at the cgroup
1596.IR x :
cdede5cd 1597.IP \[bu] 3
c8902e25
MK
1598There can be no member processes in the descendant cgroups of
1599.IR x .
1600(The cgroup
1601.I x
1602can itself have member processes.)
cdede5cd 1603.IP \[bu]
c8902e25
MK
1604No domain controllers may be enabled in
1605.IR x 's
1ae6b2c7 1606.I cgroup.subtree_control
c8902e25 1607file.
c6d039a3 1608.P
c8902e25 1609If any of the above constraints is violated, then an attempt to write
1ae6b2c7 1610.I """threaded"""
c8902e25 1611to a
1ae6b2c7 1612.I cgroup.type
c8902e25
MK
1613file fails with the error
1614.BR ENOTSUP .
1615.\"
1616.SS The """domain threaded""" cgroup type
1617According to the pathways described above,
1618the type of a cgroup can change to
1ae6b2c7 1619.I domain threaded
c8902e25 1620in either of the following cases:
cdede5cd 1621.IP \[bu] 3
c8902e25 1622The string
1ae6b2c7 1623.I """threaded"""
c8902e25 1624is written to a child cgroup.
cdede5cd 1625.IP \[bu]
c8902e25
MK
1626A threaded controller is enabled inside the cgroup and
1627a process is made a member of the cgroup.
c6d039a3 1628.P
c8902e25 1629A
1ae6b2c7 1630.I domain threaded
c8902e25
MK
1631cgroup,
1632.IR x ,
1633can revert to the type
1ae6b2c7 1634.I domain
36546c38 1635if the above conditions no longer hold true\[em]that is, if all
c8902e25
MK
1636.I threaded
1637child cgroups of
1638.I x
1639are removed and either
1640.I x
1641no longer has threaded controllers enabled or
1642no longer has member processes.
c6d039a3 1643.P
c8902e25 1644When a
1ae6b2c7 1645.I domain threaded
c8902e25 1646cgroup
1ae6b2c7 1647.I x
c8902e25
MK
1648reverts to the type
1649.IR domain :
cdede5cd 1650.IP \[bu] 3
c8902e25 1651All
1ae6b2c7 1652.I domain invalid
c8902e25
MK
1653descendants of
1654.I x
1655that are not in lower-level threaded subtrees revert to the type
1656.IR domain .
cdede5cd 1657.IP \[bu]
c8902e25
MK
1658The root cgroups in any lower-level threaded subtrees revert to the type
1659.IR "domain threaded" .
1660.\"
1661.SS Exceptions for the root cgroup
1662The root cgroup of the v2 hierarchy is treated exceptionally:
1663it can be the parent of both
1664.I domain
1665and
1666.I threaded
1667cgroups.
1668If the string
1669.I """threaded"""
1670is written to the
1671.I cgroup.type
1672file of one of the children of the root cgroup, then
cdede5cd 1673.IP \[bu] 3
c8902e25
MK
1674The type of that cgroup becomes
1675.IR threaded .
cdede5cd 1676.IP \[bu]
c8902e25
MK
1677The type of any descendants of that cgroup that
1678are not part of lower-level threaded subtrees changes to
1679.IR "domain invalid" .
c6d039a3 1680.P
c8902e25
MK
1681Note that in this case, there is no cgroup whose type becomes
1682.IR "domain threaded" .
1683(Notionally, the root cgroup can be considered as the threaded root
1684for the cgroup whose type was changed to
1685.IR threaded .)
c6d039a3 1686.P
c8902e25
MK
1687The aim of this exceptional treatment for the root cgroup is to
1688allow a threaded cgroup that employs the
1689.I cpu
1690controller to be placed as high as possible in the hierarchy,
1691so as to minimize the (small) cost of traversing the cgroup hierarchy.
1692.\"
edc90967 1693.SS The cgroups v2 """cpu""" controller and realtime threads
aa2c3623 1694As at Linux 4.19, the cgroups v2
c8902e25 1695.I cpu
0bef253e
MK
1696controller does not support control of realtime threads
1697(specifically threads scheduled under any of the policies
1698.BR SCHED_FIFO ,
1699.BR SCHED_RR ,
1700described
1701.BR SCHED_DEADLINE ;
1702see
1703.BR sched (7)).
1704Therefore, the
1705.I cpu
1706controller can be enabled in the root cgroup only
c8902e25 1707if all realtime threads are in the root cgroup.
edc90967 1708(If there are realtime threads in nonroot cgroups, then a
c8902e25
MK
1709.BR write (2)
1710of the string
1ae6b2c7 1711.I """+cpu"""
c8902e25
MK
1712to the
1713.I cgroup.subtree_control
1714file fails with the error
c2df7694 1715.BR EINVAL .)
c6d039a3 1716.P
17094a28 1717On some systems,
c8902e25 1718.BR systemd (1)
edc90967 1719places certain realtime threads in nonroot cgroups in the v2 hierarchy.
c8902e25 1720On such systems,
edc90967 1721these threads must first be moved to the root cgroup before the
c8902e25
MK
1722.I cpu
1723controller can be enabled.
1724.\"
1725.SH ERRORS
1726The following errors can occur for
1727.BR mount (2):
1728.TP
1729.B EBUSY
1730An attempt to mount a cgroup version 1 filesystem specified neither the
1731.I name=
1732option (to mount a named hierarchy) nor a controller name (or
1733.IR all ).
1734.SH NOTES
1735A child process created via
1736.BR fork (2)
1737inherits its parent's cgroup memberships.
1738A process's cgroup memberships are preserved across
1739.BR execve (2).
c6d039a3 1740.P
c0e4ab63
MK
1741The
1742.BR clone3 (2)
1743.B CLONE_INTO_CGROUP
1744flag can be used to create a child process that begins its life in
1745a different version 2 cgroup from the parent process.
c8902e25 1746.\"
5c2181ad
MK
1747.SS /proc files
1748.TP
34eb3340 1749.IR /proc/cgroups " (since Linux 2.6.24)"
92bb6d36 1750This file contains information about the controllers
1a4f7d59 1751that are compiled into the kernel.
34eb3340
MK
1752An example of the contents of this file (reformatted for readability)
1753is the following:
a721e8b2 1754.IP
34eb3340 1755.in +4n
b8302363 1756.EX
4580c2f6
MK
1757#subsys_name hierarchy num_cgroups enabled
1758cpuset 4 1 1
1759cpu 8 1 1
1760cpuacct 8 1 1
1761blkio 6 1 1
1762memory 3 1 1
1763devices 10 84 1
1764freezer 7 1 1
1765net_cls 9 1 1
1766perf_event 5 1 1
1767net_prio 9 1 1
1768hugetlb 0 1 0
1769pids 2 1 1
b8302363 1770.EE
e646a1ba 1771.in
a721e8b2 1772.IP
34eb3340
MK
1773The fields in this file are, from left to right:
1774.RS
22356d97 1775.IP [1] 5
34eb3340 1776The name of the controller.
22356d97 1777.IP [2]
92bb6d36 1778The unique ID of the cgroup hierarchy on which this controller is mounted.
11c0797f 1779If multiple cgroups v1 controllers are bound to the same hierarchy,
34eb3340 1780then each will show the same hierarchy ID in this field.
92bb6d36 1781The value in this field will be 0 if:
22356d97 1782.RS
cdede5cd 1783.IP \[bu] 3
92bb6d36 1784the controller is not mounted on a cgroups v1 hierarchy;
cdede5cd 1785.IP \[bu]
92bb6d36 1786the controller is bound to the cgroups v2 single unified hierarchy; or
cdede5cd 1787.IP \[bu]
92bb6d36
MK
1788the controller is disabled (see below).
1789.RE
22356d97 1790.IP [3]
34eb3340 1791The number of control groups in this hierarchy using this controller.
22356d97 1792.IP [4]
34eb3340
MK
1793This field contains the value 1 if this controller is enabled,
1794or 0 if it has been disabled (via the
1ae6b2c7 1795.I cgroup_disable
34eb3340
MK
1796kernel command-line boot parameter).
1797.RE
1798.TP
0d782b8d 1799.IR /proc/ pid /cgroup " (since Linux 2.6.24)"
f5faa016
MK
1800This file describes control groups to which the process
1801with the corresponding PID belongs.
5f8a7eb2 1802The displayed information differs for
2c4fbe35 1803cgroups version 1 and version 2 hierarchies.
a721e8b2 1804.IP
5f8a7eb2 1805For each cgroup hierarchy of which the process is a member,
2e33b59e 1806there is one entry containing three colon-separated fields:
a721e8b2 1807.IP
4769a778
MK
1808.in +4n
1809.EX
d064d41a 1810hierarchy\-ID:controller\-list:cgroup\-path
4769a778
MK
1811.EE
1812.in
a721e8b2 1813.IP
5f8a7eb2 1814For example:
c1a022dc
MK
1815.IP
1816.in +4n
1817.EX
18185:cpuacct,cpu,cpuset:/daemons
1819.EE
1820.in
5c2181ad
MK
1821.IP
1822The colon-separated fields are, from left to right:
5f8a7eb2 1823.RS
22356d97 1824.IP [1] 5
5f8a7eb2
MK
1825For cgroups version 1 hierarchies,
1826this field contains a unique hierarchy ID number
1827that can be matched to a hierarchy ID in
1828.IR /proc/cgroups .
1829For the cgroups version 2 hierarchy, this field contains the value 0.
22356d97 1830.IP [2]
5f8a7eb2 1831For cgroups version 1 hierarchies,
55f52de8 1832this field contains a comma-separated list of the controllers
5f8a7eb2
MK
1833bound to the hierarchy.
1834For the cgroups version 2 hierarchy, this field is empty.
22356d97 1835.IP [3]
5f8a7eb2
MK
1836This field contains the pathname of the control group in the hierarchy
1837to which the process belongs.
1838This pathname is relative to the mount point of the hierarchy.
5c2181ad 1839.RE
668ef765
MK
1840.\"
1841.SS /sys/kernel/cgroup files
1842.TP
1843.IR /sys/kernel/cgroup/delegate " (since Linux 4.15)"
1844.\" commit 01ee6cfb1483fe57c9cbd8e73817dfbf9bacffd3
1845This file exports a list of the cgroups v2 files
1846(one per line) that are delegatable
1847(i.e., whose ownership should be changed to the user ID of the delegatee).
1848In the future, the set of delegatable files may change or grow,
1849and this file provides a way for the kernel to inform
1850user-space applications of which files must be delegated.
1851As at Linux 4.15, one sees the following when inspecting this file:
1852.IP
668ef765 1853.in +4n
4f237029 1854.EX
668ef765
MK
1855$ \fBcat /sys/kernel/cgroup/delegate\fP
1856cgroup.procs
1857cgroup.subtree_control
c7913617 1858cgroup.threads
668ef765 1859.EE
4f237029 1860.in
6413d784
MK
1861.TP
1862.IR /sys/kernel/cgroup/features " (since Linux 4.15)"
1863.\" commit 5f2e673405b742be64e7c3604ed4ed3ac14f35ce
1864Over time, the set of cgroups v2 features that are provided by the
1865kernel may change or grow,
1866or some features may not be enabled by default.
1867This file provides a way for user-space applications to discover what
fcf115f5 1868features the running kernel supports and has enabled.
6413d784
MK
1869Features are listed one per line:
1870.IP
1871.in +4n
1872.EX
6413d784
MK
1873$ \fBcat /sys/kernel/cgroup/features\fP
1874nsdelegate
9e18674a 1875memory_localevents
2e69ff53 1876.EE
6413d784
MK
1877.in
1878.IP
1879The entries that can appear in this file are:
1880.RS
1881.TP
9e18674a
MK
1882.IR memory_localevents " (since Linux 5.2)"
1883The kernel supports the
1884.I memory_localevents
1885mount option.
1886.TP
6413d784
MK
1887.IR nsdelegate " (since Linux 4.15)"
1888The kernel supports the
1889.I nsdelegate
1890mount option.
e571991e
BH
1891.TP
1892.IR memory_recursiveprot " (since Linux 5.7)"
1893.\" commit 8a931f801340c2be10552c7b5622d5f4852f3a36
1894The kernel supports the
1895.I memory_recursiveprot
1896mount option.
6413d784 1897.RE
bbfdf727 1898.SH SEE ALSO
ebbc83be 1899.BR prlimit (1),
f60a5da2 1900.BR systemd (1),
28a4c58c
MK
1901.BR systemd\-cgls (1),
1902.BR systemd\-cgtop (1),
325b7eb0 1903.BR clone (2),
ebbc83be
MK
1904.BR ioprio_set (2),
1905.BR perf_event_open (2),
1906.BR setrlimit (2),
cff6de30 1907.BR cgroup_namespaces (7),
69c47536 1908.BR cpuset (7),
ebbc83be
MK
1909.BR namespaces (7),
1910.BR sched (7),
1911.BR user_namespaces (7)
c6d039a3 1912.P
d4c9a848 1913The kernel source file
069cbb60 1914.IR Documentation/admin\-guide/cgroup\-v2.rst .