]> git.ipfire.org Git - thirdparty/man-pages.git/blame - man7/cgroup_namespaces.7
Many pages: Use correct letter case in page titles (TH)
[thirdparty/man-pages.git] / man7 / cgroup_namespaces.7
CommitLineData
c736cecc
MK
1.\" Copyright (c) 2016 by Michael Kerrisk <mtk.manpages@gmail.com>
2.\"
5fbde956 3.\" SPDX-License-Identifier: Linux-man-pages-copyleft
c736cecc
MK
4.\"
5.\"
4c1c5274 6.TH cgroup_namespaces 7 (date) "Linux man-pages (unreleased)"
c736cecc
MK
7.SH NAME
8cgroup_namespaces \- overview of Linux cgroup namespaces
9.SH DESCRIPTION
10For an overview of namespaces, see
11.BR namespaces (7).
40749137 12.PP
c736cecc
MK
13Cgroup namespaces virtualize the view of a process's cgroups (see
14.BR cgroups (7))
15as seen via
1ae6b2c7 16.IR /proc/ pid /cgroup
c736cecc 17and
1ae6b2c7 18.IR /proc/ pid /mountinfo .
40749137 19.PP
aa864d82
MK
20Each cgroup namespace has its own set of cgroup root directories.
21These root directories are the base points for the relative
22locations displayed in the corresponding records in the
1ae6b2c7 23.IR /proc/ pid /cgroup
aa864d82 24file.
c736cecc
MK
25When a process creates a new cgroup namespace using
26.BR clone (2)
27or
28.BR unshare (2)
29with the
1ae6b2c7 30.B CLONE_NEWCGROUP
ef129697 31flag, its current
29179416
MK
32cgroups directories become the cgroup root directories
33of the new namespace.
c736cecc
MK
34(This applies both for the cgroups version 1 hierarchies
35and the cgroups version 2 unified hierarchy.)
40749137 36.PP
727e5609 37When reading the cgroup memberships of a "target" process from
1ae6b2c7 38.IR /proc/ pid /cgroup ,
c736cecc 39the pathname shown in the third field of each record will be
aa864d82
MK
40relative to the reading process's root directory
41for the corresponding cgroup hierarchy.
c736cecc
MK
42If the cgroup directory of the target process lies outside
43the root directory of the reading process's cgroup namespace,
44then the pathname will show
45.I ../
46entries for each ancestor level in the cgroup hierarchy.
40749137 47.PP
c736cecc
MK
48The following shell session demonstrates the effect of creating
49a new cgroup namespace.
c9a35b01 50.PP
727e5609
MK
51First, (as superuser) in a shell in the initial cgroup namespace,
52we create a child cgroup in the
c736cecc 53.I freezer
c9a35b01
MK
54hierarchy, and place a process in that cgroup that we will
55use as part of the demonstration below:
56.PP
57.in +4n
58.EX
59# \fBmkdir \-p /sys/fs/cgroup/freezer/sub2\fP
60# \fBsleep 10000 &\fP # Create a process that lives for a while
61[1] 20124
62# \fBecho 20124 > /sys/fs/cgroup/freezer/sub2/cgroup.procs\fP
63.EE
64.in
65.PP
66We then create another child cgroup in the
67.I freezer
68hierarchy and put the shell into that cgroup:
40749137 69.PP
c736cecc 70.in +4n
b8302363 71.EX
c736cecc
MK
72# \fBmkdir \-p /sys/fs/cgroup/freezer/sub\fP
73# \fBecho $$\fP # Show PID of this shell
7430655
e39f614f 75# \fBecho 30655 > /sys/fs/cgroup/freezer/sub/cgroup.procs\fP
c736cecc
MK
76# \fBcat /proc/self/cgroup | grep freezer\fP
777:freezer:/sub
b8302363 78.EE
e646a1ba 79.in
40749137 80.PP
c736cecc
MK
81Next, we use
82.BR unshare (1)
83to create a process running a new shell in new cgroup and mount namespaces:
40749137 84.PP
c736cecc 85.in +4n
146842f9 86.EX
f3da99c4 87# \fBPS1="sh2# " unshare \-Cm bash\fP
32bc5a71 88.EE
146842f9 89.in
40749137 90.PP
727e5609
MK
91From the new shell started by
92.BR unshare (1),
93we then inspect the
1ae6b2c7 94.IR /proc/ pid /cgroup
727e5609
MK
95files of, respectively, the new shell,
96a process that is in the initial cgroup namespace
c736cecc 97.RI ( init ,
c9a35b01 98with PID 1), and the process in the sibling cgroup
aa864d82 99.RI ( sub2 ):
40749137 100.PP
c736cecc 101.in +4n
146842f9 102.EX
f3da99c4 103sh2# \fBcat /proc/self/cgroup | grep freezer\fP
c736cecc 1047:freezer:/
f3da99c4 105sh2# \fBcat /proc/1/cgroup | grep freezer\fP
c736cecc 1067:freezer:/..
f3da99c4 107sh2# \fBcat /proc/20124/cgroup | grep freezer\fP
c736cecc 1087:freezer:/../sub2
32bc5a71 109.EE
146842f9 110.in
89cbd279
MK
111.PP
112From the output of the first command,
113we see that the freezer cgroup membership of the new shell
114(which is in the same cgroup as the initial shell)
115is shown defined relative to the freezer cgroup root directory
116that was established when the new cgroup namespace was created.
117(In absolute terms,
118the new shell is in the
119.I /sub
120freezer cgroup,
121and the root directory of the freezer cgroup hierarchy
122in the new cgroup namespace is also
123.IR /sub .
124Thus, the new shell's cgroup membership is displayed as \(aq/\(aq.)
125.PP
c736cecc 126However, when we look in
1ae6b2c7 127.I /proc/self/mountinfo
c736cecc 128we see the following anomaly:
40749137 129.PP
c736cecc 130.in +4n
146842f9 131.EX
f3da99c4 132sh2# \fBcat /proc/self/mountinfo | grep freezer\fP
c736cecc 133155 145 0:32 /.. /sys/fs/cgroup/freezer ...
32bc5a71 134.EE
146842f9 135.in
40749137 136.PP
aa864d82
MK
137The fourth field of this line
138.RI ( /.. )
139should show the
c736cecc
MK
140directory in the cgroup filesystem which forms the root of this mount.
141Since by the definition of cgroup namespaces, the process's current
142freezer cgroup directory became its root freezer cgroup directory,
143we should see \(aq/\(aq in this field.
144The problem here is that we are seeing a mount entry for the cgroup
727e5609
MK
145filesystem corresponding to the initial cgroup namespace
146(whose cgroup filesystem is indeed rooted at the parent directory of
c736cecc 147.IR sub ).
727e5609
MK
148To fix this problem, we must remount the freezer cgroup filesystem
149from the new shell (i.e., perform the mount from a process that is in the
150new cgroup namespace), after which we see the expected results:
40749137 151.PP
c736cecc 152.in +4n
146842f9 153.EX
861d36ba 154sh2# \fBmount \-\-make\-rslave /\fP # Don\(aqt propagate mount events
f3da99c4
MK
155 # to other namespaces
156sh2# \fBumount /sys/fs/cgroup/freezer\fP
157sh2# \fBmount \-t cgroup \-o freezer freezer /sys/fs/cgroup/freezer\fP
158sh2# \fBcat /proc/self/mountinfo | grep freezer\fP
c736cecc 159155 145 0:32 / /sys/fs/cgroup/freezer rw,relatime ...
32bc5a71 160.EE
146842f9 161.in
c736cecc 162.\"
3113c7f3 163.SH STANDARDS
e664450b 164Namespaces are a Linux-specific feature.
c736cecc 165.SH NOTES
d190902b
MK
166Use of cgroup namespaces requires a kernel that is configured with the
167.B CONFIG_CGROUPS
168option.
169.PP
4d9b3039 170The virtualization provided by cgroup namespaces serves a number of purposes:
22356d97 171.IP \(bu 3
c736cecc
MK
172It prevents information leaks whereby cgroup directory paths outside of
173a container would otherwise be visible to processes in the container.
174Such leakages could, for example,
175reveal information about the container framework
176to containerized applications.
22356d97 177.IP \(bu
10b547c5
MK
178It eases tasks such as container migration.
179The virtualization provided by cgroup namespaces
180allows containers to be isolated from knowledge of
181the pathnames of ancestor cgroups.
0191a7b9
MK
182Without such isolation, the full cgroup pathnames (displayed in
183.IR /proc/self/cgroups )
184would need to be replicated on the target system when migrating a container;
10b547c5
MK
185those pathnames would also need to be unique,
186so that they don't conflict with other pathnames on the target system.
22356d97 187.IP \(bu
a531b2cf 188It allows better confinement of containerized processes,
a2b7dba5
MK
189because it is possible to mount the container's cgroup filesystems such that
190the container processes can't gain access to ancestor cgroup directories.
c736cecc 191Consider, for example, the following scenario:
22356d97
AC
192.RS
193.IP \(bu 3
c736cecc
MK
194We have a cgroup directory,
195.IR /cg/1 ,
196that is owned by user ID 9000.
197.IP \(bu
198We have a process,
199.IR X ,
200also owned by user ID 9000,
201that is namespaced under the cgroup
1ae6b2c7 202.I /cg/1/2
c736cecc
MK
203(i.e.,
204.I X
205was placed in a new cgroup namespace via
206.BR clone (2)
207or
208.BR unshare (2)
209with the
1ae6b2c7 210.B CLONE_NEWCGROUP
c736cecc
MK
211flag).
212.RE
213.IP
214In the absence of cgroup namespacing, because the cgroup directory
1ae6b2c7 215.I /cg/1
ef6f9539 216is owned (and writable) by UID 9000 and process
bcedc0c2 217.I X
80c5b48d 218is also owned by user ID 9000, process
bcedc0c2
MK
219.I X
220would be able to modify the contents of cgroups files
221(i.e., change cgroup settings) not only in
1ae6b2c7 222.I /cg/1/2
c736cecc
MK
223but also in the ancestor cgroup directory
224.IR /cg/1 .
225Namespacing process
1ae6b2c7 226.I X
c736cecc 227under the cgroup directory
cc267b37
MK
228.IR /cg/1/2 ,
229in combination with suitable mount operations
230for the cgroup filesystem (as shown above),
c736cecc
MK
231prevents it modifying files in
232.IR /cg/1 ,
233since it cannot even see the contents of that directory
234(or of further removed cgroup ancestor directories).
235Combined with correct enforcement of hierarchical limits,
2a785d2a
MK
236this prevents process
237.I X
238from escaping the limits imposed by ancestor cgroups.
c736cecc
MK
239.SH SEE ALSO
240.BR unshare (1),
241.BR clone (2),
242.BR setns (2),
243.BR unshare (2),
244.BR proc (5),
245.BR cgroups (7),
246.BR credentials (7),
61256f9f 247.BR namespaces (7),
c736cecc 248.BR user_namespaces (7)