]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/namespaces.7
Many pages: Fix style issues reported by `make lint-groff`
[thirdparty/man-pages.git] / man7 / namespaces.7
1 .\" Copyright (c) 2013, 2016, 2017 by Michael Kerrisk <mtk.manpages@gmail.com>
2 .\" and Copyright (c) 2012 by Eric W. Biederman <ebiederm@xmission.com>
3 .\"
4 .\" SPDX-License-Identifier: Linux-man-pages-copyleft
5 .\"
6 .\"
7 .TH NAMESPACES 7 2021-08-27 "Linux" "Linux Programmer's Manual"
8 .SH NAME
9 namespaces \- overview of Linux namespaces
10 .SH DESCRIPTION
11 A namespace wraps a global system resource in an abstraction that
12 makes it appear to the processes within the namespace that they
13 have their own isolated instance of the global resource.
14 Changes to the global resource are visible to other processes
15 that are members of the namespace, but are invisible to other processes.
16 One use of namespaces is to implement containers.
17 .PP
18 This page provides pointers to information on the various namespace types,
19 describes the associated
20 .I /proc
21 files, and summarizes the APIs for working with namespaces.
22 .\"
23 .SS Namespace types
24 The following table shows the namespace types available on Linux.
25 The second column of the table shows the flag value that is used to specify
26 the namespace type in various APIs.
27 The third column identifies the manual page that provides details
28 on the namespace type.
29 The last column is a summary of the resources that are isolated by
30 the namespace type.
31 .ad l
32 .nh
33 .TS
34 lB lB lB lB
35 l1 lB1 l1 l.
36 Namespace Flag Page Isolates
37 Cgroup CLONE_NEWCGROUP \fBcgroup_namespaces\fP(7) T{
38 Cgroup root directory
39 T}
40 IPC CLONE_NEWIPC \fBipc_namespaces\fP(7) T{
41 System V IPC,
42 POSIX message queues
43 T}
44 Network CLONE_NEWNET \fBnetwork_namespaces\fP(7) T{
45 Network devices,
46 stacks, ports, etc.
47 T}
48 Mount CLONE_NEWNS \fBmount_namespaces\fP(7) Mount points
49 PID CLONE_NEWPID \fBpid_namespaces\fP(7) Process IDs
50 Time CLONE_NEWTIME \fBtime_namespaces\fP(7) T{
51 Boot and monotonic
52 clocks
53 T}
54 User CLONE_NEWUSER \fBuser_namespaces\fP(7) T{
55 User and group IDs
56 T}
57 UTS CLONE_NEWUTS \fButs_namespaces\fP(7) T{
58 Hostname and NIS
59 domain name
60 T}
61 .TE
62 .hy
63 .ad
64 .\"
65 .\" ==================== The namespaces API ====================
66 .\"
67 .SS The namespaces API
68 As well as various
69 .I /proc
70 files described below,
71 the namespaces API includes the following system calls:
72 .TP
73 .BR clone (2)
74 The
75 .BR clone (2)
76 system call creates a new process.
77 If the
78 .I flags
79 argument of the call specifies one or more of the
80 .B CLONE_NEW*
81 flags listed above, then new namespaces are created for each flag,
82 and the child process is made a member of those namespaces.
83 (This system call also implements a number of features
84 unrelated to namespaces.)
85 .TP
86 .BR setns (2)
87 The
88 .BR setns (2)
89 system call allows the calling process to join an existing namespace.
90 The namespace to join is specified via a file descriptor that refers to
91 one of the
92 .IR /proc/ pid /ns
93 files described below.
94 .TP
95 .BR unshare (2)
96 The
97 .BR unshare (2)
98 system call moves the calling process to a new namespace.
99 If the
100 .I flags
101 argument of the call specifies one or more of the
102 .B CLONE_NEW*
103 flags listed above, then new namespaces are created for each flag,
104 and the calling process is made a member of those namespaces.
105 (This system call also implements a number of features
106 unrelated to namespaces.)
107 .TP
108 .BR ioctl (2)
109 Various
110 .BR ioctl (2)
111 operations can be used to discover information about namespaces.
112 These operations are described in
113 .BR ioctl_ns (2).
114 .PP
115 Creation of new namespaces using
116 .BR clone (2)
117 and
118 .BR unshare (2)
119 in most cases requires the
120 .B CAP_SYS_ADMIN
121 capability, since, in the new namespace,
122 the creator will have the power to change global resources
123 that are visible to other processes that are subsequently created in,
124 or join the namespace.
125 User namespaces are the exception: since Linux 3.8,
126 no privilege is required to create a user namespace.
127 .\"
128 .\" ==================== The /proc/[pid]/ns/ directory ====================
129 .\"
130 .SS The /proc/[pid]/ns/ directory
131 Each process has a
132 .IR /proc/ pid /ns/
133 .\" See commit 6b4e306aa3dc94a0545eb9279475b1ab6209a31f
134 subdirectory containing one entry for each namespace that
135 supports being manipulated by
136 .BR setns (2):
137 .PP
138 .in +4n
139 .EX
140 $ \fBls \-l /proc/$$/ns | awk \(aq{print $1, $9, $10, $11}\(aq\fP
141 total 0
142 lrwxrwxrwx. cgroup \-> cgroup:[4026531835]
143 lrwxrwxrwx. ipc \-> ipc:[4026531839]
144 lrwxrwxrwx. mnt \-> mnt:[4026531840]
145 lrwxrwxrwx. net \-> net:[4026531969]
146 lrwxrwxrwx. pid \-> pid:[4026531836]
147 lrwxrwxrwx. pid_for_children \-> pid:[4026531834]
148 lrwxrwxrwx. time \-> time:[4026531834]
149 lrwxrwxrwx. time_for_children \-> time:[4026531834]
150 lrwxrwxrwx. user \-> user:[4026531837]
151 lrwxrwxrwx. uts \-> uts:[4026531838]
152 .EE
153 .in
154 .PP
155 Bind mounting (see
156 .BR mount (2))
157 one of the files in this directory
158 to somewhere else in the filesystem keeps
159 the corresponding namespace of the process specified by
160 .I pid
161 alive even if all processes currently in the namespace terminate.
162 .PP
163 Opening one of the files in this directory
164 (or a file that is bind mounted to one of these files)
165 returns a file handle for
166 the corresponding namespace of the process specified by
167 .IR pid .
168 As long as this file descriptor remains open,
169 the namespace will remain alive,
170 even if all processes in the namespace terminate.
171 The file descriptor can be passed to
172 .BR setns (2).
173 .PP
174 In Linux 3.7 and earlier, these files were visible as hard links.
175 Since Linux 3.8,
176 .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5
177 they appear as symbolic links.
178 If two processes are in the same namespace,
179 then the device IDs and inode numbers of their
180 .IR /proc/ pid /ns/ xxx
181 symbolic links will be the same; an application can check this using the
182 .I stat.st_dev
183 .\" Eric Biederman: "I reserve the right for st_dev to be significant
184 .\" when comparing namespaces."
185 .\" https://lore.kernel.org/lkml/87poky5ca9.fsf@xmission.com/
186 .\" Re: Documenting the ioctl interfaces to discover relationships...
187 .\" Date: Mon, 12 Dec 2016 11:30:38 +1300
188 and
189 .I stat.st_ino
190 fields returned by
191 .BR stat (2).
192 The content of this symbolic link is a string containing
193 the namespace type and inode number as in the following example:
194 .PP
195 .in +4n
196 .EX
197 $ \fBreadlink /proc/$$/ns/uts\fP
198 uts:[4026531838]
199 .EE
200 .in
201 .PP
202 The symbolic links in this subdirectory are as follows:
203 .TP
204 .IR /proc/ pid /ns/cgroup " (since Linux 4.6)"
205 This file is a handle for the cgroup namespace of the process.
206 .TP
207 .IR /proc/ pid /ns/ipc " (since Linux 3.0)"
208 This file is a handle for the IPC namespace of the process.
209 .TP
210 .IR /proc/ pid /ns/mnt " (since Linux 3.8)"
211 .\" commit 8823c079ba7136dc1948d6f6dcb5f8022bde438e
212 This file is a handle for the mount namespace of the process.
213 .TP
214 .IR /proc/ pid /ns/net " (since Linux 3.0)"
215 This file is a handle for the network namespace of the process.
216 .TP
217 .IR /proc/ pid /ns/pid " (since Linux 3.8)"
218 .\" commit 57e8391d327609cbf12d843259c968b9e5c1838f
219 This file is a handle for the PID namespace of the process.
220 This handle is permanent for the lifetime of the process
221 (i.e., a process's PID namespace membership never changes).
222 .TP
223 .IR /proc/ pid /ns/pid_for_children " (since Linux 4.12)"
224 .\" commit eaa0d190bfe1ed891b814a52712dcd852554cb08
225 This file is a handle for the PID namespace of
226 child processes created by this process.
227 This can change as a consequence of calls to
228 .BR unshare (2)
229 and
230 .BR setns (2)
231 (see
232 .BR pid_namespaces (7)),
233 so the file may differ from
234 .IR /proc/ pid /ns/pid .
235 The symbolic link gains a value only after the first child process
236 is created in the namespace.
237 (Beforehand,
238 .BR readlink (2)
239 of the symbolic link will return an empty buffer.)
240 .TP
241 .IR /proc/ pid /ns/time " (since Linux 5.6)"
242 This file is a handle for the time namespace of the process.
243 .TP
244 .IR /proc/ pid /ns/time_for_children " (since Linux 5.6)"
245 This file is a handle for the time namespace of
246 child processes created by this process.
247 This can change as a consequence of calls to
248 .BR unshare (2)
249 and
250 .BR setns (2)
251 (see
252 .BR time_namespaces (7)),
253 so the file may differ from
254 .IR /proc/ pid /ns/time .
255 .TP
256 .IR /proc/ pid /ns/user " (since Linux 3.8)"
257 .\" commit cde1975bc242f3e1072bde623ef378e547b73f91
258 This file is a handle for the user namespace of the process.
259 .TP
260 .IR /proc/ pid /ns/uts " (since Linux 3.0)"
261 This file is a handle for the UTS namespace of the process.
262 .PP
263 Permission to dereference or read
264 .RB ( readlink (2))
265 these symbolic links is governed by a ptrace access mode
266 .B PTRACE_MODE_READ_FSCREDS
267 check; see
268 .BR ptrace (2).
269 .\"
270 .\" ==================== The /proc/sys/user directory ====================
271 .\"
272 .SS The /proc/sys/user directory
273 The files in the
274 .I /proc/sys/user
275 directory (which is present since Linux 4.9) expose limits
276 on the number of namespaces of various types that can be created.
277 The files are as follows:
278 .TP
279 .I max_cgroup_namespaces
280 The value in this file defines a per-user limit on the number of
281 cgroup namespaces that may be created in the user namespace.
282 .TP
283 .I max_ipc_namespaces
284 The value in this file defines a per-user limit on the number of
285 ipc namespaces that may be created in the user namespace.
286 .TP
287 .I max_mnt_namespaces
288 The value in this file defines a per-user limit on the number of
289 mount namespaces that may be created in the user namespace.
290 .TP
291 .I max_net_namespaces
292 The value in this file defines a per-user limit on the number of
293 network namespaces that may be created in the user namespace.
294 .TP
295 .I max_pid_namespaces
296 The value in this file defines a per-user limit on the number of
297 PID namespaces that may be created in the user namespace.
298 .TP
299 .IR max_time_namespaces " (since Linux 5.7)"
300 .\" commit eeec26d5da8248ea4e240b8795bb4364213d3247
301 The value in this file defines a per-user limit on the number of
302 time namespaces that may be created in the user namespace.
303 .TP
304 .I max_user_namespaces
305 The value in this file defines a per-user limit on the number of
306 user namespaces that may be created in the user namespace.
307 .TP
308 .I max_uts_namespaces
309 The value in this file defines a per-user limit on the number of
310 uts namespaces that may be created in the user namespace.
311 .PP
312 Note the following details about these files:
313 .IP * 3
314 The values in these files are modifiable by privileged processes.
315 .IP *
316 The values exposed by these files are the limits for the user namespace
317 in which the opening process resides.
318 .IP *
319 The limits are per-user.
320 Each user in the same user namespace
321 can create namespaces up to the defined limit.
322 .IP *
323 The limits apply to all users, including UID 0.
324 .IP *
325 These limits apply in addition to any other per-namespace
326 limits (such as those for PID and user namespaces) that may be enforced.
327 .IP *
328 Upon encountering these limits,
329 .BR clone (2)
330 and
331 .BR unshare (2)
332 fail with the error
333 .BR ENOSPC .
334 .IP *
335 For the initial user namespace,
336 the default value in each of these files is half the limit on the number
337 of threads that may be created
338 .RI ( /proc/sys/kernel/threads\-max ).
339 In all descendant user namespaces, the default value in each file is
340 .BR MAXINT .
341 .IP *
342 When a namespace is created, the object is also accounted
343 against ancestor namespaces.
344 More precisely:
345 .RS
346 .IP + 3
347 Each user namespace has a creator UID.
348 .IP +
349 When a namespace is created,
350 it is accounted against the creator UIDs in each of the
351 ancestor user namespaces,
352 and the kernel ensures that the corresponding namespace limit
353 for the creator UID in the ancestor namespace is not exceeded.
354 .IP +
355 The aforementioned point ensures that creating a new user namespace
356 cannot be used as a means to escape the limits in force
357 in the current user namespace.
358 .RE
359 .\"
360 .SS Namespace lifetime
361 Absent any other factors,
362 a namespace is automatically torn down when the last process in
363 the namespace terminates or leaves the namespace.
364 However, there are a number of other factors that may pin
365 a namespace into existence even though it has no member processes.
366 These factors include the following:
367 .IP * 3
368 An open file descriptor or a bind mount exists for the corresponding
369 .IR /proc/ pid /ns/*
370 file.
371 .IP *
372 The namespace is hierarchical (i.e., a PID or user namespace),
373 and has a child namespace.
374 .IP *
375 It is a user namespace that owns one or more nonuser namespaces.
376 .IP *
377 It is a PID namespace,
378 and there is a process that refers to the namespace via a
379 .IR /proc/ pid /ns/pid_for_children
380 symbolic link.
381 .IP *
382 It is a time namespace,
383 and there is a process that refers to the namespace via a
384 .IR /proc/ pid /ns/time_for_children
385 symbolic link.
386 .IP *
387 It is an IPC namespace, and a corresponding mount of an
388 .I mqueue
389 filesystem (see
390 .BR mq_overview (7))
391 refers to this namespace.
392 .IP *
393 It is a PID namespace, and a corresponding mount of a
394 .BR proc (5)
395 filesystem refers to this namespace.
396 .SH EXAMPLES
397 See
398 .BR clone (2)
399 and
400 .BR user_namespaces (7).
401 .SH SEE ALSO
402 .BR nsenter (1),
403 .BR readlink (1),
404 .BR unshare (1),
405 .BR clone (2),
406 .BR ioctl_ns (2),
407 .BR setns (2),
408 .BR unshare (2),
409 .BR proc (5),
410 .BR capabilities (7),
411 .BR cgroup_namespaces (7),
412 .BR cgroups (7),
413 .BR credentials (7),
414 .BR ipc_namespaces (7),
415 .BR network_namespaces (7),
416 .BR pid_namespaces (7),
417 .BR user_namespaces (7),
418 .BR uts_namespaces (7),
419 .BR lsns (8),
420 .BR switch_root (8)