]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/namespaces.7
da4327808a079d3ce8820d55f69811143ca945c1
[thirdparty/man-pages.git] / man7 / namespaces.7
1 .\" Copyright (c) 2013, 2016, 2017 by Michael Kerrisk <mtk.manpages@gmail.com>
2 .\" and Copyright (c) 2012 by Eric W. Biederman <ebiederm@xmission.com>
3 .\"
4 .\" %%%LICENSE_START(VERBATIM)
5 .\" Permission is granted to make and distribute verbatim copies of this
6 .\" manual provided the copyright notice and this permission notice are
7 .\" preserved on all copies.
8 .\"
9 .\" Permission is granted to copy and distribute modified versions of this
10 .\" manual under the conditions for verbatim copying, provided that the
11 .\" entire resulting derived work is distributed under the terms of a
12 .\" permission notice identical to this one.
13 .\"
14 .\" Since the Linux kernel and libraries are constantly changing, this
15 .\" manual page may be incorrect or out-of-date. The author(s) assume no
16 .\" responsibility for errors or omissions, or for damages resulting from
17 .\" the use of the information contained herein. The author(s) may not
18 .\" have taken the same level of care in the production of this manual,
19 .\" which is licensed free of charge, as they might when working
20 .\" professionally.
21 .\"
22 .\" Formatted or processed versions of this manual, if unaccompanied by
23 .\" the source, must acknowledge the copyright and authors of this work.
24 .\" %%%LICENSE_END
25 .\"
26 .\"
27 .TH NAMESPACES 7 2019-03-06 "Linux" "Linux Programmer's Manual"
28 .SH NAME
29 namespaces \- overview of Linux namespaces
30 .SH DESCRIPTION
31 A namespace wraps a global system resource in an abstraction that
32 makes it appear to the processes within the namespace that they
33 have their own isolated instance of the global resource.
34 Changes to the global resource are visible to other processes
35 that are members of the namespace, but are invisible to other processes.
36 One use of namespaces is to implement containers.
37 .PP
38 Linux provides the following namespaces:
39 .TS
40 lB lB lB
41 l lB l.
42 Namespace Constant Isolates
43 Cgroup CLONE_NEWCGROUP Cgroup root directory
44 IPC CLONE_NEWIPC System V IPC, POSIX message queues
45 Network CLONE_NEWNET Network devices, stacks, ports, etc.
46 Mount CLONE_NEWNS Mount points
47 PID CLONE_NEWPID Process IDs
48 User CLONE_NEWUSER User and group IDs
49 UTS CLONE_NEWUTS Hostname and NIS domain name
50 .TE
51 .PP
52 This page describes the various namespaces and the associated
53 .I /proc
54 files, and summarizes the APIs for working with namespaces.
55 .\"
56 .\" ==================== The namespaces API ====================
57 .\"
58 .SS The namespaces API
59 As well as various
60 .I /proc
61 files described below,
62 the namespaces API includes the following system calls:
63 .TP
64 .BR clone (2)
65 The
66 .BR clone (2)
67 system call creates a new process.
68 If the
69 .I flags
70 argument of the call specifies one or more of the
71 .B CLONE_NEW*
72 flags listed below, then new namespaces are created for each flag,
73 and the child process is made a member of those namespaces.
74 (This system call also implements a number of features
75 unrelated to namespaces.)
76 .TP
77 .BR setns (2)
78 The
79 .BR setns (2)
80 system call allows the calling process to join an existing namespace.
81 The namespace to join is specified via a file descriptor that refers to
82 one of the
83 .IR /proc/[pid]/ns
84 files described below.
85 .TP
86 .BR unshare (2)
87 The
88 .BR unshare (2)
89 system call moves the calling process to a new namespace.
90 If the
91 .I flags
92 argument of the call specifies one or more of the
93 .B CLONE_NEW*
94 flags listed below, then new namespaces are created for each flag,
95 and the calling process is made a member of those namespaces.
96 (This system call also implements a number of features
97 unrelated to namespaces.)
98 .TP
99 .BR ioctl (2)
100 Various
101 .BR ioctl (2)
102 operations can be used to discover information about namespaces.
103 These operations are described in
104 .BR ioctl_ns (2).
105 .PP
106 Creation of new namespaces using
107 .BR clone (2)
108 and
109 .BR unshare (2)
110 in most cases requires the
111 .BR CAP_SYS_ADMIN
112 capability, since, in the new namespace,
113 the creator will have the power to change global resources
114 that are visible to other processes that are subsequently created in,
115 or join the namespace.
116 User namespaces are the exception: since Linux 3.8,
117 no privilege is required to create a user namespace.
118 .\"
119 .\" ==================== The /proc/[pid]/ns/ directory ====================
120 .\"
121 .SS The /proc/[pid]/ns/ directory
122 Each process has a
123 .IR /proc/[pid]/ns/
124 .\" See commit 6b4e306aa3dc94a0545eb9279475b1ab6209a31f
125 subdirectory containing one entry for each namespace that
126 supports being manipulated by
127 .BR setns (2):
128 .PP
129 .in +4n
130 .EX
131 $ \fBls \-l /proc/$$/ns\fP
132 total 0
133 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 cgroup \-> cgroup:[4026531835]
134 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 ipc \-> ipc:[4026531839]
135 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 mnt \-> mnt:[4026531840]
136 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 net \-> net:[4026531969]
137 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid \-> pid:[4026531836]
138 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid_for_children \-> pid:[4026531834]
139 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 user \-> user:[4026531837]
140 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 uts \-> uts:[4026531838]
141 .EE
142 .in
143 .PP
144 Bind mounting (see
145 .BR mount (2))
146 one of the files in this directory
147 to somewhere else in the filesystem keeps
148 the corresponding namespace of the process specified by
149 .I pid
150 alive even if all processes currently in the namespace terminate.
151 .PP
152 Opening one of the files in this directory
153 (or a file that is bind mounted to one of these files)
154 returns a file handle for
155 the corresponding namespace of the process specified by
156 .IR pid .
157 As long as this file descriptor remains open,
158 the namespace will remain alive,
159 even if all processes in the namespace terminate.
160 The file descriptor can be passed to
161 .BR setns (2).
162 .PP
163 In Linux 3.7 and earlier, these files were visible as hard links.
164 Since Linux 3.8,
165 .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5
166 they appear as symbolic links.
167 If two processes are in the same namespace,
168 then the device IDs and inode numbers of their
169 .IR /proc/[pid]/ns/xxx
170 symbolic links will be the same; an application can check this using the
171 .I stat.st_dev
172 and
173 .I stat.st_ino
174 fields returned by
175 .BR stat (2).
176 The content of this symbolic link is a string containing
177 the namespace type and inode number as in the following example:
178 .PP
179 .in +4n
180 .EX
181 $ \fBreadlink /proc/$$/ns/uts\fP
182 uts:[4026531838]
183 .EE
184 .in
185 .PP
186 The symbolic links in this subdirectory are as follows:
187 .TP
188 .IR /proc/[pid]/ns/cgroup " (since Linux 4.6)"
189 This file is a handle for the cgroup namespace of the process.
190 .TP
191 .IR /proc/[pid]/ns/ipc " (since Linux 3.0)"
192 This file is a handle for the IPC namespace of the process.
193 .TP
194 .IR /proc/[pid]/ns/mnt " (since Linux 3.8)"
195 .\" commit 8823c079ba7136dc1948d6f6dcb5f8022bde438e
196 This file is a handle for the mount namespace of the process.
197 .TP
198 .IR /proc/[pid]/ns/net " (since Linux 3.0)"
199 This file is a handle for the network namespace of the process.
200 .TP
201 .IR /proc/[pid]/ns/pid " (since Linux 3.8)"
202 .\" commit 57e8391d327609cbf12d843259c968b9e5c1838f
203 This file is a handle for the PID namespace of the process.
204 This handle is permanent for the lifetime of the process
205 (i.e., a process's PID namespace membership never changes).
206 .TP
207 .IR /proc/[pid]/ns/pid_for_children " (since Linux 4.12)"
208 .\" commit eaa0d190bfe1ed891b814a52712dcd852554cb08
209 This file is a handle for the PID namespace of
210 child processes created by this process.
211 This can change as a consequence of calls to
212 .BR unshare (2)
213 and
214 .BR setns (2)
215 (see
216 .BR pid_namespaces (7)),
217 so the file may differ from
218 .IR /proc/[pid]/ns/pid .
219 The symbolic link gains a value only after the first child process
220 is created in the namespace.
221 (Beforehand,
222 .BR readlink (2)
223 of the symbolic link will return an empty buffer.)
224 .TP
225 .IR /proc/[pid]/ns/user " (since Linux 3.8)"
226 .\" commit cde1975bc242f3e1072bde623ef378e547b73f91
227 This file is a handle for the user namespace of the process.
228 .TP
229 .IR /proc/[pid]/ns/uts " (since Linux 3.0)"
230 This file is a handle for the UTS namespace of the process.
231 .PP
232 Permission to dereference or read
233 .RB ( readlink (2))
234 these symbolic links is governed by a ptrace access mode
235 .B PTRACE_MODE_READ_FSCREDS
236 check; see
237 .BR ptrace (2).
238 .\"
239 .\" ==================== The /proc/sys/user directory ====================
240 .\"
241 .SS The /proc/sys/user directory
242 The files in the
243 .I /proc/sys/user
244 directory (which is present since Linux 4.9) expose limits
245 on the number of namespaces of various types that can be created.
246 The files are as follows:
247 .TP
248 .IR max_cgroup_namespaces
249 The value in this file defines a per-user limit on the number of
250 cgroup namespaces that may be created in the user namespace.
251 .TP
252 .IR max_ipc_namespaces
253 The value in this file defines a per-user limit on the number of
254 ipc namespaces that may be created in the user namespace.
255 .TP
256 .IR max_mnt_namespaces
257 The value in this file defines a per-user limit on the number of
258 mount namespaces that may be created in the user namespace.
259 .TP
260 .IR max_net_namespaces
261 The value in this file defines a per-user limit on the number of
262 network namespaces that may be created in the user namespace.
263 .TP
264 .IR max_pid_namespaces
265 The value in this file defines a per-user limit on the number of
266 pid namespaces that may be created in the user namespace.
267 .TP
268 .IR max_user_namespaces
269 The value in this file defines a per-user limit on the number of
270 user namespaces that may be created in the user namespace.
271 .TP
272 .IR max_uts_namespaces
273 The value in this file defines a per-user limit on the number of
274 uts namespaces that may be created in the user namespace.
275 .PP
276 Note the following details about these files:
277 .IP * 3
278 The values in these files are modifiable by privileged processes.
279 .IP *
280 The values exposed by these files are the limits for the user namespace
281 in which the opening process resides.
282 .IP *
283 The limits are per-user.
284 Each user in the same user namespace
285 can create namespaces up to the defined limit.
286 .IP *
287 The limits apply to all users, including UID 0.
288 .IP *
289 These limits apply in addition to any other per-namespace
290 limits (such as those for PID and user namespaces) that may be enforced.
291 .IP *
292 Upon encountering these limits,
293 .BR clone (2)
294 and
295 .BR unshare (2)
296 fail with the error
297 .BR ENOSPC .
298 .IP *
299 For the initial user namespace,
300 the default value in each of these files is half the limit on the number
301 of threads that may be created
302 .RI ( /proc/sys/kernel/threads-max ).
303 In all descendant user namespaces, the default value in each file is
304 .BR MAXINT .
305 .IP *
306 When a namespace is created, the object is also accounted
307 against ancestor namespaces.
308 More precisely:
309 .RS
310 .IP + 3
311 Each user namespace has a creator UID.
312 .IP +
313 When a namespace is created,
314 it is accounted against the creator UIDs in each of the
315 ancestor user namespaces,
316 and the kernel ensures that the corresponding namespace limit
317 for the creator UID in the ancestor namespace is not exceeded.
318 .IP +
319 The aforementioned point ensures that creating a new user namespace
320 cannot be used as a means to escape the limits in force
321 in the current user namespace.
322 .RE
323 .\"
324 .\" ==================== Cgroup namespaces ====================
325 .\"
326 .SS Cgroup namespaces (CLONE_NEWCGROUP)
327 See
328 .BR cgroup_namespaces (7).
329 .\"
330 .\" ==================== IPC namespaces ====================
331 .\"
332 .SS IPC namespaces (CLONE_NEWIPC)
333 IPC namespaces isolate certain IPC resources,
334 namely, System V IPC objects (see
335 .BR sysvipc (7))
336 and (since Linux 2.6.30)
337 .\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f
338 .\" https://lwn.net/Articles/312232/
339 POSIX message queues (see
340 .BR mq_overview (7)).
341 The common characteristic of these IPC mechanisms is that IPC
342 objects are identified by mechanisms other than filesystem
343 pathnames.
344 .PP
345 Each IPC namespace has its own set of System V IPC identifiers and
346 its own POSIX message queue filesystem.
347 Objects created in an IPC namespace are visible to all other processes
348 that are members of that namespace,
349 but are not visible to processes in other IPC namespaces.
350 .PP
351 The following
352 .I /proc
353 interfaces are distinct in each IPC namespace:
354 .IP * 3
355 The POSIX message queue interfaces in
356 .IR /proc/sys/fs/mqueue .
357 .IP *
358 The System V IPC interfaces in
359 .IR /proc/sys/kernel ,
360 namely:
361 .IR msgmax ,
362 .IR msgmnb ,
363 .IR msgmni ,
364 .IR sem ,
365 .IR shmall ,
366 .IR shmmax ,
367 .IR shmmni ,
368 and
369 .IR shm_rmid_forced .
370 .IP *
371 The System V IPC interfaces in
372 .IR /proc/sysvipc .
373 .PP
374 When an IPC namespace is destroyed
375 (i.e., when the last process that is a member of the namespace terminates),
376 all IPC objects in the namespace are automatically destroyed.
377 .PP
378 Use of IPC namespaces requires a kernel that is configured with the
379 .B CONFIG_IPC_NS
380 option.
381 .\"
382 .\" ==================== Network namespaces ====================
383 .\"
384 .SS Network namespaces (CLONE_NEWNET)
385 See
386 .BR network_namespaces (7).
387 .\"
388 .\" ==================== Mount namespaces ====================
389 .\"
390 .SS Mount namespaces (CLONE_NEWNS)
391 See
392 .BR mount_namespaces (7).
393 .\"
394 .\" ==================== PID namespaces ====================
395 .\"
396 .SS PID namespaces (CLONE_NEWPID)
397 See
398 .BR pid_namespaces (7).
399 .\"
400 .\" ==================== User namespaces ====================
401 .\"
402 .SS User namespaces (CLONE_NEWUSER)
403 See
404 .BR user_namespaces (7).
405 .\"
406 .\" ==================== UTS namespaces ====================
407 .\"
408 .SS UTS namespaces (CLONE_NEWUTS)
409 UTS namespaces provide isolation of two system identifiers:
410 the hostname and the NIS domain name.
411 These identifiers are set using
412 .BR sethostname (2)
413 and
414 .BR setdomainname (2),
415 and can be retrieved using
416 .BR uname (2),
417 .BR gethostname (2),
418 and
419 .BR getdomainname (2).
420 .PP
421 .PP
422 When a process creates a new UTS namespace using
423 .BR clone (2)
424 or
425 .BR unshare (2)
426 with the
427 .BR CLONE_NEWUTS
428 flag, the hostname and domain of the new UTS namespace are copied
429 from the corresponding values in the caller's UTS namespace.
430 .PP
431 Use of UTS namespaces requires a kernel that is configured with the
432 .B CONFIG_UTS_NS
433 option.
434 .\"
435 .SS Namespace lifetime
436 Absent any other factors,
437 a namespace is automatically torn down when the last process in
438 the namespace terminates or leaves the namespace.
439 However, there are a number of other factors that may pin
440 a namespace into existence even though it has no member processes.
441 These factors include the following:
442 .IP * 3
443 An open file descriptor or a bind mount exists for the corresponding
444 .IR /proc/[pid]/ns/*
445 file.
446 .IP *
447 The namespace is hierarchical (i.e., a PID or user namespace),
448 and has a child namespace.
449 .IP *
450 It is a user namespace that owns one or more nonuser namespaces.
451 .IP *
452 It is a PID namespace,
453 and there is a process that refers to the namespace via a
454 .IR /proc/[pid]/ns/pid_for_children
455 symbolic link.
456 .IP *
457 It is an IPC namespace, and a corresponding mount of an
458 .I mqueue
459 filesystem (see
460 .BR mq_overview (7))
461 refers to this namespace.
462 .IP *
463 It is a PID namespace, and a corresponding mount of a
464 .BR proc (5)
465 filesystem refers to this namespace.
466 .SH EXAMPLE
467 See
468 .BR clone (2)
469 and
470 .BR user_namespaces (7).
471 .SH SEE ALSO
472 .BR nsenter (1),
473 .BR readlink (1),
474 .BR unshare (1),
475 .BR clone (2),
476 .BR ioctl_ns (2),
477 .BR setns (2),
478 .BR unshare (2),
479 .BR proc (5),
480 .BR capabilities (7),
481 .BR cgroup_namespaces (7),
482 .BR cgroups (7),
483 .BR credentials (7),
484 .BR network_namespaces (7),
485 .BR pid_namespaces (7),
486 .BR user_namespaces (7),
487 .BR lsns (8),
488 .BR pam_namespace (8),
489 .BR switch_root (8)