]>
Commit | Line | Data |
---|---|---|
020357e8 | 1 | .\" Copyright (c) 2013 by Michael Kerrisk <mtk.manpages@gmail.com> |
7a30282c | 2 | .\" and Copyright (c) 2012 by Eric W. Biederman <ebiederm@xmission.com> |
020357e8 | 3 | .\" |
c228b4b4 | 4 | .\" %%%LICENSE_START(VERBATIM) |
020357e8 MK |
5 | .\" Permission is granted to make and distribute verbatim copies of this |
6 | .\" manual provided the copyright notice and this permission notice are | |
7 | .\" preserved on all copies. | |
8 | .\" | |
9 | .\" Permission is granted to copy and distribute modified versions of this | |
10 | .\" manual under the conditions for verbatim copying, provided that the | |
11 | .\" entire resulting derived work is distributed under the terms of a | |
12 | .\" permission notice identical to this one. | |
13 | .\" | |
14 | .\" Since the Linux kernel and libraries are constantly changing, this | |
15 | .\" manual page may be incorrect or out-of-date. The author(s) assume no | |
16 | .\" responsibility for errors or omissions, or for damages resulting from | |
17 | .\" the use of the information contained herein. The author(s) may not | |
18 | .\" have taken the same level of care in the production of this manual, | |
19 | .\" which is licensed free of charge, as they might when working | |
20 | .\" professionally. | |
21 | .\" | |
22 | .\" Formatted or processed versions of this manual, if unaccompanied by | |
23 | .\" the source, must acknowledge the copyright and authors of this work. | |
c228b4b4 | 24 | .\" %%%LICENSE_END |
020357e8 MK |
25 | .\" |
26 | .\" | |
9ba01802 | 27 | .TH NAMESPACES 7 2019-03-06 "Linux" "Linux Programmer's Manual" |
020357e8 MK |
28 | .SH NAME |
29 | namespaces \- overview of Linux namespaces | |
30 | .SH DESCRIPTION | |
31 | A namespace wraps a global system resource in an abstraction that | |
32 | makes it appear to the processes within the namespace that they | |
33 | have their own isolated instance of the global resource. | |
34 | Changes to the global resource are visible to other processes | |
35 | that are members of the namespace, but are invisible to other processes. | |
36 | One use of namespaces is to implement containers. | |
77eaf052 | 37 | .PP |
0b497138 | 38 | Linux provides the following namespaces: |
0b497138 MK |
39 | .TS |
40 | lB lB lB | |
41 | l lB l. | |
42 | Namespace Constant Isolates | |
d4d37f0a | 43 | Cgroup CLONE_NEWCGROUP Cgroup root directory |
b23c9a79 | 44 | IPC CLONE_NEWIPC System V IPC, POSIX message queues |
0b497138 MK |
45 | Network CLONE_NEWNET Network devices, stacks, ports, etc. |
46 | Mount CLONE_NEWNS Mount points | |
47 | PID CLONE_NEWPID Process IDs | |
48 | User CLONE_NEWUSER User and group IDs | |
49 | UTS CLONE_NEWUTS Hostname and NIS domain name | |
50 | .TE | |
77eaf052 | 51 | .PP |
020357e8 MK |
52 | This page describes the various namespaces and the associated |
53 | .I /proc | |
54 | files, and summarizes the APIs for working with namespaces. | |
6be09bd8 MK |
55 | .\" |
56 | .\" ==================== The namespaces API ==================== | |
57 | .\" | |
020357e8 | 58 | .SS The namespaces API |
020357e8 MK |
59 | As well as various |
60 | .I /proc | |
61 | files described below, | |
291e9237 | 62 | the namespaces API includes the following system calls: |
020357e8 MK |
63 | .TP |
64 | .BR clone (2) | |
65 | The | |
66 | .BR clone (2) | |
67 | system call creates a new process. | |
68 | If the | |
69 | .I flags | |
70 | argument of the call specifies one or more of the | |
71 | .B CLONE_NEW* | |
72 | flags listed below, then new namespaces are created for each flag, | |
73 | and the child process is made a member of those namespaces. | |
74 | (This system call also implements a number of features | |
75 | unrelated to namespaces.) | |
020357e8 MK |
76 | .TP |
77 | .BR setns (2) | |
78 | The | |
79 | .BR setns (2) | |
80 | system call allows the calling process to join an existing namespace. | |
81 | The namespace to join is specified via a file descriptor that refers to | |
82 | one of the | |
83 | .IR /proc/[pid]/ns | |
84 | files described below. | |
020357e8 MK |
85 | .TP |
86 | .BR unshare (2) | |
87 | The | |
88 | .BR unshare (2) | |
89 | system call moves the calling process to a new namespace. | |
90 | If the | |
91 | .I flags | |
92 | argument of the call specifies one or more of the | |
93 | .B CLONE_NEW* | |
94 | flags listed below, then new namespaces are created for each flag, | |
95 | and the calling process is made a member of those namespaces. | |
96 | (This system call also implements a number of features | |
97 | unrelated to namespaces.) | |
3426f62c MK |
98 | .TP |
99 | .BR ioctl (2) | |
100 | Various | |
101 | .BR ioctl (2) | |
102 | operations can be used to discover information about namespaces. | |
5a2ed9ee | 103 | These operations are described in |
3426f62c | 104 | .BR ioctl_ns (2). |
3c7103af | 105 | .PP |
027a0716 MK |
106 | Creation of new namespaces using |
107 | .BR clone (2) | |
108 | and | |
109 | .BR unshare (2) | |
110 | in most cases requires the | |
111 | .BR CAP_SYS_ADMIN | |
d45e85a9 MK |
112 | capability, since, in the new namespace, |
113 | the creator will have the power to change global resources | |
114 | that are visible to other processes that are subsequently created in, | |
043aaa94 | 115 | or join the namespace. |
027a0716 | 116 | User namespaces are the exception: since Linux 3.8, |
2a4cbd77 | 117 | no privilege is required to create a user namespace. |
6be09bd8 MK |
118 | .\" |
119 | .\" ==================== The /proc/[pid]/ns/ directory ==================== | |
120 | .\" | |
cf8bfe6d | 121 | .SS The /proc/[pid]/ns/ directory |
f5d401dd | 122 | Each process has a |
cf8bfe6d MK |
123 | .IR /proc/[pid]/ns/ |
124 | .\" See commit 6b4e306aa3dc94a0545eb9279475b1ab6209a31f | |
125 | subdirectory containing one entry for each namespace that | |
126 | supports being manipulated by | |
f2752f90 | 127 | .BR setns (2): |
77eaf052 | 128 | .PP |
f2752f90 | 129 | .in +4n |
b8302363 | 130 | .EX |
ced6277a | 131 | $ \fBls \-l /proc/$$/ns\fP |
f2752f90 | 132 | total 0 |
ced6277a MK |
133 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 cgroup \-> cgroup:[4026531835] |
134 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 ipc \-> ipc:[4026531839] | |
135 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 mnt \-> mnt:[4026531840] | |
136 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 net \-> net:[4026531969] | |
137 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid \-> pid:[4026531836] | |
99e2f752 | 138 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid_for_children \-> pid:[4026531834] |
ced6277a MK |
139 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 user \-> user:[4026531837] |
140 | lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 uts \-> uts:[4026531838] | |
b8302363 | 141 | .EE |
f2752f90 | 142 | .in |
77eaf052 | 143 | .PP |
cf8bfe6d MK |
144 | Bind mounting (see |
145 | .BR mount (2)) | |
146 | one of the files in this directory | |
ab3311aa | 147 | to somewhere else in the filesystem keeps |
cf8bfe6d MK |
148 | the corresponding namespace of the process specified by |
149 | .I pid | |
150 | alive even if all processes currently in the namespace terminate. | |
77eaf052 | 151 | .PP |
cf8bfe6d MK |
152 | Opening one of the files in this directory |
153 | (or a file that is bind mounted to one of these files) | |
154 | returns a file handle for | |
155 | the corresponding namespace of the process specified by | |
156 | .IR pid . | |
157 | As long as this file descriptor remains open, | |
158 | the namespace will remain alive, | |
159 | even if all processes in the namespace terminate. | |
160 | The file descriptor can be passed to | |
161 | .BR setns (2). | |
77eaf052 | 162 | .PP |
cf8bfe6d | 163 | In Linux 3.7 and earlier, these files were visible as hard links. |
1dc3d91d MK |
164 | Since Linux 3.8, |
165 | .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5 | |
166 | they appear as symbolic links. | |
075f5e65 MK |
167 | If two processes are in the same namespace, |
168 | then the device IDs and inode numbers of their | |
cf8bfe6d MK |
169 | .IR /proc/[pid]/ns/xxx |
170 | symbolic links will be the same; an application can check this using the | |
075f5e65 MK |
171 | .I stat.st_dev |
172 | and | |
cf8bfe6d | 173 | .I stat.st_ino |
075f5e65 | 174 | fields returned by |
cf8bfe6d MK |
175 | .BR stat (2). |
176 | The content of this symbolic link is a string containing | |
177 | the namespace type and inode number as in the following example: | |
77eaf052 | 178 | .PP |
cf8bfe6d | 179 | .in +4n |
b8302363 | 180 | .EX |
cf8bfe6d MK |
181 | $ \fBreadlink /proc/$$/ns/uts\fP |
182 | uts:[4026531838] | |
b8302363 | 183 | .EE |
cf8bfe6d | 184 | .in |
77eaf052 | 185 | .PP |
7575dbc5 | 186 | The symbolic links in this subdirectory are as follows: |
cf8bfe6d | 187 | .TP |
d4d37f0a MK |
188 | .IR /proc/[pid]/ns/cgroup " (since Linux 4.6)" |
189 | This file is a handle for the cgroup namespace of the process. | |
190 | .TP | |
cf8bfe6d MK |
191 | .IR /proc/[pid]/ns/ipc " (since Linux 3.0)" |
192 | This file is a handle for the IPC namespace of the process. | |
cf8bfe6d MK |
193 | .TP |
194 | .IR /proc/[pid]/ns/mnt " (since Linux 3.8)" | |
7eb8372d | 195 | .\" commit 8823c079ba7136dc1948d6f6dcb5f8022bde438e |
cf8bfe6d | 196 | This file is a handle for the mount namespace of the process. |
cf8bfe6d MK |
197 | .TP |
198 | .IR /proc/[pid]/ns/net " (since Linux 3.0)" | |
199 | This file is a handle for the network namespace of the process. | |
cf8bfe6d MK |
200 | .TP |
201 | .IR /proc/[pid]/ns/pid " (since Linux 3.8)" | |
7eb8372d | 202 | .\" commit 57e8391d327609cbf12d843259c968b9e5c1838f |
97a1e5b2 MK |
203 | This file is a handle for the PID namespace of the process. |
204 | This handle is permanent for the lifetime of the process | |
205 | (i.e., a process's PID namespace membership never changes). | |
99e2f752 KT |
206 | .TP |
207 | .IR /proc/[pid]/ns/pid_for_children " (since Linux 4.12)" | |
208 | .\" commit eaa0d190bfe1ed891b814a52712dcd852554cb08 | |
97a1e5b2 MK |
209 | This file is a handle for the PID namespace of |
210 | child processes created by this process. | |
211 | This can change as a consequence of calls to | |
212 | .BR unshare (2) | |
213 | and | |
214 | .BR setns (2) | |
215 | (see | |
216 | .BR pid_namespaces (7)), | |
217 | so the file may differ from | |
218 | .IR /proc/[pid]/ns/pid . | |
1a7e08e3 MK |
219 | The symbolic link gains a value only after the first child process |
220 | is created in the namespace. | |
221 | (Beforehand, | |
222 | .BR readlink (2) | |
223 | of the symbolic link will return an empty buffer.) | |
cf8bfe6d MK |
224 | .TP |
225 | .IR /proc/[pid]/ns/user " (since Linux 3.8)" | |
7eb8372d | 226 | .\" commit cde1975bc242f3e1072bde623ef378e547b73f91 |
cf8bfe6d | 227 | This file is a handle for the user namespace of the process. |
cf8bfe6d MK |
228 | .TP |
229 | .IR /proc/[pid]/ns/uts " (since Linux 3.0)" | |
258e6b6c | 230 | This file is a handle for the UTS namespace of the process. |
33a1ab5d MK |
231 | .PP |
232 | Permission to dereference or read | |
233 | .RB ( readlink (2)) | |
234 | these symbolic links is governed by a ptrace access mode | |
235 | .B PTRACE_MODE_READ_FSCREDS | |
236 | check; see | |
237 | .BR ptrace (2). | |
6be09bd8 | 238 | .\" |
5046cb72 MK |
239 | .\" ==================== The /proc/sys/user directory ==================== |
240 | .\" | |
241 | .SS The /proc/sys/user directory | |
242 | The files in the | |
243 | .I /proc/sys/user | |
244 | directory (which is present since Linux 4.9) expose limits | |
245 | on the number of namespaces of various types that can be created. | |
246 | The files are as follows: | |
247 | .TP | |
248 | .IR max_cgroup_namespaces | |
249 | The value in this file defines a per-user limit on the number of | |
250 | cgroup namespaces that may be created in the user namespace. | |
251 | .TP | |
252 | .IR max_ipc_namespaces | |
253 | The value in this file defines a per-user limit on the number of | |
254 | ipc namespaces that may be created in the user namespace. | |
255 | .TP | |
256 | .IR max_mnt_namespaces | |
257 | The value in this file defines a per-user limit on the number of | |
258 | mount namespaces that may be created in the user namespace. | |
259 | .TP | |
260 | .IR max_net_namespaces | |
261 | The value in this file defines a per-user limit on the number of | |
262 | network namespaces that may be created in the user namespace. | |
263 | .TP | |
264 | .IR max_pid_namespaces | |
265 | The value in this file defines a per-user limit on the number of | |
266 | pid namespaces that may be created in the user namespace. | |
267 | .TP | |
268 | .IR max_user_namespaces | |
269 | The value in this file defines a per-user limit on the number of | |
270 | user namespaces that may be created in the user namespace. | |
271 | .TP | |
272 | .IR max_uts_namespaces | |
273 | The value in this file defines a per-user limit on the number of | |
69fc6c67 | 274 | uts namespaces that may be created in the user namespace. |
5046cb72 MK |
275 | .PP |
276 | Note the following details about these files: | |
277 | .IP * 3 | |
278 | The values in these files are modifiable by privileged processes. | |
279 | .IP * | |
280 | The values exposed by these files are the limits for the user namespace | |
281 | in which the opening process resides. | |
282 | .IP * | |
283 | The limits are per-user. | |
284 | Each user in the same user namespace | |
285 | can create namespaces up to the defined limit. | |
286 | .IP * | |
287 | The limits apply to all users, including UID 0. | |
288 | .IP * | |
289 | These limits apply in addition to any other per-namespace | |
290 | limits (such as those for PID and user namespaces) that may be enforced. | |
291 | .IP * | |
292 | Upon encountering these limits, | |
293 | .BR clone (2) | |
294 | and | |
295 | .BR unshare (2) | |
296 | fail with the error | |
297 | .BR ENOSPC . | |
298 | .IP * | |
299 | For the initial user namespace, | |
300 | the default value in each of these files is half the limit on the number | |
301 | of threads that may be created | |
302 | .RI ( /proc/sys/kernel/threads-max ). | |
303 | In all descendant user namespaces, the default value in each file is | |
304 | .BR MAXINT . | |
305 | .IP * | |
306 | When a namespace is created, the object is also accounted | |
307 | against ancestor namespaces. | |
308 | More precisely: | |
309 | .RS | |
310 | .IP + 3 | |
311 | Each user namespace has a creator UID. | |
312 | .IP + | |
313 | When a namespace is created, | |
314 | it is accounted against the creator UIDs in each of the | |
315 | ancestor user namespaces, | |
316 | and the kernel ensures that the corresponding namespace limit | |
317 | for the creator UID in the ancestor namespace is not exceeded. | |
318 | .IP + | |
319 | The aforementioned point ensures that creating a new user namespace | |
320 | cannot be used as a means to escape the limits in force | |
321 | in the current user namespace. | |
322 | .RE | |
5046cb72 | 323 | .\" |
d4d37f0a MK |
324 | .\" ==================== Cgroup namespaces ==================== |
325 | .\" | |
326 | .SS Cgroup namespaces (CLONE_NEWCGROUP) | |
a2ee61a3 MK |
327 | See |
328 | .BR cgroup_namespaces (7). | |
d4d37f0a | 329 | .\" |
6be09bd8 MK |
330 | .\" ==================== IPC namespaces ==================== |
331 | .\" | |
020357e8 | 332 | .SS IPC namespaces (CLONE_NEWIPC) |
020357e8 MK |
333 | IPC namespaces isolate certain IPC resources, |
334 | namely, System V IPC objects (see | |
335 | .BR svipc (7)) | |
9343f8e7 MK |
336 | and (since Linux 2.6.30) |
337 | .\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f | |
338 | .\" https://lwn.net/Articles/312232/ | |
339 | POSIX message queues (see | |
f7611a00 | 340 | .BR mq_overview (7)). |
9343f8e7 | 341 | The common characteristic of these IPC mechanisms is that IPC |
ab3311aa | 342 | objects are identified by mechanisms other than filesystem |
9343f8e7 | 343 | pathnames. |
77eaf052 | 344 | .PP |
020357e8 | 345 | Each IPC namespace has its own set of System V IPC identifiers and |
ab3311aa | 346 | its own POSIX message queue filesystem. |
9343f8e7 MK |
347 | Objects created in an IPC namespace are visible to all other processes |
348 | that are members of that namespace, | |
349 | but are not visible to processes in other IPC namespaces. | |
77eaf052 | 350 | .PP |
f344e055 MK |
351 | The following |
352 | .I /proc | |
353 | interfaces are distinct in each IPC namespace: | |
354 | .IP * 3 | |
355 | The POSIX message queue interfaces in | |
356 | .IR /proc/sys/fs/mqueue . | |
357 | .IP * | |
beb9df9e | 358 | The System V IPC interfaces in |
f344e055 MK |
359 | .IR /proc/sys/kernel , |
360 | namely: | |
361 | .IR msgmax , | |
362 | .IR msgmnb , | |
363 | .IR msgmni , | |
364 | .IR sem , | |
365 | .IR shmall , | |
366 | .IR shmmax , | |
367 | .IR shmmni , | |
368 | and | |
369 | .IR shm_rmid_forced . | |
370 | .IP * | |
beb9df9e | 371 | The System V IPC interfaces in |
f344e055 MK |
372 | .IR /proc/sysvipc . |
373 | .PP | |
9343f8e7 MK |
374 | When an IPC namespace is destroyed |
375 | (i.e., when the last process that is a member of the namespace terminates), | |
376 | all IPC objects in the namespace are automatically destroyed. | |
77eaf052 | 377 | .PP |
9343f8e7 MK |
378 | Use of IPC namespaces requires a kernel that is configured with the |
379 | .B CONFIG_IPC_NS | |
380 | option. | |
6be09bd8 MK |
381 | .\" |
382 | .\" ==================== Network namespaces ==================== | |
383 | .\" | |
020357e8 | 384 | .SS Network namespaces (CLONE_NEWNET) |
2685b303 MK |
385 | See |
386 | .BR network_namespaces (7). | |
6be09bd8 MK |
387 | .\" |
388 | .\" ==================== Mount namespaces ==================== | |
389 | .\" | |
357002ec | 390 | .SS Mount namespaces (CLONE_NEWNS) |
da031af1 MK |
391 | See |
392 | .BR mount_namespaces (7). | |
6be09bd8 MK |
393 | .\" |
394 | .\" ==================== PID namespaces ==================== | |
395 | .\" | |
020357e8 | 396 | .SS PID namespaces (CLONE_NEWPID) |
024d6a84 MK |
397 | See |
398 | .BR pid_namespaces (7). | |
6be09bd8 MK |
399 | .\" |
400 | .\" ==================== User namespaces ==================== | |
401 | .\" | |
020357e8 | 402 | .SS User namespaces (CLONE_NEWUSER) |
67d1131f MK |
403 | See |
404 | .BR user_namespaces (7). | |
6be09bd8 MK |
405 | .\" |
406 | .\" ==================== UTS namespaces ==================== | |
407 | .\" | |
020357e8 | 408 | .SS UTS namespaces (CLONE_NEWUTS) |
020357e8 MK |
409 | UTS namespaces provide isolation of two system identifiers: |
410 | the hostname and the NIS domain name. | |
411 | These identifiers are set using | |
412 | .BR sethostname (2) | |
413 | and | |
414 | .BR setdomainname (2), | |
415 | and can be retrieved using | |
416 | .BR uname (2), | |
417 | .BR gethostname (2), | |
418 | and | |
419 | .BR getdomainname (2). | |
77eaf052 | 420 | .PP |
83d9e9b2 MK |
421 | Use of UTS namespaces requires a kernel that is configured with the |
422 | .B CONFIG_UTS_NS | |
423 | option. | |
9a6d888c MK |
424 | .\" |
425 | .SS Namespace lifetime | |
426 | Absent any other factors, | |
427 | a namespace is automatically torn down when the last process in | |
428 | the namespace terminates or leaves the namespace. | |
429 | However, there are a number of other factors that may pin | |
430 | a namespace into existence even though it has no member processes. | |
431 | These factors include the following: | |
432 | .IP * 3 | |
433 | An open file descriptor or a bind mount exists for the corresponding | |
434 | .IR /proc/[pid]/ns/* | |
435 | file. | |
436 | .IP * | |
437 | The namespace is hierarchical (i.e., a PID or user namespace), | |
438 | and has a child namespace. | |
439 | .IP * | |
440 | It is a user namespace that owns one or more nonuser namespaces. | |
441 | .IP * | |
442 | It is a PID namespace, | |
443 | and there is a process that refers to the namespace via a | |
444 | .IR /proc/[pid]/ns/pid_for_children | |
445 | symbolic link. | |
446 | .IP * | |
447 | It is an IPC namespace, and a corresponding mount of an | |
448 | .I mqueue | |
449 | filesystem (see | |
450 | .BR mq_overview (7)) | |
451 | refers to this namespace. | |
452 | .IP * | |
68bd4ad9 | 453 | It is a PID namespace, and a corresponding mount of a |
9a6d888c MK |
454 | .BR proc (5) |
455 | filesystem refers to this namespace. | |
fa88d1a4 | 456 | .SH EXAMPLE |
e0ab72cb | 457 | See |
94dd730b MK |
458 | .BR clone (2) |
459 | and | |
fa88d1a4 | 460 | .BR user_namespaces (7). |
020357e8 | 461 | .SH SEE ALSO |
86499a6b | 462 | .BR nsenter (1), |
020357e8 | 463 | .BR readlink (1), |
86499a6b | 464 | .BR unshare (1), |
020357e8 | 465 | .BR clone (2), |
e0ab72cb | 466 | .BR ioctl_ns (2), |
020357e8 MK |
467 | .BR setns (2), |
468 | .BR unshare (2), | |
469 | .BR proc (5), | |
029ae9e3 | 470 | .BR capabilities (7), |
a2ee61a3 | 471 | .BR cgroup_namespaces (7), |
35fae0aa | 472 | .BR cgroups (7), |
10f8f8cb | 473 | .BR credentials (7), |
2685b303 | 474 | .BR network_namespaces (7), |
024d6a84 | 475 | .BR pid_namespaces (7), |
67d1131f | 476 | .BR user_namespaces (7), |
8512495a | 477 | .BR lsns (8), |
2c1608c2 | 478 | .BR pam_namespace (8), |
029ae9e3 | 479 | .BR switch_root (8) |