]>
Commit | Line | Data |
---|---|---|
930e2ffa | 1 | .\" Copyright (c) 2013, 2016, 2017 by Michael Kerrisk <mtk.manpages@gmail.com> |
7a30282c | 2 | .\" and Copyright (c) 2012 by Eric W. Biederman <ebiederm@xmission.com> |
020357e8 | 3 | .\" |
5fbde956 | 4 | .\" SPDX-License-Identifier: Linux-man-pages-copyleft |
020357e8 MK |
5 | .\" |
6 | .\" | |
45186a5d | 7 | .TH NAMESPACES 7 2021-08-27 "Linux man-pages (unreleased)" |
020357e8 MK |
8 | .SH NAME |
9 | namespaces \- overview of Linux namespaces | |
10 | .SH DESCRIPTION | |
11 | A namespace wraps a global system resource in an abstraction that | |
12 | makes it appear to the processes within the namespace that they | |
13 | have their own isolated instance of the global resource. | |
14 | Changes to the global resource are visible to other processes | |
15 | that are members of the namespace, but are invisible to other processes. | |
16 | One use of namespaces is to implement containers. | |
77eaf052 | 17 | .PP |
ee81d7e4 MK |
18 | This page provides pointers to information on the various namespace types, |
19 | describes the associated | |
020357e8 MK |
20 | .I /proc |
21 | files, and summarizes the APIs for working with namespaces. | |
6be09bd8 | 22 | .\" |
ee81d7e4 | 23 | .SS Namespace types |
ee81d7e4 MK |
24 | The following table shows the namespace types available on Linux. |
25 | The second column of the table shows the flag value that is used to specify | |
26 | the namespace type in various APIs. | |
27 | The third column identifies the manual page that provides details | |
28 | on the namespace type. | |
29 | The last column is a summary of the resources that are isolated by | |
30 | the namespace type. | |
0b174fe0 MK |
31 | .ad l |
32 | .nh | |
ee81d7e4 MK |
33 | .TS |
34 | lB lB lB lB | |
35 | l1 lB1 l1 l. | |
36 | Namespace Flag Page Isolates | |
0b174fe0 MK |
37 | Cgroup CLONE_NEWCGROUP \fBcgroup_namespaces\fP(7) T{ |
38 | Cgroup root directory | |
39 | T} | |
ee81d7e4 MK |
40 | IPC CLONE_NEWIPC \fBipc_namespaces\fP(7) T{ |
41 | System V IPC, | |
ee81d7e4 MK |
42 | POSIX message queues |
43 | T} | |
44 | Network CLONE_NEWNET \fBnetwork_namespaces\fP(7) T{ | |
45 | Network devices, | |
ee81d7e4 MK |
46 | stacks, ports, etc. |
47 | T} | |
48 | Mount CLONE_NEWNS \fBmount_namespaces\fP(7) Mount points | |
49 | PID CLONE_NEWPID \fBpid_namespaces\fP(7) Process IDs | |
19e8f797 MK |
50 | Time CLONE_NEWTIME \fBtime_namespaces\fP(7) T{ |
51 | Boot and monotonic | |
19e8f797 MK |
52 | clocks |
53 | T} | |
1b8089e1 MW |
54 | User CLONE_NEWUSER \fBuser_namespaces\fP(7) T{ |
55 | User and group IDs | |
0b174fe0 | 56 | T} |
ee81d7e4 MK |
57 | UTS CLONE_NEWUTS \fButs_namespaces\fP(7) T{ |
58 | Hostname and NIS | |
ee81d7e4 MK |
59 | domain name |
60 | T} | |
61 | .TE | |
0b174fe0 MK |
62 | .hy |
63 | .ad | |
ee81d7e4 | 64 | .\" |
6be09bd8 MK |
65 | .\" ==================== The namespaces API ==================== |
66 | .\" | |
020357e8 | 67 | .SS The namespaces API |
020357e8 MK |
68 | As well as various |
69 | .I /proc | |
70 | files described below, | |
291e9237 | 71 | the namespaces API includes the following system calls: |
020357e8 MK |
72 | .TP |
73 | .BR clone (2) | |
74 | The | |
75 | .BR clone (2) | |
76 | system call creates a new process. | |
77 | If the | |
78 | .I flags | |
79 | argument of the call specifies one or more of the | |
80 | .B CLONE_NEW* | |
f05d7043 | 81 | flags listed above, then new namespaces are created for each flag, |
020357e8 MK |
82 | and the child process is made a member of those namespaces. |
83 | (This system call also implements a number of features | |
84 | unrelated to namespaces.) | |
020357e8 MK |
85 | .TP |
86 | .BR setns (2) | |
87 | The | |
88 | .BR setns (2) | |
89 | system call allows the calling process to join an existing namespace. | |
90 | The namespace to join is specified via a file descriptor that refers to | |
91 | one of the | |
1ae6b2c7 | 92 | .IR /proc/ pid /ns |
020357e8 | 93 | files described below. |
020357e8 MK |
94 | .TP |
95 | .BR unshare (2) | |
96 | The | |
97 | .BR unshare (2) | |
98 | system call moves the calling process to a new namespace. | |
99 | If the | |
100 | .I flags | |
101 | argument of the call specifies one or more of the | |
102 | .B CLONE_NEW* | |
f05d7043 | 103 | flags listed above, then new namespaces are created for each flag, |
020357e8 MK |
104 | and the calling process is made a member of those namespaces. |
105 | (This system call also implements a number of features | |
106 | unrelated to namespaces.) | |
3426f62c MK |
107 | .TP |
108 | .BR ioctl (2) | |
109 | Various | |
110 | .BR ioctl (2) | |
111 | operations can be used to discover information about namespaces. | |
5a2ed9ee | 112 | These operations are described in |
3426f62c | 113 | .BR ioctl_ns (2). |
3c7103af | 114 | .PP |
027a0716 MK |
115 | Creation of new namespaces using |
116 | .BR clone (2) | |
117 | and | |
118 | .BR unshare (2) | |
119 | in most cases requires the | |
1ae6b2c7 | 120 | .B CAP_SYS_ADMIN |
d45e85a9 MK |
121 | capability, since, in the new namespace, |
122 | the creator will have the power to change global resources | |
123 | that are visible to other processes that are subsequently created in, | |
043aaa94 | 124 | or join the namespace. |
027a0716 | 125 | User namespaces are the exception: since Linux 3.8, |
2a4cbd77 | 126 | no privilege is required to create a user namespace. |
6be09bd8 MK |
127 | .\" |
128 | .\" ==================== The /proc/[pid]/ns/ directory ==================== | |
129 | .\" | |
cf8bfe6d | 130 | .SS The /proc/[pid]/ns/ directory |
f5d401dd | 131 | Each process has a |
1ae6b2c7 | 132 | .IR /proc/ pid /ns/ |
cf8bfe6d MK |
133 | .\" See commit 6b4e306aa3dc94a0545eb9279475b1ab6209a31f |
134 | subdirectory containing one entry for each namespace that | |
135 | supports being manipulated by | |
f2752f90 | 136 | .BR setns (2): |
77eaf052 | 137 | .PP |
f2752f90 | 138 | .in +4n |
b8302363 | 139 | .EX |
05d2e9d0 | 140 | $ \fBls \-l /proc/$$/ns | awk \(aq{print $1, $9, $10, $11}\(aq\fP |
f2752f90 | 141 | total 0 |
05d2e9d0 MK |
142 | lrwxrwxrwx. cgroup \-> cgroup:[4026531835] |
143 | lrwxrwxrwx. ipc \-> ipc:[4026531839] | |
144 | lrwxrwxrwx. mnt \-> mnt:[4026531840] | |
145 | lrwxrwxrwx. net \-> net:[4026531969] | |
146 | lrwxrwxrwx. pid \-> pid:[4026531836] | |
147 | lrwxrwxrwx. pid_for_children \-> pid:[4026531834] | |
d064d41a MK |
148 | lrwxrwxrwx. time \-> time:[4026531834] |
149 | lrwxrwxrwx. time_for_children \-> time:[4026531834] | |
05d2e9d0 MK |
150 | lrwxrwxrwx. user \-> user:[4026531837] |
151 | lrwxrwxrwx. uts \-> uts:[4026531838] | |
b8302363 | 152 | .EE |
f2752f90 | 153 | .in |
77eaf052 | 154 | .PP |
cf8bfe6d MK |
155 | Bind mounting (see |
156 | .BR mount (2)) | |
157 | one of the files in this directory | |
ab3311aa | 158 | to somewhere else in the filesystem keeps |
cf8bfe6d MK |
159 | the corresponding namespace of the process specified by |
160 | .I pid | |
161 | alive even if all processes currently in the namespace terminate. | |
77eaf052 | 162 | .PP |
cf8bfe6d MK |
163 | Opening one of the files in this directory |
164 | (or a file that is bind mounted to one of these files) | |
165 | returns a file handle for | |
166 | the corresponding namespace of the process specified by | |
167 | .IR pid . | |
168 | As long as this file descriptor remains open, | |
169 | the namespace will remain alive, | |
170 | even if all processes in the namespace terminate. | |
171 | The file descriptor can be passed to | |
172 | .BR setns (2). | |
77eaf052 | 173 | .PP |
cf8bfe6d | 174 | In Linux 3.7 and earlier, these files were visible as hard links. |
1dc3d91d MK |
175 | Since Linux 3.8, |
176 | .\" commit bf056bfa80596a5d14b26b17276a56a0dcb080e5 | |
177 | they appear as symbolic links. | |
075f5e65 MK |
178 | If two processes are in the same namespace, |
179 | then the device IDs and inode numbers of their | |
1ae6b2c7 | 180 | .IR /proc/ pid /ns/ xxx |
cf8bfe6d | 181 | symbolic links will be the same; an application can check this using the |
075f5e65 | 182 | .I stat.st_dev |
0dd34252 MK |
183 | .\" Eric Biederman: "I reserve the right for st_dev to be significant |
184 | .\" when comparing namespaces." | |
185 | .\" https://lore.kernel.org/lkml/87poky5ca9.fsf@xmission.com/ | |
186 | .\" Re: Documenting the ioctl interfaces to discover relationships... | |
187 | .\" Date: Mon, 12 Dec 2016 11:30:38 +1300 | |
075f5e65 | 188 | and |
cf8bfe6d | 189 | .I stat.st_ino |
075f5e65 | 190 | fields returned by |
cf8bfe6d MK |
191 | .BR stat (2). |
192 | The content of this symbolic link is a string containing | |
193 | the namespace type and inode number as in the following example: | |
77eaf052 | 194 | .PP |
cf8bfe6d | 195 | .in +4n |
b8302363 | 196 | .EX |
cf8bfe6d MK |
197 | $ \fBreadlink /proc/$$/ns/uts\fP |
198 | uts:[4026531838] | |
b8302363 | 199 | .EE |
cf8bfe6d | 200 | .in |
77eaf052 | 201 | .PP |
7575dbc5 | 202 | The symbolic links in this subdirectory are as follows: |
cf8bfe6d | 203 | .TP |
1ae6b2c7 | 204 | .IR /proc/ pid /ns/cgroup " (since Linux 4.6)" |
d4d37f0a MK |
205 | This file is a handle for the cgroup namespace of the process. |
206 | .TP | |
1ae6b2c7 | 207 | .IR /proc/ pid /ns/ipc " (since Linux 3.0)" |
cf8bfe6d | 208 | This file is a handle for the IPC namespace of the process. |
cf8bfe6d | 209 | .TP |
1ae6b2c7 | 210 | .IR /proc/ pid /ns/mnt " (since Linux 3.8)" |
7eb8372d | 211 | .\" commit 8823c079ba7136dc1948d6f6dcb5f8022bde438e |
cf8bfe6d | 212 | This file is a handle for the mount namespace of the process. |
cf8bfe6d | 213 | .TP |
1ae6b2c7 | 214 | .IR /proc/ pid /ns/net " (since Linux 3.0)" |
cf8bfe6d | 215 | This file is a handle for the network namespace of the process. |
cf8bfe6d | 216 | .TP |
1ae6b2c7 | 217 | .IR /proc/ pid /ns/pid " (since Linux 3.8)" |
7eb8372d | 218 | .\" commit 57e8391d327609cbf12d843259c968b9e5c1838f |
97a1e5b2 MK |
219 | This file is a handle for the PID namespace of the process. |
220 | This handle is permanent for the lifetime of the process | |
221 | (i.e., a process's PID namespace membership never changes). | |
99e2f752 | 222 | .TP |
1ae6b2c7 | 223 | .IR /proc/ pid /ns/pid_for_children " (since Linux 4.12)" |
99e2f752 | 224 | .\" commit eaa0d190bfe1ed891b814a52712dcd852554cb08 |
97a1e5b2 MK |
225 | This file is a handle for the PID namespace of |
226 | child processes created by this process. | |
227 | This can change as a consequence of calls to | |
228 | .BR unshare (2) | |
229 | and | |
230 | .BR setns (2) | |
231 | (see | |
232 | .BR pid_namespaces (7)), | |
233 | so the file may differ from | |
1ae6b2c7 | 234 | .IR /proc/ pid /ns/pid . |
1a7e08e3 MK |
235 | The symbolic link gains a value only after the first child process |
236 | is created in the namespace. | |
237 | (Beforehand, | |
238 | .BR readlink (2) | |
239 | of the symbolic link will return an empty buffer.) | |
cf8bfe6d | 240 | .TP |
1ae6b2c7 | 241 | .IR /proc/ pid /ns/time " (since Linux 5.6)" |
19e8f797 MK |
242 | This file is a handle for the time namespace of the process. |
243 | .TP | |
1ae6b2c7 | 244 | .IR /proc/ pid /ns/time_for_children " (since Linux 5.6)" |
19e8f797 MK |
245 | This file is a handle for the time namespace of |
246 | child processes created by this process. | |
247 | This can change as a consequence of calls to | |
248 | .BR unshare (2) | |
249 | and | |
250 | .BR setns (2) | |
251 | (see | |
252 | .BR time_namespaces (7)), | |
253 | so the file may differ from | |
1ae6b2c7 | 254 | .IR /proc/ pid /ns/time . |
19e8f797 | 255 | .TP |
1ae6b2c7 | 256 | .IR /proc/ pid /ns/user " (since Linux 3.8)" |
7eb8372d | 257 | .\" commit cde1975bc242f3e1072bde623ef378e547b73f91 |
cf8bfe6d | 258 | This file is a handle for the user namespace of the process. |
cf8bfe6d | 259 | .TP |
1ae6b2c7 | 260 | .IR /proc/ pid /ns/uts " (since Linux 3.0)" |
258e6b6c | 261 | This file is a handle for the UTS namespace of the process. |
33a1ab5d MK |
262 | .PP |
263 | Permission to dereference or read | |
264 | .RB ( readlink (2)) | |
265 | these symbolic links is governed by a ptrace access mode | |
266 | .B PTRACE_MODE_READ_FSCREDS | |
267 | check; see | |
268 | .BR ptrace (2). | |
6be09bd8 | 269 | .\" |
5046cb72 MK |
270 | .\" ==================== The /proc/sys/user directory ==================== |
271 | .\" | |
272 | .SS The /proc/sys/user directory | |
273 | The files in the | |
274 | .I /proc/sys/user | |
275 | directory (which is present since Linux 4.9) expose limits | |
276 | on the number of namespaces of various types that can be created. | |
277 | The files are as follows: | |
278 | .TP | |
1ae6b2c7 | 279 | .I max_cgroup_namespaces |
5046cb72 MK |
280 | The value in this file defines a per-user limit on the number of |
281 | cgroup namespaces that may be created in the user namespace. | |
282 | .TP | |
1ae6b2c7 | 283 | .I max_ipc_namespaces |
5046cb72 MK |
284 | The value in this file defines a per-user limit on the number of |
285 | ipc namespaces that may be created in the user namespace. | |
286 | .TP | |
1ae6b2c7 | 287 | .I max_mnt_namespaces |
5046cb72 MK |
288 | The value in this file defines a per-user limit on the number of |
289 | mount namespaces that may be created in the user namespace. | |
290 | .TP | |
1ae6b2c7 | 291 | .I max_net_namespaces |
5046cb72 MK |
292 | The value in this file defines a per-user limit on the number of |
293 | network namespaces that may be created in the user namespace. | |
294 | .TP | |
1ae6b2c7 | 295 | .I max_pid_namespaces |
5046cb72 | 296 | The value in this file defines a per-user limit on the number of |
c96bc205 | 297 | PID namespaces that may be created in the user namespace. |
5046cb72 | 298 | .TP |
c8bbab9a MK |
299 | .IR max_time_namespaces " (since Linux 5.7)" |
300 | .\" commit eeec26d5da8248ea4e240b8795bb4364213d3247 | |
301 | The value in this file defines a per-user limit on the number of | |
302 | time namespaces that may be created in the user namespace. | |
303 | .TP | |
1ae6b2c7 | 304 | .I max_user_namespaces |
5046cb72 MK |
305 | The value in this file defines a per-user limit on the number of |
306 | user namespaces that may be created in the user namespace. | |
307 | .TP | |
1ae6b2c7 | 308 | .I max_uts_namespaces |
5046cb72 | 309 | The value in this file defines a per-user limit on the number of |
69fc6c67 | 310 | uts namespaces that may be created in the user namespace. |
5046cb72 MK |
311 | .PP |
312 | Note the following details about these files: | |
313 | .IP * 3 | |
314 | The values in these files are modifiable by privileged processes. | |
315 | .IP * | |
316 | The values exposed by these files are the limits for the user namespace | |
317 | in which the opening process resides. | |
318 | .IP * | |
319 | The limits are per-user. | |
320 | Each user in the same user namespace | |
321 | can create namespaces up to the defined limit. | |
322 | .IP * | |
323 | The limits apply to all users, including UID 0. | |
324 | .IP * | |
325 | These limits apply in addition to any other per-namespace | |
326 | limits (such as those for PID and user namespaces) that may be enforced. | |
327 | .IP * | |
328 | Upon encountering these limits, | |
329 | .BR clone (2) | |
330 | and | |
331 | .BR unshare (2) | |
332 | fail with the error | |
333 | .BR ENOSPC . | |
334 | .IP * | |
335 | For the initial user namespace, | |
336 | the default value in each of these files is half the limit on the number | |
337 | of threads that may be created | |
cd415e73 | 338 | .RI ( /proc/sys/kernel/threads\-max ). |
5046cb72 MK |
339 | In all descendant user namespaces, the default value in each file is |
340 | .BR MAXINT . | |
341 | .IP * | |
342 | When a namespace is created, the object is also accounted | |
343 | against ancestor namespaces. | |
344 | More precisely: | |
345 | .RS | |
346 | .IP + 3 | |
347 | Each user namespace has a creator UID. | |
348 | .IP + | |
349 | When a namespace is created, | |
350 | it is accounted against the creator UIDs in each of the | |
351 | ancestor user namespaces, | |
352 | and the kernel ensures that the corresponding namespace limit | |
353 | for the creator UID in the ancestor namespace is not exceeded. | |
354 | .IP + | |
355 | The aforementioned point ensures that creating a new user namespace | |
356 | cannot be used as a means to escape the limits in force | |
357 | in the current user namespace. | |
2fe33a0d | 358 | .RE |
9a6d888c MK |
359 | .\" |
360 | .SS Namespace lifetime | |
361 | Absent any other factors, | |
362 | a namespace is automatically torn down when the last process in | |
363 | the namespace terminates or leaves the namespace. | |
364 | However, there are a number of other factors that may pin | |
365 | a namespace into existence even though it has no member processes. | |
366 | These factors include the following: | |
367 | .IP * 3 | |
368 | An open file descriptor or a bind mount exists for the corresponding | |
1ae6b2c7 | 369 | .IR /proc/ pid /ns/* |
9a6d888c MK |
370 | file. |
371 | .IP * | |
372 | The namespace is hierarchical (i.e., a PID or user namespace), | |
373 | and has a child namespace. | |
374 | .IP * | |
375 | It is a user namespace that owns one or more nonuser namespaces. | |
376 | .IP * | |
377 | It is a PID namespace, | |
378 | and there is a process that refers to the namespace via a | |
1ae6b2c7 | 379 | .IR /proc/ pid /ns/pid_for_children |
9a6d888c MK |
380 | symbolic link. |
381 | .IP * | |
a1a8c63f MK |
382 | It is a time namespace, |
383 | and there is a process that refers to the namespace via a | |
1ae6b2c7 | 384 | .IR /proc/ pid /ns/time_for_children |
a1a8c63f MK |
385 | symbolic link. |
386 | .IP * | |
9a6d888c MK |
387 | It is an IPC namespace, and a corresponding mount of an |
388 | .I mqueue | |
389 | filesystem (see | |
390 | .BR mq_overview (7)) | |
391 | refers to this namespace. | |
392 | .IP * | |
68bd4ad9 | 393 | It is a PID namespace, and a corresponding mount of a |
9a6d888c MK |
394 | .BR proc (5) |
395 | filesystem refers to this namespace. | |
a14af333 | 396 | .SH EXAMPLES |
e0ab72cb | 397 | See |
94dd730b MK |
398 | .BR clone (2) |
399 | and | |
fa88d1a4 | 400 | .BR user_namespaces (7). |
020357e8 | 401 | .SH SEE ALSO |
86499a6b | 402 | .BR nsenter (1), |
020357e8 | 403 | .BR readlink (1), |
86499a6b | 404 | .BR unshare (1), |
020357e8 | 405 | .BR clone (2), |
e0ab72cb | 406 | .BR ioctl_ns (2), |
020357e8 MK |
407 | .BR setns (2), |
408 | .BR unshare (2), | |
409 | .BR proc (5), | |
029ae9e3 | 410 | .BR capabilities (7), |
a2ee61a3 | 411 | .BR cgroup_namespaces (7), |
35fae0aa | 412 | .BR cgroups (7), |
10f8f8cb | 413 | .BR credentials (7), |
25e96f04 | 414 | .BR ipc_namespaces (7), |
2685b303 | 415 | .BR network_namespaces (7), |
024d6a84 | 416 | .BR pid_namespaces (7), |
67d1131f | 417 | .BR user_namespaces (7), |
30e022e5 | 418 | .BR uts_namespaces (7), |
8512495a | 419 | .BR lsns (8), |
029ae9e3 | 420 | .BR switch_root (8) |