]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man7/mount_namespaces.7
epoll.7: wfix
[thirdparty/man-pages.git] / man7 / mount_namespaces.7
1 .\" Copyright (c) 2016, 2019 by Michael Kerrisk <mtk.manpages@gmail.com>
2 .\"
3 .\" %%%LICENSE_START(VERBATIM)
4 .\" Permission is granted to make and distribute verbatim copies of this
5 .\" manual provided the copyright notice and this permission notice are
6 .\" preserved on all copies.
7 .\"
8 .\" Permission is granted to copy and distribute modified versions of this
9 .\" manual under the conditions for verbatim copying, provided that the
10 .\" entire resulting derived work is distributed under the terms of a
11 .\" permission notice identical to this one.
12 .\"
13 .\" Since the Linux kernel and libraries are constantly changing, this
14 .\" manual page may be incorrect or out-of-date. The author(s) assume no
15 .\" responsibility for errors or omissions, or for damages resulting from
16 .\" the use of the information contained herein. The author(s) may not
17 .\" have taken the same level of care in the production of this manual,
18 .\" which is licensed free of charge, as they might when working
19 .\" professionally.
20 .\"
21 .\" Formatted or processed versions of this manual, if unaccompanied by
22 .\" the source, must acknowledge the copyright and authors of this work.
23 .\" %%%LICENSE_END
24 .\"
25 .\"
26 .TH MOUNT_NAMESPACES 7 2019-08-02 "Linux" "Linux Programmer's Manual"
27 .SH NAME
28 mount_namespaces \- overview of Linux mount namespaces
29 .SH DESCRIPTION
30 For an overview of namespaces, see
31 .BR namespaces (7).
32 .PP
33 Mount namespaces provide isolation of the list of mount points seen
34 by the processes in each namespace instance.
35 Thus, the processes in each of the mount namespace instances
36 will see distinct single-directory hierarchies.
37 .PP
38 The views provided by the
39 .IR /proc/[pid]/mounts ,
40 .IR /proc/[pid]/mountinfo ,
41 and
42 .IR /proc/[pid]/mountstats
43 files (all described in
44 .BR proc (5))
45 correspond to the mount namespace in which the process with the PID
46 .IR [pid]
47 resides.
48 (All of the processes that reside in the same mount namespace
49 will see the same view in these files.)
50 .PP
51 A new mount namespace is created using either
52 .BR clone (2)
53 or
54 .BR unshare (2)
55 with the
56 .BR CLONE_NEWNS
57 flag.
58 When a new mount namespace is created,
59 its mount point list is initialized as follows:
60 .IP * 3
61 If the namespace is created using
62 .BR clone (2),
63 the mount point list of the child's namespace is a copy
64 of the mount point list in the parent's namespace.
65 .IP *
66 If the namespace is created using
67 .BR unshare (2),
68 the mount point list of the new namespace is a copy of
69 the mount point list in the caller's previous mount namespace.
70 .PP
71 Subsequent modifications to the mount point list
72 .RB ( mount (2)
73 and
74 .BR umount (2))
75 in either mount namespace will not (by default) affect the
76 mount point list seen in the other namespace
77 (but see the following discussion of shared subtrees).
78 .\"
79 .\" ============================================================
80 .\"
81 .SS Restrictions on mount namespaces
82 Note the following points with respect to mount namespaces:
83 .IP * 3
84 Each mount namespace has an owner user namespace.
85 As explained above, when a new mount namespace is created,
86 its mount point list is initialized as a copy of the mount point list
87 of another mount namespace.
88 If the new namespace and the namespace from which the mount point list
89 was copied are owned by different user namespaces,
90 then the new mount namespace is considered
91 .IR "less privileged" .
92 .IP *
93 When creating a less privileged mount namespace,
94 shared mounts are reduced to slave mounts.
95 (Shared and slave mounts are discussed below.)
96 This ensures that mappings performed in less
97 privileged mount namespaces will not propagate to more privileged
98 mount namespaces.
99 .IP *
100 .\" FIXME .
101 .\" What does "come as a single unit from more privileged mount" mean?
102 Mounts that come as a single unit from more privileged mount are
103 locked together and may not be separated in a less privileged mount
104 namespace.
105 (The
106 .BR unshare (2)
107 .B CLONE_NEWNS
108 operation brings across all of the mounts from the original
109 mount namespace as a single unit,
110 and recursive mounts that propagate between
111 mount namespaces propagate as a single unit.)
112 .IP *
113 The
114 .BR mount (2)
115 flags
116 .BR MS_RDONLY ,
117 .BR MS_NOSUID ,
118 .BR MS_NOEXEC ,
119 and the "atime" flags
120 .RB ( MS_NOATIME ,
121 .BR MS_NODIRATIME ,
122 .BR MS_RELATIME )
123 settings become locked
124 .\" commit 9566d6742852c527bf5af38af5cbb878dad75705
125 .\" Author: Eric W. Biederman <ebiederm@xmission.com>
126 .\" Date: Mon Jul 28 17:26:07 2014 -0700
127 .\"
128 .\" mnt: Correct permission checks in do_remount
129 .\"
130 when propagated from a more privileged to
131 a less privileged mount namespace,
132 and may not be changed in the less privileged mount namespace.
133 .IP *
134 .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree))
135 A file or directory that is a mount point in one namespace that is not
136 a mount point in another namespace, may be renamed, unlinked, or removed
137 .RB ( rmdir (2))
138 in the mount namespace in which it is not a mount point
139 (subject to the usual permission checks).
140 Consequently, the mount point is removed in the mount namespace
141 where it was a mount point.
142 .IP
143 Previously (before Linux 3.18),
144 .\" mtk: The change was in Linux 3.18, I think, with this commit:
145 .\" commit 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe
146 .\" Author: Eric W. Biederman <ebiederman@twitter.com>
147 .\" Date: Tue Oct 1 18:33:48 2013 -0700
148 .\"
149 .\" vfs: Lazily remove mounts on unlinked files and directories.
150 attempting to unlink, rename, or remove a file or directory
151 that was a mount point in another mount namespace would result in the error
152 .BR EBUSY .
153 That behavior had technical problems of enforcement (e.g., for NFS)
154 and permitted denial-of-service attacks against more privileged users.
155 (i.e., preventing individual files from being updated
156 by bind mounting on top of them).
157 .\"
158 .SH SHARED SUBTREES
159 After the implementation of mount namespaces was completed,
160 experience showed that the isolation that they provided was,
161 in some cases, too great.
162 For example, in order to make a newly loaded optical disk
163 available in all mount namespaces,
164 a mount operation was required in each namespace.
165 For this use case, and others,
166 the shared subtree feature was introduced in Linux 2.6.15.
167 This feature allows for automatic, controlled propagation of mount and unmount
168 .I events
169 between namespaces
170 (or, more precisely, between the members of a
171 .IR "peer group"
172 that are propagating events to one another).
173 .PP
174 Each mount point is marked (via
175 .BR mount (2))
176 as having one of the following
177 .IR "propagation types" :
178 .TP
179 .BR MS_SHARED
180 This mount point shares events with members of a peer group.
181 Mount and unmount events immediately under this mount point will propagate
182 to the other mount points that are members of the peer group.
183 .I Propagation
184 here means that the same mount or unmount will automatically occur
185 under all of the other mount points in the peer group.
186 Conversely, mount and unmount events that take place under
187 peer mount points will propagate to this mount point.
188 .TP
189 .BR MS_PRIVATE
190 This mount point is private; it does not have a peer group.
191 Mount and unmount events do not propagate into or out of this mount point.
192 .TP
193 .BR MS_SLAVE
194 Mount and unmount events propagate into this mount point from
195 a (master) shared peer group.
196 Mount and unmount events under this mount point do not propagate to any peer.
197 .IP
198 Note that a mount point can be the slave of another peer group
199 while at the same time sharing mount and unmount events
200 with a peer group of which it is a member.
201 (More precisely, one peer group can be the slave of another peer group.)
202 .TP
203 .BR MS_UNBINDABLE
204 This is like a private mount,
205 and in addition this mount can't be bind mounted.
206 Attempts to bind mount this mount
207 .RB ( mount (2)
208 with the
209 .BR MS_BIND
210 flag) will fail.
211 .IP
212 When a recursive bind mount
213 .RB ( mount (2)
214 with the
215 .BR MS_BIND
216 and
217 .BR MS_REC
218 flags) is performed on a directory subtree,
219 any bind mounts within the subtree are automatically pruned
220 (i.e., not replicated)
221 when replicating that subtree to produce the target subtree.
222 .PP
223 For a discussion of the propagation type assigned to a new mount,
224 see NOTES.
225 .PP
226 The propagation type is a per-mount-point setting;
227 some mount points may be marked as shared
228 (with each shared mount point being a member of a distinct peer group),
229 while others are private
230 (or slaved or unbindable).
231 .PP
232 Note that a mount's propagation type determines whether
233 mounts and unmounts of mount points
234 .I "immediately under"
235 the mount point are propagated.
236 Thus, the propagation type does not affect propagation of events for
237 grandchildren and further removed descendant mount points.
238 What happens if the mount point itself is unmounted is determined by
239 the propagation type that is in effect for the
240 .I parent
241 of the mount point.
242 .PP
243 Members are added to a
244 .IR "peer group"
245 when a mount point is marked as shared and either:
246 .IP * 3
247 the mount point is replicated during the creation of a new mount namespace; or
248 .IP *
249 a new bind mount is created from the mount point.
250 .PP
251 In both of these cases, the new mount point joins the peer group
252 of which the existing mount point is a member.
253 .PP
254 A new peer group is also created when a child mount point is created under
255 an existing mount point that is marked as shared.
256 In this case, the new child mount point is also marked as shared and
257 the resulting peer group consists of all the mount points
258 that are replicated under the peers of parent mount.
259 .PP
260 A mount ceases to be a member of a peer group when either
261 the mount is explicitly unmounted,
262 or when the mount is implicitly unmounted because a mount namespace is removed
263 (because it has no more member processes).
264 .PP
265 The propagation type of the mount points in a mount namespace
266 can be discovered via the "optional fields" exposed in
267 .IR /proc/[pid]/mountinfo .
268 (See
269 .BR proc (5)
270 for details of this file.)
271 The following tags can appear in the optional fields
272 for a record in that file:
273 .TP
274 .I shared:X
275 This mount point is shared in peer group
276 .IR X .
277 Each peer group has a unique ID that is automatically
278 generated by the kernel,
279 and all mount points in the same peer group will show the same ID.
280 (These IDs are assigned starting from the value 1,
281 and may be recycled when a peer group ceases to have any members.)
282 .TP
283 .I master:X
284 This mount is a slave to shared peer group
285 .IR X .
286 .TP
287 .IR propagate_from:X " (since Linux 2.6.26)"
288 .\" commit 97e7e0f71d6d948c25f11f0a33878d9356d9579e
289 This mount is a slave and receives propagation from shared peer group
290 .IR X .
291 This tag will always appear in conjunction with a
292 .IR master:X
293 tag.
294 Here,
295 .IR X
296 is the closest dominant peer group under the process's root directory.
297 If
298 .IR X
299 is the immediate master of the mount,
300 or if there is no dominant peer group under the same root,
301 then only the
302 .IR master:X
303 field is present and not the
304 .IR propagate_from:X
305 field.
306 For further details, see below.
307 .TP
308 .IR unbindable
309 This is an unbindable mount.
310 .PP
311 If none of the above tags is present, then this is a private mount.
312 .SS MS_SHARED and MS_PRIVATE example
313 Suppose that on a terminal in the initial mount namespace,
314 we mark one mount point as shared and another as private,
315 and then view the mounts in
316 .IR /proc/self/mountinfo :
317 .PP
318 .in +4n
319 .EX
320 sh1# \fBmount \-\-make\-shared /mntS\fP
321 sh1# \fBmount \-\-make\-private /mntP\fP
322 sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
323 77 61 8:17 / /mntS rw,relatime shared:1
324 83 61 8:15 / /mntP rw,relatime
325 .EE
326 .in
327 .PP
328 From the
329 .IR /proc/self/mountinfo
330 output, we see that
331 .IR /mntS
332 is a shared mount in peer group 1, and that
333 .IR /mntP
334 has no optional tags, indicating that it is a private mount.
335 The first two fields in each record in this file are the unique
336 ID for this mount, and the mount ID of the parent mount.
337 We can further inspect this file to see that the parent mount point of
338 .IR /mntS
339 and
340 .IR /mntP
341 is the root directory,
342 .IR / ,
343 which is mounted as private:
344 .PP
345 .in +4n
346 .EX
347 sh1# \fBcat /proc/self/mountinfo | awk \(aq$1 == 61\(aq | sed \(aqs/ \- .*//\(aq\fP
348 61 0 8:2 / / rw,relatime
349 .EE
350 .in
351 .PP
352 On a second terminal,
353 we create a new mount namespace where we run a second shell
354 and inspect the mounts:
355 .PP
356 .in +4n
357 .EX
358 $ \fBPS1=\(aqsh2# \(aq sudo unshare \-m \-\-propagation unchanged sh\fP
359 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
360 222 145 8:17 / /mntS rw,relatime shared:1
361 225 145 8:15 / /mntP rw,relatime
362 .EE
363 .in
364 .PP
365 The new mount namespace received a copy of the initial mount namespace's
366 mount points.
367 These new mount points maintain the same propagation types,
368 but have unique mount IDs.
369 (The
370 .IR \-\-propagation\ unchanged
371 option prevents
372 .BR unshare (1)
373 from marking all mounts as private when creating a new mount namespace,
374 .\" Since util-linux 2.27
375 which it does by default.)
376 .PP
377 In the second terminal, we then create submounts under each of
378 .IR /mntS
379 and
380 .IR /mntP
381 and inspect the set-up:
382 .PP
383 .in +4n
384 .EX
385 sh2# \fBmkdir /mntS/a\fP
386 sh2# \fBmount /dev/sdb6 /mntS/a\fP
387 sh2# \fBmkdir /mntP/b\fP
388 sh2# \fBmount /dev/sdb7 /mntP/b\fP
389 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
390 222 145 8:17 / /mntS rw,relatime shared:1
391 225 145 8:15 / /mntP rw,relatime
392 178 222 8:22 / /mntS/a rw,relatime shared:2
393 230 225 8:23 / /mntP/b rw,relatime
394 .EE
395 .in
396 .PP
397 From the above, it can be seen that
398 .IR /mntS/a
399 was created as shared (inheriting this setting from its parent mount) and
400 .IR /mntP/b
401 was created as a private mount.
402 .PP
403 Returning to the first terminal and inspecting the set-up,
404 we see that the new mount created under the shared mount point
405 .IR /mntS
406 propagated to its peer mount (in the initial mount namespace),
407 but the new mount created under the private mount point
408 .IR /mntP
409 did not propagate:
410 .PP
411 .in +4n
412 .EX
413 sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
414 77 61 8:17 / /mntS rw,relatime shared:1
415 83 61 8:15 / /mntP rw,relatime
416 179 77 8:22 / /mntS/a rw,relatime shared:2
417 .EE
418 .in
419 .\"
420 .SS MS_SLAVE example
421 Making a mount point a slave allows it to receive propagated
422 mount and unmount events from a master shared peer group,
423 while preventing it from propagating events to that master.
424 This is useful if we want to (say) receive a mount event when
425 an optical disk is mounted in the master shared peer group
426 (in another mount namespace),
427 but want to prevent mount and unmount events under the slave mount
428 from having side effects in other namespaces.
429 .PP
430 We can demonstrate the effect of slaving by first marking
431 two mount points as shared in the initial mount namespace:
432 .PP
433 .in +4n
434 .EX
435 sh1# \fBmount \-\-make\-shared /mntX\fP
436 sh1# \fBmount \-\-make\-shared /mntY\fP
437 sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
438 132 83 8:23 / /mntX rw,relatime shared:1
439 133 83 8:22 / /mntY rw,relatime shared:2
440 .EE
441 .in
442 .PP
443 On a second terminal,
444 we create a new mount namespace and inspect the mount points:
445 .PP
446 .in +4n
447 .EX
448 sh2# \fBunshare \-m \-\-propagation unchanged sh\fP
449 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
450 168 167 8:23 / /mntX rw,relatime shared:1
451 169 167 8:22 / /mntY rw,relatime shared:2
452 .EE
453 .in
454 .PP
455 In the new mount namespace, we then mark one of the mount points as a slave:
456 .PP
457 .in +4n
458 .EX
459 sh2# \fBmount \-\-make\-slave /mntY\fP
460 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
461 168 167 8:23 / /mntX rw,relatime shared:1
462 169 167 8:22 / /mntY rw,relatime master:2
463 .EE
464 .in
465 .PP
466 From the above output, we see that
467 .IR /mntY
468 is now a slave mount that is receiving propagation events from
469 the shared peer group with the ID 2.
470 .PP
471 Continuing in the new namespace, we create submounts under each of
472 .IR /mntX
473 and
474 .IR /mntY :
475 .PP
476 .in +4n
477 .EX
478 sh2# \fBmkdir /mntX/a\fP
479 sh2# \fBmount /dev/sda3 /mntX/a\fP
480 sh2# \fBmkdir /mntY/b\fP
481 sh2# \fBmount /dev/sda5 /mntY/b\fP
482 .EE
483 .in
484 .PP
485 When we inspect the state of the mount points in the new mount namespace,
486 we see that
487 .IR /mntX/a
488 was created as a new shared mount
489 (inheriting the "shared" setting from its parent mount) and
490 .IR /mntY/b
491 was created as a private mount:
492 .PP
493 .in +4n
494 .EX
495 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
496 168 167 8:23 / /mntX rw,relatime shared:1
497 169 167 8:22 / /mntY rw,relatime master:2
498 173 168 8:3 / /mntX/a rw,relatime shared:3
499 175 169 8:5 / /mntY/b rw,relatime
500 .EE
501 .in
502 .PP
503 Returning to the first terminal (in the initial mount namespace),
504 we see that the mount
505 .IR /mntX/a
506 propagated to the peer (the shared
507 .IR /mntX ),
508 but the mount
509 .IR /mntY/b
510 was not propagated:
511 .PP
512 .in +4n
513 .EX
514 sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
515 132 83 8:23 / /mntX rw,relatime shared:1
516 133 83 8:22 / /mntY rw,relatime shared:2
517 174 132 8:3 / /mntX/a rw,relatime shared:3
518 .EE
519 .in
520 .PP
521 Now we create a new mount point under
522 .IR /mntY
523 in the first shell:
524 .PP
525 .in +4n
526 .EX
527 sh1# \fBmkdir /mntY/c\fP
528 sh1# \fBmount /dev/sda1 /mntY/c\fP
529 sh1# \fBcat /proc/self/mountinfo | grep '/mnt' | sed 's/ \- .*//'\fP
530 132 83 8:23 / /mntX rw,relatime shared:1
531 133 83 8:22 / /mntY rw,relatime shared:2
532 174 132 8:3 / /mntX/a rw,relatime shared:3
533 178 133 8:1 / /mntY/c rw,relatime shared:4
534 .EE
535 .in
536 .PP
537 When we examine the mount points in the second mount namespace,
538 we see that in this case the new mount has been propagated
539 to the slave mount point,
540 and that the new mount is itself a slave mount (to peer group 4):
541 .PP
542 .in +4n
543 .EX
544 sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
545 168 167 8:23 / /mntX rw,relatime shared:1
546 169 167 8:22 / /mntY rw,relatime master:2
547 173 168 8:3 / /mntX/a rw,relatime shared:3
548 175 169 8:5 / /mntY/b rw,relatime
549 179 169 8:1 / /mntY/c rw,relatime master:4
550 .EE
551 .in
552 .\"
553 .SS MS_UNBINDABLE example
554 One of the primary purposes of unbindable mounts is to avoid
555 the "mount point explosion" problem when repeatedly performing bind mounts
556 of a higher-level subtree at a lower-level mount point.
557 The problem is illustrated by the following shell session.
558 .PP
559 Suppose we have a system with the following mount points:
560 .PP
561 .in +4n
562 .EX
563 # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP
564 /dev/sda1 on /
565 /dev/sdb6 on /mntX
566 /dev/sdb7 on /mntY
567 .EE
568 .in
569 .PP
570 Suppose furthermore that we wish to recursively bind mount
571 the root directory under several users' home directories.
572 We do this for the first user, and inspect the mount points:
573 .PP
574 .in +4n
575 .EX
576 # \fBmount \-\-rbind / /home/cecilia/\fP
577 # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP
578 /dev/sda1 on /
579 /dev/sdb6 on /mntX
580 /dev/sdb7 on /mntY
581 /dev/sda1 on /home/cecilia
582 /dev/sdb6 on /home/cecilia/mntX
583 /dev/sdb7 on /home/cecilia/mntY
584 .EE
585 .in
586 .PP
587 When we repeat this operation for the second user,
588 we start to see the explosion problem:
589 .PP
590 .in +4n
591 .EX
592 # \fBmount \-\-rbind / /home/henry\fP
593 # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP
594 /dev/sda1 on /
595 /dev/sdb6 on /mntX
596 /dev/sdb7 on /mntY
597 /dev/sda1 on /home/cecilia
598 /dev/sdb6 on /home/cecilia/mntX
599 /dev/sdb7 on /home/cecilia/mntY
600 /dev/sda1 on /home/henry
601 /dev/sdb6 on /home/henry/mntX
602 /dev/sdb7 on /home/henry/mntY
603 /dev/sda1 on /home/henry/home/cecilia
604 /dev/sdb6 on /home/henry/home/cecilia/mntX
605 /dev/sdb7 on /home/henry/home/cecilia/mntY
606 .EE
607 .in
608 .PP
609 Under
610 .IR /home/henry ,
611 we have not only recursively added the
612 .IR /mntX
613 and
614 .IR /mntY
615 mounts, but also the recursive mounts of those directories under
616 .IR /home/cecilia
617 that were created in the previous step.
618 Upon repeating the step for a third user,
619 it becomes obvious that the explosion is exponential in nature:
620 .PP
621 .in +4n
622 .EX
623 # \fBmount \-\-rbind / /home/otto\fP
624 # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP
625 /dev/sda1 on /
626 /dev/sdb6 on /mntX
627 /dev/sdb7 on /mntY
628 /dev/sda1 on /home/cecilia
629 /dev/sdb6 on /home/cecilia/mntX
630 /dev/sdb7 on /home/cecilia/mntY
631 /dev/sda1 on /home/henry
632 /dev/sdb6 on /home/henry/mntX
633 /dev/sdb7 on /home/henry/mntY
634 /dev/sda1 on /home/henry/home/cecilia
635 /dev/sdb6 on /home/henry/home/cecilia/mntX
636 /dev/sdb7 on /home/henry/home/cecilia/mntY
637 /dev/sda1 on /home/otto
638 /dev/sdb6 on /home/otto/mntX
639 /dev/sdb7 on /home/otto/mntY
640 /dev/sda1 on /home/otto/home/cecilia
641 /dev/sdb6 on /home/otto/home/cecilia/mntX
642 /dev/sdb7 on /home/otto/home/cecilia/mntY
643 /dev/sda1 on /home/otto/home/henry
644 /dev/sdb6 on /home/otto/home/henry/mntX
645 /dev/sdb7 on /home/otto/home/henry/mntY
646 /dev/sda1 on /home/otto/home/henry/home/cecilia
647 /dev/sdb6 on /home/otto/home/henry/home/cecilia/mntX
648 /dev/sdb7 on /home/otto/home/henry/home/cecilia/mntY
649 .EE
650 .in
651 .PP
652 The mount explosion problem in the above scenario can be avoided
653 by making each of the new mounts unbindable.
654 The effect of doing this is that recursive mounts of the root
655 directory will not replicate the unbindable mounts.
656 We make such a mount for the first user:
657 .PP
658 .in +4n
659 .EX
660 # \fBmount \-\-rbind \-\-make\-unbindable / /home/cecilia\fP
661 .EE
662 .in
663 .PP
664 Before going further, we show that unbindable mounts are indeed unbindable:
665 .PP
666 .in +4n
667 .EX
668 # \fBmkdir /mntZ\fP
669 # \fBmount \-\-bind /home/cecilia /mntZ\fP
670 mount: wrong fs type, bad option, bad superblock on /home/cecilia,
671 missing codepage or helper program, or other error
672
673 In some cases useful info is found in syslog \- try
674 dmesg | tail or so.
675 .EE
676 .in
677 .PP
678 Now we create unbindable recursive bind mounts for the other two users:
679 .PP
680 .in +4n
681 .EX
682 # \fBmount \-\-rbind \-\-make\-unbindable / /home/henry\fP
683 # \fBmount \-\-rbind \-\-make\-unbindable / /home/otto\fP
684 .EE
685 .in
686 .PP
687 Upon examining the list of mount points,
688 we see there has been no explosion of mount points,
689 because the unbindable mounts were not replicated
690 under each user's directory:
691 .PP
692 .in +4n
693 .EX
694 # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP
695 /dev/sda1 on /
696 /dev/sdb6 on /mntX
697 /dev/sdb7 on /mntY
698 /dev/sda1 on /home/cecilia
699 /dev/sdb6 on /home/cecilia/mntX
700 /dev/sdb7 on /home/cecilia/mntY
701 /dev/sda1 on /home/henry
702 /dev/sdb6 on /home/henry/mntX
703 /dev/sdb7 on /home/henry/mntY
704 /dev/sda1 on /home/otto
705 /dev/sdb6 on /home/otto/mntX
706 /dev/sdb7 on /home/otto/mntY
707 .EE
708 .in
709 .\"
710 .SS Propagation type transitions
711 The following table shows the effect that applying a new propagation type
712 (i.e.,
713 .IR "mount \-\-make\-xxxx")
714 has on the existing propagation type of a mount point.
715 The rows correspond to existing propagation types,
716 and the columns are the new propagation settings.
717 For reasons of space, "private" is abbreviated as "priv" and
718 "unbindable" as "unbind".
719 .TS
720 lb2 lb2 lb2 lb2 lb1
721 lb l l l l l.
722 make-shared make-slave make-priv make-unbind
723 shared shared slave/priv [1] priv unbind
724 slave slave+shared slave [2] priv unbind
725 slave+shared slave+shared slave priv unbind
726 private shared priv [2] priv unbind
727 unbindable shared unbind [2] priv unbind
728 .TE
729 .sp 1
730 Note the following details to the table:
731 .IP [1] 4
732 If a shared mount is the only mount in its peer group,
733 making it a slave automatically makes it private.
734 .IP [2]
735 Slaving a nonshared mount has no effect on the mount.
736 .\"
737 .SS Bind (MS_BIND) semantics
738 Suppose that the following command is performed:
739 .PP
740 .in +4n
741 .EX
742 mount \-\-bind A/a B/b
743 .EE
744 .in
745 .PP
746 Here,
747 .I A
748 is the source mount point,
749 .I B
750 is the destination mount point,
751 .I a
752 is a subdirectory path under the mount point
753 .IR A ,
754 and
755 .I b
756 is a subdirectory path under the mount point
757 .IR B .
758 The propagation type of the resulting mount,
759 .IR B/b ,
760 depends on the propagation types of the mount points
761 .IR A
762 and
763 .IR B ,
764 and is summarized in the following table.
765 .PP
766 .TS
767 lb2 lb1 lb2 lb2 lb2 lb0
768 lb2 lb1 lb2 lb2 lb2 lb0
769 lb lb l l l l l.
770 source(A)
771 shared private slave unbind
772 _
773 dest(B) shared | shared shared slave+shared invalid
774 nonshared | shared private slave invalid
775 .TE
776 .sp 1
777 Note that a recursive bind of a subtree follows the same semantics
778 as for a bind operation on each mount in the subtree.
779 (Unbindable mounts are automatically pruned at the target mount point.)
780 .PP
781 For further details, see
782 .I Documentation/filesystems/sharedsubtree.txt
783 in the kernel source tree.
784 .\"
785 .SS Move (MS_MOVE) semantics
786 Suppose that the following command is performed:
787 .PP
788 .in +4n
789 .EX
790 mount \-\-move A B/b
791 .EE
792 .in
793 .PP
794 Here,
795 .I A
796 is the source mount point,
797 .I B
798 is the destination mount point, and
799 .I b
800 is a subdirectory path under the mount point
801 .IR B .
802 The propagation type of the resulting mount,
803 .IR B/b ,
804 depends on the propagation types of the mount points
805 .IR A
806 and
807 .IR B ,
808 and is summarized in the following table.
809 .PP
810 .TS
811 lb2 lb1 lb2 lb2 lb2 lb0
812 lb2 lb1 lb2 lb2 lb2 lb0
813 lb lb l l l l l.
814 source(A)
815 shared private slave unbind
816 _
817 dest(B) shared | shared shared slave+shared invalid
818 nonshared | shared private slave unbindable
819 .TE
820 .sp 1
821 Note: moving a mount that resides under a shared mount is invalid.
822 .PP
823 For further details, see
824 .I Documentation/filesystems/sharedsubtree.txt
825 in the kernel source tree.
826 .\"
827 .SS Mount semantics
828 Suppose that we use the following command to create a mount point:
829 .PP
830 .in +4n
831 .EX
832 mount device B/b
833 .EE
834 .in
835 .PP
836 Here,
837 .I B
838 is the destination mount point, and
839 .I b
840 is a subdirectory path under the mount point
841 .IR B .
842 The propagation type of the resulting mount,
843 .IR B/b ,
844 follows the same rules as for a bind mount,
845 where the propagation type of the source mount
846 is considered always to be private.
847 .\"
848 .SS Unmount semantics
849 Suppose that we use the following command to tear down a mount point:
850 .PP
851 .in +4n
852 .EX
853 unmount A
854 .EE
855 .in
856 .PP
857 Here,
858 .I A
859 is a mount point on
860 .IR B/b ,
861 where
862 .I B
863 is the parent mount and
864 .I b
865 is a subdirectory path under the mount point
866 .IR B .
867 If
868 .B B
869 is shared, then all most-recently-mounted mounts at
870 .I b
871 on mounts that receive propagation from mount
872 .I B
873 and do not have submounts under them are unmounted.
874 .\"
875 .SS The /proc/[pid]/mountinfo "propagate_from" tag
876 The
877 .I propagate_from:X
878 tag is shown in the optional fields of a
879 .IR /proc/[pid]/mountinfo
880 record in cases where a process can't see a slave's immediate master
881 (i.e., the pathname of the master is not reachable from
882 the filesystem root directory)
883 and so cannot determine the
884 chain of propagation between the mounts it can see.
885 .PP
886 In the following example, we first create a two-link master-slave chain
887 between the mounts
888 .IR /mnt ,
889 .IR /tmp/etc ,
890 and
891 .IR /mnt/tmp/etc .
892 Then the
893 .BR chroot (1)
894 command is used to make the
895 .IR /tmp/etc
896 mount point unreachable from the root directory,
897 creating a situation where the master of
898 .IR /mnt/tmp/etc
899 is not reachable from the (new) root directory of the process.
900 .PP
901 First, we bind mount the root directory onto
902 .IR /mnt
903 and then bind mount
904 .IR /proc
905 at
906 .IR /mnt/proc
907 so that after the later
908 .BR chroot (1)
909 the
910 .BR proc (5)
911 filesystem remains visible at the correct location
912 in the chroot-ed environment.
913 .PP
914 .in +4n
915 .EX
916 # \fBmkdir \-p /mnt/proc\fP
917 # \fBmount \-\-bind / /mnt\fP
918 # \fBmount \-\-bind /proc /mnt/proc\fP
919 .EE
920 .in
921 .PP
922 Next, we ensure that the
923 .IR /mnt
924 mount is a shared mount in a new peer group (with no peers):
925 .PP
926 .in +4n
927 .EX
928 # \fBmount \-\-make\-private /mnt\fP # Isolate from any previous peer group
929 # \fBmount \-\-make\-shared /mnt\fP
930 # \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP
931 239 61 8:2 / /mnt ... shared:102
932 248 239 0:4 / /mnt/proc ... shared:5
933 .EE
934 .in
935 .PP
936 Next, we bind mount
937 .IR /mnt/etc
938 onto
939 .IR /tmp/etc :
940 .PP
941 .in +4n
942 .EX
943 # \fBmkdir \-p /tmp/etc\fP
944 # \fBmount \-\-bind /mnt/etc /tmp/etc\fP
945 # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP
946 239 61 8:2 / /mnt ... shared:102
947 248 239 0:4 / /mnt/proc ... shared:5
948 267 40 8:2 /etc /tmp/etc ... shared:102
949 .EE
950 .in
951 .PP
952 Initially, these two mount points are in the same peer group,
953 but we then make the
954 .IR /tmp/etc
955 a slave of
956 .IR /mnt/etc ,
957 and then make
958 .IR /tmp/etc
959 shared as well,
960 so that it can propagate events to the next slave in the chain:
961 .PP
962 .in +4n
963 .EX
964 # \fBmount \-\-make\-slave /tmp/etc\fP
965 # \fBmount \-\-make\-shared /tmp/etc\fP
966 # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP
967 239 61 8:2 / /mnt ... shared:102
968 248 239 0:4 / /mnt/proc ... shared:5
969 267 40 8:2 /etc /tmp/etc ... shared:105 master:102
970 .EE
971 .in
972 .PP
973 Then we bind mount
974 .IR /tmp/etc
975 onto
976 .IR /mnt/tmp/etc .
977 Again, the two mount points are initially in the same peer group,
978 but we then make
979 .IR /mnt/tmp/etc
980 a slave of
981 .IR /tmp/etc :
982 .PP
983 .in +4n
984 .EX
985 # \fBmkdir \-p /mnt/tmp/etc\fP
986 # \fBmount \-\-bind /tmp/etc /mnt/tmp/etc\fP
987 # \fBmount \-\-make\-slave /mnt/tmp/etc\fP
988 # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP
989 239 61 8:2 / /mnt ... shared:102
990 248 239 0:4 / /mnt/proc ... shared:5
991 267 40 8:2 /etc /tmp/etc ... shared:105 master:102
992 273 239 8:2 /etc /mnt/tmp/etc ... master:105
993 .EE
994 .in
995 .PP
996 From the above, we see that
997 .IR /mnt
998 is the master of the slave
999 .IR /tmp/etc ,
1000 which in turn is the master of the slave
1001 .IR /mnt/tmp/etc .
1002 .PP
1003 We then
1004 .BR chroot (1)
1005 to the
1006 .IR /mnt
1007 directory, which renders the mount with ID 267 unreachable
1008 from the (new) root directory:
1009 .PP
1010 .in +4n
1011 .EX
1012 # \fBchroot /mnt\fP
1013 .EE
1014 .in
1015 .PP
1016 When we examine the state of the mounts inside the chroot-ed environment,
1017 we see the following:
1018 .PP
1019 .in +4n
1020 .EX
1021 # \fBcat /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP
1022 239 61 8:2 / / ... shared:102
1023 248 239 0:4 / /proc ... shared:5
1024 273 239 8:2 /etc /tmp/etc ... master:105 propagate_from:102
1025 .EE
1026 .in
1027 .PP
1028 Above, we see that the mount with ID 273
1029 is a slave whose master is the peer group 105.
1030 The mount point for that master is unreachable, and so a
1031 .IR propagate_from
1032 tag is displayed, indicating that the closest dominant peer group
1033 (i.e., the nearest reachable mount in the slave chain)
1034 is the peer group with the ID 102 (corresponding to the
1035 .IR /mnt
1036 mount point before the
1037 .BR chroot (1)
1038 was performed.
1039 .\"
1040 .SH VERSIONS
1041 Mount namespaces first appeared in Linux 2.4.19.
1042 .SH CONFORMING TO
1043 Namespaces are a Linux-specific feature.
1044 .\"
1045 .SH NOTES
1046 The propagation type assigned to a new mount point depends
1047 on the propagation type of the parent mount.
1048 If the mount point has a parent (i.e., it is a non-root mount
1049 point) and the propagation type of the parent is
1050 .BR MS_SHARED ,
1051 then the propagation type of the new mount is also
1052 .BR MS_SHARED .
1053 Otherwise, the propagation type of the new mount is
1054 .BR MS_PRIVATE .
1055 .PP
1056 Notwithstanding the fact that the default propagation type
1057 for new mount points is in many cases
1058 .BR MS_PRIVATE ,
1059 .BR MS_SHARED
1060 is typically more useful.
1061 For this reason,
1062 .BR systemd (1)
1063 automatically remounts all mount points as
1064 .BR MS_SHARED
1065 on system startup.
1066 Thus, on most modern systems, the default propagation type is in practice
1067 .BR MS_SHARED .
1068 .PP
1069 Since, when one uses
1070 .BR unshare (1)
1071 to create a mount namespace,
1072 the goal is commonly to provide full isolation of the mount points
1073 in the new namespace,
1074 .BR unshare (1)
1075 (since
1076 .IR util-linux
1077 version 2.27) in turn reverses the step performed by
1078 .BR systemd (1),
1079 by making all mount points private in the new namespace.
1080 That is,
1081 .BR unshare (1)
1082 performs the equivalent of the following in the new mount namespace:
1083 .PP
1084 .in +4n
1085 .EX
1086 mount \-\-make\-rprivate /
1087 .EE
1088 .in
1089 .PP
1090 To prevent this, one can use the
1091 .IR "\-\-propagation\ unchanged"
1092 option to
1093 .BR unshare (1).
1094 .PP
1095 An application that creates a new mount namespace directly using
1096 .BR clone (2)
1097 or
1098 .BR unshare (2)
1099 may desire to prevent propagation of mount events to other mount namespaces
1100 (as is done by
1101 .BR unshare (1)).
1102 This can be done by changing the propagation type of
1103 mount points in the new namespace to either
1104 .BR MS_SLAVE
1105 or
1106 .BR MS_PRIVATE .
1107 using a call such as the following:
1108 .IP
1109 .in +4n
1110 .EX
1111 mount(NULL, "/", MS_SLAVE | MS_REC, NULL);
1112 .EE
1113 .in
1114 .PP
1115 For a discussion of propagation types when moving mounts
1116 .RB ( MS_MOVE )
1117 and creating bind mounts
1118 .RB ( MS_BIND ),
1119 see
1120 .IR Documentation/filesystems/sharedsubtree.txt .
1121 .SH EXAMPLE
1122 See
1123 .BR pivot_root (2).
1124 .SH SEE ALSO
1125 .BR unshare (1),
1126 .BR clone (2),
1127 .BR mount (2),
1128 .BR pivot_root (2),
1129 .BR setns (2),
1130 .BR umount (2),
1131 .BR unshare (2),
1132 .BR proc (5),
1133 .BR namespaces (7),
1134 .BR user_namespaces (7),
1135 .BR findmnt (8),
1136 .BR pivot_root (8)
1137 .PP
1138 .IR Documentation/filesystems/sharedsubtree.txt
1139 in the kernel source tree.