]>
Commit | Line | Data |
---|---|---|
ee54e5d5 | 1 | .\" Copyright (c) 2016, 2019, 2021 by Michael Kerrisk <mtk.manpages@gmail.com> |
98c28960 | 2 | .\" |
5fbde956 | 3 | .\" SPDX-License-Identifier: Linux-man-pages-copyleft |
98c28960 MK |
4 | .\" |
5 | .\" | |
6e00b7a8 | 6 | .TH MOUNT_NAMESPACES 7 2021-08-27 "Linux" "Linux Programmer's Manual" |
98c28960 MK |
7 | .SH NAME |
8 | mount_namespaces \- overview of Linux mount namespaces | |
9 | .SH DESCRIPTION | |
10 | For an overview of namespaces, see | |
11 | .BR namespaces (7). | |
a721e8b2 | 12 | .PP |
8c9a8274 | 13 | Mount namespaces provide isolation of the list of mounts seen |
98c28960 MK |
14 | by the processes in each namespace instance. |
15 | Thus, the processes in each of the mount namespace instances | |
16 | will see distinct single-directory hierarchies. | |
a721e8b2 | 17 | .PP |
98c28960 | 18 | The views provided by the |
1ae6b2c7 AC |
19 | .IR /proc/ pid /mounts , |
20 | .IR /proc/ pid /mountinfo , | |
98c28960 | 21 | and |
1ae6b2c7 | 22 | .IR /proc/ pid /mountstats |
98c28960 MK |
23 | files (all described in |
24 | .BR proc (5)) | |
25 | correspond to the mount namespace in which the process with the PID | |
1ae6b2c7 | 26 | .I pid |
98c28960 MK |
27 | resides. |
28 | (All of the processes that reside in the same mount namespace | |
29 | will see the same view in these files.) | |
a721e8b2 | 30 | .PP |
534755ee | 31 | A new mount namespace is created using either |
98c28960 MK |
32 | .BR clone (2) |
33 | or | |
34 | .BR unshare (2) | |
35 | with the | |
1ae6b2c7 | 36 | .B CLONE_NEWNS |
534755ee MK |
37 | flag. |
38 | When a new mount namespace is created, | |
8c9a8274 | 39 | its mount list is initialized as follows: |
534755ee MK |
40 | .IP * 3 |
41 | If the namespace is created using | |
42 | .BR clone (2), | |
8c9a8274 | 43 | the mount list of the child's namespace is a copy |
82357e60 | 44 | of the mount list in the parent process's mount namespace. |
534755ee MK |
45 | .IP * |
46 | If the namespace is created using | |
47 | .BR unshare (2), | |
8c9a8274 MK |
48 | the mount list of the new namespace is a copy of |
49 | the mount list in the caller's previous mount namespace. | |
534755ee | 50 | .PP |
8c9a8274 | 51 | Subsequent modifications to the mount list |
98c28960 MK |
52 | .RB ( mount (2) |
53 | and | |
54 | .BR umount (2)) | |
55 | in either mount namespace will not (by default) affect the | |
8c9a8274 | 56 | mount list seen in the other namespace |
98c28960 MK |
57 | (but see the following discussion of shared subtrees). |
58 | .\" | |
59 | .SH SHARED SUBTREES | |
60 | After the implementation of mount namespaces was completed, | |
61 | experience showed that the isolation that they provided was, | |
62 | in some cases, too great. | |
63 | For example, in order to make a newly loaded optical disk | |
64 | available in all mount namespaces, | |
65 | a mount operation was required in each namespace. | |
66 | For this use case, and others, | |
67 | the shared subtree feature was introduced in Linux 2.6.15. | |
68 | This feature allows for automatic, controlled propagation of mount and unmount | |
69 | .I events | |
70 | between namespaces | |
24483c27 | 71 | (or, more precisely, between the mounts that are members of a |
1ae6b2c7 | 72 | .I peer group |
98c28960 | 73 | that are propagating events to one another). |
a721e8b2 | 74 | .PP |
8c9a8274 | 75 | Each mount is marked (via |
98c28960 MK |
76 | .BR mount (2)) |
77 | as having one of the following | |
78 | .IR "propagation types" : | |
79 | .TP | |
1ae6b2c7 | 80 | .B MS_SHARED |
8c9a8274 MK |
81 | This mount shares events with members of a peer group. |
82 | Mount and unmount events immediately under this mount will propagate | |
83 | to the other mounts that are members of the peer group. | |
98c28960 MK |
84 | .I Propagation |
85 | here means that the same mount or unmount will automatically occur | |
8c9a8274 | 86 | under all of the other mounts in the peer group. |
98c28960 | 87 | Conversely, mount and unmount events that take place under |
8c9a8274 | 88 | peer mounts will propagate to this mount. |
98c28960 | 89 | .TP |
1ae6b2c7 | 90 | .B MS_PRIVATE |
8c9a8274 MK |
91 | This mount is private; it does not have a peer group. |
92 | Mount and unmount events do not propagate into or out of this mount. | |
98c28960 | 93 | .TP |
1ae6b2c7 | 94 | .B MS_SLAVE |
8c9a8274 | 95 | Mount and unmount events propagate into this mount from |
98c28960 | 96 | a (master) shared peer group. |
8c9a8274 | 97 | Mount and unmount events under this mount do not propagate to any peer. |
a721e8b2 | 98 | .IP |
8c9a8274 | 99 | Note that a mount can be the slave of another peer group |
98c28960 MK |
100 | while at the same time sharing mount and unmount events |
101 | with a peer group of which it is a member. | |
102 | (More precisely, one peer group can be the slave of another peer group.) | |
103 | .TP | |
1ae6b2c7 | 104 | .B MS_UNBINDABLE |
98c28960 MK |
105 | This is like a private mount, |
106 | and in addition this mount can't be bind mounted. | |
107 | Attempts to bind mount this mount | |
108 | .RB ( mount (2) | |
109 | with the | |
1ae6b2c7 | 110 | .B MS_BIND |
98c28960 | 111 | flag) will fail. |
a721e8b2 | 112 | .IP |
98c28960 MK |
113 | When a recursive bind mount |
114 | .RB ( mount (2) | |
115 | with the | |
1ae6b2c7 | 116 | .B MS_BIND |
98c28960 | 117 | and |
1ae6b2c7 | 118 | .B MS_REC |
98c28960 MK |
119 | flags) is performed on a directory subtree, |
120 | any bind mounts within the subtree are automatically pruned | |
121 | (i.e., not replicated) | |
122 | when replicating that subtree to produce the target subtree. | |
123 | .PP | |
3dcc463a MK |
124 | For a discussion of the propagation type assigned to a new mount, |
125 | see NOTES. | |
a721e8b2 | 126 | .PP |
98c28960 | 127 | The propagation type is a per-mount-point setting; |
8c9a8274 MK |
128 | some mounts may be marked as shared |
129 | (with each shared mount being a member of a distinct peer group), | |
98c28960 MK |
130 | while others are private |
131 | (or slaved or unbindable). | |
a721e8b2 | 132 | .PP |
98c28960 | 133 | Note that a mount's propagation type determines whether |
8c9a8274 | 134 | mounts and unmounts of mounts |
1ae6b2c7 | 135 | .I immediately under |
8c9a8274 | 136 | the mount are propagated. |
98c28960 | 137 | Thus, the propagation type does not affect propagation of events for |
8c9a8274 MK |
138 | grandchildren and further removed descendant mounts. |
139 | What happens if the mount itself is unmounted is determined by | |
98c28960 MK |
140 | the propagation type that is in effect for the |
141 | .I parent | |
8c9a8274 | 142 | of the mount. |
a721e8b2 | 143 | .PP |
98c28960 | 144 | Members are added to a |
1ae6b2c7 | 145 | .I peer group |
8c9a8274 | 146 | when a mount is marked as shared and either: |
98c28960 | 147 | .IP * 3 |
8c9a8274 | 148 | the mount is replicated during the creation of a new mount namespace; or |
98c28960 | 149 | .IP * |
8c9a8274 | 150 | a new bind mount is created from the mount. |
98c28960 | 151 | .PP |
8c9a8274 MK |
152 | In both of these cases, the new mount joins the peer group |
153 | of which the existing mount is a member. | |
46af7198 | 154 | .PP |
8c9a8274 MK |
155 | A new peer group is also created when a child mount is created under |
156 | an existing mount that is marked as shared. | |
157 | In this case, the new child mount is also marked as shared and | |
158 | the resulting peer group consists of all the mounts | |
9428bb9d | 159 | that are replicated under the peers of parent mounts. |
6b49df22 | 160 | .PP |
98c28960 MK |
161 | A mount ceases to be a member of a peer group when either |
162 | the mount is explicitly unmounted, | |
163 | or when the mount is implicitly unmounted because a mount namespace is removed | |
164 | (because it has no more member processes). | |
a721e8b2 | 165 | .PP |
8c9a8274 | 166 | The propagation type of the mounts in a mount namespace |
98c28960 | 167 | can be discovered via the "optional fields" exposed in |
1ae6b2c7 | 168 | .IR /proc/ pid /mountinfo . |
98c28960 MK |
169 | (See |
170 | .BR proc (5) | |
171 | for details of this file.) | |
172 | The following tags can appear in the optional fields | |
173 | for a record in that file: | |
174 | .TP | |
175 | .I shared:X | |
8c9a8274 | 176 | This mount is shared in peer group |
98c28960 | 177 | .IR X . |
d9cdf357 | 178 | Each peer group has a unique ID that is automatically |
98c28960 | 179 | generated by the kernel, |
8c9a8274 | 180 | and all mounts in the same peer group will show the same ID. |
d9cdf357 MK |
181 | (These IDs are assigned starting from the value 1, |
182 | and may be recycled when a peer group ceases to have any members.) | |
98c28960 MK |
183 | .TP |
184 | .I master:X | |
185 | This mount is a slave to shared peer group | |
186 | .IR X . | |
187 | .TP | |
188 | .IR propagate_from:X " (since Linux 2.6.26)" | |
189 | .\" commit 97e7e0f71d6d948c25f11f0a33878d9356d9579e | |
190 | This mount is a slave and receives propagation from shared peer group | |
191 | .IR X . | |
192 | This tag will always appear in conjunction with a | |
1ae6b2c7 | 193 | .I master:X |
98c28960 MK |
194 | tag. |
195 | Here, | |
1ae6b2c7 | 196 | .I X |
98c28960 MK |
197 | is the closest dominant peer group under the process's root directory. |
198 | If | |
1ae6b2c7 | 199 | .I X |
98c28960 MK |
200 | is the immediate master of the mount, |
201 | or if there is no dominant peer group under the same root, | |
202 | then only the | |
1ae6b2c7 | 203 | .I master:X |
98c28960 | 204 | field is present and not the |
1ae6b2c7 | 205 | .I propagate_from:X |
98c28960 | 206 | field. |
e2109196 | 207 | For further details, see below. |
98c28960 | 208 | .TP |
1ae6b2c7 | 209 | .I unbindable |
98c28960 MK |
210 | This is an unbindable mount. |
211 | .PP | |
212 | If none of the above tags is present, then this is a private mount. | |
213 | .SS MS_SHARED and MS_PRIVATE example | |
214 | Suppose that on a terminal in the initial mount namespace, | |
8c9a8274 | 215 | we mark one mount as shared and another as private, |
98c28960 MK |
216 | and then view the mounts in |
217 | .IR /proc/self/mountinfo : | |
a721e8b2 | 218 | .PP |
98c28960 | 219 | .in +4n |
b8302363 | 220 | .EX |
d9cdf357 MK |
221 | sh1# \fBmount \-\-make\-shared /mntS\fP |
222 | sh1# \fBmount \-\-make\-private /mntP\fP | |
f481726d | 223 | sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
d9cdf357 MK |
224 | 77 61 8:17 / /mntS rw,relatime shared:1 |
225 | 83 61 8:15 / /mntP rw,relatime | |
b8302363 | 226 | .EE |
e646a1ba | 227 | .in |
a721e8b2 | 228 | .PP |
98c28960 | 229 | From the |
1ae6b2c7 | 230 | .I /proc/self/mountinfo |
98c28960 | 231 | output, we see that |
1ae6b2c7 | 232 | .I /mntS |
98c28960 | 233 | is a shared mount in peer group 1, and that |
1ae6b2c7 | 234 | .I /mntP |
98c28960 MK |
235 | has no optional tags, indicating that it is a private mount. |
236 | The first two fields in each record in this file are the unique | |
237 | ID for this mount, and the mount ID of the parent mount. | |
8c9a8274 | 238 | We can further inspect this file to see that the parent mount of |
1ae6b2c7 | 239 | .I /mntS |
98c28960 | 240 | and |
1ae6b2c7 | 241 | .I /mntP |
98c28960 MK |
242 | is the root directory, |
243 | .IR / , | |
244 | which is mounted as private: | |
a721e8b2 | 245 | .PP |
98c28960 | 246 | .in +4n |
b8302363 | 247 | .EX |
98c28960 MK |
248 | sh1# \fBcat /proc/self/mountinfo | awk \(aq$1 == 61\(aq | sed \(aqs/ \- .*//\(aq\fP |
249 | 61 0 8:2 / / rw,relatime | |
b8302363 | 250 | .EE |
e646a1ba | 251 | .in |
a721e8b2 | 252 | .PP |
98c28960 MK |
253 | On a second terminal, |
254 | we create a new mount namespace where we run a second shell | |
255 | and inspect the mounts: | |
a721e8b2 | 256 | .PP |
98c28960 | 257 | .in +4n |
b8302363 | 258 | .EX |
98c28960 | 259 | $ \fBPS1=\(aqsh2# \(aq sudo unshare \-m \-\-propagation unchanged sh\fP |
f481726d | 260 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
d9cdf357 MK |
261 | 222 145 8:17 / /mntS rw,relatime shared:1 |
262 | 225 145 8:15 / /mntP rw,relatime | |
b8302363 | 263 | .EE |
e646a1ba | 264 | .in |
a721e8b2 | 265 | .PP |
98c28960 | 266 | The new mount namespace received a copy of the initial mount namespace's |
8c9a8274 MK |
267 | mounts. |
268 | These new mounts maintain the same propagation types, | |
98c28960 MK |
269 | but have unique mount IDs. |
270 | (The | |
1ae6b2c7 | 271 | .I \-\-propagation\~unchanged |
98c28960 MK |
272 | option prevents |
273 | .BR unshare (1) | |
274 | from marking all mounts as private when creating a new mount namespace, | |
275 | .\" Since util-linux 2.27 | |
276 | which it does by default.) | |
a721e8b2 | 277 | .PP |
98c28960 | 278 | In the second terminal, we then create submounts under each of |
1ae6b2c7 | 279 | .I /mntS |
98c28960 | 280 | and |
1ae6b2c7 | 281 | .I /mntP |
98c28960 | 282 | and inspect the set-up: |
a721e8b2 | 283 | .PP |
98c28960 | 284 | .in +4n |
b8302363 | 285 | .EX |
d9cdf357 MK |
286 | sh2# \fBmkdir /mntS/a\fP |
287 | sh2# \fBmount /dev/sdb6 /mntS/a\fP | |
288 | sh2# \fBmkdir /mntP/b\fP | |
289 | sh2# \fBmount /dev/sdb7 /mntP/b\fP | |
f481726d | 290 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
d9cdf357 MK |
291 | 222 145 8:17 / /mntS rw,relatime shared:1 |
292 | 225 145 8:15 / /mntP rw,relatime | |
293 | 178 222 8:22 / /mntS/a rw,relatime shared:2 | |
294 | 230 225 8:23 / /mntP/b rw,relatime | |
b8302363 | 295 | .EE |
e646a1ba | 296 | .in |
a721e8b2 | 297 | .PP |
98c28960 | 298 | From the above, it can be seen that |
1ae6b2c7 | 299 | .I /mntS/a |
98c28960 | 300 | was created as shared (inheriting this setting from its parent mount) and |
1ae6b2c7 | 301 | .I /mntP/b |
98c28960 | 302 | was created as a private mount. |
a721e8b2 | 303 | .PP |
98c28960 | 304 | Returning to the first terminal and inspecting the set-up, |
8c9a8274 | 305 | we see that the new mount created under the shared mount |
1ae6b2c7 | 306 | .I /mntS |
98c28960 | 307 | propagated to its peer mount (in the initial mount namespace), |
8c9a8274 | 308 | but the new mount created under the private mount |
1ae6b2c7 | 309 | .I /mntP |
98c28960 | 310 | did not propagate: |
a721e8b2 | 311 | .PP |
98c28960 | 312 | .in +4n |
b8302363 | 313 | .EX |
f481726d | 314 | sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
d9cdf357 MK |
315 | 77 61 8:17 / /mntS rw,relatime shared:1 |
316 | 83 61 8:15 / /mntP rw,relatime | |
317 | 179 77 8:22 / /mntS/a rw,relatime shared:2 | |
b8302363 | 318 | .EE |
e646a1ba | 319 | .in |
98c28960 MK |
320 | .\" |
321 | .SS MS_SLAVE example | |
8c9a8274 | 322 | Making a mount a slave allows it to receive propagated |
98c28960 | 323 | mount and unmount events from a master shared peer group, |
d9cdf357 | 324 | while preventing it from propagating events to that master. |
98c28960 MK |
325 | This is useful if we want to (say) receive a mount event when |
326 | an optical disk is mounted in the master shared peer group | |
327 | (in another mount namespace), | |
328 | but want to prevent mount and unmount events under the slave mount | |
329 | from having side effects in other namespaces. | |
a721e8b2 | 330 | .PP |
98c28960 | 331 | We can demonstrate the effect of slaving by first marking |
8c9a8274 | 332 | two mounts as shared in the initial mount namespace: |
a721e8b2 | 333 | .PP |
98c28960 | 334 | .in +4n |
b8302363 | 335 | .EX |
98c28960 MK |
336 | sh1# \fBmount \-\-make\-shared /mntX\fP |
337 | sh1# \fBmount \-\-make\-shared /mntY\fP | |
338 | sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP | |
339 | 132 83 8:23 / /mntX rw,relatime shared:1 | |
340 | 133 83 8:22 / /mntY rw,relatime shared:2 | |
b8302363 | 341 | .EE |
e646a1ba | 342 | .in |
a721e8b2 | 343 | .PP |
98c28960 | 344 | On a second terminal, |
8c9a8274 | 345 | we create a new mount namespace and inspect the mounts: |
a721e8b2 | 346 | .PP |
98c28960 | 347 | .in +4n |
b8302363 | 348 | .EX |
98c28960 MK |
349 | sh2# \fBunshare \-m \-\-propagation unchanged sh\fP |
350 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP | |
351 | 168 167 8:23 / /mntX rw,relatime shared:1 | |
352 | 169 167 8:22 / /mntY rw,relatime shared:2 | |
b8302363 | 353 | .EE |
e646a1ba | 354 | .in |
a721e8b2 | 355 | .PP |
8c9a8274 | 356 | In the new mount namespace, we then mark one of the mounts as a slave: |
a721e8b2 | 357 | .PP |
98c28960 | 358 | .in +4n |
b8302363 | 359 | .EX |
98c28960 MK |
360 | sh2# \fBmount \-\-make\-slave /mntY\fP |
361 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP | |
362 | 168 167 8:23 / /mntX rw,relatime shared:1 | |
363 | 169 167 8:22 / /mntY rw,relatime master:2 | |
b8302363 | 364 | .EE |
e646a1ba | 365 | .in |
a721e8b2 | 366 | .PP |
98c28960 | 367 | From the above output, we see that |
1ae6b2c7 | 368 | .I /mntY |
98c28960 MK |
369 | is now a slave mount that is receiving propagation events from |
370 | the shared peer group with the ID 2. | |
a721e8b2 | 371 | .PP |
98c28960 | 372 | Continuing in the new namespace, we create submounts under each of |
1ae6b2c7 | 373 | .I /mntX |
98c28960 MK |
374 | and |
375 | .IR /mntY : | |
a721e8b2 | 376 | .PP |
98c28960 | 377 | .in +4n |
b8302363 | 378 | .EX |
d9cdf357 MK |
379 | sh2# \fBmkdir /mntX/a\fP |
380 | sh2# \fBmount /dev/sda3 /mntX/a\fP | |
381 | sh2# \fBmkdir /mntY/b\fP | |
382 | sh2# \fBmount /dev/sda5 /mntY/b\fP | |
b8302363 | 383 | .EE |
e646a1ba | 384 | .in |
a721e8b2 | 385 | .PP |
8c9a8274 | 386 | When we inspect the state of the mounts in the new mount namespace, |
98c28960 | 387 | we see that |
1ae6b2c7 | 388 | .I /mntX/a |
98c28960 MK |
389 | was created as a new shared mount |
390 | (inheriting the "shared" setting from its parent mount) and | |
1ae6b2c7 | 391 | .I /mntY/b |
98c28960 | 392 | was created as a private mount: |
a721e8b2 | 393 | .PP |
98c28960 | 394 | .in +4n |
b8302363 | 395 | .EX |
98c28960 MK |
396 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
397 | 168 167 8:23 / /mntX rw,relatime shared:1 | |
398 | 169 167 8:22 / /mntY rw,relatime master:2 | |
d9cdf357 MK |
399 | 173 168 8:3 / /mntX/a rw,relatime shared:3 |
400 | 175 169 8:5 / /mntY/b rw,relatime | |
b8302363 | 401 | .EE |
e646a1ba | 402 | .in |
a721e8b2 | 403 | .PP |
98c28960 MK |
404 | Returning to the first terminal (in the initial mount namespace), |
405 | we see that the mount | |
1ae6b2c7 | 406 | .I /mntX/a |
98c28960 MK |
407 | propagated to the peer (the shared |
408 | .IR /mntX ), | |
409 | but the mount | |
1ae6b2c7 | 410 | .I /mntY/b |
98c28960 | 411 | was not propagated: |
a721e8b2 | 412 | .PP |
98c28960 | 413 | .in +4n |
b8302363 | 414 | .EX |
98c28960 MK |
415 | sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
416 | 132 83 8:23 / /mntX rw,relatime shared:1 | |
417 | 133 83 8:22 / /mntY rw,relatime shared:2 | |
d9cdf357 | 418 | 174 132 8:3 / /mntX/a rw,relatime shared:3 |
b8302363 | 419 | .EE |
e646a1ba | 420 | .in |
a721e8b2 | 421 | .PP |
8c9a8274 | 422 | Now we create a new mount under |
1ae6b2c7 | 423 | .I /mntY |
98c28960 | 424 | in the first shell: |
a721e8b2 | 425 | .PP |
98c28960 | 426 | .in +4n |
b8302363 | 427 | .EX |
d9cdf357 MK |
428 | sh1# \fBmkdir /mntY/c\fP |
429 | sh1# \fBmount /dev/sda1 /mntY/c\fP | |
861d36ba | 430 | sh1# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
98c28960 MK |
431 | 132 83 8:23 / /mntX rw,relatime shared:1 |
432 | 133 83 8:22 / /mntY rw,relatime shared:2 | |
d9cdf357 MK |
433 | 174 132 8:3 / /mntX/a rw,relatime shared:3 |
434 | 178 133 8:1 / /mntY/c rw,relatime shared:4 | |
b8302363 | 435 | .EE |
e646a1ba | 436 | .in |
a721e8b2 | 437 | .PP |
8c9a8274 | 438 | When we examine the mounts in the second mount namespace, |
98c28960 | 439 | we see that in this case the new mount has been propagated |
8c9a8274 | 440 | to the slave mount, |
98c28960 | 441 | and that the new mount is itself a slave mount (to peer group 4): |
a721e8b2 | 442 | .PP |
98c28960 | 443 | .in +4n |
b8302363 | 444 | .EX |
98c28960 MK |
445 | sh2# \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP |
446 | 168 167 8:23 / /mntX rw,relatime shared:1 | |
447 | 169 167 8:22 / /mntY rw,relatime master:2 | |
d9cdf357 MK |
448 | 173 168 8:3 / /mntX/a rw,relatime shared:3 |
449 | 175 169 8:5 / /mntY/b rw,relatime | |
450 | 179 169 8:1 / /mntY/c rw,relatime master:4 | |
b8302363 | 451 | .EE |
e646a1ba | 452 | .in |
98c28960 MK |
453 | .\" |
454 | .SS MS_UNBINDABLE example | |
455 | One of the primary purposes of unbindable mounts is to avoid | |
8c9a8274 MK |
456 | the "mount explosion" problem when repeatedly performing bind mounts |
457 | of a higher-level subtree at a lower-level mount. | |
98c28960 | 458 | The problem is illustrated by the following shell session. |
a721e8b2 | 459 | .PP |
8c9a8274 | 460 | Suppose we have a system with the following mounts: |
a721e8b2 | 461 | .PP |
98c28960 | 462 | .in +4n |
b8302363 | 463 | .EX |
98c28960 MK |
464 | # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
465 | /dev/sda1 on / | |
466 | /dev/sdb6 on /mntX | |
467 | /dev/sdb7 on /mntY | |
b8302363 | 468 | .EE |
e646a1ba | 469 | .in |
a721e8b2 | 470 | .PP |
98c28960 MK |
471 | Suppose furthermore that we wish to recursively bind mount |
472 | the root directory under several users' home directories. | |
8c9a8274 | 473 | We do this for the first user, and inspect the mounts: |
a721e8b2 | 474 | .PP |
98c28960 | 475 | .in +4n |
b8302363 | 476 | .EX |
98c28960 MK |
477 | # \fBmount \-\-rbind / /home/cecilia/\fP |
478 | # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP | |
479 | /dev/sda1 on / | |
480 | /dev/sdb6 on /mntX | |
481 | /dev/sdb7 on /mntY | |
482 | /dev/sda1 on /home/cecilia | |
483 | /dev/sdb6 on /home/cecilia/mntX | |
484 | /dev/sdb7 on /home/cecilia/mntY | |
b8302363 | 485 | .EE |
e646a1ba | 486 | .in |
a721e8b2 | 487 | .PP |
98c28960 MK |
488 | When we repeat this operation for the second user, |
489 | we start to see the explosion problem: | |
a721e8b2 | 490 | .PP |
98c28960 | 491 | .in +4n |
b8302363 | 492 | .EX |
98c28960 MK |
493 | # \fBmount \-\-rbind / /home/henry\fP |
494 | # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP | |
495 | /dev/sda1 on / | |
496 | /dev/sdb6 on /mntX | |
497 | /dev/sdb7 on /mntY | |
498 | /dev/sda1 on /home/cecilia | |
499 | /dev/sdb6 on /home/cecilia/mntX | |
500 | /dev/sdb7 on /home/cecilia/mntY | |
501 | /dev/sda1 on /home/henry | |
502 | /dev/sdb6 on /home/henry/mntX | |
503 | /dev/sdb7 on /home/henry/mntY | |
504 | /dev/sda1 on /home/henry/home/cecilia | |
505 | /dev/sdb6 on /home/henry/home/cecilia/mntX | |
506 | /dev/sdb7 on /home/henry/home/cecilia/mntY | |
b8302363 | 507 | .EE |
e646a1ba | 508 | .in |
a721e8b2 | 509 | .PP |
98c28960 MK |
510 | Under |
511 | .IR /home/henry , | |
512 | we have not only recursively added the | |
1ae6b2c7 | 513 | .I /mntX |
98c28960 | 514 | and |
1ae6b2c7 | 515 | .I /mntY |
98c28960 | 516 | mounts, but also the recursive mounts of those directories under |
1ae6b2c7 | 517 | .I /home/cecilia |
98c28960 MK |
518 | that were created in the previous step. |
519 | Upon repeating the step for a third user, | |
520 | it becomes obvious that the explosion is exponential in nature: | |
a721e8b2 | 521 | .PP |
98c28960 | 522 | .in +4n |
b8302363 | 523 | .EX |
98c28960 MK |
524 | # \fBmount \-\-rbind / /home/otto\fP |
525 | # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP | |
526 | /dev/sda1 on / | |
527 | /dev/sdb6 on /mntX | |
528 | /dev/sdb7 on /mntY | |
529 | /dev/sda1 on /home/cecilia | |
530 | /dev/sdb6 on /home/cecilia/mntX | |
531 | /dev/sdb7 on /home/cecilia/mntY | |
532 | /dev/sda1 on /home/henry | |
533 | /dev/sdb6 on /home/henry/mntX | |
534 | /dev/sdb7 on /home/henry/mntY | |
535 | /dev/sda1 on /home/henry/home/cecilia | |
536 | /dev/sdb6 on /home/henry/home/cecilia/mntX | |
537 | /dev/sdb7 on /home/henry/home/cecilia/mntY | |
538 | /dev/sda1 on /home/otto | |
539 | /dev/sdb6 on /home/otto/mntX | |
540 | /dev/sdb7 on /home/otto/mntY | |
541 | /dev/sda1 on /home/otto/home/cecilia | |
542 | /dev/sdb6 on /home/otto/home/cecilia/mntX | |
543 | /dev/sdb7 on /home/otto/home/cecilia/mntY | |
544 | /dev/sda1 on /home/otto/home/henry | |
545 | /dev/sdb6 on /home/otto/home/henry/mntX | |
546 | /dev/sdb7 on /home/otto/home/henry/mntY | |
547 | /dev/sda1 on /home/otto/home/henry/home/cecilia | |
548 | /dev/sdb6 on /home/otto/home/henry/home/cecilia/mntX | |
549 | /dev/sdb7 on /home/otto/home/henry/home/cecilia/mntY | |
b8302363 | 550 | .EE |
e646a1ba | 551 | .in |
a721e8b2 | 552 | .PP |
98c28960 MK |
553 | The mount explosion problem in the above scenario can be avoided |
554 | by making each of the new mounts unbindable. | |
555 | The effect of doing this is that recursive mounts of the root | |
556 | directory will not replicate the unbindable mounts. | |
557 | We make such a mount for the first user: | |
a721e8b2 | 558 | .PP |
98c28960 | 559 | .in +4n |
b8302363 | 560 | .EX |
98c28960 | 561 | # \fBmount \-\-rbind \-\-make\-unbindable / /home/cecilia\fP |
b8302363 | 562 | .EE |
e646a1ba | 563 | .in |
a721e8b2 | 564 | .PP |
98c28960 | 565 | Before going further, we show that unbindable mounts are indeed unbindable: |
a721e8b2 | 566 | .PP |
98c28960 | 567 | .in +4n |
b8302363 | 568 | .EX |
98c28960 MK |
569 | # \fBmkdir /mntZ\fP |
570 | # \fBmount \-\-bind /home/cecilia /mntZ\fP | |
571 | mount: wrong fs type, bad option, bad superblock on /home/cecilia, | |
572 | missing codepage or helper program, or other error | |
573 | ||
574 | In some cases useful info is found in syslog \- try | |
575 | dmesg | tail or so. | |
b8302363 | 576 | .EE |
e646a1ba | 577 | .in |
a721e8b2 | 578 | .PP |
98c28960 | 579 | Now we create unbindable recursive bind mounts for the other two users: |
a721e8b2 | 580 | .PP |
98c28960 | 581 | .in +4n |
b8302363 | 582 | .EX |
98c28960 MK |
583 | # \fBmount \-\-rbind \-\-make\-unbindable / /home/henry\fP |
584 | # \fBmount \-\-rbind \-\-make\-unbindable / /home/otto\fP | |
b8302363 | 585 | .EE |
e646a1ba | 586 | .in |
a721e8b2 | 587 | .PP |
8c9a8274 MK |
588 | Upon examining the list of mounts, |
589 | we see there has been no explosion of mounts, | |
98c28960 MK |
590 | because the unbindable mounts were not replicated |
591 | under each user's directory: | |
a721e8b2 | 592 | .PP |
98c28960 | 593 | .in +4n |
b8302363 | 594 | .EX |
98c28960 MK |
595 | # \fBmount | awk \(aq{print $1, $2, $3}\(aq\fP |
596 | /dev/sda1 on / | |
597 | /dev/sdb6 on /mntX | |
598 | /dev/sdb7 on /mntY | |
599 | /dev/sda1 on /home/cecilia | |
600 | /dev/sdb6 on /home/cecilia/mntX | |
601 | /dev/sdb7 on /home/cecilia/mntY | |
602 | /dev/sda1 on /home/henry | |
603 | /dev/sdb6 on /home/henry/mntX | |
604 | /dev/sdb7 on /home/henry/mntY | |
605 | /dev/sda1 on /home/otto | |
606 | /dev/sdb6 on /home/otto/mntX | |
607 | /dev/sdb7 on /home/otto/mntY | |
b8302363 | 608 | .EE |
e646a1ba | 609 | .in |
98c28960 MK |
610 | .\" |
611 | .SS Propagation type transitions | |
612 | The following table shows the effect that applying a new propagation type | |
613 | (i.e., | |
1ae6b2c7 | 614 | .IR mount\~\-\-make\-xxxx ) |
8c9a8274 | 615 | has on the existing propagation type of a mount. |
98c28960 MK |
616 | The rows correspond to existing propagation types, |
617 | and the columns are the new propagation settings. | |
618 | For reasons of space, "private" is abbreviated as "priv" and | |
619 | "unbindable" as "unbind". | |
620 | .TS | |
621 | lb2 lb2 lb2 lb2 lb1 | |
4cfdaa53 | 622 | lb | l l l l l. |
98c28960 | 623 | make-shared make-slave make-priv make-unbind |
4cfdaa53 | 624 | _ |
98c28960 MK |
625 | shared shared slave/priv [1] priv unbind |
626 | slave slave+shared slave [2] priv unbind | |
627 | slave+shared slave+shared slave priv unbind | |
628 | private shared priv [2] priv unbind | |
629 | unbindable shared unbind [2] priv unbind | |
630 | .TE | |
a721e8b2 | 631 | .sp 1 |
98c28960 MK |
632 | Note the following details to the table: |
633 | .IP [1] 4 | |
634 | If a shared mount is the only mount in its peer group, | |
635 | making it a slave automatically makes it private. | |
636 | .IP [2] | |
637 | Slaving a nonshared mount has no effect on the mount. | |
638 | .\" | |
639 | .SS Bind (MS_BIND) semantics | |
640 | Suppose that the following command is performed: | |
a721e8b2 | 641 | .PP |
fd6307c4 MK |
642 | .in +4n |
643 | .EX | |
644 | mount \-\-bind A/a B/b | |
645 | .EE | |
646 | .in | |
a721e8b2 | 647 | .PP |
98c28960 MK |
648 | Here, |
649 | .I A | |
8c9a8274 | 650 | is the source mount, |
98c28960 | 651 | .I B |
8c9a8274 | 652 | is the destination mount, |
98c28960 MK |
653 | .I a |
654 | is a subdirectory path under the mount point | |
655 | .IR A , | |
656 | and | |
657 | .I b | |
658 | is a subdirectory path under the mount point | |
659 | .IR B . | |
660 | The propagation type of the resulting mount, | |
661 | .IR B/b , | |
8c9a8274 | 662 | depends on the propagation types of the mounts |
1ae6b2c7 | 663 | .I A |
98c28960 MK |
664 | and |
665 | .IR B , | |
666 | and is summarized in the following table. | |
a721e8b2 | 667 | .PP |
98c28960 MK |
668 | .TS |
669 | lb2 lb1 lb2 lb2 lb2 lb0 | |
670 | lb2 lb1 lb2 lb2 lb2 lb0 | |
4cfdaa53 | 671 | lb lb | l l l l l. |
98c28960 MK |
672 | source(A) |
673 | shared private slave unbind | |
674 | _ | |
4cfdaa53 MK |
675 | dest(B) shared shared shared slave+shared invalid |
676 | nonshared shared private slave invalid | |
98c28960 | 677 | .TE |
a721e8b2 | 678 | .sp 1 |
98c28960 MK |
679 | Note that a recursive bind of a subtree follows the same semantics |
680 | as for a bind operation on each mount in the subtree. | |
681 | (Unbindable mounts are automatically pruned at the target mount point.) | |
a721e8b2 | 682 | .PP |
98c28960 | 683 | For further details, see |
77a4c232 | 684 | .I Documentation/filesystems/sharedsubtree.rst |
98c28960 MK |
685 | in the kernel source tree. |
686 | .\" | |
687 | .SS Move (MS_MOVE) semantics | |
688 | Suppose that the following command is performed: | |
a721e8b2 | 689 | .PP |
fd6307c4 MK |
690 | .in +4n |
691 | .EX | |
692 | mount \-\-move A B/b | |
693 | .EE | |
694 | .in | |
a721e8b2 | 695 | .PP |
98c28960 MK |
696 | Here, |
697 | .I A | |
8c9a8274 | 698 | is the source mount, |
98c28960 | 699 | .I B |
8c9a8274 | 700 | is the destination mount, and |
98c28960 MK |
701 | .I b |
702 | is a subdirectory path under the mount point | |
703 | .IR B . | |
704 | The propagation type of the resulting mount, | |
705 | .IR B/b , | |
8c9a8274 | 706 | depends on the propagation types of the mounts |
1ae6b2c7 | 707 | .I A |
98c28960 MK |
708 | and |
709 | .IR B , | |
710 | and is summarized in the following table. | |
a721e8b2 | 711 | .PP |
98c28960 MK |
712 | .TS |
713 | lb2 lb1 lb2 lb2 lb2 lb0 | |
714 | lb2 lb1 lb2 lb2 lb2 lb0 | |
4cfdaa53 | 715 | lb lb | l l l l l. |
98c28960 MK |
716 | source(A) |
717 | shared private slave unbind | |
718 | _ | |
4cfdaa53 MK |
719 | dest(B) shared shared shared slave+shared invalid |
720 | nonshared shared private slave unbindable | |
98c28960 | 721 | .TE |
a721e8b2 | 722 | .sp 1 |
98c28960 | 723 | Note: moving a mount that resides under a shared mount is invalid. |
a721e8b2 | 724 | .PP |
98c28960 | 725 | For further details, see |
77a4c232 | 726 | .I Documentation/filesystems/sharedsubtree.rst |
98c28960 MK |
727 | in the kernel source tree. |
728 | .\" | |
729 | .SS Mount semantics | |
8c9a8274 | 730 | Suppose that we use the following command to create a mount: |
a721e8b2 | 731 | .PP |
fd6307c4 MK |
732 | .in +4n |
733 | .EX | |
734 | mount device B/b | |
735 | .EE | |
736 | .in | |
a721e8b2 | 737 | .PP |
a66648bb MK |
738 | Here, |
739 | .I B | |
8c9a8274 | 740 | is the destination mount, and |
a66648bb MK |
741 | .I b |
742 | is a subdirectory path under the mount point | |
743 | .IR B . | |
744 | The propagation type of the resulting mount, | |
745 | .IR B/b , | |
746 | follows the same rules as for a bind mount, | |
747 | where the propagation type of the source mount | |
748 | is considered always to be private. | |
749 | .\" | |
750 | .SS Unmount semantics | |
8c9a8274 | 751 | Suppose that we use the following command to tear down a mount: |
a66648bb MK |
752 | .PP |
753 | .in +4n | |
754 | .EX | |
755 | unmount A | |
756 | .EE | |
757 | .in | |
758 | .PP | |
759 | Here, | |
760 | .I A | |
8c9a8274 | 761 | is a mount on |
a66648bb MK |
762 | .IR B/b , |
763 | where | |
764 | .I B | |
765 | is the parent mount and | |
766 | .I b | |
767 | is a subdirectory path under the mount point | |
768 | .IR B . | |
769 | If | |
770 | .B B | |
771 | is shared, then all most-recently-mounted mounts at | |
772 | .I b | |
773 | on mounts that receive propagation from mount | |
774 | .I B | |
775 | and do not have submounts under them are unmounted. | |
776 | .\" | |
1ae6b2c7 | 777 | .SS The /proc/ pid /mountinfo "propagate_from" tag |
a66648bb MK |
778 | The |
779 | .I propagate_from:X | |
780 | tag is shown in the optional fields of a | |
1ae6b2c7 | 781 | .IR /proc/ pid /mountinfo |
a66648bb MK |
782 | record in cases where a process can't see a slave's immediate master |
783 | (i.e., the pathname of the master is not reachable from | |
784 | the filesystem root directory) | |
785 | and so cannot determine the | |
786 | chain of propagation between the mounts it can see. | |
787 | .PP | |
788 | In the following example, we first create a two-link master-slave chain | |
789 | between the mounts | |
790 | .IR /mnt , | |
791 | .IR /tmp/etc , | |
792 | and | |
793 | .IR /mnt/tmp/etc . | |
794 | Then the | |
795 | .BR chroot (1) | |
796 | command is used to make the | |
1ae6b2c7 | 797 | .I /tmp/etc |
a66648bb MK |
798 | mount point unreachable from the root directory, |
799 | creating a situation where the master of | |
1ae6b2c7 | 800 | .I /mnt/tmp/etc |
a66648bb MK |
801 | is not reachable from the (new) root directory of the process. |
802 | .PP | |
803 | First, we bind mount the root directory onto | |
1ae6b2c7 | 804 | .I /mnt |
a66648bb | 805 | and then bind mount |
1ae6b2c7 | 806 | .I /proc |
a66648bb | 807 | at |
1ae6b2c7 | 808 | .I /mnt/proc |
a66648bb MK |
809 | so that after the later |
810 | .BR chroot (1) | |
811 | the | |
812 | .BR proc (5) | |
813 | filesystem remains visible at the correct location | |
814 | in the chroot-ed environment. | |
815 | .PP | |
816 | .in +4n | |
817 | .EX | |
818 | # \fBmkdir \-p /mnt/proc\fP | |
819 | # \fBmount \-\-bind / /mnt\fP | |
820 | # \fBmount \-\-bind /proc /mnt/proc\fP | |
821 | .EE | |
822 | .in | |
823 | .PP | |
824 | Next, we ensure that the | |
1ae6b2c7 | 825 | .I /mnt |
a66648bb MK |
826 | mount is a shared mount in a new peer group (with no peers): |
827 | .PP | |
828 | .in +4n | |
829 | .EX | |
830 | # \fBmount \-\-make\-private /mnt\fP # Isolate from any previous peer group | |
831 | # \fBmount \-\-make\-shared /mnt\fP | |
832 | # \fBcat /proc/self/mountinfo | grep \(aq/mnt\(aq | sed \(aqs/ \- .*//\(aq\fP | |
833 | 239 61 8:2 / /mnt ... shared:102 | |
834 | 248 239 0:4 / /mnt/proc ... shared:5 | |
835 | .EE | |
836 | .in | |
837 | .PP | |
838 | Next, we bind mount | |
1ae6b2c7 | 839 | .I /mnt/etc |
a66648bb MK |
840 | onto |
841 | .IR /tmp/etc : | |
842 | .PP | |
843 | .in +4n | |
844 | .EX | |
845 | # \fBmkdir \-p /tmp/etc\fP | |
846 | # \fBmount \-\-bind /mnt/etc /tmp/etc\fP | |
847 | # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP | |
848 | 239 61 8:2 / /mnt ... shared:102 | |
849 | 248 239 0:4 / /mnt/proc ... shared:5 | |
850 | 267 40 8:2 /etc /tmp/etc ... shared:102 | |
851 | .EE | |
852 | .in | |
853 | .PP | |
8c9a8274 | 854 | Initially, these two mounts are in the same peer group, |
a66648bb | 855 | but we then make the |
1ae6b2c7 | 856 | .I /tmp/etc |
a66648bb MK |
857 | a slave of |
858 | .IR /mnt/etc , | |
859 | and then make | |
1ae6b2c7 | 860 | .I /tmp/etc |
a66648bb MK |
861 | shared as well, |
862 | so that it can propagate events to the next slave in the chain: | |
863 | .PP | |
864 | .in +4n | |
865 | .EX | |
866 | # \fBmount \-\-make\-slave /tmp/etc\fP | |
867 | # \fBmount \-\-make\-shared /tmp/etc\fP | |
868 | # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP | |
869 | 239 61 8:2 / /mnt ... shared:102 | |
870 | 248 239 0:4 / /mnt/proc ... shared:5 | |
871 | 267 40 8:2 /etc /tmp/etc ... shared:105 master:102 | |
872 | .EE | |
873 | .in | |
874 | .PP | |
875 | Then we bind mount | |
1ae6b2c7 | 876 | .I /tmp/etc |
a66648bb MK |
877 | onto |
878 | .IR /mnt/tmp/etc . | |
8c9a8274 | 879 | Again, the two mounts are initially in the same peer group, |
a66648bb | 880 | but we then make |
1ae6b2c7 | 881 | .I /mnt/tmp/etc |
a66648bb MK |
882 | a slave of |
883 | .IR /tmp/etc : | |
884 | .PP | |
885 | .in +4n | |
886 | .EX | |
887 | # \fBmkdir \-p /mnt/tmp/etc\fP | |
888 | # \fBmount \-\-bind /tmp/etc /mnt/tmp/etc\fP | |
889 | # \fBmount \-\-make\-slave /mnt/tmp/etc\fP | |
890 | # \fBcat /proc/self/mountinfo | egrep \(aq/mnt|/tmp/\(aq | sed \(aqs/ \- .*//\(aq\fP | |
891 | 239 61 8:2 / /mnt ... shared:102 | |
892 | 248 239 0:4 / /mnt/proc ... shared:5 | |
893 | 267 40 8:2 /etc /tmp/etc ... shared:105 master:102 | |
894 | 273 239 8:2 /etc /mnt/tmp/etc ... master:105 | |
895 | .EE | |
896 | .in | |
897 | .PP | |
898 | From the above, we see that | |
1ae6b2c7 | 899 | .I /mnt |
a66648bb MK |
900 | is the master of the slave |
901 | .IR /tmp/etc , | |
902 | which in turn is the master of the slave | |
903 | .IR /mnt/tmp/etc . | |
904 | .PP | |
905 | We then | |
906 | .BR chroot (1) | |
907 | to the | |
1ae6b2c7 | 908 | .I /mnt |
a66648bb MK |
909 | directory, which renders the mount with ID 267 unreachable |
910 | from the (new) root directory: | |
911 | .PP | |
912 | .in +4n | |
913 | .EX | |
914 | # \fBchroot /mnt\fP | |
915 | .EE | |
916 | .in | |
917 | .PP | |
918 | When we examine the state of the mounts inside the chroot-ed environment, | |
919 | we see the following: | |
920 | .PP | |
921 | .in +4n | |
922 | .EX | |
923 | # \fBcat /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP | |
924 | 239 61 8:2 / / ... shared:102 | |
925 | 248 239 0:4 / /proc ... shared:5 | |
926 | 273 239 8:2 /etc /tmp/etc ... master:105 propagate_from:102 | |
927 | .EE | |
928 | .in | |
929 | .PP | |
930 | Above, we see that the mount with ID 273 | |
931 | is a slave whose master is the peer group 105. | |
932 | The mount point for that master is unreachable, and so a | |
1ae6b2c7 | 933 | .I propagate_from |
a66648bb MK |
934 | tag is displayed, indicating that the closest dominant peer group |
935 | (i.e., the nearest reachable mount in the slave chain) | |
936 | is the peer group with the ID 102 (corresponding to the | |
1ae6b2c7 | 937 | .I /mnt |
a66648bb MK |
938 | mount point before the |
939 | .BR chroot (1) | |
940 | was performed. | |
941 | .\" | |
942 | .SH VERSIONS | |
943 | Mount namespaces first appeared in Linux 2.4.19. | |
944 | .SH CONFORMING TO | |
945 | Namespaces are a Linux-specific feature. | |
946 | .\" | |
947 | .SH NOTES | |
8c9a8274 | 948 | The propagation type assigned to a new mount depends |
a66648bb | 949 | on the propagation type of the parent mount. |
8c9a8274 | 950 | If the mount has a parent (i.e., it is a non-root mount |
a66648bb MK |
951 | point) and the propagation type of the parent is |
952 | .BR MS_SHARED , | |
953 | then the propagation type of the new mount is also | |
954 | .BR MS_SHARED . | |
955 | Otherwise, the propagation type of the new mount is | |
956 | .BR MS_PRIVATE . | |
957 | .PP | |
958 | Notwithstanding the fact that the default propagation type | |
8c9a8274 | 959 | for new mount is in many cases |
a66648bb | 960 | .BR MS_PRIVATE , |
1ae6b2c7 | 961 | .B MS_SHARED |
a66648bb MK |
962 | is typically more useful. |
963 | For this reason, | |
964 | .BR systemd (1) | |
8c9a8274 | 965 | automatically remounts all mounts as |
1ae6b2c7 | 966 | .B MS_SHARED |
a66648bb MK |
967 | on system startup. |
968 | Thus, on most modern systems, the default propagation type is in practice | |
969 | .BR MS_SHARED . | |
970 | .PP | |
971 | Since, when one uses | |
972 | .BR unshare (1) | |
973 | to create a mount namespace, | |
8c9a8274 | 974 | the goal is commonly to provide full isolation of the mounts |
a66648bb MK |
975 | in the new namespace, |
976 | .BR unshare (1) | |
977 | (since | |
1ae6b2c7 | 978 | .I util\-linux |
a66648bb MK |
979 | version 2.27) in turn reverses the step performed by |
980 | .BR systemd (1), | |
8c9a8274 | 981 | by making all mounts private in the new namespace. |
a66648bb MK |
982 | That is, |
983 | .BR unshare (1) | |
984 | performs the equivalent of the following in the new mount namespace: | |
a721e8b2 | 985 | .PP |
fd6307c4 MK |
986 | .in +4n |
987 | .EX | |
a66648bb | 988 | mount \-\-make\-rprivate / |
fd6307c4 MK |
989 | .EE |
990 | .in | |
a721e8b2 | 991 | .PP |
a66648bb | 992 | To prevent this, one can use the |
1ae6b2c7 | 993 | .I \-\-propagation\~unchanged |
a66648bb MK |
994 | option to |
995 | .BR unshare (1). | |
a721e8b2 | 996 | .PP |
a66648bb MK |
997 | An application that creates a new mount namespace directly using |
998 | .BR clone (2) | |
999 | or | |
1000 | .BR unshare (2) | |
1001 | may desire to prevent propagation of mount events to other mount namespaces | |
1002 | (as is done by | |
1003 | .BR unshare (1)). | |
1004 | This can be done by changing the propagation type of | |
8c9a8274 | 1005 | mounts in the new namespace to either |
a66648bb MK |
1006 | .B MS_SLAVE |
1007 | or | |
1008 | .BR MS_PRIVATE , | |
1009 | using a call such as the following: | |
a721e8b2 | 1010 | .PP |
e2109196 | 1011 | .in +4n |
b8302363 | 1012 | .EX |
a66648bb | 1013 | mount(NULL, "/", MS_SLAVE | MS_REC, NULL); |
b8302363 | 1014 | .EE |
e646a1ba | 1015 | .in |
a721e8b2 | 1016 | .PP |
a66648bb MK |
1017 | For a discussion of propagation types when moving mounts |
1018 | .RB ( MS_MOVE ) | |
1019 | and creating bind mounts | |
1020 | .RB ( MS_BIND ), | |
1021 | see | |
77a4c232 | 1022 | .IR Documentation/filesystems/sharedsubtree.rst . |
a66648bb MK |
1023 | .\" |
1024 | .\" ============================================================ | |
1025 | .\" | |
1026 | .SS Restrictions on mount namespaces | |
1027 | Note the following points with respect to mount namespaces: | |
ababc346 | 1028 | .IP [1] 4 |
a66648bb MK |
1029 | Each mount namespace has an owner user namespace. |
1030 | As explained above, when a new mount namespace is created, | |
8c9a8274 | 1031 | its mount list is initialized as a copy of the mount list |
a66648bb | 1032 | of another mount namespace. |
8c9a8274 | 1033 | If the new namespace and the namespace from which the mount list |
a66648bb MK |
1034 | was copied are owned by different user namespaces, |
1035 | then the new mount namespace is considered | |
1036 | .IR "less privileged" . | |
ababc346 | 1037 | .IP [2] |
a66648bb MK |
1038 | When creating a less privileged mount namespace, |
1039 | shared mounts are reduced to slave mounts. | |
1040 | This ensures that mappings performed in less | |
1041 | privileged mount namespaces will not propagate to more privileged | |
1042 | mount namespaces. | |
ababc346 | 1043 | .IP [3] |
a66648bb MK |
1044 | Mounts that come as a single unit from a more privileged mount namespace are |
1045 | locked together and may not be separated in a less privileged mount | |
1046 | namespace. | |
1047 | (The | |
1048 | .BR unshare (2) | |
1049 | .B CLONE_NEWNS | |
1050 | operation brings across all of the mounts from the original | |
1051 | mount namespace as a single unit, | |
1052 | and recursive mounts that propagate between | |
1053 | mount namespaces propagate as a single unit.) | |
1054 | .IP | |
1055 | In this context, "may not be separated" means that the mounts | |
1056 | are locked so that they may not be individually unmounted. | |
1057 | Consider the following example: | |
1058 | .IP | |
1059 | .RS | |
e2109196 | 1060 | .in +4n |
b8302363 | 1061 | .EX |
906ab494 MK |
1062 | $ \fBsudo sh\fP |
1063 | # \fBmount \-\-bind /dev/null /etc/shadow\fP | |
1064 | # \fBcat /etc/shadow\fP # Produces no output | |
b8302363 | 1065 | .EE |
e646a1ba | 1066 | .in |
a66648bb MK |
1067 | .RE |
1068 | .IP | |
aa62e72d | 1069 | The above steps, performed in a more privileged mount namespace, |
2433a20c | 1070 | have created a bind mount that |
906ab494 MK |
1071 | obscures the contents of the shadow password file, |
1072 | .IR /etc/shadow . | |
a66648bb | 1073 | For security reasons, it should not be possible to unmount |
aa62e72d | 1074 | that mount in a less privileged mount namespace, |
906ab494 MK |
1075 | since that would reveal the contents of |
1076 | .IR /etc/shadow . | |
a66648bb MK |
1077 | .IP |
1078 | Suppose we now create a new mount namespace | |
2433a20c | 1079 | owned by a new user namespace. |
a66648bb MK |
1080 | The new mount namespace will inherit copies of all of the mounts |
1081 | from the previous mount namespace. | |
1082 | However, those mounts will be locked because the new mount namespace | |
2433a20c MK |
1083 | is less privileged. |
1084 | Consequently, an attempt to unmount the mount fails as show | |
1085 | in the following step: | |
a66648bb MK |
1086 | .IP |
1087 | .RS | |
e2109196 | 1088 | .in +4n |
b8302363 | 1089 | .EX |
906ab494 | 1090 | # \fBunshare \-\-user \-\-map\-root\-user \-\-mount \e\fP |
a66648bb MK |
1091 | \fBstrace \-o /tmp/log \e\fP |
1092 | \fBumount /mnt/dir\fP | |
906ab494 MK |
1093 | umount: /etc/shadow: not mounted. |
1094 | # \fBgrep \(aq^umount\(aq /tmp/log\fP | |
1095 | umount2("/etc/shadow", 0) = \-1 EINVAL (Invalid argument) | |
b8302363 | 1096 | .EE |
e646a1ba | 1097 | .in |
a66648bb MK |
1098 | .RE |
1099 | .IP | |
1100 | The error message from | |
1101 | .BR mount (8) | |
1102 | is a little confusing, but the | |
1103 | .BR strace (1) | |
1104 | output reveals that the underlying | |
1105 | .BR umount2 (2) | |
1106 | system call failed with the error | |
1107 | .BR EINVAL , | |
1108 | which is the error that the kernel returns to indicate that | |
1109 | the mount is locked. | |
ebc82e00 MK |
1110 | .IP |
1111 | Note, however, that it is possible to stack (and unstack) a | |
1112 | mount on top of one of the inherited locked mounts in a | |
1113 | less privileged mount namespace: | |
1114 | .IP | |
1115 | .in +4n | |
1116 | .EX | |
906ab494 MK |
1117 | # \fBecho \(aqaaaaa\(aq > /tmp/a\fP # File to mount onto /etc/shadow |
1118 | # \fBunshare \-\-user \-\-map\-root\-user \-\-mount \e\fP | |
1119 | \fBsh \-c \(aqmount \-\-bind /tmp/a /etc/shadow; cat /etc/shadow\(aq\fP | |
1120 | aaaaa | |
1121 | # \fBumount /etc/shadow\fP | |
ebc82e00 MK |
1122 | .EE |
1123 | .in | |
906ab494 MK |
1124 | .IP |
1125 | The final | |
1126 | .BR umount (8) | |
1127 | command above, which is performed in the initial mount namespace, | |
1128 | makes the original | |
1129 | .I /etc/shadow | |
1130 | file once more visible in that namespace. | |
ababc346 MK |
1131 | .IP [4] |
1132 | Following on from point [3], | |
f6aaf493 | 1133 | note that it is possible to unmount an entire subtree of mounts that |
aa62e72d | 1134 | propagated as a unit into a less privileged mount namespace, |
a66648bb MK |
1135 | as illustrated in the following example. |
1136 | .IP | |
1137 | First, we create new user and mount namespaces using | |
1138 | .BR unshare (1). | |
1139 | In the new mount namespace, | |
1140 | the propagation type of all mounts is set to private. | |
1141 | We then create a shared bind mount at | |
1142 | .IR /mnt , | |
8c9a8274 | 1143 | and a small hierarchy of mounts underneath that mount. |
a66648bb | 1144 | .IP |
e2109196 | 1145 | .in +4n |
b8302363 | 1146 | .EX |
a66648bb MK |
1147 | $ \fBPS1=\(aqns1# \(aq sudo unshare \-\-user \-\-map\-root\-user \e\fP |
1148 | \fB\-\-mount \-\-propagation private bash\fP | |
1149 | ns1# \fBecho $$\fP # We need the PID of this shell later | |
1150 | 778501 | |
1151 | ns1# \fBmount \-\-make\-shared \-\-bind /mnt /mnt\fP | |
1152 | ns1# \fBmkdir /mnt/x\fP | |
1153 | ns1# \fBmount \-\-make\-private \-t tmpfs none /mnt/x\fP | |
1154 | ns1# \fBmkdir /mnt/x/y\fP | |
1155 | ns1# \fBmount \-\-make\-private \-t tmpfs none /mnt/x/y\fP | |
1156 | ns1# \fBgrep /mnt /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP | |
1157 | 986 83 8:5 /mnt /mnt rw,relatime shared:344 | |
1158 | 989 986 0:56 / /mnt/x rw,relatime | |
1159 | 990 989 0:57 / /mnt/x/y rw,relatime | |
b8302363 | 1160 | .EE |
e646a1ba | 1161 | .in |
a66648bb MK |
1162 | .IP |
1163 | Continuing in the same shell session, | |
aa62e72d MK |
1164 | we then create a second shell in a new user namespace and a new |
1165 | (less privileged) mount namespace and | |
8c9a8274 | 1166 | check the state of the propagated mounts rooted at |
a66648bb MK |
1167 | .IR /mnt . |
1168 | .IP | |
e2109196 | 1169 | .in +4n |
b8302363 | 1170 | .EX |
2433a20c | 1171 | ns1# \fBPS1=\(aqns2# \(aq unshare \-\-user \-\-map\-root\-user \e\fP |
a66648bb MK |
1172 | \fB\-\-mount \-\-propagation unchanged bash\fP |
1173 | ns2# \fBgrep /mnt /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP | |
1174 | 1239 1204 8:5 /mnt /mnt rw,relatime master:344 | |
1175 | 1240 1239 0:56 / /mnt/x rw,relatime | |
1176 | 1241 1240 0:57 / /mnt/x/y rw,relatime | |
e646a1ba | 1177 | .EE |
e2109196 | 1178 | .in |
a66648bb | 1179 | .IP |
8c9a8274 | 1180 | Of note in the above output is that the propagation type of the mount |
a66648bb | 1181 | .I /mnt |
ababc346 | 1182 | has been reduced to slave, as explained in point [2]. |
a66648bb MK |
1183 | This means that submount events will propagate from the master |
1184 | .I /mnt | |
1185 | in "ns1", but propagation will not occur in the opposite direction. | |
1186 | .IP | |
1187 | From a separate terminal window, we then use | |
1188 | .BR nsenter (1) | |
1189 | to enter the mount and user namespaces corresponding to "ns1". | |
1190 | In that terminal window, we then recursively bind mount | |
1ae6b2c7 | 1191 | .I /mnt/x |
a66648bb MK |
1192 | at the location |
1193 | .IR /mnt/ppp . | |
1194 | .IP | |
e2109196 | 1195 | .in +4n |
b8302363 | 1196 | .EX |
a66648bb MK |
1197 | $ \fBPS1=\(aqns3# \(aq sudo nsenter \-t 778501 \-\-user \-\-mount\fP |
1198 | ns3# \fBmount \-\-rbind \-\-make\-private /mnt/x /mnt/ppp\fP | |
1199 | ns3# \fBgrep /mnt /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP | |
1200 | 986 83 8:5 /mnt /mnt rw,relatime shared:344 | |
1201 | 989 986 0:56 / /mnt/x rw,relatime | |
1202 | 990 989 0:57 / /mnt/x/y rw,relatime | |
1203 | 1242 986 0:56 / /mnt/ppp rw,relatime | |
1204 | 1243 1242 0:57 / /mnt/ppp/y rw,relatime shared:518 | |
b8302363 | 1205 | .EE |
e646a1ba | 1206 | .in |
a66648bb MK |
1207 | .IP |
1208 | Because the propagation type of the parent mount, | |
1209 | .IR /mnt , | |
f6aaf493 | 1210 | was shared, the recursive bind mount propagated a small subtree of |
a66648bb MK |
1211 | mounts under the slave mount |
1212 | .I /mnt | |
1213 | into "ns2", | |
1214 | as can be verified by executing the following command in that shell session: | |
1215 | .IP | |
e2109196 | 1216 | .in +4n |
b8302363 | 1217 | .EX |
a66648bb MK |
1218 | ns2# \fBgrep /mnt /proc/self/mountinfo | sed \(aqs/ \- .*//\(aq\fP |
1219 | 1239 1204 8:5 /mnt /mnt rw,relatime master:344 | |
1220 | 1240 1239 0:56 / /mnt/x rw,relatime | |
1221 | 1241 1240 0:57 / /mnt/x/y rw,relatime | |
1222 | 1244 1239 0:56 / /mnt/ppp rw,relatime | |
1223 | 1245 1244 0:57 / /mnt/ppp/y rw,relatime master:518 | |
b8302363 | 1224 | .EE |
e646a1ba | 1225 | .in |
a66648bb | 1226 | .IP |
2433a20c | 1227 | While it is not possible to unmount a part of the propagated subtree |
5aea19ed MK |
1228 | .RI ( /mnt/ppp/y ) |
1229 | in "ns2", | |
f6aaf493 | 1230 | it is possible to unmount the entire subtree, |
a66648bb MK |
1231 | as shown by the following commands: |
1232 | .IP | |
fd6307c4 MK |
1233 | .in +4n |
1234 | .EX | |
a66648bb MK |
1235 | ns2# \fBumount /mnt/ppp/y\fP |
1236 | umount: /mnt/ppp/y: not mounted. | |
1237 | ns2# \fBumount \-l /mnt/ppp | sed \(aqs/ \- .*//\(aq\fP # Succeeds... | |
1238 | ns2# \fBgrep /mnt /proc/self/mountinfo\fP | |
1239 | 1239 1204 8:5 /mnt /mnt rw,relatime master:344 | |
1240 | 1240 1239 0:56 / /mnt/x rw,relatime | |
1241 | 1241 1240 0:57 / /mnt/x/y rw,relatime | |
fd6307c4 MK |
1242 | .EE |
1243 | .in | |
ababc346 | 1244 | .IP [5] |
a66648bb MK |
1245 | The |
1246 | .BR mount (2) | |
1247 | flags | |
1248 | .BR MS_RDONLY , | |
1249 | .BR MS_NOSUID , | |
1250 | .BR MS_NOEXEC , | |
1251 | and the "atime" flags | |
1252 | .RB ( MS_NOATIME , | |
1253 | .BR MS_NODIRATIME , | |
1254 | .BR MS_RELATIME ) | |
1255 | settings become locked | |
1256 | .\" commit 9566d6742852c527bf5af38af5cbb878dad75705 | |
1257 | .\" Author: Eric W. Biederman <ebiederm@xmission.com> | |
1258 | .\" Date: Mon Jul 28 17:26:07 2014 -0700 | |
1259 | .\" | |
1260 | .\" mnt: Correct permission checks in do_remount | |
1261 | .\" | |
1262 | when propagated from a more privileged to | |
1263 | a less privileged mount namespace, | |
1264 | and may not be changed in the less privileged mount namespace. | |
1265 | .IP | |
2433a20c MK |
1266 | This point is illustrated in the following example where, |
1267 | in a more privileged mount namespace, | |
1268 | we create a bind mount that is marked as read-only. | |
a66648bb MK |
1269 | For security reasons, |
1270 | it should not be possible to make the mount writable in | |
2433a20c | 1271 | a less privileged mount namespace, and indeed the kernel prevents this: |
a66648bb MK |
1272 | .IP |
1273 | .RS | |
a2fc45a9 MK |
1274 | .in +4n |
1275 | .EX | |
a66648bb | 1276 | $ \fBsudo mkdir /mnt/dir\fP |
a66648bb MK |
1277 | $ \fBsudo mount \-\-bind \-o ro /some/path /mnt/dir\fP |
1278 | $ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP | |
1279 | \fBmount \-o remount,rw /mnt/dir\fP | |
1280 | mount: /mnt/dir: permission denied. | |
a2fc45a9 MK |
1281 | .EE |
1282 | .in | |
a66648bb | 1283 | .RE |
ababc346 | 1284 | .IP [6] |
a66648bb MK |
1285 | .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree)) |
1286 | A file or directory that is a mount point in one namespace that is not | |
1287 | a mount point in another namespace, may be renamed, unlinked, or removed | |
1288 | .RB ( rmdir (2)) | |
1289 | in the mount namespace in which it is not a mount point | |
1290 | (subject to the usual permission checks). | |
1291 | Consequently, the mount point is removed in the mount namespace | |
1292 | where it was a mount point. | |
1293 | .IP | |
1294 | Previously (before Linux 3.18), | |
1295 | .\" mtk: The change was in Linux 3.18, I think, with this commit: | |
1296 | .\" commit 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe | |
1297 | .\" Author: Eric W. Biederman <ebiederman@twitter.com> | |
1298 | .\" Date: Tue Oct 1 18:33:48 2013 -0700 | |
1299 | .\" | |
1300 | .\" vfs: Lazily remove mounts on unlinked files and directories. | |
1301 | attempting to unlink, rename, or remove a file or directory | |
1302 | that was a mount point in another mount namespace would result in the error | |
1303 | .BR EBUSY . | |
1304 | That behavior had technical problems of enforcement (e.g., for NFS) | |
1305 | and permitted denial-of-service attacks against more privileged users | |
1306 | (i.e., preventing individual files from being updated | |
1307 | by bind mounting on top of them). | |
a14af333 | 1308 | .SH EXAMPLES |
43d438e2 MK |
1309 | See |
1310 | .BR pivot_root (2). | |
98c28960 MK |
1311 | .SH SEE ALSO |
1312 | .BR unshare (1), | |
1313 | .BR clone (2), | |
1314 | .BR mount (2), | |
4d7a6485 | 1315 | .BR mount_setattr (2), |
e70abf48 | 1316 | .BR pivot_root (2), |
98c28960 MK |
1317 | .BR setns (2), |
1318 | .BR umount (2), | |
1319 | .BR unshare (2), | |
1320 | .BR proc (5), | |
466247eb | 1321 | .BR namespaces (7), |
93f5b0f8 | 1322 | .BR user_namespaces (7), |
e70abf48 | 1323 | .BR findmnt (8), |
e9832dc0 | 1324 | .BR mount (8), |
88b0e0e0 | 1325 | .BR pam_namespace (8), |
e9832dc0 MK |
1326 | .BR pivot_root (8), |
1327 | .BR umount (8) | |
a721e8b2 | 1328 | .PP |
1ae6b2c7 | 1329 | .I Documentation/filesystems/sharedsubtree.rst |
98c28960 | 1330 | in the kernel source tree. |