]> git.ipfire.org Git - thirdparty/man-pages.git/blob - man2/pivot_root.2
a11a31a03151076beeb773048a4f4b922e870655
[thirdparty/man-pages.git] / man2 / pivot_root.2
1 .\" Copyright (C) 2000 by Werner Almesberger
2 .\" and Copyright (C) 2019 Michael Kerrisk <mtk.manpages@gmail.com>
3 .\"
4 .\" %%%LICENSE_START(GPL_NOVERSION_ONELINE)
5 .\" May be distributed under GPL
6 .\" %%%LICENSE_END
7 .\"
8 .\" Written 2000-02-23 by Werner Almesberger
9 .\" Modified 2004-06-17 Michael Kerrisk <mtk.manpages@gmail.com>
10 .\"
11 .TH PIVOT_ROOT 2 2019-08-02 "Linux" "Linux Programmer's Manual"
12 .SH NAME
13 pivot_root \- change the root filesystem
14 .SH SYNOPSIS
15 .BI "int pivot_root(const char *" new_root ", const char *" put_old );
16 .PP
17 .IR Note :
18 There is no glibc wrapper for this system call; see NOTES.
19 .SH DESCRIPTION
20 .BR pivot_root ()
21 changes the root filesystem in the mount namespace of the calling process.
22 More precisely, it moves the root filesystem to the
23 directory \fIput_old\fP and makes \fInew_root\fP the new root filesystem.
24 The calling process must have the
25 .B CAP_SYS_ADMIN
26 capability in the user namespace that owns the caller's mount namespace.
27 .PP
28 .BR pivot_root ()
29 may or may not change the current root and the current
30 working directory of any processes or threads that
31 use the old root directory and which are in
32 the same mount namespace as the caller of
33 .BR pivot_root ().
34 The caller of
35 .BR pivot_root ()
36 must ensure that processes with root or current working directory
37 at the old root operate correctly in either case.
38 An easy way to ensure this is to change their
39 root and current working directory to \fInew_root\fP before invoking
40 .BR pivot_root ().
41 .PP
42 The paragraph above is intentionally vague because the implementation of
43 .BR pivot_root ()
44 may change in the future
45 (or so it was thought when this system call was first added).
46 However,
47 the behavior on this point has remained consistent since
48 .BR pivot_root ()
49 was first implemented:
50 .BR pivot_root ()
51 changes the root directory and the current working directory
52 of each process or thread in the same mount namespace to
53 .I new_root
54 if they point to the old root directory.
55 This is necessary in order to prevent kernel threads from keeping the old
56 root directory busy with their root and current working directory,
57 even if they never access
58 the filesystem in any way.
59 Perhaps one day there may be a mechanism for
60 kernel threads to explicitly relinquish any access to the filesystem,
61 such that this fairly intrusive mechanism can be removed from
62 .BR pivot_root ().
63 .PP
64 Note that this also applies to the calling process:
65 .BR pivot_root ()
66 may or may not affect its current working directory.
67 It is therefore recommended to call
68 \fBchdir("/")\fP immediately after
69 .BR pivot_root ().
70 .PP
71 The following restrictions apply to \fInew_root\fP and \fIput_old\fP:
72 .IP \- 3
73 They must be directories.
74 .IP \-
75 \fInew_root\fP and \fIput_old\fP must not be on the same filesystem as
76 the current root.
77 .IP \-
78 \fIput_old\fP must be underneath \fInew_root\fP, that is, adding a nonzero
79 number of \fI/..\fP to the string pointed to by \fIput_old\fP must yield
80 the same directory as \fInew_root\fP.
81 .IP \-
82 .I new_root
83 must be a mount point.
84 (If it is not otherwise a mount point, it suffices to bind mount
85 .I new_root
86 on top of itself.)
87 .IP \-
88 The propagation type of
89 .I new_root
90 and its parent mount must not be
91 .BR MS_SHARED ;
92 similarly, if
93 .I put_old
94 is an existing mount point, its propagation type must not be
95 .BR MS_SHARED .
96 .PP
97 See also
98 .BR pivot_root (8)
99 for additional usage examples.
100 .PP
101 If the current root is not a mount point (e.g., after an earlier
102 .BR chroot (2)
103 or
104 .BR pivot_root ()),
105 then the mount point of the filesystem containing the current root directory
106 (i.e., not the directory itself) is mounted on \fIput_old\fP.
107 .SH RETURN VALUE
108 On success, zero is returned.
109 On error, \-1 is returned, and
110 \fIerrno\fP is set appropriately.
111 .SH ERRORS
112 .BR pivot_root ()
113 may fail with any of the same errors as
114 .BR stat (2).
115 Additionally, it may fail with the following errors:
116 .TP
117 .B EBUSY
118 \fInew_root\fP or \fIput_old\fP are on the current root filesystem,
119 or a filesystem is already mounted on \fIput_old\fP.
120 .TP
121 .B EINVAL
122 .I new_root
123 is not a mount point.
124 .TP
125 .B EINVAL
126 \fIput_old\fP is not underneath \fInew_root\fP.
127 .TP
128 .B EINVAL
129 The current root is on the rootfs (initial ramfs) filesystem; see NOTES.
130 .TP
131 .B EINVAL
132 Either the mount point at
133 .IR new_root ,
134 or the parent mount of that mount point,
135 has propagation type
136 .BR MS_SHARED .
137 .TP
138 .B EINVAL
139 .I put_old
140 is a mount point and has the propagation type
141 .BR MS_SHARED .
142 .TP
143 .B ENOTDIR
144 \fInew_root\fP or \fIput_old\fP is not a directory.
145 .TP
146 .B EPERM
147 The calling process does not have the
148 .B CAP_SYS_ADMIN
149 capability.
150 .SH VERSIONS
151 .BR pivot_root ()
152 was introduced in Linux 2.3.41.
153 .SH CONFORMING TO
154 .BR pivot_root ()
155 is Linux-specific and hence is not portable.
156 .SH NOTES
157 Glibc does not provide a wrapper for this system call; call it using
158 .BR syscall (2).
159 .PP
160 .BR pivot_root ()
161 allows the caller to switch to a new root filesystem while at the same time
162 placing the old root mount at a location under
163 .I new_root
164 from where it can subsequently be unmounted.
165 (The fact that it moves all processes that have a root directory
166 or current working directory on the old root filesystem to the
167 new root filesystem frees the old root filesystem of users,
168 allowing it to be unmounted more easily.)
169 A typical use of
170 .BR pivot_root ()
171 is during system startup, when the
172 system mounts a temporary root filesystem (e.g., an \fBinitrd\fP), then
173 mounts the real root filesystem, and eventually turns the latter into
174 the current root of all relevant processes or threads.
175 A modern use is to set up a root filesystem during
176 the creation of a container.
177 .PP
178 The rootfs (initial ramfs) cannot be
179 .BR pivot_root ()ed.
180 The recommended method of changing the root filesystem in this case is
181 to delete everything in rootfs, overmount rootfs with the new root, attach
182 .IR stdin / stdout / stderr
183 to the new
184 .IR /dev/console ,
185 and exec the new
186 .BR init (1).
187 Helper programs for this process exist; see
188 .BR switch_root (8).
189 .SH BUGS
190 .BR pivot_root ()
191 should not have to change root and current working directory of other
192 processes in the system.
193 .PP
194 Some of the more obscure uses of
195 .BR pivot_root ()
196 may quickly lead to
197 insanity.
198 .SH EXAMPLE
199 .PP
200 The program below demonstrates the use of
201 .BR pivot_root ()
202 inside a mount namespace that is created using
203 .BR clone (2).
204 After pivoting to the root directory named in the program's
205 first command-line argument, the child created by
206 .BR clone (2)
207 then executes the program named in the remaining command-line arguments.
208 .PP
209 We demonstrate the program by creating a directory that will serve as
210 the new root filesystem and placing a copy of the (statically linked)
211 .BR busybox (1)
212 executable in that directory.
213 .PP
214 .in +4n
215 .EX
216 $ \fBmkdir /tmp/rootfs\fP
217 $ \fBls \-id /tmp/rootfs\fP # Show inode number of new root directory
218 319459 /tmp/rootfs
219 $ \fBcp $(which busybox) /tmp/rootfs\fP
220 $ \fBPS1='bbsh$ ' sudo ./pivot_root_demo /tmp/rootfs /busybox sh\fP
221 bbsh$ \fBPATH=/\fP
222 bbsh$ \fBbusybox ln busybox ln\fP
223 bbsh$ \fBln busybox echo\fP
224 bbsh$ \fBln busybox ls\fP
225 bbsh$ \fBls\fP
226 busybox echo ln ls
227 bbsh$ \fBls \-id /\fP # Compare with inode number above
228 319459 /
229 bbsh$ \fBecho \(aqhello world\(aq\fP
230 hello world
231 .EE
232 .in
233 .SS Program source
234 \&
235 .PP
236 .EX
237 /* pivot_root_demo.c */
238
239 #define _GNU_SOURCE
240 #include <sched.h>
241 #include <stdio.h>
242 #include <stdlib.h>
243 #include <unistd.h>
244 #include <sys/wait.h>
245 #include <sys/syscall.h>
246 #include <sys/mount.h>
247 #include <sys/stat.h>
248 #include <limits.h>
249
250 #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \e
251 } while (0)
252
253 static int
254 pivot_root(const char *new_root, const char *put_old)
255 {
256 return syscall(SYS_pivot_root, new_root, put_old);
257 }
258
259 #define STACK_SIZE (1024 * 1024)
260
261 static int /* Startup function for cloned child */
262 child(void *arg)
263 {
264 char **args = arg;
265 char *new_root = args[0];
266 const char *put_old = "/oldrootfs";
267 char path[PATH_MAX];
268
269 /* Ensure that \(aqnew_root\(aq and its parent mount don\(aqt have
270 shared propagation (which would cause pivot_root() to
271 return an error), and prevent propagation of mount
272 events to the initial mount namespace */
273
274 if (mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL) == 1)
275 errExit("mount\-MS_PRIVATE");
276
277 /* Ensure that \(aqnew_root\(aq is a mount point */
278
279 if (mount(new_root, new_root, NULL, MS_BIND, NULL) == \-1)
280 errExit("mount\-MS_BIND");
281
282 /* Create directory to which old root will be pivoted */
283
284 snprintf(path, sizeof(path), "%s/%s", new_root, put_old);
285 if (mkdir(path, 0777) == \-1)
286 errExit("mkdir");
287
288 /* And pivot the root filesystem */
289
290 if (pivot_root(new_root, path) == \-1)
291 errExit("pivot_root");
292
293 /* Switch the current working working directory to "/" */
294
295 if (chdir("/") == \-1)
296 errExit("chdir");
297
298 /* Unmount old root and remove mount point */
299
300 if (umount2(put_old, MNT_DETACH) == \-1)
301 perror("umount2");
302 if (rmdir(put_old) == \-1)
303 perror("rmdir");
304
305 /* Execute the command specified in argv[1]... */
306
307 execv(args[1], &args[1]);
308 errExit("execv");
309 }
310
311 int
312 main(int argc, char *argv[])
313 {
314 /* Create a child process in a new mount namespace */
315
316 char *stack = malloc(STACK_SIZE);
317 if (stack == NULL)
318 errExit("malloc");
319
320 if (clone(child, stack + STACK_SIZE,
321 CLONE_NEWNS | SIGCHLD, &argv[1]) == \-1)
322 errExit("clone");
323
324 /* Parent falls through to here; wait for child */
325
326 if (wait(NULL) == \-1)
327 errExit("wait");
328
329 exit(EXIT_SUCCESS);
330 }
331 .EE
332 .SH SEE ALSO
333 .BR chdir (2),
334 .BR chroot (2),
335 .BR mount (2),
336 .BR stat (2),
337 .BR initrd (4),
338 .BR mount_namespaces (7),
339 .BR pivot_root (8),
340 .BR switch_root (8)