]> git.ipfire.org Git - thirdparty/util-linux.git/blob - sys-utils/unshare.1
a260d02e32d17ce02e97992d63df4ab9a6af3008
[thirdparty/util-linux.git] / sys-utils / unshare.1
1 .TH UNSHARE 1 "February 2016" "util-linux" "User Commands"
2 .SH NAME
3 unshare \- run program in new namespaces
4 .SH SYNOPSIS
5 .B unshare
6 [options]
7 .RI [ program
8 .RI [ arguments ]]
9 .SH DESCRIPTION
10 The
11 .B unshare
12 command creates new namespaces
13 (as specified by the command-line options described below)
14 and then executes the specified \fIprogram\fR.
15 If \fIprogram\fR is not given, then ``${SHELL}'' is
16 run (default: /bin/sh).
17 .PP
18 By default, a new namespace persists only as long as it has member processes.
19 A new namespace can be made persistent even when it has no member processes
20 by bind mounting
21 /proc/\fIpid\fR/ns/\fItype\fR files to a filesystem path.
22 A namespace that has been made persistent in this way can subsequently
23 be entered with
24 .BR \%nsenter (1)
25 even after the \fIprogram\fR terminates (except PID namespaces where
26 a permanently running init process is required).
27 Once a persistent \%namespace is no longer needed,
28 it can be unpersisted by using
29 .BR umount (8)
30 to remove the bind mount.
31 See the \fBEXAMPLES\fR section for more details.
32 .PP
33 .B unshare
34 since util-linux version 2.36 uses /\fIproc/[pid]/ns/pid_for_children\fP and \fI/proc/[pid]/ns/time_for_children\fP
35 files for persistent PID and TIME namespaces. This change requires Linux kernel 4.17 or newer.
36 .PP
37 The following types of namespaces can be created with
38 .BR unshare :
39 .TP
40 .B mount namespace
41 Mounting and unmounting filesystems will not affect the rest of the system,
42 except for filesystems which are explicitly marked as
43 shared (with \fBmount \-\-make-shared\fP; see \fI/proc/self/mountinfo\fP or
44 \fBfindmnt \-o+PROPAGATION\fP for the \fBshared\fP flags).
45 For further details, see
46 .BR mount_namespaces (7).
47 .IP
48 .B unshare
49 since util-linux version 2.27 automatically sets propagation to \fBprivate\fP
50 in a new mount namespace to make sure that the new namespace is really
51 unshared. It's possible to disable this feature with option
52 \fB\-\-propagation unchanged\fP.
53 Note that \fBprivate\fP is the kernel default.
54 .TP
55 .B UTS namespace
56 Setting hostname or domainname will not affect the rest of the system.
57 For further details, see
58 .BR uts_namespaces (7).
59 .TP
60 .B IPC namespace
61 The process will have an independent namespace for POSIX message queues
62 as well as System V \%message queues,
63 semaphore sets and shared memory segments.
64 For further details, see
65 .BR ipc_namespaces (7).
66 .TP
67 .B network namespace
68 The process will have independent IPv4 and IPv6 stacks, IP routing tables,
69 firewall rules, the \fI/proc/net\fP and \fI/sys/class/net\fP directory trees,
70 sockets, etc.
71 For further details, see
72 .BR network_namespaces (7).
73 .TP
74 .B PID namespace
75 Children will have a distinct set of PID-to-process mappings from their parent.
76 For further details, see
77 .BR pid_namespaces (7).
78 .TP
79 .B cgroup namespace
80 The process will have a virtualized view of \fI/proc\:/self\:/cgroup\fP, and new
81 cgroup mounts will be rooted at the namespace cgroup root.
82 For further details, see
83 .BR cgroup_namespaces (7).
84 .TP
85 .B user namespace
86 The process will have a distinct set of UIDs, GIDs and capabilities.
87 For further details, see
88 .BR user_namespaces (7).
89 .TP
90 .B time namespace
91 The process can have a distinct view of
92 .B CLOCK_MONOTONIC
93 and/or
94 .B CLOCK_BOOTTIME
95 which can be changed using \fI/proc/self/timens_offsets\fP.
96 For further details, see
97 .BR time_namespaces (7).
98 .SH OPTIONS
99 .TP
100 .BR \-i , " \-\-ipc" [ =\fIfile ]
101 Unshare the IPC namespace. If \fIfile\fP is specified, then a persistent
102 namespace is created by a bind mount.
103 .TP
104 .BR \-m , " \-\-mount" [ =\fIfile ]
105 Unshare the mount namespace. If \fIfile\fP is specified, then a persistent
106 namespace is created by a bind mount.
107 Note that \fIfile\fP has to be located on a filesystem with the propagation
108 flag set to \fBprivate\fP. Use the command \fBfindmnt \-o+PROPAGATION\fP
109 when not sure about the current setting. See also the examples below.
110 .TP
111 .BR \-n , " \-\-net" [ =\fIfile ]
112 Unshare the network namespace. If \fIfile\fP is specified, then a persistent
113 namespace is created by a bind mount.
114 .TP
115 .BR \-p , " \-\-pid" [ =\fIfile ]
116 Unshare the PID namespace. If \fIfile\fP is specified then persistent
117 namespace is created by a bind mount. See also the \fB\-\-fork\fP and
118 \fB\-\-mount-proc\fP options.
119 .TP
120 .BR \-u , " \-\-uts" [ =\fIfile ]
121 Unshare the UTS namespace. If \fIfile\fP is specified, then a persistent
122 namespace is created by a bind mount.
123 .TP
124 .BR \-U , " \-\-user" [ =\fIfile ]
125 Unshare the user namespace. If \fIfile\fP is specified, then a persistent
126 namespace is created by a bind mount.
127 .TP
128 .BR \-C , " \-\-cgroup"[=\fIfile\fP]
129 Unshare the cgroup namespace. If \fIfile\fP is specified then persistent namespace is created
130 by bind mount.
131 .TP
132 .BR \-T , " \-\-time"[=\fIfile\fP]
133 Unshare the time namespace. If \fIfile\fP is specified then a persistent
134 namespace is created by a bind mount. The \fB\-\-monotonic\fP and
135 \fB\-\-boottime\fP options can be used to specify the corresponding
136 offset in the time namespace.
137 .TP
138 .BR \-f , " \-\-fork"
139 Fork the specified \fIprogram\fR as a child process of \fBunshare\fR rather than
140 running it directly. This is useful when creating a new PID namespace.
141 .TP
142 .B \-\-keep\-caps
143 When the \fB\-\-user\fP option is given, ensure that capabilities granted
144 in the user namespace are preserved in the child process.
145 .TP
146 .BR \-\-kill\-child [ =\fIsigname ]
147 When \fBunshare\fR terminates, have \fIsigname\fP be sent to the forked child process.
148 Combined with \fB\-\-pid\fR this allows for an easy and reliable killing of the entire
149 process tree below \fBunshare\fR.
150 If not given, \fIsigname\fP defaults to \fBSIGKILL\fR.
151 This option implies \fB\-\-fork\fR.
152 .TP
153 .BR \-\-mount\-proc [ =\fImountpoint ]
154 Just before running the program, mount the proc filesystem at \fImountpoint\fP
155 (default is /proc). This is useful when creating a new PID namespace. It also
156 implies creating a new mount namespace since the /proc mount would otherwise
157 mess up existing programs on the system. The new proc filesystem is explicitly
158 mounted as private (with MS_PRIVATE|MS_REC).
159 .TP
160 .BI \-\-map\-user= uid|name
161 Run the program only after the current effective user ID has been mapped to \fIuid\fP.
162 If this option is specified multiple times, the last occurrence takes precedence.
163 This option implies \fB\-\-user\fR.
164 .TP
165 .BI \-\-map\-group= gid|name
166 Run the program only after the current effective group ID has been mapped to \fIgid\fP.
167 If this option is specified multiple times, the last occurrence takes precedence.
168 This option implies \fB\-\-setgroups=deny\fR and \fB\-\-user\fR.
169 .TP
170 .BR \-r , " \-\-map\-root\-user"
171 Run the program only after the current effective user and group IDs have been mapped to
172 the superuser UID and GID in the newly created user namespace. This makes it possible to
173 conveniently gain capabilities needed to manage various aspects of the newly created
174 namespaces (such as configuring interfaces in the network namespace or mounting filesystems in
175 the mount namespace) even when run unprivileged. As a mere convenience feature, it does not support
176 more sophisticated use cases, such as mapping multiple ranges of UIDs and GIDs.
177 This option implies \fB\-\-setgroups=deny\fR and \fB\-\-user\fR.
178 This option is equivalent to \fB\-\-map-user=0 \-\-map-group=0\fR.
179 .TP
180 .BR \-c , " \-\-map\-current\-user"
181 Run the program only after the current effective user and group IDs have been mapped to
182 the same UID and GID in the newly created user namespace. This option implies
183 \fB\-\-setgroups=deny\fR and \fB\-\-user\fR.
184 This option is equivalent to \fB\-\-map-user=$(id -ru) \-\-map-group=$(id -rg)\fR.
185 .TP
186 .BR "\-\-propagation private" | shared | slave | unchanged
187 Recursively set the mount propagation flag in the new mount namespace. The default
188 is to set the propagation to \fIprivate\fP. It is possible to disable this feature
189 with the argument \fBunchanged\fR. The option is silently ignored when the mount
190 namespace (\fB\-\-mount\fP) is not requested.
191 .TP
192 .BR "\-\-setgroups allow" | deny
193 Allow or deny the
194 .BR setgroups (2)
195 system call in a user namespace.
196 .sp
197 To be able to call
198 .BR setgroups (2),
199 the calling process must at least have CAP_SETGID.
200 But since Linux 3.19 a further restriction applies:
201 the kernel gives permission to call
202 .BR \%setgroups (2)
203 only after the GID map (\fB/proc/\fIpid\fB/gid_map\fR) has been set.
204 The GID map is writable by root when
205 .BR \%setgroups (2)
206 is enabled (i.e., \fBallow\fR, the default), and
207 the GID map becomes writable by unprivileged processes when
208 .BR \%setgroups (2)
209 is permanently disabled (with \fBdeny\fR).
210 .TP
211 .BR \-R, "\-\-root=\fIdir"
212 run the command with root directory set to \fIdir\fP.
213 .TP
214 .BR \-w, "\-\-wd=\fIdir"
215 change working directory to \fIdir\fP.
216 .TP
217 .BR \-S, "\-\-setuid \fIuid"
218 Set the user ID which will be used in the entered namespace.
219 .TP
220 .BR \-G, "\-\-setgid \fIgid"
221 Set the group ID which will be used in the entered namespace and drop
222 supplementary groups.
223 .TP
224 .BI \-\-monotonic " offset"
225 Set the offset of
226 .B CLOCK_MONOTONIC
227 which will be used in the entered time namespace. This option requires
228 unsharing a time namespace with \fB\-\-time\fP.
229 .TP
230 .BI \-\-boottime " offset"
231 Set the offset of
232 .B CLOCK_BOOTTIME
233 which will be used in the entered time namespace. This option requires
234 unsharing a time namespace with \fB\-\-time\fP.
235 .TP
236 .BR \-V , " \-\-version"
237 Display version information and exit.
238 .TP
239 .BR \-h , " \-\-help"
240 Display help text and exit.
241 .SH NOTES
242 The proc and sysfs filesystems mounting as root in a user namespace have to be
243 restricted so that a less privileged user can not get more access to sensitive
244 files that a more privileged user made unavailable. In short the rule for proc
245 and sysfs is as close to a bind mount as possible.
246 .SH EXAMPLES
247 .PP
248 The following command creates a PID namespace, using
249 .B \-\-fork
250 to ensure that the executed command is performed in a child process
251 that (being the first process in the namespace) has PID 1.
252 The
253 .B \-\-mount-proc
254 option ensures that a new mount namespace is also simultaneously created
255 and that a new
256 .BR proc (5)
257 filesystem is mounted that contains information corresponding to the new
258 PID namespace.
259 When the
260 .BR readlink
261 command terminates, the new namespaces are automatically torn down.
262 .PP
263 .in +4n
264 .EX
265 .B # unshare \-\-fork \-\-pid \-\-mount-proc readlink /proc/self
266 1
267 .EE
268 .in
269 .PP
270 As an unprivileged user, create a new user namespace where the user's
271 credentials are mapped to the root IDs inside the namespace:
272 .PP
273 .in +4n
274 .EX
275 .B $ id \-u; id \-g
276 1000
277 1000
278 .B $ unshare \-\-user \-\-map-root-user \e
279 .B " sh \-c \(aqwhoami; cat /proc/self/uid_map /proc/self/gid_map\(aq"
280 root
281 0 1000 1
282 0 1000 1
283 .EE
284 .in
285 .PP
286 The first of the following commands creates a new persistent UTS namespace
287 and modifies the hostname as seen in that namespace.
288 The namespace is then entered with
289 .BR nsenter (1)
290 in order to display the modified hostname;
291 this step demonstrates that the UTS namespace continues to exist
292 even though the namespace had no member processes after the
293 .B unshare
294 command terminated.
295 The namespace is then destroyed by removing the bind mount.
296 .PP
297 .in +4n
298 .EX
299 .B # touch /root/uts-ns
300 .B # unshare \-\-uts=/root/uts-ns hostname FOO
301 .B # nsenter \-\-uts=/root/uts-ns hostname
302 FOO
303 .B # umount /root/uts-ns
304 .EE
305 .in
306 .PP
307 The following commands
308 establish a persistent mount namespace referenced by the bind mount
309 .IR /root/namespaces/mnt .
310 In order to ensure that this bind mount does not get propagated
311 to other mount namespaces,
312 the parent directory
313 .RI ( /root/namespaces )
314 is first made a bind mount with
315 .I private
316 propagation.
317 .PP
318 .in +4n
319 .EX
320 .B # mount \-\-bind /root/namespaces /root/namespaces
321 .B # mount \-\-make-private /root/namespaces
322 .B # touch /root/namespaces/mnt
323 .B # unshare \-\-mount=/root/namespaces/mnt
324 .EE
325 .in
326 .PP
327 The following commands demonstrate the use of the
328 .B \-\-kill-child
329 option when creating a PID namespace, in order to ensure that when
330 .B unshare
331 is killed, all of the processes within the PID namespace are killed.
332 .PP
333 .in +4n
334 .EX
335 .BR "# set +m " "# Don't print job status messages"
336 .B # unshare \-\-pid \-\-fork \-\-mount\-proc \-\-kill\-child \-\- \e
337 .B " bash \-\-norc \-c \(aq(sleep 555 &) && (ps a &) && sleep 999\(aq &"
338 [1] 53456
339 # PID TTY STAT TIME COMMAND
340 1 pts/3 S+ 0:00 sleep 999
341 3 pts/3 S+ 0:00 sleep 555
342 5 pts/3 R+ 0:00 ps a
343
344 .BR "# ps h \-o 'comm' $! " "# Show that background job is unshare(1)"
345 unshare
346 .BR "# kill $! " "# Kill unshare(1)
347 .B # pidof sleep
348 .EE
349 .in
350 .PP
351 The
352 .B pidof
353 command prints no output, because the
354 .B sleep
355 processes have been killed.
356 More precisely, when the
357 .B sleep
358 process that has PID 1 in the namespace (i.e., the namespace's init process)
359 was killed, this caused all other processes in the namespace to be killed.
360 By contrast, a similar series of commands where the
361 .B \-\-kill\-child
362 option is not used shows that when
363 .B unshare
364 terminates, the processes in the PID namespace are not killed:
365 .PP
366 .in +4n
367 .EX
368 .B # unshare \-\-pid \-\-fork \-\-mount\-proc \-\- \e
369 .B " bash \-\-norc \-c \(aq(sleep 555 &) && (ps a &) && sleep 999\(aq &"
370 [1] 53479
371 # PID TTY STAT TIME COMMAND
372 1 pts/3 S+ 0:00 sleep 999
373 3 pts/3 S+ 0:00 sleep 555
374 5 pts/3 R+ 0:00 ps a
375
376 .B # kill $!
377 .B # pidof sleep
378 53482 53480
379 .EE
380 .in
381 .PP
382 The following example demonstrates the creation of a time namespace
383 where the boottime clock is set to a point several years in the past:
384 .PP
385 .in +4n
386 .EX
387 .BR "# uptime \-p " "# Show uptime in initial time namespace"
388 up 21 hours, 30 minutes
389 .B # unshare \-\-time \-\-fork \-\-boottime 300000000 uptime \-p
390 up 9 years, 28 weeks, 1 day, 2 hours, 50 minutes
391 .EE
392 .in
393 .SH AUTHORS
394 .UR dottedmag@dottedmag.net
395 Mikhail Gusarov
396 .UE
397 .br
398 .UR kzak@redhat.com
399 Karel Zak
400 .UE
401 .SH SEE ALSO
402 .BR clone (2),
403 .BR unshare (2),
404 .BR namespaces (7),
405 .BR mount (8)
406 .SH AVAILABILITY
407 The unshare command is part of the util-linux package and is available from
408 https://www.kernel.org/pub/linux/utils/util-linux/.