]> git.ipfire.org Git - thirdparty/systemd.git/blob - man/systemd-nspawn.xml
man: document nspawn's new --volatile=overlay switch
[thirdparty/systemd.git] / man / systemd-nspawn.xml
1 <?xml version='1.0'?>
2 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
4 <!ENTITY fedora_latest_version "28">
5 <!ENTITY fedora_cloud_release "1.1">
6 ]>
7
8 <!--
9 SPDX-License-Identifier: LGPL-2.1+
10 -->
11
12 <refentry id="systemd-nspawn"
13 xmlns:xi="http://www.w3.org/2001/XInclude">
14
15 <refentryinfo>
16 <title>systemd-nspawn</title>
17 <productname>systemd</productname>
18 </refentryinfo>
19
20 <refmeta>
21 <refentrytitle>systemd-nspawn</refentrytitle>
22 <manvolnum>1</manvolnum>
23 </refmeta>
24
25 <refnamediv>
26 <refname>systemd-nspawn</refname>
27 <refpurpose>Spawn a command or OS in a light-weight container</refpurpose>
28 </refnamediv>
29
30 <refsynopsisdiv>
31 <cmdsynopsis>
32 <command>systemd-nspawn</command>
33 <arg choice="opt" rep="repeat">OPTIONS</arg>
34 <arg choice="opt"><replaceable>COMMAND</replaceable>
35 <arg choice="opt" rep="repeat">ARGS</arg>
36 </arg>
37 </cmdsynopsis>
38 <cmdsynopsis>
39 <command>systemd-nspawn</command>
40 <arg choice="plain">--boot</arg>
41 <arg choice="opt" rep="repeat">OPTIONS</arg>
42 <arg choice="opt" rep="repeat">ARGS</arg>
43 </cmdsynopsis>
44 </refsynopsisdiv>
45
46 <refsect1>
47 <title>Description</title>
48
49 <para><command>systemd-nspawn</command> may be used to run a command or OS in a light-weight namespace
50 container. In many ways it is similar to <citerefentry
51 project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>1</manvolnum></citerefentry>, but more powerful
52 since it fully virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems and
53 the host and domain name.</para>
54
55 <para><command>systemd-nspawn</command> may be invoked on any directory tree containing an operating system tree,
56 using the <option>--directory=</option> command line option. By using the <option>--machine=</option> option an OS
57 tree is automatically searched for in a couple of locations, most importantly in
58 <filename>/var/lib/machines</filename>, the suggested directory to place OS container images installed on the
59 system.</para>
60
61 <para>In contrast to <citerefentry
62 project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>1</manvolnum></citerefentry> <command>systemd-nspawn</command>
63 may be used to boot full Linux-based operating systems in a container.</para>
64
65 <para><command>systemd-nspawn</command> limits access to various kernel interfaces in the container to read-only,
66 such as <filename>/sys</filename>, <filename>/proc/sys</filename> or <filename>/sys/fs/selinux</filename>. The
67 host's network interfaces and the system clock may not be changed from within the container. Device nodes may not
68 be created. The host system cannot be rebooted and kernel modules may not be loaded from within the
69 container.</para>
70
71 <para>Use a tool like <citerefentry
72 project='mankier'><refentrytitle>dnf</refentrytitle><manvolnum>8</manvolnum></citerefentry>, <citerefentry
73 project='die-net'><refentrytitle>debootstrap</refentrytitle><manvolnum>8</manvolnum></citerefentry>, or
74 <citerefentry project='archlinux'><refentrytitle>pacman</refentrytitle><manvolnum>8</manvolnum></citerefentry> to
75 set up an OS directory tree suitable as file system hierarchy for <command>systemd-nspawn</command> containers. See
76 the Examples section below for details on suitable invocation of these commands.</para>
77
78 <para>As a safety check <command>systemd-nspawn</command> will verify the existence of
79 <filename>/usr/lib/os-release</filename> or <filename>/etc/os-release</filename> in the container tree before
80 starting the container (see
81 <citerefentry><refentrytitle>os-release</refentrytitle><manvolnum>5</manvolnum></citerefentry>). It might be
82 necessary to add this file to the container tree manually if the OS of the container is too old to contain this
83 file out-of-the-box.</para>
84
85 <para><command>systemd-nspawn</command> may be invoked directly from the interactive command line or run as system
86 service in the background. In this mode each container instance runs as its own service instance; a default
87 template unit file <filename>systemd-nspawn@.service</filename> is provided to make this easy, taking the container
88 name as instance identifier. Note that different default options apply when <command>systemd-nspawn</command> is
89 invoked by the template unit file than interactively on the command line. Most importantly the template unit file
90 makes use of the <option>--boot</option> which is not the default in case <command>systemd-nspawn</command> is
91 invoked from the interactive command line. Further differences with the defaults are documented along with the
92 various supported options below.</para>
93
94 <para>The <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry> tool may
95 be used to execute a number of operations on containers. In particular it provides easy-to-use commands to run
96 containers as system services using the <filename>systemd-nspawn@.service</filename> template unit
97 file.</para>
98
99 <para>Along with each container a settings file with the <filename>.nspawn</filename> suffix may exist, containing
100 additional settings to apply when running the container. See
101 <citerefentry><refentrytitle>systemd.nspawn</refentrytitle><manvolnum>5</manvolnum></citerefentry> for
102 details. Settings files override the default options used by the <filename>systemd-nspawn@.service</filename>
103 template unit file, making it usually unnecessary to alter this template file directly.</para>
104
105 <para>Note that <command>systemd-nspawn</command> will mount file systems private to the container to
106 <filename>/dev</filename>, <filename>/run</filename> and similar. These will not be visible outside of the
107 container, and their contents will be lost when the container exits.</para>
108
109 <para>Note that running two <command>systemd-nspawn</command> containers from the same directory tree will not make
110 processes in them see each other. The PID namespace separation of the two containers is complete and the containers
111 will share very few runtime objects except for the underlying file system. Use
112 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s
113 <command>login</command> or <command>shell</command> commands to request an additional login session in a running
114 container.</para>
115
116 <para><command>systemd-nspawn</command> implements the <ulink
117 url="https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface">Container Interface</ulink>
118 specification.</para>
119
120 <para>While running, containers invoked with <command>systemd-nspawn</command> are registered with the
121 <citerefentry><refentrytitle>systemd-machined</refentrytitle><manvolnum>8</manvolnum></citerefentry> service that
122 keeps track of running containers, and provides programming interfaces to interact with them.</para>
123 </refsect1>
124
125 <refsect1>
126 <title>Options</title>
127
128 <para>If option <option>-b</option> is specified, the arguments
129 are used as arguments for the init program. Otherwise,
130 <replaceable>COMMAND</replaceable> specifies the program to launch
131 in the container, and the remaining arguments are used as
132 arguments for this program. If <option>--boot</option> is not used and
133 no arguments are specified, a shell is launched in the
134 container.</para>
135
136 <para>The following options are understood:</para>
137
138 <variablelist>
139 <varlistentry>
140 <term><option>-D</option></term>
141 <term><option>--directory=</option></term>
142
143 <listitem><para>Directory to use as file system root for the
144 container.</para>
145
146 <para>If neither <option>--directory=</option>, nor
147 <option>--image=</option> is specified the directory is
148 determined by searching for a directory named the same as the
149 machine name specified with <option>--machine=</option>. See
150 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>
151 section "Files and Directories" for the precise search path.</para>
152
153 <para>If neither <option>--directory=</option>,
154 <option>--image=</option>, nor <option>--machine=</option>
155 are specified, the current directory will
156 be used. May not be specified together with
157 <option>--image=</option>.</para></listitem>
158 </varlistentry>
159
160 <varlistentry>
161 <term><option>--template=</option></term>
162
163 <listitem><para>Directory or <literal>btrfs</literal> subvolume to use as template for the container's root
164 directory. If this is specified and the container's root directory (as configured by
165 <option>--directory=</option>) does not yet exist it is created as <literal>btrfs</literal> snapshot (if
166 supported) or plain directory (otherwise) and populated from this template tree. Ideally, the specified
167 template path refers to the root of a <literal>btrfs</literal> subvolume, in which case a simple copy-on-write
168 snapshot is taken, and populating the root directory is instant. If the specified template path does not refer
169 to the root of a <literal>btrfs</literal> subvolume (or not even to a <literal>btrfs</literal> file system at
170 all), the tree is copied (though possibly in a 'reflink' copy-on-write scheme — if the file system supports
171 that), which can be substantially more time-consuming. May not be specified together with
172 <option>--image=</option> or <option>--ephemeral</option>.</para>
173
174 <para>Note that this switch leaves host name, machine ID and
175 all other settings that could identify the instance
176 unmodified.</para></listitem>
177 </varlistentry>
178
179 <varlistentry>
180 <term><option>-x</option></term>
181 <term><option>--ephemeral</option></term>
182
183 <listitem><para>If specified, the container is run with a temporary snapshot of its file system that is removed
184 immediately when the container terminates. May not be specified together with
185 <option>--template=</option>.</para>
186 <para>Note that this switch leaves host name, machine ID and all other settings that could identify the
187 instance unmodified. Please note that — as with <option>--template=</option> — taking the temporary snapshot is
188 more efficient on file systems that support subvolume snapshots or 'reflinks' naively (<literal>btrfs</literal>
189 or new <literal>xfs</literal>) than on more traditional file systems that do not
190 (<literal>ext4</literal>).</para>
191
192 <para>With this option no modifications of the container image are retained. Use
193 <option>--volatile=</option> (described below) for other mechanisms to restrict persistency of
194 container images during runtime.</para>
195 </listitem>
196 </varlistentry>
197
198 <varlistentry>
199 <term><option>-i</option></term>
200 <term><option>--image=</option></term>
201
202 <listitem><para>Disk image to mount the root directory for the
203 container from. Takes a path to a regular file or to a block
204 device node. The file or block device must contain
205 either:</para>
206
207 <itemizedlist>
208 <listitem><para>An MBR partition table with a single
209 partition of type 0x83 that is marked
210 bootable.</para></listitem>
211
212 <listitem><para>A GUID partition table (GPT) with a single
213 partition of type
214 0fc63daf-8483-4772-8e79-3d69d8477de4.</para></listitem>
215
216 <listitem><para>A GUID partition table (GPT) with a marked
217 root partition which is mounted as the root directory of the
218 container. Optionally, GPT images may contain a home and/or
219 a server data partition which are mounted to the appropriate
220 places in the container. All these partitions must be
221 identified by the partition types defined by the <ulink
222 url="https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/">Discoverable
223 Partitions Specification</ulink>.</para></listitem>
224
225 <listitem><para>No partition table, and a single file system spanning the whole image.</para></listitem>
226 </itemizedlist>
227
228 <para>On GPT images, if an EFI System Partition (ESP) is discovered, it is automatically mounted to
229 <filename>/efi</filename> (or <filename>/boot</filename> as fallback) in case a directory by this name exists
230 and is empty.</para>
231
232 <para>Partitions encrypted with LUKS are automatically decrypted. Also, on GPT images dm-verity data integrity
233 hash partitions are set up if the root hash for them is specified using the <option>--root-hash=</option>
234 option.</para>
235
236 <para>Any other partitions, such as foreign partitions or swap partitions are not mounted. May not be specified
237 together with <option>--directory=</option>, <option>--template=</option>.</para></listitem>
238 </varlistentry>
239
240 <varlistentry>
241 <term><option>--root-hash=</option></term>
242
243 <listitem><para>Takes a data integrity (dm-verity) root hash specified in hexadecimal. This option enables data
244 integrity checks using dm-verity, if the used image contains the appropriate integrity data (see above). The
245 specified hash must match the root hash of integrity data, and is usually at least 256 bits (and hence 64
246 formatted hexadecimal characters) long (in case of SHA256 for example). If this option is not specified, but
247 the image file carries the <literal>user.verity.roothash</literal> extended file attribute (see <citerefentry
248 project='man-pages'><refentrytitle>xattr</refentrytitle><manvolnum>7</manvolnum></citerefentry>), then the root
249 hash is read from it, also as formatted hexadecimal characters. If the extended file attribute is not found (or
250 is not supported by the underlying file system), but a file with the <filename>.roothash</filename> suffix is
251 found next to the image file, bearing otherwise the same name, the root hash is read from it and automatically
252 used, also as formatted hexadecimal characters.</para></listitem>
253 </varlistentry>
254
255 <varlistentry>
256 <term><option>-a</option></term>
257 <term><option>--as-pid2</option></term>
258
259 <listitem><para>Invoke the shell or specified program as process ID (PID) 2 instead of PID 1 (init). By
260 default, if neither this option nor <option>--boot</option> is used, the selected program is run as the process
261 with PID 1, a mode only suitable for programs that are aware of the special semantics that the process with
262 PID 1 has on UNIX. For example, it needs to reap all processes reparented to it, and should implement
263 <command>sysvinit</command> compatible signal handling (specifically: it needs to reboot on SIGINT, reexecute
264 on SIGTERM, reload configuration on SIGHUP, and so on). With <option>--as-pid2</option> a minimal stub init
265 process is run as PID 1 and the selected program is executed as PID 2 (and hence does not need to implement any
266 special semantics). The stub init process will reap processes as necessary and react appropriately to
267 signals. It is recommended to use this mode to invoke arbitrary commands in containers, unless they have been
268 modified to run correctly as PID 1. Or in other words: this switch should be used for pretty much all commands,
269 except when the command refers to an init or shell implementation, as these are generally capable of running
270 correctly as PID 1. This option may not be combined with <option>--boot</option>.</para>
271 </listitem>
272 </varlistentry>
273
274 <varlistentry>
275 <term><option>-b</option></term>
276 <term><option>--boot</option></term>
277
278 <listitem><para>Automatically search for an init program and invoke it as PID 1, instead of a shell or a user
279 supplied program. If this option is used, arguments specified on the command line are used as arguments for the
280 init program. This option may not be combined with <option>--as-pid2</option>.</para>
281
282 <para>The following table explains the different modes of invocation and relationship to
283 <option>--as-pid2</option> (see above):</para>
284
285 <table>
286 <title>Invocation Mode</title>
287 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
288 <colspec colname="switch" />
289 <colspec colname="explanation" />
290 <thead>
291 <row>
292 <entry>Switch</entry>
293 <entry>Explanation</entry>
294 </row>
295 </thead>
296 <tbody>
297 <row>
298 <entry>Neither <option>--as-pid2</option> nor <option>--boot</option> specified</entry>
299 <entry>The passed parameters are interpreted as the command line, which is executed as PID 1 in the container.</entry>
300 </row>
301
302 <row>
303 <entry><option>--as-pid2</option> specified</entry>
304 <entry>The passed parameters are interpreted as the command line, which is executed as PID 2 in the container. A stub init process is run as PID 1.</entry>
305 </row>
306
307 <row>
308 <entry><option>--boot</option> specified</entry>
309 <entry>An init program is automatically searched for and run as PID 1 in the container. The passed parameters are used as invocation parameters for this process.</entry>
310 </row>
311
312 </tbody>
313 </tgroup>
314 </table>
315
316 <para>Note that <option>--boot</option> is the default mode of operation if the
317 <filename>systemd-nspawn@.service</filename> template unit file is used.</para>
318 </listitem>
319 </varlistentry>
320
321 <varlistentry>
322 <term><option>--chdir=</option></term>
323
324 <listitem><para>Change to the specified working directory before invoking the process in the container. Expects
325 an absolute path in the container's file system namespace.</para></listitem>
326 </varlistentry>
327
328 <varlistentry>
329 <term><option>--pivot-root=</option></term>
330
331 <listitem><para>Pivot the specified directory to <filename>/</filename> inside the container, and either unmount the
332 container's old root, or pivot it to another specified directory. Takes one of: a path argument — in which case the
333 specified path will be pivoted to <filename>/</filename> and the old root will be unmounted; or a colon-separated pair
334 of new root path and pivot destination for the old root. The new root path will be pivoted to <filename>/</filename>,
335 and the old <filename>/</filename> will be pivoted to the other directory. Both paths must be absolute, and are resolved
336 in the container's file system namespace.</para>
337
338 <para>This is for containers which have several bootable directories in them; for example, several
339 <ulink url="https://ostree.readthedocs.io/en/latest/">OSTree</ulink> deployments. It emulates the behavior of
340 the boot loader and initial RAM disk which normally select which directory to mount as the root and start the
341 container's PID 1 in.</para></listitem>
342 </varlistentry>
343
344 <varlistentry>
345 <term><option>-u</option></term>
346 <term><option>--user=</option></term>
347
348 <listitem><para>After transitioning into the container, change
349 to the specified user-defined in the container's user
350 database. Like all other systemd-nspawn features, this is not
351 a security feature and provides protection against accidental
352 destructive operations only.</para></listitem>
353 </varlistentry>
354
355 <varlistentry>
356 <term><option>-M</option></term>
357 <term><option>--machine=</option></term>
358
359 <listitem><para>Sets the machine name for this container. This
360 name may be used to identify this container during its runtime
361 (for example in tools like
362 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>
363 and similar), and is used to initialize the container's
364 hostname (which the container can choose to override,
365 however). If not specified, the last component of the root
366 directory path of the container is used, possibly suffixed
367 with a random identifier in case <option>--ephemeral</option>
368 mode is selected. If the root directory selected is the host's
369 root directory the host's hostname is used as default
370 instead.</para></listitem>
371 </varlistentry>
372
373 <varlistentry>
374 <term><option>--hostname=</option></term>
375
376 <listitem><para>Controls the hostname to set within the container, if different from the machine name. Expects
377 a valid hostname as argument. If this option is used, the kernel hostname of the container will be set to this
378 value, otherwise it will be initialized to the machine name as controlled by the <option>--machine=</option>
379 option described above. The machine name is used for various aspect of identification of the container from the
380 outside, the kernel hostname configurable with this option is useful for the container to identify itself from
381 the inside. It is usually a good idea to keep both forms of identification synchronized, in order to avoid
382 confusion. It is hence recommended to avoid usage of this option, and use <option>--machine=</option>
383 exclusively. Note that regardless whether the container's hostname is initialized from the name set with
384 <option>--hostname=</option> or the one set with <option>--machine=</option>, the container can later override
385 its kernel hostname freely on its own as well.</para>
386 </listitem>
387 </varlistentry>
388
389 <varlistentry>
390 <term><option>--uuid=</option></term>
391
392 <listitem><para>Set the specified UUID for the container. The
393 init system will initialize
394 <filename>/etc/machine-id</filename> from this if this file is
395 not set yet. Note that this option takes effect only if
396 <filename>/etc/machine-id</filename> in the container is
397 unpopulated.</para></listitem>
398 </varlistentry>
399
400 <varlistentry>
401 <term><option>-S</option></term>
402 <term><option>--slice=</option></term>
403
404 <listitem><para>Make the container part of the specified slice, instead of the default
405 <filename>machine.slice</filename>. This applies only if the machine is run in its own scope unit, i.e. if
406 <option>--keep-unit</option> isn't used.</para>
407 </listitem>
408 </varlistentry>
409
410 <varlistentry>
411 <term><option>--property=</option></term>
412
413 <listitem><para>Set a unit property on the scope unit to register for the machine. This applies only if the
414 machine is run in its own scope unit, i.e. if <option>--keep-unit</option> isn't used. Takes unit property
415 assignments in the same format as <command>systemctl set-property</command>. This is useful to set memory
416 limits and similar for container.</para>
417 </listitem>
418 </varlistentry>
419
420 <varlistentry>
421 <term><option>--private-users=</option></term>
422
423 <listitem><para>Controls user namespacing. If enabled, the container will run with its own private set of UNIX
424 user and group ids (UIDs and GIDs). This involves mapping the private UIDs/GIDs used in the container (starting
425 with the container's root user 0 and up) to a range of UIDs/GIDs on the host that are not used for other
426 purposes (usually in the range beyond the host's UID/GID 65536). The parameter may be specified as follows:</para>
427
428 <orderedlist>
429 <listitem><para>If one or two colon-separated numbers are specified, user namespacing is turned on. The first
430 parameter specifies the first host UID/GID to assign to the container, the second parameter specifies the
431 number of host UIDs/GIDs to assign to the container. If the second parameter is omitted, 65536 UIDs/GIDs are
432 assigned.</para></listitem>
433
434 <listitem><para>If the parameter is omitted, or true, user namespacing is turned on. The UID/GID range to
435 use is determined automatically from the file ownership of the root directory of the container's directory
436 tree. To use this option, make sure to prepare the directory tree in advance, and ensure that all files and
437 directories in it are owned by UIDs/GIDs in the range you'd like to use. Also, make sure that used file ACLs
438 exclusively reference UIDs/GIDs in the appropriate range. If this mode is used the number of UIDs/GIDs
439 assigned to the container for use is 65536, and the UID/GID of the root directory must be a multiple of
440 65536.</para></listitem>
441
442 <listitem><para>If the parameter is false, user namespacing is turned off. This is the default.</para>
443 </listitem>
444
445 <listitem><para>The special value <literal>pick</literal> turns on user namespacing. In this case the UID/GID
446 range is automatically chosen. As first step, the file owner of the root directory of the container's
447 directory tree is read, and it is checked that it is currently not used by the system otherwise (in
448 particular, that no other container is using it). If this check is successful, the UID/GID range determined
449 this way is used, similar to the behavior if "yes" is specified. If the check is not successful (and thus
450 the UID/GID range indicated in the root directory's file owner is already used elsewhere) a new – currently
451 unused – UID/GID range of 65536 UIDs/GIDs is randomly chosen between the host UID/GIDs of 524288 and
452 1878982656, always starting at a multiple of 65536. This setting implies
453 <option>--private-users-chown</option> (see below), which has the effect that the files and directories in
454 the container's directory tree will be owned by the appropriate users of the range picked. Using this option
455 makes user namespace behavior fully automatic. Note that the first invocation of a previously unused
456 container image might result in picking a new UID/GID range for it, and thus in the (possibly expensive) file
457 ownership adjustment operation. However, subsequent invocations of the container will be cheap (unless of
458 course the picked UID/GID range is assigned to a different use by then).</para></listitem>
459 </orderedlist>
460
461 <para>It is recommended to assign at least 65536 UIDs/GIDs to each container, so that the usable UID/GID range in the
462 container covers 16 bit. For best security, do not assign overlapping UID/GID ranges to multiple containers. It is
463 hence a good idea to use the upper 16 bit of the host 32-bit UIDs/GIDs as container identifier, while the lower 16
464 bit encode the container UID/GID used. This is in fact the behavior enforced by the
465 <option>--private-users=pick</option> option.</para>
466
467 <para>When user namespaces are used, the GID range assigned to each container is always chosen identical to the
468 UID range.</para>
469
470 <para>In most cases, using <option>--private-users=pick</option> is the recommended option as it enhances
471 container security massively and operates fully automatically in most cases.</para>
472
473 <para>Note that the picked UID/GID range is not written to <filename>/etc/passwd</filename> or
474 <filename>/etc/group</filename>. In fact, the allocation of the range is not stored persistently anywhere,
475 except in the file ownership of the files and directories of the container.</para>
476
477 <para>Note that when user namespacing is used file ownership on disk reflects this, and all of the container's
478 files and directories are owned by the container's effective user and group IDs. This means that copying files
479 from and to the container image requires correction of the numeric UID/GID values, according to the UID/GID
480 shift applied.</para></listitem>
481 </varlistentry>
482
483 <varlistentry>
484 <term><option>--private-users-chown</option></term>
485
486 <listitem><para>If specified, all files and directories in the container's directory tree will adjusted so that
487 they are owned to the appropriate UIDs/GIDs selected for the container (see above). This operation is
488 potentially expensive, as it involves descending and iterating through the full directory tree of the
489 container. Besides actual file ownership, file ACLs are adjusted as well.</para>
490
491 <para>This option is implied if <option>--private-users=pick</option> is used. This option has no effect if
492 user namespacing is not used.</para></listitem>
493 </varlistentry>
494
495 <varlistentry>
496 <term><option>-U</option></term>
497
498 <listitem><para>If the kernel supports the user namespaces feature, equivalent to
499 <option>--private-users=pick --private-users-chown</option>, otherwise equivalent to
500 <option>--private-users=no</option>.</para>
501
502 <para>Note that <option>-U</option> is the default if the
503 <filename>systemd-nspawn@.service</filename> template unit file is used.</para>
504
505 <para>Note: it is possible to undo the effect of <option>--private-users-chown</option> (or
506 <option>-U</option>) on the file system by redoing the operation with the first UID of 0:</para>
507
508 <programlisting>systemd-nspawn … --private-users=0 --private-users-chown</programlisting>
509 </listitem>
510 </varlistentry>
511
512 <varlistentry>
513 <term><option>--private-network</option></term>
514
515 <listitem><para>Disconnect networking of the container from
516 the host. This makes all network interfaces unavailable in the
517 container, with the exception of the loopback device and those
518 specified with <option>--network-interface=</option> and
519 configured with <option>--network-veth</option>. If this
520 option is specified, the CAP_NET_ADMIN capability will be
521 added to the set of capabilities the container retains. The
522 latter may be disabled by using <option>--drop-capability=</option>.
523 If this option is not specified (or implied by one of the options
524 listed below), the container will have full access to the host network.
525 </para></listitem>
526 </varlistentry>
527
528 <varlistentry>
529 <term><option>--network-namespace-path=</option></term>
530
531 <listitem><para>Takes the path to a file representing a kernel
532 network namespace that the container shall run in. The specified path
533 should refer to a (possibly bind-mounted) network namespace file, as
534 exposed by the kernel below <filename>/proc/$PID/ns/net</filename>.
535 This makes the container enter the given network namespace. One of the
536 typical use cases is to give a network namespace under
537 <filename>/run/netns</filename> created by <citerefentry
538 project='man-pages'><refentrytitle>ip-netns</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
539 for example, <option>--network-namespace-path=/run/netns/foo</option>.
540 Note that this option cannot be used together with other
541 network-related options, such as <option>--private-network</option>
542 or <option>--network-interface=</option>.</para></listitem>
543 </varlistentry>
544
545 <varlistentry>
546 <term><option>--network-interface=</option></term>
547
548 <listitem><para>Assign the specified network interface to the
549 container. This will remove the specified interface from the
550 calling namespace and place it in the container. When the
551 container terminates, it is moved back to the host namespace.
552 Note that <option>--network-interface=</option> implies
553 <option>--private-network</option>. This option may be used
554 more than once to add multiple network interfaces to the
555 container.</para></listitem>
556 </varlistentry>
557
558 <varlistentry>
559 <term><option>--network-macvlan=</option></term>
560
561 <listitem><para>Create a <literal>macvlan</literal> interface
562 of the specified Ethernet network interface and add it to the
563 container. A <literal>macvlan</literal> interface is a virtual
564 interface that adds a second MAC address to an existing
565 physical Ethernet link. The interface in the container will be
566 named after the interface on the host, prefixed with
567 <literal>mv-</literal>. Note that
568 <option>--network-macvlan=</option> implies
569 <option>--private-network</option>. This option may be used
570 more than once to add multiple network interfaces to the
571 container.</para></listitem>
572 </varlistentry>
573
574 <varlistentry>
575 <term><option>--network-ipvlan=</option></term>
576
577 <listitem><para>Create an <literal>ipvlan</literal> interface
578 of the specified Ethernet network interface and add it to the
579 container. An <literal>ipvlan</literal> interface is a virtual
580 interface, similar to a <literal>macvlan</literal> interface,
581 which uses the same MAC address as the underlying interface.
582 The interface in the container will be named after the
583 interface on the host, prefixed with <literal>iv-</literal>.
584 Note that <option>--network-ipvlan=</option> implies
585 <option>--private-network</option>. This option may be used
586 more than once to add multiple network interfaces to the
587 container.</para></listitem>
588 </varlistentry>
589
590 <varlistentry>
591 <term><option>-n</option></term>
592 <term><option>--network-veth</option></term>
593
594 <listitem><para>Create a virtual Ethernet link (<literal>veth</literal>) between host and container. The host
595 side of the Ethernet link will be available as a network interface named after the container's name (as
596 specified with <option>--machine=</option>), prefixed with <literal>ve-</literal>. The container side of the
597 Ethernet link will be named <literal>host0</literal>. The <option>--network-veth</option> option implies
598 <option>--private-network</option>.</para>
599
600 <para>Note that
601 <citerefentry><refentrytitle>systemd-networkd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
602 includes by default a network file <filename>/usr/lib/systemd/network/80-container-ve.network</filename>
603 matching the host-side interfaces created this way, which contains settings to enable automatic address
604 provisioning on the created virtual link via DHCP, as well as automatic IP routing onto the host's external
605 network interfaces. It also contains <filename>/usr/lib/systemd/network/80-container-host0.network</filename>
606 matching the container-side interface created this way, containing settings to enable client side address
607 assignment via DHCP. In case <filename>systemd-networkd</filename> is running on both the host and inside the
608 container, automatic IP communication from the container to the host is thus available, with further
609 connectivity to the external network.</para>
610
611 <para>Note that <option>--network-veth</option> is the default if the
612 <filename>systemd-nspawn@.service</filename> template unit file is used.</para>
613 </listitem>
614 </varlistentry>
615
616 <varlistentry>
617 <term><option>--network-veth-extra=</option></term>
618
619 <listitem><para>Adds an additional virtual Ethernet link
620 between host and container. Takes a colon-separated pair of
621 host interface name and container interface name. The latter
622 may be omitted in which case the container and host sides will
623 be assigned the same name. This switch is independent of
624 <option>--network-veth</option>, and — in contrast — may be
625 used multiple times, and allows configuration of the network
626 interface names. Note that <option>--network-bridge=</option>
627 has no effect on interfaces created with
628 <option>--network-veth-extra=</option>.</para></listitem>
629 </varlistentry>
630
631 <varlistentry>
632 <term><option>--network-bridge=</option></term>
633
634 <listitem><para>Adds the host side of the Ethernet link created with <option>--network-veth</option> to the
635 specified Ethernet bridge interface. Expects a valid network interface name of a bridge device as
636 argument. Note that <option>--network-bridge=</option> implies <option>--network-veth</option>. If this option
637 is used, the host side of the Ethernet link will use the <literal>vb-</literal> prefix instead of
638 <literal>ve-</literal>.</para></listitem>
639 </varlistentry>
640
641 <varlistentry>
642 <term><option>--network-zone=</option></term>
643
644 <listitem><para>Creates a virtual Ethernet link (<literal>veth</literal>) to the container and adds it to an
645 automatically managed Ethernet bridge interface. The bridge interface is named after the passed argument,
646 prefixed with <literal>vz-</literal>. The bridge interface is automatically created when the first container
647 configured for its name is started, and is automatically removed when the last container configured for its
648 name exits. Hence, each bridge interface configured this way exists only as long as there's at least one
649 container referencing it running. This option is very similar to <option>--network-bridge=</option>, besides
650 this automatic creation/removal of the bridge device.</para>
651
652 <para>This setting makes it easy to place multiple related containers on a common, virtual Ethernet-based
653 broadcast domain, here called a "zone". Each container may only be part of one zone, but each zone may contain
654 any number of containers. Each zone is referenced by its name. Names may be chosen freely (as long as they form
655 valid network interface names when prefixed with <literal>vz-</literal>), and it is sufficient to pass the same
656 name to the <option>--network-zone=</option> switch of the various concurrently running containers to join
657 them in one zone.</para>
658
659 <para>Note that
660 <citerefentry><refentrytitle>systemd-networkd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
661 includes by default a network file <filename>/usr/lib/systemd/network/80-container-vz.network</filename>
662 matching the bridge interfaces created this way, which contains settings to enable automatic address
663 provisioning on the created virtual network via DHCP, as well as automatic IP routing onto the host's external
664 network interfaces. Using <option>--network-zone=</option> is hence in most cases fully automatic and
665 sufficient to connect multiple local containers in a joined broadcast domain to the host, with further
666 connectivity to the external network.</para>
667 </listitem>
668 </varlistentry>
669
670 <varlistentry>
671 <term><option>-p</option></term>
672 <term><option>--port=</option></term>
673
674 <listitem><para>If private networking is enabled, maps an IP
675 port on the host onto an IP port on the container. Takes a
676 protocol specifier (either <literal>tcp</literal> or
677 <literal>udp</literal>), separated by a colon from a host port
678 number in the range 1 to 65535, separated by a colon from a
679 container port number in the range from 1 to 65535. The
680 protocol specifier and its separating colon may be omitted, in
681 which case <literal>tcp</literal> is assumed. The container
682 port number and its colon may be omitted, in which case the
683 same port as the host port is implied. This option is only
684 supported if private networking is used, such as with
685 <option>--network-veth</option>, <option>--network-zone=</option>
686 <option>--network-bridge=</option>.</para></listitem>
687 </varlistentry>
688
689 <varlistentry>
690 <term><option>-Z</option></term>
691 <term><option>--selinux-context=</option></term>
692
693 <listitem><para>Sets the SELinux security context to be used
694 to label processes in the container.</para>
695 </listitem>
696 </varlistentry>
697
698 <varlistentry>
699 <term><option>-L</option></term>
700 <term><option>--selinux-apifs-context=</option></term>
701
702 <listitem><para>Sets the SELinux security context to be used
703 to label files in the virtual API file systems in the
704 container.</para>
705 </listitem>
706 </varlistentry>
707
708 <varlistentry>
709 <term><option>--capability=</option></term>
710
711 <listitem><para>List one or more additional capabilities to grant the container.
712 Takes a comma-separated list of capability names, see
713 <citerefentry project='man-pages'><refentrytitle>capabilities</refentrytitle><manvolnum>7</manvolnum></citerefentry>
714 for more information. Note that the following capabilities will be granted in any way:
715 CAP_AUDIT_CONTROL, CAP_AUDIT_WRITE, CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH,
716 CAP_FOWNER, CAP_FSETID, CAP_IPC_OWNER, CAP_KILL, CAP_LEASE, CAP_LINUX_IMMUTABLE,
717 CAP_MKNOD, CAP_NET_BIND_SERVICE, CAP_NET_BROADCAST, CAP_NET_RAW, CAP_SETFCAP,
718 CAP_SETGID, CAP_SETPCAP, CAP_SETUID, CAP_SYS_ADMIN, CAP_SYS_BOOT, CAP_SYS_CHROOT,
719 CAP_SYS_NICE, CAP_SYS_PTRACE, CAP_SYS_RESOURCE, CAP_SYS_TTY_CONFIG. Also CAP_NET_ADMIN
720 is retained if <option>--private-network</option> is specified. If the special value
721 <literal>all</literal> is passed, all capabilities are retained.</para></listitem>
722 </varlistentry>
723
724 <varlistentry>
725 <term><option>--drop-capability=</option></term>
726
727 <listitem><para>Specify one or more additional capabilities to
728 drop for the container. This allows running the container with
729 fewer capabilities than the default (see
730 above).</para></listitem>
731 </varlistentry>
732
733 <varlistentry>
734 <term><option>--no-new-privileges=</option></term>
735
736 <listitem><para>Takes a boolean argument. Specifies the value of the <constant>PR_SET_NO_NEW_PRIVS</constant>
737 flag for the container payload. Defaults to off. When turned on the payload code of the container cannot
738 acquire new privileges, i.e. the "setuid" file bit as well as file system capabilities will not have an effect
739 anymore. See <citerefentry
740 project='man-pages'><refentrytitle>prctl</refentrytitle><manvolnum>2</manvolnum></citerefentry> for details
741 about this flag. </para></listitem>
742 </varlistentry>
743
744 <varlistentry>
745 <term><option>--system-call-filter=</option></term>
746
747 <listitem><para>Alter the system call filter applied to containers. Takes a space-separated list of system call
748 names or group names (the latter prefixed with <literal>@</literal>, as listed by the
749 <command>syscall-filter</command> command of
750 <citerefentry><refentrytitle>systemd-analyze</refentrytitle><manvolnum>1</manvolnum></citerefentry>). Passed
751 system calls will be permitted. The list may optionally be prefixed by <literal>~</literal>, in which case all
752 listed system calls are prohibited. If this command line option is used multiple times the configured lists are
753 combined. If both a positive and a negative list (that is one system call list without and one with the
754 <literal>~</literal> prefix) are configured, the negative list takes precedence over the positive list. Note
755 that <command>systemd-nspawn</command> always implements a system call whitelist (as opposed to a blacklist),
756 and this command line option hence adds or removes entries from the default whitelist, depending on the
757 <literal>~</literal> prefix. Note that the applied system call filter is also altered implicitly if additional
758 capabilities are passed using the <command>--capabilities=</command>.</para></listitem>
759 </varlistentry>
760
761 <varlistentry>
762 <term><option>--rlimit=</option></term>
763
764 <listitem><para>Sets the specified POSIX resource limit for the container payload. Expects an assignment of the
765 form
766 <literal><replaceable>LIMIT</replaceable>=<replaceable>SOFT</replaceable>:<replaceable>HARD</replaceable></literal>
767 or <literal><replaceable>LIMIT</replaceable>=<replaceable>VALUE</replaceable></literal>, where
768 <replaceable>LIMIT</replaceable> should refer to a resource limit type, such as
769 <constant>RLIMIT_NOFILE</constant> or <constant>RLIMIT_NICE</constant>. The <replaceable>SOFT</replaceable> and
770 <replaceable>HARD</replaceable> fields should refer to the numeric soft and hard resource limit values. If the
771 second form is used, <replaceable>VALUE</replaceable> may specify a value that is used both as soft and hard
772 limit. In place of a numeric value the special string <literal>infinity</literal> may be used to turn off
773 resource limiting for the specific type of resource. This command line option may be used multiple times to
774 control limits on multiple limit types. If used multiple times for the same limit type, the last use
775 wins. For details about resource limits see <citerefentry
776 project='man-pages'><refentrytitle>setrlimit</refentrytitle><manvolnum>2</manvolnum></citerefentry>. By default
777 resource limits for the container's init process (PID 1) are set to the same values the Linux kernel originally
778 passed to the host init system. Note that some resource limits are enforced on resources counted per user, in
779 particular <constant>RLIMIT_NPROC</constant>. This means that unless user namespacing is deployed
780 (i.e. <option>--private-users=</option> is used, see above), any limits set will be applied to the resource
781 usage of the same user on all local containers as well as the host. This means particular care needs to be
782 taken with these limits as they might be triggered by possibly less trusted code. Example:
783 <literal>--rlimit=RLIMIT_NOFILE=8192:16384</literal>.</para></listitem>
784 </varlistentry>
785
786 <varlistentry>
787 <term><option>--oom-score-adjust=</option></term>
788
789 <listitem><para>Changes the OOM ("Out Of Memory") score adjustment value for the container payload. This controls
790 <filename>/proc/self/oom_score_adj</filename> which influences the preference with which this container is
791 terminated when memory becomes scarce. For details see <citerefentry
792 project='man-pages'><refentrytitle>proc</refentrytitle><manvolnum>5</manvolnum></citerefentry>. Takes an
793 integer in the range -10001000.</para></listitem>
794 </varlistentry>
795
796 <varlistentry>
797 <term><option>--cpu-affinity=</option></term>
798
799 <listitem><para>Controls the CPU affinity of the container payload. Takes a comma separated list of CPU numbers
800 or number ranges (the latter's start and end value separated by dashes). See <citerefentry
801 project='man-pages'><refentrytitle>sched_setaffinity</refentrytitle><manvolnum>2</manvolnum></citerefentry> for
802 details.</para></listitem>
803 </varlistentry>
804
805 <varlistentry>
806 <term><option>--kill-signal=</option></term>
807
808 <listitem><para>Specify the process signal to send to the container's PID 1 when nspawn itself receives
809 <constant>SIGTERM</constant>, in order to trigger an orderly shutdown of the container. Defaults to
810 <constant>SIGRTMIN+3</constant> if <option>--boot</option> is used (on systemd-compatible init systems
811 <constant>SIGRTMIN+3</constant> triggers an orderly shutdown). If <option>--boot</option> is not used and this
812 option is not specified the container's processes are terminated abruptly via <constant>SIGKILL</constant>. For
813 a list of valid signals, see <citerefentry
814 project='man-pages'><refentrytitle>signal</refentrytitle><manvolnum>7</manvolnum></citerefentry>.</para></listitem>
815 </varlistentry>
816
817 <varlistentry>
818 <term><option>--link-journal=</option></term>
819
820 <listitem><para>Control whether the container's journal shall
821 be made visible to the host system. If enabled, allows viewing
822 the container's journal files from the host (but not vice
823 versa). Takes one of <literal>no</literal>,
824 <literal>host</literal>, <literal>try-host</literal>,
825 <literal>guest</literal>, <literal>try-guest</literal>,
826 <literal>auto</literal>. If <literal>no</literal>, the journal
827 is not linked. If <literal>host</literal>, the journal files
828 are stored on the host file system (beneath
829 <filename>/var/log/journal/<replaceable>machine-id</replaceable></filename>)
830 and the subdirectory is bind-mounted into the container at the
831 same location. If <literal>guest</literal>, the journal files
832 are stored on the guest file system (beneath
833 <filename>/var/log/journal/<replaceable>machine-id</replaceable></filename>)
834 and the subdirectory is symlinked into the host at the same
835 location. <literal>try-host</literal> and
836 <literal>try-guest</literal> do the same but do not fail if
837 the host does not have persistent journaling enabled. If
838 <literal>auto</literal> (the default), and the right
839 subdirectory of <filename>/var/log/journal</filename> exists,
840 it will be bind mounted into the container. If the
841 subdirectory does not exist, no linking is performed.
842 Effectively, booting a container once with
843 <literal>guest</literal> or <literal>host</literal> will link
844 the journal persistently if further on the default of
845 <literal>auto</literal> is used.</para>
846
847 <para>Note that <option>--link-journal=try-guest</option> is the default if the
848 <filename>systemd-nspawn@.service</filename> template unit file is used.</para></listitem>
849 </varlistentry>
850
851 <varlistentry>
852 <term><option>-j</option></term>
853
854 <listitem><para>Equivalent to
855 <option>--link-journal=try-guest</option>.</para></listitem>
856 </varlistentry>
857
858 <varlistentry>
859 <term><option>--resolv-conf=</option></term>
860
861 <listitem><para>Configures how <filename>/etc/resolv.conf</filename> inside of the container (i.e. DNS
862 configuration synchronization from host to container) shall be handled. Takes one of <literal>off</literal>,
863 <literal>copy-host</literal>, <literal>copy-static</literal>, <literal>bind-host</literal>,
864 <literal>bind-static</literal>, <literal>delete</literal> or <literal>auto</literal>. If set to
865 <literal>off</literal> the <filename>/etc/resolv.conf</filename> file in the container is left as it is
866 included in the image, and neither modified nor bind mounted over. If set to <literal>copy-host</literal>, the
867 <filename>/etc/resolv.conf</filename> file from the host is copied into the container. Similar, if
868 <literal>bind-host</literal> is used, the file is bind mounted from the host into the container. If set to
869 <literal>copy-static</literal> the static <filename>resolv.conf</filename> file supplied with
870 <citerefentry><refentrytitle>systemd-resolved.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> is
871 copied into the container, and correspondingly <literal>bind-static</literal> bind mounts it there. If set to
872 <literal>delete</literal> the <filename>/etc/resolv.conf</filename> file in the container is deleted if it
873 exists. Finally, if set to <literal>auto</literal> the file is left as it is if private networking is turned on
874 (see <option>--private-network</option>). Otherwise, if <filename>systemd-resolved.service</filename> is
875 connectible its static <filename>resolv.conf</filename> file is used, and if not the host's
876 <filename>/etc/resolv.conf</filename> file is used. In the latter cases the file is copied if the image is
877 writable, and bind mounted otherwise. It's recommended to use <literal>copy</literal> if the container shall be
878 able to make changes to the DNS configuration on its own, deviating from the host's settings. Otherwise
879 <literal>bind</literal> is preferable, as it means direct changes to <filename>/etc/resolv.conf</filename> in
880 the container are not allowed, as it is a read-only bind mount (but note that if the container has enough
881 privileges, it might simply go ahead and unmount the bind mount anyway). Note that both if the file is bind
882 mounted and if it is copied no further propagation of configuration is generally done after the one-time early
883 initialization (this is because the file is usually updated through copying and renaming). Defaults to
884 <literal>auto</literal>.</para></listitem>
885 </varlistentry>
886
887 <varlistentry>
888 <term><option>--timezone=</option></term>
889
890 <listitem><para>Configures how <filename>/etc/localtime</filename> inside of the container (i.e. local timezone
891 synchronization from host to container) shall be handled. Takes one of <literal>off</literal>,
892 <literal>copy</literal>, <literal>bind</literal>, <literal>symlink</literal>, <literal>delete</literal> or
893 <literal>auto</literal>. If set to <literal>off</literal> the <filename>/etc/localtime</filename> file in the
894 container is left as it is included in the image, and neither modified nor bind mounted over. If set to
895 <literal>copy</literal> the <filename>/etc/localtime</filename> file of the host is copied into the
896 container. Similar, if <literal>bind</literal> is used, it is bind mounted from the host into the container. If
897 set to <literal>symlink</literal> a symlink from <filename>/etc/localtime</filename> in the container is
898 created pointing to the matching the timezone file of the container that matches the timezone setting on the
899 host. If set to <literal>delete</literal> the file in the container is deleted, should it exist. If set to
900 <literal>auto</literal> and the <filename>/etc/localtime</filename> file of the host is a symlink, then
901 <literal>symlink</literal> mode is used, and <literal>copy</literal> otherwise, except if the image is
902 read-only in which case <literal>bind</literal> is used instead. Defaults to
903 <literal>auto</literal>.</para></listitem>
904 </varlistentry>
905
906 <varlistentry>
907 <term><option>--read-only</option></term>
908
909 <listitem><para>Mount the container's root file system (and any other file systems container in the container
910 image) read-only. This has no effect on additional mounts made with <option>--bind=</option>,
911 <option>--tmpfs=</option> and similar options. This mode is implied if the container image file or directory is
912 marked read-only itself. It is also implied if <option>--volatile=</option> is used. In this case the container
913 image on disk is strictly read-only, while changes are permitted but kept non-persistently in memory only. For
914 further details, see below.</para></listitem>
915 </varlistentry>
916
917 <varlistentry>
918 <term><option>--bind=</option></term>
919 <term><option>--bind-ro=</option></term>
920
921 <listitem><para>Bind mount a file or directory from the host into the container. Takes one of: a path
922 argument — in which case the specified path will be mounted from the host to the same path in the container, or
923 a colon-separated pair of paths — in which case the first specified path is the source in the host, and the
924 second path is the destination in the container, or a colon-separated triple of source path, destination path
925 and mount options. The source path may optionally be prefixed with a <literal>+</literal> character. If so, the
926 source path is taken relative to the image's root directory. This permits setting up bind mounts within the
927 container image. The source path may be specified as empty string, in which case a temporary directory below
928 the host's <filename>/var/tmp</filename> directory is used. It is automatically removed when the container is
929 shut down. Mount options are comma-separated and currently, only <option>rbind</option> and
930 <option>norbind</option> are allowed, controlling whether to create a recursive or a regular bind
931 mount. Defaults to "rbind". Backslash escapes are interpreted, so <literal>\:</literal> may be used to embed
932 colons in either path. This option may be specified multiple times for creating multiple independent bind
933 mount points. The <option>--bind-ro=</option> option creates read-only bind mounts.</para>
934
935 <para>Note that when this option is used in combination with <option>--private-users</option>, the resulting
936 mount points will be owned by the <constant>nobody</constant> user. That's because the mount and its files and
937 directories continue to be owned by the relevant host users and groups, which do not exist in the container,
938 and thus show up under the wildcard UID 65534 (nobody). If such bind mounts are created, it is recommended to
939 make them read-only, using <option>--bind-ro=</option>.</para></listitem>
940 </varlistentry>
941
942 <varlistentry>
943 <term><option>--tmpfs=</option></term>
944
945 <listitem><para>Mount a tmpfs file system into the container. Takes a single absolute path argument that
946 specifies where to mount the tmpfs instance to (in which case the directory access mode will be chosen as 0755,
947 owned by root/root), or optionally a colon-separated pair of path and mount option string that is used for
948 mounting (in which case the kernel default for access mode and owner will be chosen, unless otherwise
949 specified). Backslash escapes are interpreted in the path, so <literal>\:</literal> may be used to embed colons
950 in the path.</para>
951
952 <para>Note that this option cannot be used to replace the root file system of the container with a temporary
953 file system. However, the <option>--volatile=</option> option described below provides similar
954 functionality, with a focus on implementing stateless operating system images.</para></listitem>
955 </varlistentry>
956
957 <varlistentry>
958 <term><option>--overlay=</option></term>
959 <term><option>--overlay-ro=</option></term>
960
961 <listitem><para>Combine multiple directory trees into one
962 overlay file system and mount it into the container. Takes a
963 list of colon-separated paths to the directory trees to
964 combine and the destination mount point.</para>
965
966 <para>Backslash escapes are interpreted in the paths, so
967 <literal>\:</literal> may be used to embed colons in the paths.
968 </para>
969
970 <para>If three or more paths are specified, then the last
971 specified path is the destination mount point in the
972 container, all paths specified before refer to directory trees
973 on the host and are combined in the specified order into one
974 overlay file system. The left-most path is hence the lowest
975 directory tree, the second-to-last path the highest directory
976 tree in the stacking order. If <option>--overlay-ro=</option>
977 is used instead of <option>--overlay=</option>, a read-only
978 overlay file system is created. If a writable overlay file
979 system is created, all changes made to it are written to the
980 highest directory tree in the stacking order, i.e. the
981 second-to-last specified.</para>
982
983 <para>If only two paths are specified, then the second
984 specified path is used both as the top-level directory tree in
985 the stacking order as seen from the host, as well as the mount
986 point for the overlay file system in the container. At least
987 two paths have to be specified.</para>
988
989 <para>The source paths may optionally be prefixed with <literal>+</literal> character. If so they are taken
990 relative to the image's root directory. The uppermost source path may also be specified as empty string, in
991 which case a temporary directory below the host's <filename>/var/tmp</filename> is used. The directory is
992 removed automatically when the container is shut down. This behaviour is useful in order to make read-only
993 container directories writable while the container is running. For example, use the
994 <literal>--overlay=+/var::/var</literal> option in order to automatically overlay a writable temporary
995 directory on a read-only <filename>/var</filename> directory.</para>
996
997 <para>For details about overlay file systems, see <ulink
998 url="https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt">overlayfs.txt</ulink>. Note
999 that the semantics of overlay file systems are substantially
1000 different from normal file systems, in particular regarding
1001 reported device and inode information. Device and inode
1002 information may change for a file while it is being written
1003 to, and processes might see out-of-date versions of files at
1004 times. Note that this switch automatically derives the
1005 <literal>workdir=</literal> mount option for the overlay file
1006 system from the top-level directory tree, making it a sibling
1007 of it. It is hence essential that the top-level directory tree
1008 is not a mount point itself (since the working directory must
1009 be on the same file system as the top-most directory
1010 tree). Also note that the <literal>lowerdir=</literal> mount
1011 option receives the paths to stack in the opposite order of
1012 this switch.</para>
1013
1014 <para>Note that this option cannot be used to replace the root file system of the container with an overlay
1015 file system. However, the <option>--volatile=</option> option described below provides similar functionality,
1016 with a focus on implementing stateless operating system images.</para></listitem>
1017 </varlistentry>
1018
1019 <varlistentry>
1020 <term><option>-E <replaceable>NAME</replaceable>=<replaceable>VALUE</replaceable></option></term>
1021 <term><option>--setenv=<replaceable>NAME</replaceable>=<replaceable>VALUE</replaceable></option></term>
1022
1023 <listitem><para>Specifies an environment variable assignment
1024 to pass to the init process in the container, in the format
1025 <literal>NAME=VALUE</literal>. This may be used to override
1026 the default variables or to set additional variables. This
1027 parameter may be used more than once.</para></listitem>
1028 </varlistentry>
1029
1030 <varlistentry>
1031 <term><option>--register=</option></term>
1032
1033 <listitem><para>Controls whether the container is registered with
1034 <citerefentry><refentrytitle>systemd-machined</refentrytitle><manvolnum>8</manvolnum></citerefentry>. Takes a
1035 boolean argument, which defaults to <literal>yes</literal>. This option should be enabled when the container
1036 runs a full Operating System (more specifically: a system and service manager as PID 1), and is useful to
1037 ensure that the container is accessible via
1038 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry> and shown by
1039 tools such as <citerefentry
1040 project='man-pages'><refentrytitle>ps</refentrytitle><manvolnum>1</manvolnum></citerefentry>. If the container
1041 does not run a service manager, it is recommended to set this option to
1042 <literal>no</literal>.</para></listitem>
1043 </varlistentry>
1044
1045 <varlistentry>
1046 <term><option>--keep-unit</option></term>
1047
1048 <listitem><para>Instead of creating a transient scope unit to run the container in, simply use the service or
1049 scope unit <command>systemd-nspawn</command> has been invoked in. If <option>--register=yes</option> is set
1050 this unit is registered with
1051 <citerefentry><refentrytitle>systemd-machined</refentrytitle><manvolnum>8</manvolnum></citerefentry>. This
1052 switch should be used if <command>systemd-nspawn</command> is invoked from within a service unit, and the
1053 service unit's sole purpose is to run a single <command>systemd-nspawn</command> container. This option is not
1054 available if run from a user session.</para>
1055 <para>Note that passing <option>--keep-unit</option> disables the effect of <option>--slice=</option> and
1056 <option>--property=</option>. Use <option>--keep-unit</option> and <option>--register=no</option> in
1057 combination to disable any kind of unit allocation or registration with
1058 <command>systemd-machined</command>.</para></listitem>
1059 </varlistentry>
1060
1061 <varlistentry>
1062 <term><option>--personality=</option></term>
1063
1064 <listitem><para>Control the architecture ("personality")
1065 reported by
1066 <citerefentry project='man-pages'><refentrytitle>uname</refentrytitle><manvolnum>2</manvolnum></citerefentry>
1067 in the container. Currently, only <literal>x86</literal> and
1068 <literal>x86-64</literal> are supported. This is useful when
1069 running a 32-bit container on a 64-bit host. If this setting
1070 is not used, the personality reported in the container is the
1071 same as the one reported on the host.</para></listitem>
1072 </varlistentry>
1073
1074 <varlistentry>
1075 <term><option>-q</option></term>
1076 <term><option>--quiet</option></term>
1077
1078 <listitem><para>Turns off any status output by the tool
1079 itself. When this switch is used, the only output from nspawn
1080 will be the console output of the container OS
1081 itself.</para></listitem>
1082 </varlistentry>
1083
1084 <varlistentry>
1085 <term><option>--volatile</option></term>
1086 <term><option>--volatile=</option><replaceable>MODE</replaceable></term>
1087
1088 <listitem><para>Boots the container in volatile mode. When no mode parameter is passed or when mode is
1089 specified as <option>yes</option>, full volatile mode is enabled. This means the root directory is mounted as a
1090 mostly unpopulated <literal>tmpfs</literal> instance, and <filename>/usr/</filename> from the OS tree is
1091 mounted into it in read-only mode (the system thus starts up with read-only OS image, but pristine state and
1092 configuration, any changes are lost on shutdown). When the mode parameter is specified as
1093 <option>state</option>, the OS tree is mounted read-only, but <filename>/var/</filename> is mounted as a
1094 writable <literal>tmpfs</literal> instance into it (the system thus starts up with read-only OS resources and
1095 configuration, but pristine state, and any changes to the latter are lost on shutdown). When the mode parameter
1096 is specified as <option>overlay</option> the read-only root file system is combined with a writable
1097 <filename>tmpfs</filename> instance through <literal>overlayfs</literal>, so that it appears at it normally
1098 would, but any changes are applied to the temporary file system only and lost when the container is
1099 terminated. When the mode parameter is specified as <option>no</option> (the default), the whole OS tree is
1100 made available writable (unless <option>--read-only</option> is specified, see above).</para>
1101
1102 <para>Note that if one of the volatile modes is chosen, its effect is limited to the root file system (or
1103 <filename>/var/</filename> in case of <option>state</option>), and any other mounts placed in the hierarchy are
1104 unaffected — regardless if they are established automatically (e.g. the EFI system partition that might be
1105 mounted to <filename>/efi/</filename> or <filename>/boot/</filename>) or explicitly (e.g. through an additional
1106 command line option such as <option>--bind=</option>, see above). This means, even if
1107 <option>--volatile=overlay</option> is used changes to <filename>/efi/</filename> or
1108 <filename>/boot/</filename> are prohibited in case such a partition exists in the container image operated on,
1109 and even if <option>--volatile=state</option> is used the hypothetical file <filename>/etc/foobar</filename> is
1110 potentially writable if <option>--bind=/etc/foobar</option> if used to mount it from outside the read-only
1111 container <filename>/etc</filename> directory.</para>
1112
1113 <para>The <option>--ephemeral</option> option is closely related to this setting, and provides similar
1114 behaviour by making a temporary, ephemeral copy of the whole OS image and executing that. For further details,
1115 see above.</para>
1116
1117 <para>The <option>--tmpfs=</option> and <option>--overlay=</option> options provide similar functionality, but
1118 for specific sub-directories of the OS image only. For details, see above.</para>
1119
1120 <para>This option provides similar functionality for containers as the <literal>systemd.volatile=</literal>
1121 kernel command line switch provides for host systems. See
1122 <citerefentry><refentrytitle>kernel-command-line</refentrytitle><manvolnum>7</manvolnum></citerefentry> for
1123 details.</para>
1124
1125 <para>Note that setting this option to <option>yes</option> or <option>state</option> will only work correctly
1126 with operating systems in the container that can boot up with only <filename>/usr</filename> mounted, and are
1127 able to automatically populate <filename>/var</filename>, and also <filename>/etc</filename> in case of
1128 <literal>--volatile=yes</literal>. The <option>overlay</option> option does not require any particular
1129 preparations in the OS, but do note that <literal>overlayfs</literal> behaviour differs from regular file
1130 systems in a number of ways, and hence compatibility is limited.</para></listitem>
1131 </varlistentry>
1132
1133 <varlistentry>
1134 <term><option>--settings=</option><replaceable>MODE</replaceable></term>
1135
1136 <listitem><para>Controls whether
1137 <command>systemd-nspawn</command> shall search for and use
1138 additional per-container settings from
1139 <filename>.nspawn</filename> files. Takes a boolean or the
1140 special values <option>override</option> or
1141 <option>trusted</option>.</para>
1142
1143 <para>If enabled (the default), a settings file named after the
1144 machine (as specified with the <option>--machine=</option>
1145 setting, or derived from the directory or image file name)
1146 with the suffix <filename>.nspawn</filename> is searched in
1147 <filename>/etc/systemd/nspawn/</filename> and
1148 <filename>/run/systemd/nspawn/</filename>. If it is found
1149 there, its settings are read and used. If it is not found
1150 there, it is subsequently searched in the same directory as the
1151 image file or in the immediate parent of the root directory of
1152 the container. In this case, if the file is found, its settings
1153 will be also read and used, but potentially unsafe settings
1154 are ignored. Note that in both these cases, settings on the
1155 command line take precedence over the corresponding settings
1156 from loaded <filename>.nspawn</filename> files, if both are
1157 specified. Unsafe settings are considered all settings that
1158 elevate the container's privileges or grant access to
1159 additional resources such as files or directories of the
1160 host. For details about the format and contents of
1161 <filename>.nspawn</filename> files, consult
1162 <citerefentry><refentrytitle>systemd.nspawn</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
1163
1164 <para>If this option is set to <option>override</option>, the
1165 file is searched, read and used the same way, however, the order of
1166 precedence is reversed: settings read from the
1167 <filename>.nspawn</filename> file will take precedence over
1168 the corresponding command line options, if both are
1169 specified.</para>
1170
1171 <para>If this option is set to <option>trusted</option>, the
1172 file is searched, read and used the same way, but regardless
1173 of being found in <filename>/etc/systemd/nspawn/</filename>,
1174 <filename>/run/systemd/nspawn/</filename> or next to the image
1175 file or container root directory, all settings will take
1176 effect, however, command line arguments still take precedence
1177 over corresponding settings.</para>
1178
1179 <para>If disabled, no <filename>.nspawn</filename> file is read
1180 and no settings except the ones on the command line are in
1181 effect.</para></listitem>
1182 </varlistentry>
1183
1184 <varlistentry>
1185 <term><option>--notify-ready=</option></term>
1186
1187 <listitem><para>Configures support for notifications from the container's init process.
1188 <option>--notify-ready=</option> takes a boolean (<option>no</option> and <option>yes</option>).
1189 With option <option>no</option> systemd-nspawn notifies systemd
1190 with a <literal>READY=1</literal> message when the init process is created.
1191 With option <option>yes</option> systemd-nspawn waits for the
1192 <literal>READY=1</literal> message from the init process in the container
1193 before sending its own to systemd. For more details about notifications
1194 see <citerefentry><refentrytitle>sd_notify</refentrytitle><manvolnum>3</manvolnum></citerefentry>).</para></listitem>
1195 </varlistentry>
1196
1197 <xi:include href="standard-options.xml" xpointer="help" />
1198 <xi:include href="standard-options.xml" xpointer="version" />
1199 </variablelist>
1200
1201 </refsect1>
1202
1203 <refsect1>
1204 <title>Examples</title>
1205
1206 <example>
1207 <title>Download a
1208 <ulink url="https://getfedora.org">Fedora</ulink> image and start a shell in it</title>
1209
1210 <programlisting># machinectl pull-raw --verify=no \
1211 https://download.fedoraproject.org/pub/fedora/linux/releases/&fedora_latest_version;/Cloud/x86_64/images/Fedora-Cloud-Base-&fedora_latest_version;-&fedora_cloud_release;.x86_64.raw.xz
1212 # systemd-nspawn -M Fedora-Cloud-Base-&fedora_latest_version;-&fedora_cloud_release;.x86_64.raw</programlisting>
1213
1214 <para>This downloads an image using
1215 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>
1216 and opens a shell in it.</para>
1217 </example>
1218
1219 <example>
1220 <title>Build and boot a minimal Fedora distribution in a container</title>
1221
1222 <programlisting># dnf -y --releasever=&fedora_latest_version; --installroot=/var/lib/machines/f&fedora_latest_version; \
1223 --disablerepo='*' --enablerepo=fedora --enablerepo=updates install \
1224 systemd passwd dnf fedora-release vim-minimal
1225 # systemd-nspawn -bD /var/lib/machines/f&fedora_latest_version;</programlisting>
1226
1227 <para>This installs a minimal Fedora distribution into the
1228 directory <filename noindex='true'>/var/lib/machines/f&fedora_latest_version;</filename>
1229 and then boots an OS in a namespace container in it. Because the installation
1230 is located underneath the standard <filename>/var/lib/machines/</filename>
1231 directory, it is also possible to start the machine using
1232 <command>systemd-nspawn -M f&fedora_latest_version;</command>.</para>
1233 </example>
1234
1235 <example>
1236 <title>Spawn a shell in a container of a minimal Debian unstable distribution</title>
1237
1238 <programlisting># debootstrap unstable ~/debian-tree/
1239 # systemd-nspawn -D ~/debian-tree/</programlisting>
1240
1241 <para>This installs a minimal Debian unstable distribution into
1242 the directory <filename>~/debian-tree/</filename> and then
1243 spawns a shell in a namespace container in it.</para>
1244
1245 <para><command>debootstrap</command> supports
1246 <ulink url="https://www.debian.org">Debian</ulink>,
1247 <ulink url="https://www.ubuntu.com">Ubuntu</ulink>,
1248 and <ulink url="https://www.tanglu.org">Tanglu</ulink>
1249 out of the box, so the same command can be used to install any of those. For other
1250 distributions from the Debian family, a mirror has to be specified, see
1251 <citerefentry project='die-net'><refentrytitle>debootstrap</refentrytitle><manvolnum>8</manvolnum></citerefentry>.
1252 </para>
1253 </example>
1254
1255 <example>
1256 <title>Boot a minimal
1257 <ulink url="https://www.archlinux.org">Arch Linux</ulink> distribution in a container</title>
1258
1259 <programlisting># pacstrap -c -d ~/arch-tree/ base
1260 # systemd-nspawn -bD ~/arch-tree/</programlisting>
1261
1262 <para>This installs a minimal Arch Linux distribution into the
1263 directory <filename>~/arch-tree/</filename> and then boots an OS
1264 in a namespace container in it.</para>
1265 </example>
1266
1267 <example>
1268 <title>Install the
1269 <ulink url="https://software.opensuse.org/distributions/tumbleweed">OpenSUSE Tumbleweed</ulink>
1270 rolling distribution</title>
1271
1272 <programlisting># zypper --root=/var/lib/machines/tumbleweed ar -c \
1273 https://download.opensuse.org/tumbleweed/repo/oss tumbleweed
1274 # zypper --root=/var/lib/machines/tumbleweed refresh
1275 # zypper --root=/var/lib/machines/tumbleweed install --no-recommends \
1276 systemd shadow zypper openSUSE-release vim
1277 # systemd-nspawn -M tumbleweed passwd root
1278 # systemd-nspawn -M tumbleweed -b</programlisting>
1279 </example>
1280
1281 <example>
1282 <title>Boot into an ephemeral snapshot of the host system</title>
1283
1284 <programlisting># systemd-nspawn -D / -xb</programlisting>
1285
1286 <para>This runs a copy of the host system in a snapshot which is removed immediately when the container
1287 exits. All file system changes made during runtime will be lost on shutdown, hence.</para>
1288 </example>
1289
1290 <example>
1291 <title>Run a container with SELinux sandbox security contexts</title>
1292
1293 <programlisting># chcon system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -R /srv/container
1294 # systemd-nspawn -L system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 \
1295 -Z system_u:system_r:svirt_lxc_net_t:s0:c0,c1 -D /srv/container /bin/sh</programlisting>
1296 </example>
1297
1298 <example>
1299 <title>Run a container with an OSTree deployment</title>
1300
1301 <programlisting># systemd-nspawn -b -i ~/image.raw \
1302 --pivot-root=/ostree/deploy/$OS/deploy/$CHECKSUM:/sysroot \
1303 --bind=+/sysroot/ostree/deploy/$OS/var:/var</programlisting>
1304 </example>
1305 </refsect1>
1306
1307 <refsect1>
1308 <title>Exit status</title>
1309
1310 <para>The exit code of the program executed in the container is
1311 returned.</para>
1312 </refsect1>
1313
1314 <refsect1>
1315 <title>See Also</title>
1316 <para>
1317 <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
1318 <citerefentry><refentrytitle>systemd.nspawn</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
1319 <citerefentry project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
1320 <citerefentry project='mankier'><refentrytitle>dnf</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
1321 <citerefentry project='die-net'><refentrytitle>debootstrap</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
1322 <citerefentry project='archlinux'><refentrytitle>pacman</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
1323 <citerefentry project='mankier'><refentrytitle>zypper</refentrytitle><manvolnum>8</manvolnum></citerefentry>,
1324 <citerefentry><refentrytitle>systemd.slice</refentrytitle><manvolnum>5</manvolnum></citerefentry>,
1325 <citerefentry><refentrytitle>machinectl</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
1326 <citerefentry project='man-pages'><refentrytitle>btrfs</refentrytitle><manvolnum>8</manvolnum></citerefentry>
1327 </para>
1328 </refsect1>
1329
1330 </refentry>