]> git.ipfire.org Git - thirdparty/systemd.git/blame - docs/PORTABLE_SERVICES.md
fs-util: no need for fchmod_and_chown() to access /proc/self/fd directly
[thirdparty/systemd.git] / docs / PORTABLE_SERVICES.md
CommitLineData
c3e270f4
FB
1---
2title: Portable Services Introduction
3---
4
44d565ed
LP
5# Portable Services Introduction
6
7This systemd version includes a preview of the "portable service"
8concept. "Portable Services" are supposed to be an incremental improvement over
9traditional system services, making two specific facets of container management
10available to system services more readily. Specifically:
11
121. The bundling of applications, i.e. packing up multiple services, their
13 binaries and all their dependencies in a single image, and running them
14 directly from it.
15
162. Stricter default security policies, i.e. sand-boxing of applications.
17
18The primary tool for interfacing with "portable services" is the new
19"portablectl" program. It's currently shipped in /usr/lib/systemd/portablectl
20(i.e. not in the `$PATH`), since it's not yet considered part of the officially
21supported systemd interfaces — it's a preview still after all.
22
23Portable services don't bring anything inherently new to the table. All they do
24is put together known concepts in a slightly nicer way to cover a specific set
25of use-cases in a nicer way.
26
991b4350 27## So, what *is* a "Portable Service"?
44d565ed
LP
28
29A portable service is ultimately just an OS tree, either inside of a directory
30tree, or inside a raw disk image containing a Linux file system. This tree is
31called the "image". It can be "attached" or "detached" from the system. When
32"attached" specific systemd units from the image are made available on the host
33system, then behaving pretty much exactly like locally installed system
34services. When "detached" these units are removed again from the host, leaving
35no artifacts around (except maybe messages they might have logged).
36
37The OS tree/image can be created with any tool of your choice. For example, you
38can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
39entirely generic, and doesn't have to carry any specific metadata beyond what
40distribution images carry anyway. Or to say this differently: the image format
41doesn't define any new metadata as unit files and OS tree directories or disk
42images are already sufficient, and pretty universally available these days. One
43particularly nice tool for creating suitable images is
44[mkosi](https://github.com/systemd/mkosi), but many other existing tools will
45do too.
46
47If you so will, "Portable Services" are a nicer way to manage chroot()
48environments, with better security, tooling and behavior.
49
991b4350 50## Where's the difference to a "Container"?
44d565ed
LP
51
52"Container" is a very vague term, after all it is used for
53systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service
54containers, and even certain 'lightweight' VM runtimes.
55
56The "portable service" concept ultimately will not provide a fully isolated
57environment to the payload, like containers mostly intend to. Instead they are
58from the beginning more alike regular system services, can be controlled with
59the same tools, are exposed the same way in all infrastructure and so on. Their
60main difference is that the use a different root directory than the rest of the
61system. Hence, the intention is not to run code in a different, isolated world
62from the host — like most containers would do it —, but to run it in the same
63world, but with stricter access controls on what the service can see and do.
64
65As one point of differentiation: as programs run as "portable services" are
66pretty much regular system services, they won't run as PID 1 (like Docker would
67do it), but as normal process. A corollary of that is that they aren't supposed
68to manage anything in their own environment (such as the network) as the
69execution environment is mostly shared with the rest of the system.
70
71The primary focus use-case of "portable services" is to extend the host system
72with encapsulated extensions, but provide almost full integration with the rest
73of the system, though possibly restricted by effective security knobs. This
74focus includes system extensions otherwise sometimes called "super-privileged
75containers".
76
77Note that portable services are only available for system services, not for
78user services. i.e. the functionality cannot be used for the stuff
79bubblewrap/flatpak is focusing on.
80
991b4350 81## Mode of Operation
44d565ed
LP
82
83If you have portable service image, maybe in a raw disk image called
84`foobar_0.7.23.raw`, then attaching the services to the host is as easy as:
85
86```
87# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
88```
89
90This command does the following:
91
6f61b14d
ДГ
921. It dissects the image, checks and validates the `/etc/os-release`
93 (or `/usr/lib/os-release`, see below) data of the image, and looks for
94 all included unit files.
44d565ed
LP
95
962. It copies out all unit files with a suffix of `.service`, `.socket`,
97 `.target`, `.timer` and `.path`. whose name begins with the image's name
98 (with the .raw removed), truncated at the first underscore (if there is
99 one). This prefix name generated from the image name must be followed by a
100 ".", "-" or "@" character in the unit name. Or in other words, given the
101 image name of `foobar_0.7.23.raw` all unit files matching
102 `foobar-*.{service|socket|target|timer|path}`,
103 `foobar@.{service|socket|target|timer|path}` as well as
104 `foobar.*.{service|socket|target|timer|path}` and
105 `foobar.{service|socket|target|timer|path}` are copied out. These unit files
83f72cd6
LP
106 are placed in `/etc/systemd/system.attached/` (which is part of the normal
107 unit file search path of PID 1, and thus loaded exactly like regular unit
108 files). Within the images the unit files are looked for at the usual
109 locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and
110 so on, relative to the image's root.
44d565ed
LP
111
1123. For each such unit file a drop-in file is created. Let's say
113 `foobar-waldo.service` was one of the unit files copied to
83f72cd6
LP
114 `/etc/systemd/system.attached/`, then a drop-in file
115 `/etc/systemd/system.attached/foobar-waldo.service.d/20-portable.conf` is
116 created, containing a few lines of additional configuration:
44d565ed
LP
117
118 ```
119 [Service]
120 RootImage=/path/to/foobar.raw
121 Environment=PORTABLE=foobar
122 LogExtraFields=PORTABLE=foobar
123 ```
124
1254. For each such unit a "profile" drop-in is linked in. This "profile" drop-in
126 generally contains security options that lock down the service. By default
127 the `default` profile is used, which provides a medium level of
128 security. There's also `trusted` which runs the service at the highest
b99bfb13 129 privileges, i.e. host's root and everything. The `strict` profile comes with
44d565ed
LP
130 the toughest security restrictions. Finally, `nonetwork` is like `default`
131 but without network access. Users may define their own profiles too (or
132 modify the existing ones)
133
134And that's already it.
135
136Note that the images need to stay around (and the same location) as long as the
137portable service is attached. If an image is moved, the `RootImage=` line
138written to the unit drop-in would point to an non-existing place, and break the
139logic.
140
141The `portablectl detach` command executes the reverse operation: it looks for
142the drop-ins and the unit files associated with the image, and removes them
143again.
144
145Note that `portable attach` won't enable or start any of the units it copies
146out. This still has to take place in a second, separate step. (That said We
147might add options to do this automatically later on.).
148
991b4350 149## Requirements on Images
44d565ed
LP
150
151Note that portable services don't introduce any new image format, but most OS
152images should just work the way they are. Specifically, the following
153requirements are made for an image that can be attached/detached with
154`portablectl`.
155
957848db
LP
1561. It must contain an executable that shall be invoked, along with all its
157 dependencies. If binary code, the code needs to be compiled for an
158 architecture compatible with the host.
44d565ed
LP
159
1602. The image must either be a plain sub-directory (or btrfs subvolume)
161 containing the binaries and its dependencies in a classic Linux OS tree, or
162 must be a raw disk image either containing only one, naked file system, or
163 an image with a partition table understood by the Linux kernel with only a
164 single partition defined, or alternatively, a GPT partition table with a set
165 of properly marked partitions following the [Discoverable Partitions
166 Specification](https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/).
167
1683. The image must at least contain one matching unit file, with the right name
169 prefix and suffix (see above). The unit file is searched in the usual paths,
170 i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the
171 image. (The implementation will check a couple of other paths too, but it's
172 recommended to use these two paths.)
173
6f61b14d
ДГ
1744. The image must contain an os-release file, either in `/etc/os-release` or
175 `/usr/lib/os-release`. The file should follow the standard format.
176
1775. The image must contain the files `/etc/resolv.conf` and `/etc/machine-id`
178 (empty files are ok), they will be bind mounted from the host at runtime.
44d565ed 179
957848db
LP
1806. The image must contain directories `/proc/`, `/sys/`, `/dev/`, `/run/`,
181 `/tmp/`, `/var/tmp/` that can be mounted over with the corresponding version
182 from the host.
183
1847. The OS might require other files or directories to be in place. For example,
185 if the image is built based on glibc, the dynamic loader needs to be
186 available in `/lib/ld-linux.so.2` or `/lib64/ld-linux-x86-64.so.2` (or
187 similar, depending on architecture), and if the distribution implements a
188 merged `/usr/` tree, this means `/lib` and/or `/lib64` need to be symlinks
189 to their respective counterparts below `/usr/`. For details see your
190 distribution's documentation.
191
192Note that images created by tools such as `debootstrap`, `dnf --installroot=`
193or `mkosi` generally qualify for all of the above in one way or another. If you
194wonder what the most minimal image would be that complies with the requirements
195above, it could consist of this:
44d565ed
LP
196
197```
570ee29c
LP
198/usr/bin/minimald # a statically compiled binary
199/usr/lib/systemd/system/minimal-test.service # the unit file for the service, with ExecStart=/usr/bin/minimald
200/usr/lib/os-release # an os-release file explaining what this is
201/etc/resolv.conf # empty file to mount over with host's version
202/etc/machine-id # ditto
203/proc/ # empty directory to use as mount point for host's API fs
204/sys/ # ditto
205/dev/ # ditto
206/run/ # ditto
207/tmp/ # ditto
208/var/tmp/ # ditto
44d565ed
LP
209```
210
211And that's it.
212
213Note that qualifying images do not have to contain an init system of their
214own. If they do, it's fine, it will be ignored by the portable service logic,
215but they generally don't have to, and it might make sense to avoid any, to keep
216images minimal.
217
957848db
LP
218If the image is writable, and some of the files or directories that are
219overmounted from the host do not exist yet they are automatically created. On
220read-only, immutable images (e.g. squashfs images) all files and directories to
221over-mount must exist already.
222
44d565ed
LP
223Note that as no new image format or metadata is defined, it's very
224straight-forward to define images than can be made use of it a number of
225different ways. For example, by using `mkosi -b` you can trivially build a
226single, unified image that:
227
2281. Can be attached as portable service, to run any container services natively
229 on the host.
230
2312. Can be run as OS container, using `systemd-nspawn`, by booting the image
232 with `systemd-nspawn -i -b`.
233
2343. Can be booted directly as VM image, using a generic VM executor such as
235 `virtualbox`/`qemu`/`kvm`
236
2374. Can be booted directly on bare-metal systems.
238
239Of course, to facilitate 2, 3 and 4 you need to include an init system in the
240image. To facility 3 and 4 you also need to include a boot loader in the
241image. As mentioned `mkosi -b` takes care of all of that for you, but any other
242image generator should work too.
243
991b4350 244## Execution Environment
44d565ed
LP
245
246Note that the code in portable service images is run exactly like regular
247services. Hence there's no new execution environment to consider. Oh, unlike
248Docker would do it, as these are regular system services they aren't run as PID
2491 either, but with regular PID values.
250
991b4350 251## Access to host resources
44d565ed
LP
252
253If services shipped with this mechanism shall be able to access host resources
254(such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and
255`BindReadOnlyPaths=` settings in unit files to mount them in. In fact the
256`default` profile mentioned above makes use of this to ensure
257`/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging
258subsystem are available to the service.
259
991b4350 260## Instantiation
44d565ed
LP
261
262Sometimes it makes sense to instantiate the same set of services multiple
263times. The portable service concept does not introduce a new logic for this. It
264is recommended to use the regular unit templating of systemd for this, i.e. to
265include template units such as `foobar@.service`, so that instantiation is as
266simple as:
267
268```
269# /usr/lib/systemd/portablectl attach foobar_0.7.23.raw
270# systemctl enable --now foobar@instancea.service
271# systemctl enable --now foobar@instanceb.service
272
273```
274
275The benefit of this approach is that templating works exactly the same for
276units shipped with the OS itself as for attached portable services.
277
991b4350 278## Immutable images with local data
44d565ed
LP
279
280It's a good idea to keep portable service images read-only during normal
281operation. In fact all but the `trusted` profile will default to this kind of
282behaviour, by setting the `ProtectSystem=strict` option. In this case writable
283service data may be placed on the host file system. Use `StateDirectory=` in
284the unit files to enable such behaviour and add a local data directory to the
285services copied onto the host.