]>
Commit | Line | Data |
---|---|---|
44d565ed LP |
1 | # Portable Services Introduction |
2 | ||
3 | This systemd version includes a preview of the "portable service" | |
4 | concept. "Portable Services" are supposed to be an incremental improvement over | |
5 | traditional system services, making two specific facets of container management | |
6 | available to system services more readily. Specifically: | |
7 | ||
8 | 1. The bundling of applications, i.e. packing up multiple services, their | |
9 | binaries and all their dependencies in a single image, and running them | |
10 | directly from it. | |
11 | ||
12 | 2. Stricter default security policies, i.e. sand-boxing of applications. | |
13 | ||
14 | The primary tool for interfacing with "portable services" is the new | |
15 | "portablectl" program. It's currently shipped in /usr/lib/systemd/portablectl | |
16 | (i.e. not in the `$PATH`), since it's not yet considered part of the officially | |
17 | supported systemd interfaces — it's a preview still after all. | |
18 | ||
19 | Portable services don't bring anything inherently new to the table. All they do | |
20 | is put together known concepts in a slightly nicer way to cover a specific set | |
21 | of use-cases in a nicer way. | |
22 | ||
23 | # So, what *is* a "Portable Service"? | |
24 | ||
25 | A portable service is ultimately just an OS tree, either inside of a directory | |
26 | tree, or inside a raw disk image containing a Linux file system. This tree is | |
27 | called the "image". It can be "attached" or "detached" from the system. When | |
28 | "attached" specific systemd units from the image are made available on the host | |
29 | system, then behaving pretty much exactly like locally installed system | |
30 | services. When "detached" these units are removed again from the host, leaving | |
31 | no artifacts around (except maybe messages they might have logged). | |
32 | ||
33 | The OS tree/image can be created with any tool of your choice. For example, you | |
34 | can use `dnf --installroot=` if you like, or `debootstrap`, the image format is | |
35 | entirely generic, and doesn't have to carry any specific metadata beyond what | |
36 | distribution images carry anyway. Or to say this differently: the image format | |
37 | doesn't define any new metadata as unit files and OS tree directories or disk | |
38 | images are already sufficient, and pretty universally available these days. One | |
39 | particularly nice tool for creating suitable images is | |
40 | [mkosi](https://github.com/systemd/mkosi), but many other existing tools will | |
41 | do too. | |
42 | ||
43 | If you so will, "Portable Services" are a nicer way to manage chroot() | |
44 | environments, with better security, tooling and behavior. | |
45 | ||
46 | # Where's the difference to a "Container"? | |
47 | ||
48 | "Container" is a very vague term, after all it is used for | |
49 | systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service | |
50 | containers, and even certain 'lightweight' VM runtimes. | |
51 | ||
52 | The "portable service" concept ultimately will not provide a fully isolated | |
53 | environment to the payload, like containers mostly intend to. Instead they are | |
54 | from the beginning more alike regular system services, can be controlled with | |
55 | the same tools, are exposed the same way in all infrastructure and so on. Their | |
56 | main difference is that the use a different root directory than the rest of the | |
57 | system. Hence, the intention is not to run code in a different, isolated world | |
58 | from the host — like most containers would do it —, but to run it in the same | |
59 | world, but with stricter access controls on what the service can see and do. | |
60 | ||
61 | As one point of differentiation: as programs run as "portable services" are | |
62 | pretty much regular system services, they won't run as PID 1 (like Docker would | |
63 | do it), but as normal process. A corollary of that is that they aren't supposed | |
64 | to manage anything in their own environment (such as the network) as the | |
65 | execution environment is mostly shared with the rest of the system. | |
66 | ||
67 | The primary focus use-case of "portable services" is to extend the host system | |
68 | with encapsulated extensions, but provide almost full integration with the rest | |
69 | of the system, though possibly restricted by effective security knobs. This | |
70 | focus includes system extensions otherwise sometimes called "super-privileged | |
71 | containers". | |
72 | ||
73 | Note that portable services are only available for system services, not for | |
74 | user services. i.e. the functionality cannot be used for the stuff | |
75 | bubblewrap/flatpak is focusing on. | |
76 | ||
77 | # Mode of Operation | |
78 | ||
79 | If you have portable service image, maybe in a raw disk image called | |
80 | `foobar_0.7.23.raw`, then attaching the services to the host is as easy as: | |
81 | ||
82 | ``` | |
83 | # /usr/lib/systemd/portablectl attach foobar_0.7.23.raw | |
84 | ``` | |
85 | ||
86 | This command does the following: | |
87 | ||
88 | 1. It dissects the image, checks and validates the `/etc/os-release` data of | |
89 | the image, and looks for all included unit files. | |
90 | ||
91 | 2. It copies out all unit files with a suffix of `.service`, `.socket`, | |
92 | `.target`, `.timer` and `.path`. whose name begins with the image's name | |
93 | (with the .raw removed), truncated at the first underscore (if there is | |
94 | one). This prefix name generated from the image name must be followed by a | |
95 | ".", "-" or "@" character in the unit name. Or in other words, given the | |
96 | image name of `foobar_0.7.23.raw` all unit files matching | |
97 | `foobar-*.{service|socket|target|timer|path}`, | |
98 | `foobar@.{service|socket|target|timer|path}` as well as | |
99 | `foobar.*.{service|socket|target|timer|path}` and | |
100 | `foobar.{service|socket|target|timer|path}` are copied out. These unit files | |
101 | are placed in `/etc/systemd/system/` like regular unit files. Within the | |
102 | images the unit files are looked for at the usual locations, i.e. in | |
103 | `/usr/lib/systemd/system/` and `/etc/systemd/system/` and so on, relative to | |
104 | the image's root. | |
105 | ||
106 | 3. For each such unit file a drop-in file is created. Let's say | |
107 | `foobar-waldo.service` was one of the unit files copied to | |
108 | `/etc/systemd/system/`, then a drop-in file | |
109 | `/etc/systemd/system/foobar-waldo.service.d/20-portable.conf` is created, | |
110 | containing a few lines of additional configuration: | |
111 | ||
112 | ``` | |
113 | [Service] | |
114 | RootImage=/path/to/foobar.raw | |
115 | Environment=PORTABLE=foobar | |
116 | LogExtraFields=PORTABLE=foobar | |
117 | ``` | |
118 | ||
119 | 4. For each such unit a "profile" drop-in is linked in. This "profile" drop-in | |
120 | generally contains security options that lock down the service. By default | |
121 | the `default` profile is used, which provides a medium level of | |
122 | security. There's also `trusted` which runs the service at the highest | |
b99bfb13 | 123 | privileges, i.e. host's root and everything. The `strict` profile comes with |
44d565ed LP |
124 | the toughest security restrictions. Finally, `nonetwork` is like `default` |
125 | but without network access. Users may define their own profiles too (or | |
126 | modify the existing ones) | |
127 | ||
128 | And that's already it. | |
129 | ||
130 | Note that the images need to stay around (and the same location) as long as the | |
131 | portable service is attached. If an image is moved, the `RootImage=` line | |
132 | written to the unit drop-in would point to an non-existing place, and break the | |
133 | logic. | |
134 | ||
135 | The `portablectl detach` command executes the reverse operation: it looks for | |
136 | the drop-ins and the unit files associated with the image, and removes them | |
137 | again. | |
138 | ||
139 | Note that `portable attach` won't enable or start any of the units it copies | |
140 | out. This still has to take place in a second, separate step. (That said We | |
141 | might add options to do this automatically later on.). | |
142 | ||
143 | # Requirements on Images | |
144 | ||
145 | Note that portable services don't introduce any new image format, but most OS | |
146 | images should just work the way they are. Specifically, the following | |
147 | requirements are made for an image that can be attached/detached with | |
148 | `portablectl`. | |
149 | ||
150 | 1. It must contain a binary (and its dependencies) that shall be invoked, | |
151 | including all its dependencies. If binary code, the code needs to be | |
152 | compiled for an architecture compatible with the host. | |
153 | ||
154 | 2. The image must either be a plain sub-directory (or btrfs subvolume) | |
155 | containing the binaries and its dependencies in a classic Linux OS tree, or | |
156 | must be a raw disk image either containing only one, naked file system, or | |
157 | an image with a partition table understood by the Linux kernel with only a | |
158 | single partition defined, or alternatively, a GPT partition table with a set | |
159 | of properly marked partitions following the [Discoverable Partitions | |
160 | Specification](https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/). | |
161 | ||
162 | 3. The image must at least contain one matching unit file, with the right name | |
163 | prefix and suffix (see above). The unit file is searched in the usual paths, | |
164 | i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the | |
165 | image. (The implementation will check a couple of other paths too, but it's | |
166 | recommended to use these two paths.) | |
167 | ||
168 | 4. The image must contain an os-release file, either in /etc/os-release or | |
169 | /usr/lib/os-release. The file should follow the standard format. | |
170 | ||
171 | Note that generally images created by tools such as `debootstrap`, `dnf | |
172 | --installroot=` or `mkosi` qualify for all of the above in one way or | |
173 | another. If you wonder what the most minimal image would be that complies with | |
174 | the requirements above, it could consist of this: | |
175 | ||
176 | ``` | |
177 | /usr/bin/minimald # a statically compiled binary | |
178 | /usr/lib/systemd/minimal-test.service # the unit file for the service, with ExecStart=/usr/bin/minimald | |
179 | /usr/lib/os-release # an os-release file explaining what this is | |
180 | ``` | |
181 | ||
182 | And that's it. | |
183 | ||
184 | Note that qualifying images do not have to contain an init system of their | |
185 | own. If they do, it's fine, it will be ignored by the portable service logic, | |
186 | but they generally don't have to, and it might make sense to avoid any, to keep | |
187 | images minimal. | |
188 | ||
189 | Note that as no new image format or metadata is defined, it's very | |
190 | straight-forward to define images than can be made use of it a number of | |
191 | different ways. For example, by using `mkosi -b` you can trivially build a | |
192 | single, unified image that: | |
193 | ||
194 | 1. Can be attached as portable service, to run any container services natively | |
195 | on the host. | |
196 | ||
197 | 2. Can be run as OS container, using `systemd-nspawn`, by booting the image | |
198 | with `systemd-nspawn -i -b`. | |
199 | ||
200 | 3. Can be booted directly as VM image, using a generic VM executor such as | |
201 | `virtualbox`/`qemu`/`kvm` | |
202 | ||
203 | 4. Can be booted directly on bare-metal systems. | |
204 | ||
205 | Of course, to facilitate 2, 3 and 4 you need to include an init system in the | |
206 | image. To facility 3 and 4 you also need to include a boot loader in the | |
207 | image. As mentioned `mkosi -b` takes care of all of that for you, but any other | |
208 | image generator should work too. | |
209 | ||
210 | # Execution Environment | |
211 | ||
212 | Note that the code in portable service images is run exactly like regular | |
213 | services. Hence there's no new execution environment to consider. Oh, unlike | |
214 | Docker would do it, as these are regular system services they aren't run as PID | |
215 | 1 either, but with regular PID values. | |
216 | ||
217 | # Access to host resources | |
218 | ||
219 | If services shipped with this mechanism shall be able to access host resources | |
220 | (such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and | |
221 | `BindReadOnlyPaths=` settings in unit files to mount them in. In fact the | |
222 | `default` profile mentioned above makes use of this to ensure | |
223 | `/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging | |
224 | subsystem are available to the service. | |
225 | ||
226 | # Instantiation | |
227 | ||
228 | Sometimes it makes sense to instantiate the same set of services multiple | |
229 | times. The portable service concept does not introduce a new logic for this. It | |
230 | is recommended to use the regular unit templating of systemd for this, i.e. to | |
231 | include template units such as `foobar@.service`, so that instantiation is as | |
232 | simple as: | |
233 | ||
234 | ``` | |
235 | # /usr/lib/systemd/portablectl attach foobar_0.7.23.raw | |
236 | # systemctl enable --now foobar@instancea.service | |
237 | # systemctl enable --now foobar@instanceb.service | |
238 | … | |
239 | ``` | |
240 | ||
241 | The benefit of this approach is that templating works exactly the same for | |
242 | units shipped with the OS itself as for attached portable services. | |
243 | ||
244 | # Immutable images with local data | |
245 | ||
246 | It's a good idea to keep portable service images read-only during normal | |
247 | operation. In fact all but the `trusted` profile will default to this kind of | |
248 | behaviour, by setting the `ProtectSystem=strict` option. In this case writable | |
249 | service data may be placed on the host file system. Use `StateDirectory=` in | |
250 | the unit files to enable such behaviour and add a local data directory to the | |
251 | services copied onto the host. |