]>
Commit | Line | Data |
---|---|---|
93f59100 LP |
1 | --- |
2 | title: Random Seeds | |
3 | --- | |
4 | ||
5 | # Random Seeds | |
6 | ||
7 | systemd can help in a number of ways with providing reliable, high quality | |
8 | random numbers from early boot on. | |
9 | ||
10 | ## Linux Kernel Entropy Pool | |
11 | ||
12 | Today's computer systems require random number generators for numerous | |
13 | cryptographic and other purposes. On Linux systems, the kernel's entropy pool | |
14 | is typically used as high-quality source of random numbers. The kernel's | |
15 | entropy pool combines various entropy inputs together, mixes them and provides | |
16 | an API to userspace as well as to internal kernel subsystems to retrieve | |
17 | it. This entropy pool needs to be initialized with a minimal level of entropy | |
18 | before it can provide high quality, cryptographic random numbers to | |
19 | applications. Until the entropy pool is fully initialized application requests | |
20 | for high-quality random numbers cannot be fulfilled. | |
21 | ||
22 | The Linux kernel provides three relevant userspace APIs to request random data | |
23 | from the kernel's entropy pool: | |
24 | ||
25 | * The [`getrandom()`](http://man7.org/linux/man-pages/man2/getrandom.2.html) | |
26 | system call with its `flags` parameter set to 0. If invoked the calling | |
27 | program will synchronously block until the random pool is fully initialized | |
28 | and the requested bytes can be provided. | |
29 | ||
30 | * The `getrandom()` system call with its `flags` parameter set to | |
31 | `GRND_NONBLOCK`. If invoked the request for random bytes will fail if the | |
32 | pool is not initialized yet. | |
33 | ||
34 | * Reading from the | |
35 | [`/dev/urandom`](http://man7.org/linux/man-pages/man4/urandom.4.html) | |
36 | pseudo-device will always return random bytes immediately, even if the pool | |
37 | is not initialized. The provided random bytes will be of low quality in this | |
38 | case however. Moreover the kernel will log about all programs using this | |
39 | interface in this state, and which thus potentially rely on an uninitialized | |
40 | entropy pool. | |
41 | ||
42 | (Strictly speaking there are more APIs, for example `/dev/random`, but these | |
43 | should not be used by almost any application and hence aren't mentioned here.) | |
44 | ||
45 | Note that the time it takes to initialize the random pool may differ between | |
46 | systems. If local hardware random number generators are available, | |
47 | initialization is likely quick, but particularly in embedded and virtualized | |
48 | environments available entropy is small and thus random pool initialization | |
49 | might take a long time (up to tens of minutes!). | |
50 | ||
51 | Modern hardware tends to come with a number of hardware random number | |
52 | generators (hwrng), that may be used to relatively quickly fill up the entropy | |
53 | pool. Specifically: | |
54 | ||
55 | * All recent Intel and AMD CPUs provide the CPU opcode | |
56 | [RDRAND](https://en.wikipedia.org/wiki/RdRand) to acquire random bytes. Linux | |
57 | includes random bytes generated this way in its entropy pool, but didn't use | |
58 | to credit entropy for it (i.e. data from this source wasn't considered good | |
59 | enough to consider the entropy pool properly filled even though it was | |
60 | used). This has changed recently however, and most big distributions have | |
61 | turned on the `CONFIG_RANDOM_TRUST_CPU=y` kernel compile time option. This | |
62 | means systems with CPUs supporting this opcode will be able to very quickly | |
63 | reach the "pool filled" state. | |
64 | ||
65 | * The TPM security chip that is available on all modern desktop systems has a | |
66 | hwrng. It is also fed into the entropy pool, but generally not credited | |
67 | entropy. You may use `rng_core.default_quality=1000` on the kernel command | |
68 | line to change that, but note that this is a global setting affect all | |
69 | hwrngs. (Yeah, that's weird.) | |
70 | ||
71 | * Many Intel and AMD chipsets have hwrng chips. Their Linux drivers usually | |
72 | don't credit entropy. (But there's `rng_core.default_quality=1000`, see | |
73 | above.) | |
74 | ||
75 | * Various embedded boards have hwrng chips. Some drivers automatically credit | |
76 | entropy, others do not. Some WiFi chips appear to have hwrng sources too, and | |
77 | they usually do not credit entropy for them. | |
78 | ||
79 | * `virtio-rng` is used in virtualized environments and retrieves random data | |
80 | from the VM host. It credits full entropy. | |
81 | ||
82 | * The EFI firmware typically provides a RNG API. When transitioning from UEFI | |
83 | to kernel mode Linux will query some random data through it, and feed it into | |
84 | the pool, but not credit entropy to it. What kind of random source is behind | |
85 | the EFI RNG API is often not entirely clear, but it hopefully is some kind of | |
86 | hardware source. | |
87 | ||
88 | If neither of these are available (in fact, even if they are), Linux generates | |
89 | entropy from various non-hwrng sources in various subsystems, all of which | |
90 | ultimately are rooted in IRQ noise, a very "slow" source of entropy, in | |
91 | particular in virtualized environments. | |
92 | ||
93 | ## `systemd`'s Use of Random Numbers | |
94 | ||
95 | systemd is responsible for bringing up the OS. It generally runs as the first | |
96 | userspace process the kernel invokes. Because of that it runs at a time where | |
97 | the entropy pool is typically not yet initialized, and thus requests to acquire | |
98 | random bytes will either be delayed, will fail or result in a noisy kernel log | |
99 | message (see above). | |
100 | ||
101 | Various other components run during early boot that require random bytes. For | |
102 | example, initial RAM disks nowadays communicate with encrypted networks or | |
103 | access encrypted storage which might need random numbers. systemd itself | |
104 | requires random numbers as well, including for the following uses: | |
105 | ||
106 | * systemd assigns 'invocation' UUIDs to all services it invokes that uniquely | |
107 | identify each invocation. This is useful retain a global handle on a specific | |
108 | service invocation and relate it to other data. For example, log data | |
109 | collected by the journal usually includes the invocation UUID and thus the | |
110 | runtime context the service manager maintains can be neatly matched up with | |
111 | the log data a specific service invocation generated. systemd also | |
112 | initializes `/etc/machine-id` with a randomized UUID. (systemd also makes use | |
113 | of the randomized "boot id" the kernel exposes in | |
114 | `/proc/sys/kernel/random/boot_id`). These UUIDs are exclusively Type 4 UUIDs, | |
115 | i.e. randomly generated ones. | |
116 | ||
117 | * systemd maintains various hash tables internally. In order to harden them | |
118 | against [collision | |
119 | attacks](https://rt.perl.org/Public/Bug/Display.html?CSRF_Token=165691af9ddaa95f653402f1b68de728) | |
120 | they are seeded with random numbers. | |
121 | ||
122 | * At various places systemd needs random bytes for temporary file name | |
123 | generation, UID allocation randomization, and similar. | |
124 | ||
125 | * systemd-resolved and systemd-networkd use random number generators to harden | |
126 | the protocols they implement against packet forgery. | |
127 | ||
128 | * systemd-udevd and systemd-nspawn can generate randomized MAC addresses for | |
129 | network devices. | |
130 | ||
131 | Note that these cases generally do not require a cryptographic-grade random | |
132 | number generator, as most of these utilize random numbers to minimize risk of | |
133 | collision and not to generate secret key material. However, they usually do | |
134 | require "medium-grade" random data. For example: systemd's hash-maps are | |
135 | reseeded if they grow beyond certain thresholds (and thus collisions are more | |
136 | likely). This means they are generally fine with low-quality (even constant) | |
137 | random numbers initially as long as they get better with time, so that | |
138 | collision attacks are eventually thwarted as better, non-guessable seeds are | |
139 | acquired. | |
140 | ||
141 | ## Keeping `systemd'`s Demand on the Kernel Entropy Pool Minimal | |
142 | ||
143 | Since most of systemd's own use of random numbers do not require | |
144 | cryptographic-grade RNGs, it tries to avoid reading entropy from the kernel | |
145 | entropy pool if possible. If it succeeds this has the benefit that there's no | |
146 | need to delay the early boot process until entropy is available, and noisy | |
147 | kernel log messages about early reading from `/dev/urandom` are avoided | |
148 | too. Specifically: | |
149 | ||
150 | 1. When generating [Type 4 | |
151 | UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_\(random\)), | |
152 | systemd tries to use Intel's and AMD's RDRAND CPU opcode directly, if | |
153 | available. While some doubt the quality and trustworthiness of the entropy | |
154 | provided by these opcodes, they should be good enough for generating UUIDs, | |
155 | if not key material (though, as mentioned, today's big distributions opted | |
156 | to trust it for that too, now, see above — but we are not going to make that | |
157 | decision for you, and for anything key material related will only use the | |
158 | kernel's entropy pool). If RDRAND is not available or doesn't work, it will | |
159 | use synchronous `getrandom()` as fallback, and `/dev/urandom` on old kernels | |
160 | where that system call doesn't exist yet. This means on non-Intel/AMD | |
161 | systems UUID generation will block on kernel entropy initialization. | |
162 | ||
163 | 2. For seeding hash tables, and all the other similar purposes systemd first | |
164 | tries RDRAND, and if that's not available will try to use asynchronous | |
165 | `getrandom()` (if the kernel doesn't support this system call, | |
166 | `/dev/urandom` is used). This may fail too in case the pool is not | |
167 | initialized yet, in which case it will fall back to glibc's internal rand() | |
168 | calls, i.e. weak pseudo-random numbers. This should make sure we use good | |
169 | random bytes if we can, but neither delay boot nor trigger noisy kernel log | |
170 | messages during early boot for these use-cases. | |
171 | ||
172 | ## `systemd`'s Support for Filling the Kernel Entropy Pool | |
173 | ||
174 | systemd has various provisions to ensure the kernel entropy is filled during | |
175 | boot, in order to ensure the entropy pool is filled up quickly. | |
176 | ||
177 | 1. When systemd's PID 1 detects it runs in a virtualized environment providing | |
178 | the `virtio-rng` interface it will load the necessary kernel modules to make | |
179 | use of it during earliest boot, if possible — much earlier than regular | |
180 | kernel module loading done by `systemd-udevd.service`. This should ensure | |
181 | that in VM environments the entropy pool is quickly filled, even before | |
182 | systemd invokes the first service process — as long as the VM environment | |
183 | provides virtualized RNG hardware (and VM environments really should!). | |
184 | ||
185 | 2. The | |
186 | [`systemd-random-seed.service`](https://www.freedesktop.org/software/systemd/man/systemd-random-seed.service.html) | |
187 | system service will load a random seed from `/var/lib/systemd/random-seed` | |
188 | into the kernel entropy pool. By default it does not credit entropy for it | |
189 | though, since the seed is — more often than not — not reset when 'golden' | |
190 | master images of an OS are created, and thus replicated into every | |
191 | installation. If OS image builders carefully reset the random seed file | |
192 | before generating the image it should be safe to credit entropy, which can | |
d35c7741 LP |
193 | be enabled by setting the `$SYSTEMD_RANDOM_SEED_CREDIT` environment variable |
194 | for the service to `1` (or even `force`, see man page). Note however, that | |
195 | this service typically runs relatively late during early boot: long after | |
196 | the initial RAM disk (`initrd`) completed, and after the `/var/` file system | |
197 | became writable. This is usually too late for many applications, it is hence | |
198 | not advised to rely exclusively on this functionality to seed the kernel's | |
93f59100 LP |
199 | entropy pool. Also note that this service synchronously waits until the |
200 | kernel's entropy pool is initialized before completing start-up. It may thus | |
201 | be used by other services as synchronization point to order against, if they | |
202 | require an initialized entropy pool to operate correctly. | |
203 | ||
204 | 3. The | |
205 | [`systemd-boot`](https://www.freedesktop.org/software/systemd/man/systemd-boot.html) | |
206 | EFI boot loader included in systemd is able to maintain and provide a random | |
207 | seed stored in the EFI System Partition (ESP) to the booted OS, which allows | |
208 | booting up with a fully initialized entropy pool from earliest boot | |
209 | on. During installation of the boot loader (or when invoking [`bootctl | |
210 | random-seed`](https://www.freedesktop.org/software/systemd/man/bootctl.html#random-seed)) | |
211 | a seed file with an initial seed is placed in a file `/loader/random-seed` | |
212 | in the ESP. In addition, an identically sized randomized EFI variable called | |
213 | the the 'system token' is set, which is written to the machine's firmware | |
214 | NVRAM. During boot, when `systemd-boot` finds both the random seed file and | |
215 | the system token they are combined and hashed with SHA256 (in counter mode, | |
216 | to generate sufficient data), to generate a new random seed file to store in | |
217 | the ESP as well as a random seed to pass to the OS kernel. The new random | |
218 | seed file for the ESP is then written to the ESP, ensuring this is completed | |
219 | before the OS is invoked. Very early during initialization PID 1 will read | |
220 | the random seed provided in the EFI variable and credit it fully to the | |
221 | kernel's entropy pool. | |
222 | ||
223 | This mechanism is able to safely provide an initialized entropy pool already | |
224 | in the `initrd` and guarantees that different seeds are passed from the boot | |
225 | loader to the OS on every boot (in a way that does not allow regeneration of | |
226 | an old seed file from a new seed file). Moreover, when an OS image is | |
227 | replicated between multiple images and the random seed is not reset, this | |
228 | will still result in different random seeds being passed to the OS, as the | |
229 | per-machine 'system token' is specific to the physical host, and not | |
230 | included in OS disk images. If the 'system token' is properly initialized | |
231 | and kept sufficiently secret it should not be possible to regenerate the | |
232 | entropy pool of different machines, even if this seed is the only source of | |
233 | entropy. | |
234 | ||
235 | Note that the writes to the ESP needed to maintain the random seed should be | |
236 | minimal. The size of the random seed file is directly derived from the Linux | |
237 | kernel's entropy pool size, which defaults to 512 bytes. This means updating | |
238 | the random seed in the ESP should be doable safely with a single sector | |
239 | write (since hard-disk sectors typically happen to be 512 bytes long, too), | |
240 | which should be safe even with FAT file system drivers built into | |
241 | low-quality EFI firmwares. | |
242 | ||
243 | As a special restriction: in virtualized environments PID 1 will refrain | |
244 | from using this mechanism, for safety reasons. This is because on VM | |
245 | environments the EFI variable space and the disk space is generally not | |
246 | maintained physically separate (for example, `qemu` in EFI mode stores the | |
247 | variables in the ESP itself). The robustness towards sloppy OS image | |
248 | generation is the main purpose of maintaining the 'system token' however, | |
249 | and if the EFI variable storage is not kept physically separate from the OS | |
250 | image there's no point in it. That said, OS builders that know that they are | |
251 | not going to replicate the built image on multiple systems may opt to turn | |
252 | off the 'system token' concept by setting `random-seed-mode always` in the | |
253 | ESP's | |
254 | [`/loader/loader.conf`](https://www.freedesktop.org/software/systemd/man/loader.conf.html) | |
255 | file. If done, `systemd-boot` will use the random seed file even if no | |
256 | system token is found in EFI variables. | |
257 | ||
258 | With the three mechanisms described above it should be possible to provide | |
259 | early-boot entropy in most cases. Specifically: | |
260 | ||
261 | 1. On EFI systems, `systemd-boot`'s random seed logic should make sure good | |
262 | entropy is available during earliest boot — as long as `systemd-boot` is | |
263 | used as boot loader, and outside of virtualized environments. | |
264 | ||
265 | 2. On virtualized systems, the early `virtio-rng` hookup should ensure entropy | |
266 | is available early on — as long as the VM environment provides virtualized | |
267 | RNG devices, which they really should all do in 2019. Complain to your | |
268 | hosting provider if they don't. | |
269 | ||
270 | 3. On Intel/AMD systems systemd's own reliance on the kernel entropy pool is | |
271 | minimal (as RDRAND is used on those for UUID generation). This only works if | |
272 | the CPU has RDRAND of course, which most physical CPUs do (but I hear many | |
273 | virtualized CPUs do not. Pity.) | |
274 | ||
275 | 4. In all other cases, `systemd-random-seed.service` will help a bit, but — as | |
276 | mentioned — is too late to help with early boot. | |
277 | ||
278 | This primarily leaves two kind of systems in the cold: | |
279 | ||
280 | 1. Some embedded systems. Many embedded chipsets have hwrng functionality these | |
281 | days. Consider using them while crediting | |
282 | entropy. (i.e. `rng_core.default_quality=1000` on the kernel command line is | |
283 | your friend). Or accept that the system might take a bit longer to | |
284 | boot. Alternatively, consider implementing a solution similar to | |
285 | systemd-boot's random seed concept in your platform's boot loader. | |
286 | ||
287 | 2. Virtualized environments that lack both virtio-rng and RDRAND. Tough | |
288 | luck. Talk to your hosting provider, and ask them to fix this. | |
289 | ||
290 | 3. Also note: if you deploy an image without any random seed and/or without | |
291 | installing any 'system token' in an EFI variable, as described above, this | |
292 | means that on the first boot no seed can be passed to the OS | |
293 | either. However, as the boot completes (with entropy acquired elsewhere), | |
294 | systemd will automatically install both a random seed in the GPT and a | |
295 | 'system token' in the EFI variable space, so that any future boots will have | |
296 | entropy from earliest boot on — all provided `systemd-boot` is used. | |
297 | ||
298 | ## Frequently Asked Questions | |
299 | ||
300 | 1. *Why don't you just use getrandom()? That's all you need!* | |
301 | ||
302 | Did you read any of the above? getrandom() is hooked to the kernel entropy | |
303 | pool, and during early boot it's not going to be filled yet, very likely. We | |
304 | do use it in many cases, but not in all. Please read the above again! | |
305 | ||
306 | 2. *Why don't you use | |
307 | [getentropy()](http://man7.org/linux/man-pages/man3/getentropy.3.html)? That's | |
308 | all you need!* | |
309 | ||
310 | Same story. That call is just a different name for `getrandom()` with | |
311 | `flags` set to zero, and some additional limitations, and thus it also needs | |
312 | the kernel's entropy pool to be initialized, which is the whole problem we | |
313 | are trying to address here. | |
314 | ||
315 | 3. *Why don't you generate your UUIDs with | |
316 | [`uuidd`](http://man7.org/linux/man-pages/man8/uuidd.8.html)? That's all you | |
317 | need!* | |
318 | ||
319 | First of all, that's a system service, i.e. something that runs as "payload" | |
320 | of systemd, long after systemd is already up and hence can't provide us | |
321 | UUIDs during earliest boot yet. Don't forget: to assign the invocation UUID | |
322 | for the `uuidd.service` start we already need a UUID that the service is | |
323 | supposed to provide us. More importantly though, `uuidd` needs state/a random | |
324 | seed/a MAC address/host ID to operate, all of which are not available during | |
325 | early boot. | |
326 | ||
327 | 4. *Why don't you generate your UUIDs with `/proc/sys/kernel/random/uuid`? | |
328 | That's all you need!* | |
329 | ||
330 | This is just a different, more limited interface to `/dev/urandom`. It gains | |
331 | us nothing. | |
332 | ||
333 | 5. *Why don't you use [`rngd`](https://github.com/nhorman/rng-tools), | |
334 | [`haveged`](http://www.issihosts.com/haveged/), | |
335 | [`egd`](http://egd.sourceforge.net/)? That's all you need!* | |
336 | ||
337 | Like `uuidd` above these are system services, hence come too late for our | |
338 | use-case. In addition much of what `rngd` provides appears to be equivalent | |
339 | to `CONFIG_RANDOM_TRUST_CPU=y` or `rng_core.default_quality=1000`, except | |
340 | being more complex and involving userspace. These services partly measure | |
341 | system behavior (such as scheduling effects) which the kernel either | |
342 | already feeds into its pool anyway (and thus shouldn't be fed into it a | |
343 | second time, crediting entropy for it a second time) or is at least | |
344 | something the kernel could much better do on its own. Hence, if what these | |
345 | daemons do is still desirable today, this would be much better implemented | |
346 | in kernel (which would be very welcome of course, but wouldn't really help | |
347 | us here in our specific problem, see above). | |
348 | ||
349 | 6. *Why don't you use [`arc4random()`](https://man.openbsd.org/arc4random.3)? | |
350 | That's all you need!* | |
351 | ||
352 | This doesn't solve the issue, since it requires a nonce to start from, and | |
353 | it gets that from `getrandom()`, and thus we have to wait for random pool | |
354 | initialization the same way as calling `getrandom()` | |
355 | directly. `arc4random()` is nothing more than optimization, in fact it | |
356 | implements similar algorithms that the kernel entropy pool implements | |
357 | anyway, hence besides being able to provide random bytes with higher | |
358 | throughput there's little it gets us over just using `getrandom()`. Also, | |
359 | it's not supported by glibc. And as long as that's the case we are not keen | |
360 | on using it, as we'd have to maintain that on our own, and we don't want to | |
361 | maintain our own cryptographic primitives if we don't have to. Since | |
362 | systemd's uses are not performance relevant (besides the pool initialization | |
363 | delay, which this doesn't solve), there's hence little benefit for us to | |
364 | call these functions. That said, if glibc learns these APIs one day, we'll | |
365 | certainly make use of them where appropriate. | |
366 | ||
367 | 7. *This is boring: NetBSD had [boot loader entropy seed | |
368 | support](https://netbsd.gw.com/cgi-bin/man-cgi?boot+8) since ages!* | |
369 | ||
370 | Yes, NetBSD has that, and the above is inspired by that (note though: this | |
371 | article is about a lot more than that). NetBSD's support is not really safe, | |
372 | since it neither updates the random seed before using it, nor has any | |
373 | safeguards against replicating the same disk image with its random seed on | |
374 | multiple machines (which the 'system token' mentioned above is supposed to | |
375 | address). This means reuse of the same random seed by the boot loader is | |
376 | much more likely. | |
377 | ||
378 | 8. *Why does PID 1 upload the boot loader provided random seed into kernel | |
379 | instead of kernel doing that on its own?* | |
380 | ||
381 | That's a good question. Ideally the kernel would do that on its own, and we | |
382 | wouldn't have to involve userspace in this. | |
383 | ||
384 | 9. *What about non-EFI?* | |
385 | ||
386 | The boot loader random seed logic described above uses EFI variables to pass | |
387 | the seed from the boot loader to the OS. Other systems might have similar | |
388 | functionality though, and it shouldn't be too hard to implement something | |
389 | similar for them. Ideally, we'd have an official way to pass such a seed as | |
390 | part of the `struct boot_params` from the boot loader to the kernel, but | |
391 | this is currently not available. | |
392 | ||
393 | 10. *I use a different boot loader than `systemd-boot`, I'd like to use boot | |
394 | loader random seeds too!* | |
395 | ||
396 | Well, consider just switching to `systemd-boot`, it's worth it. See | |
397 | [systemd-boot(7)](https://www.freedesktop.org/software/systemd/man/systemd-boot.html) | |
398 | for an introduction why. That said, any boot loader can re-implement the | |
399 | logic described above, and can pass a random seed that systemd as PID 1 | |
400 | will then upload into the kernel's entropy pool. For details see the [Boot | |
401 | Loader Interface](https://systemd.io/BOOT_LOADER_INTERFACE) documentation. | |
402 | ||
403 | 11. *Why not pass the boot loader random seed via kernel command line instead | |
404 | of as EFI variable?* | |
405 | ||
406 | The kernel command line is accessible to unprivileged processes via | |
407 | `/proc/cmdline`. It's not desirable if unprivileged processes can use this | |
408 | information to possibly gain too much information about the current state | |
409 | of the kernel's entropy pool. | |
410 | ||
411 | 12. *Why doesn't `systemd-boot` rewrite the 'system token' too each time | |
412 | when updating the random seed file stored in the ESP?* | |
413 | ||
414 | The system token is stored as persistent EFI variable, i.e. in some form of | |
415 | NVRAM. These memory chips tend be of low quality in many machines, and | |
416 | hence we shouldn't write them too often. Writing them once during | |
417 | installation should generally be OK, but rewriting them on every single | |
418 | boot would probably wear the chip out too much, and we shouldn't risk that. |