## Keeping `systemd'`s Demand on the Kernel Entropy Pool Minimal
Since most of systemd's own use of random numbers do not require
-cryptographic-grade RNGs, it tries to avoid reading entropy from the kernel
-entropy pool if possible. If it succeeds this has the benefit that there's no
-need to delay the early boot process until entropy is available, and noisy
-kernel log messages about early reading from `/dev/urandom` are avoided
-too. Specifically:
-
-1. When generating [Type 4
- UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_\(random\)),
- systemd tries to use Intel's and AMD's RDRAND CPU opcode directly, if
- available. While some doubt the quality and trustworthiness of the entropy
- provided by these opcodes, they should be good enough for generating UUIDs,
- if not key material (though, as mentioned, today's big distributions opted
- to trust it for that too, now, see above — but we are not going to make that
- decision for you, and for anything key material related will only use the
- kernel's entropy pool). If RDRAND is not available or doesn't work, it will
- use synchronous `getrandom()` as fallback, and `/dev/urandom` on old kernels
- where that system call doesn't exist yet. This means on non-Intel/AMD
- systems UUID generation will block on kernel entropy initialization.
-
-2. For seeding hash tables, and all the other similar purposes systemd first
- tries RDRAND, and if that's not available will try to use asynchronous
- `getrandom()` (if the kernel doesn't support this system call,
- `/dev/urandom` is used). This may fail too in case the pool is not
- initialized yet, in which case it will fall back to glibc's internal rand()
- calls, i.e. weak pseudo-random numbers. This should make sure we use good
- random bytes if we can, but neither delay boot nor trigger noisy kernel log
- messages during early boot for these use-cases.
+cryptographic-grade RNGs, it tries to avoid blocking reads to the kernel's RNG,
+opting instead for using `getrandom(GRND_INSECURE)`. After the pool is
+initialized, this is identical to `getrandom(0)`, returning cryptographically
+secure random numbers, but before it's initialized it has the nice effect of
+not blocking system boot.
## `systemd`'s Support for Filling the Kernel Entropy Pool
hosting provider if they don't. For VMs used in testing environments,
`systemd.random_seed=` may be used as an alternative to a virtualized RNG.
-3. On Intel/AMD systems systemd's own reliance on the kernel entropy pool is
- minimal (as RDRAND is used on those for UUID generation). This only works if
- the CPU has RDRAND of course, which most physical CPUs do (but I hear many
- virtualized CPUs do not. Pity.)
+3. In general, systemd's own reliance on the kernel entropy pool is minimal
+ (due to the use of `GRND_INSECURE`).
4. In all other cases, `systemd-random-seed.service` will help a bit, but — as
mentioned — is too late to help with early boot.
static bool srand_called = false;
-int rdrand(unsigned long *ret) {
-
- /* So, you are a "security researcher", and you wonder why we bother with using raw RDRAND here,
- * instead of sticking to /dev/urandom or getrandom()?
- *
- * Here's why: early boot. On Linux, during early boot the random pool that backs /dev/urandom and
- * getrandom() is generally not initialized yet. It is very common that initialization of the random
- * pool takes a longer time (up to many minutes), in particular on embedded devices that have no
- * explicit hardware random generator, as well as in virtualized environments such as major cloud
- * installations that do not provide virtio-rng or a similar mechanism.
- *
- * In such an environment using getrandom() synchronously means we'd block the entire system boot-up
- * until the pool is initialized, i.e. *very* long. Using getrandom() asynchronously (GRND_NONBLOCK)
- * would mean acquiring randomness during early boot would simply fail. Using /dev/urandom would mean
- * generating many kmsg log messages about our use of it before the random pool is properly
- * initialized. Neither of these outcomes is desirable.
- *
- * Thus, for very specific purposes we use RDRAND instead of either of these three options. RDRAND
- * provides us quickly and relatively reliably with random values, without having to delay boot,
- * without triggering warning messages in kmsg.
- *
- * Note that we use RDRAND only under very specific circumstances, when the requirements on the
- * quality of the returned entropy permit it. Specifically, here are some cases where we *do* use
- * RDRAND:
- *
- * • UUID generation: UUIDs are supposed to be universally unique but are not cryptographic
- * key material. The quality and trust level of RDRAND should hence be OK: UUIDs should be
- * generated in a way that is reliably unique, but they do not require ultimate trust into
- * the entropy generator. systemd generates a number of UUIDs during early boot, including
- * 'invocation IDs' for every unit spawned that identify the specific invocation of the
- * service globally, and a number of others. Other alternatives for generating these UUIDs
- * have been considered, but don't really work: for example, hashing uuids from a local
- * system identifier combined with a counter falls flat because during early boot disk
- * storage is not yet available (think: initrd) and thus a system-specific ID cannot be
- * stored or retrieved yet.
- *
- * • Hash table seed generation: systemd uses many hash tables internally. Hash tables are
- * generally assumed to have O(1) access complexity, but can deteriorate to prohibitive
- * O(n) access complexity if an attacker manages to trigger a large number of hash
- * collisions. Thus, systemd (as any software employing hash tables should) uses seeded
- * hash functions for its hash tables, with a seed generated randomly. The hash tables
- * systemd employs watch the fill level closely and reseed if necessary. This allows use of
- * a low quality RNG initially, as long as it improves should a hash table be under attack:
- * the attacker after all needs to trigger many collisions to exploit it for the purpose
- * of DoS, but if doing so improves the seed the attack surface is reduced as the attack
- * takes place.
- *
- * Some cases where we do NOT use RDRAND are:
- *
- * • Generation of cryptographic key material 🔑
- *
- * • Generation of cryptographic salt values 🧂
- *
- * This function returns:
- *
- * -EOPNOTSUPP → RDRAND is not available on this system 😔
- * -EAGAIN → The operation failed this time, but is likely to work if you try again a few
- * times ♻
- * -EUCLEAN → We got some random value, but it looked strange, so we refused using it.
- * This failure might or might not be temporary. 😕
- */
-
-#if defined(__i386__) || defined(__x86_64__)
- static int have_rdrand = -1;
- unsigned long v;
- uint8_t success;
-
- if (have_rdrand < 0) {
- uint32_t eax, ebx, ecx, edx;
-
- /* Check if RDRAND is supported by the CPU */
- if (__get_cpuid(1, &eax, &ebx, &ecx, &edx) == 0) {
- have_rdrand = false;
- return -EOPNOTSUPP;
- }
-
-/* Compat with old gcc where bit_RDRND didn't exist yet */
-#ifndef bit_RDRND
-#define bit_RDRND (1U << 30)
-#endif
-
- have_rdrand = !!(ecx & bit_RDRND);
-
- if (have_rdrand > 0) {
- /* Allow disabling use of RDRAND with SYSTEMD_RDRAND=0
- If it is unset getenv_bool_secure will return a negative value. */
- if (getenv_bool_secure("SYSTEMD_RDRAND") == 0) {
- have_rdrand = false;
- return -EOPNOTSUPP;
- }
- }
- }
-
- if (have_rdrand == 0)
- return -EOPNOTSUPP;
-
- asm volatile("rdrand %0;"
- "setc %1"
- : "=r" (v),
- "=qm" (success));
- msan_unpoison(&success, sizeof(success));
- if (!success)
- return -EAGAIN;
-
- /* Apparently on some AMD CPUs RDRAND will sometimes (after a suspend/resume cycle?) report success
- * via the carry flag but nonetheless return the same fixed value -1 in all cases. This appears to be
- * a bad bug in the CPU or firmware. Let's deal with that and work-around this by explicitly checking
- * for this special value (and also 0, just to be sure) and filtering it out. This is a work-around
- * only however and something AMD really should fix properly. The Linux kernel should probably work
- * around this issue by turning off RDRAND altogether on those CPUs. See:
- * https://github.com/systemd/systemd/issues/11810 */
- if (v == 0 || v == ULONG_MAX)
- return log_debug_errno(SYNTHETIC_ERRNO(EUCLEAN),
- "RDRAND returned suspicious value %lx, assuming bad hardware RNG, not using value.", v);
-
- *ret = v;
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
int genuine_random_bytes(void *p, size_t n, RandomFlags flags) {
static int have_syscall = -1;
_cleanup_close_ int fd = -1;
- if (FLAGS_SET(flags, RANDOM_BLOCK | RANDOM_ALLOW_RDRAND))
- return -EINVAL;
-
- /* Gathers some high-quality randomness from the kernel (or potentially mid-quality randomness from
- * the CPU if the RANDOM_ALLOW_RDRAND flag is set). This call won't block, unless the RANDOM_BLOCK
+ /* Gathers some high-quality randomness from the kernel. This call won't block, unless the RANDOM_BLOCK
* flag is set. If it doesn't block, it will still always return some data from the kernel, regardless
* of whether the random pool is fully initialized or not. When creating cryptographic key material you
* should always use RANDOM_BLOCK. */
}
}
- if (FLAGS_SET(flags, RANDOM_ALLOW_RDRAND)) {
- /* Try x86-64' RDRAND intrinsic if we have it. We only use it if high quality randomness is
- * not required, as we don't trust it (who does?). Note that we only do a single iteration of
- * RDRAND here, even though the Intel docs suggest calling this in a tight loop of 10
- * invocations or so. That's because we don't really care about the quality here. We
- * generally prefer using RDRAND if the caller allows us to, since this way we won't upset
- * the kernel's random subsystem by accessing it before the pool is initialized (after all it
- * will kmsg log about every attempt to do so). */
- for (;;) {
- unsigned long u;
- size_t m;
-
- if (rdrand(&u) < 0) {
- /* OK, this didn't work, let's go with /dev/urandom instead */
- break;
- }
-
- m = MIN(sizeof(u), n);
- memcpy(p, &u, m);
-
- p = (uint8_t*) p + m;
- n -= m;
-
- if (n == 0)
- return 0; /* Yay, success! */
- }
- }
-
fd = open("/dev/urandom", O_RDONLY|O_CLOEXEC|O_NOCTTY);
if (fd < 0)
return errno == ENOENT ? -ENOSYS : -errno;
#if HAVE_SYS_AUXV_H
const void *auxv;
#endif
- unsigned long k;
-
if (srand_called)
return;
x ^= (unsigned) now(CLOCK_REALTIME);
x ^= (unsigned) gettid();
- if (rdrand(&k) >= 0)
- x ^= (unsigned) k;
-
srand(x);
srand_called = true;
*
* What this function will do:
*
- * • This function will preferably use the CPU's RDRAND operation, if it is available, in
- * order to return "mid-quality" random values cheaply.
+ * • Use getrandom(GRND_INSECURE) or /dev/urandom, to return high-quality random values if
+ * they are cheaply available, or less high-quality random values if they are not.
*
* • This function will return pseudo-random data, generated via libc rand() if nothing
* better is available.
* This function is hence not useful for generating UUIDs or cryptographic key material.
*/
- if (genuine_random_bytes(p, n, RANDOM_ALLOW_RDRAND) >= 0)
+ if (genuine_random_bytes(p, n, 0) >= 0)
return;
/* If for some reason some user made /dev/urandom unavailable to us, or the kernel has no entropy, use a PRNG instead. */