logind: also cache LoaderEntryOneShot EFI variable
With this we are now caching all EFI variables that we expose as
property in logind. Thus a client invoking GetAllProperties() should
only trgger a single read of each variable, but never repeated ones.
The data from this EFI variable is exposed as dbus property, and gdbus
clients are happy to issue GetAllProperties() as if it was free. Hence
make sure it's actually free and cache LoaderConfigTimeoutOneShot, since
it's easy.
Łukasz Stelmach [Wed, 24 Jun 2020 17:24:13 +0000 (19:24 +0200)]
udev: split attribute assignment for MMC cards
Some cards have names consisting only of whitespace characters which
prevents the original rule from matching and assigning ID_SERIAL
properly. With the split rules ID_SERIAL and ID_NAME are assigned
independently and the symlink is created only if both are available the
same way it has worked for partitions.
Luca Boccassi [Tue, 23 Jun 2020 14:56:33 +0000 (15:56 +0100)]
portabled: create temp file for unit, not directory
open_tmpfile_linkable is used to create a temporary file in the same
directory as the target, but portabled uses the name of the parent
directory instead of the file it intends to create.
In other words, it creats a tmp for /etc/systemd/system.attached instead
of /etc/systemd/system.attached/foo.service.
It still works because it's later moved in the right place.
But as a side effect, it tries the create the file in the parent directory
which is /etc/systemd, and it case of read-only filesystems it fails.
journal-file: when individual hash chains grow too large, rotate
Even with the new keyed hash table journal feature: if an attacker
manages to get access to the journal file id it could synthesize records
that result in hash collisions. Let's rotate automatically when we
notice that, so that a new journal file ID is generated, our performance
is restored and the attacker has to guess a new file ID before being
able to trigger the issue again.
That said, untrusted peers should never get access to journal files in
the first case...
journal: use a different hash function for each journal file
This adds a new (incompatible) feature to journal files: if enabled the
hash function used for the hash tables is no longer jenkins hash with a
zero key, but siphash keyed by the file uuid that is included in the
file header anyway. This should make our hash tables more robust against
collision attacks, as long as the attacker has no read access to the
journal files. We switch from jenkins to siphash simply because it's
more well-known and we standardize for the rest of our codebase onto it.
This is hardening in order to make collision attacks harder for clients
that can forge log messages but have no read access to the logs. It has
no effect on clients that have read access.
Let's prefix this with "jenkins_" since it wraps the jenkins hash. We
want to add support for other hash functions to journald soon, hence
better be clear with what this is. In particular as all other symbols
defined by lookup3.h actually are prefixed "jenkins_".
Let's clean this up a bit, following our usual nomenclature to name
return parameters ret-xyz.
This is mostly a bit of renaming, but there's also some minor other
changes: if we return a pointer to a mmap'ed object plus its offset, in
almost all cases we are happy if either parameter is NULL in case the
caller is not interested in it. Let's fix the remaining case to do this
too, to minimize surprises.
The object flags field is a bitmask, hence don't sloppily define
_OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK
as long as we only had two flags, but will fall apart as soon as we have
three. Let's fix this.
(It's kinda sloppy how the string table is built here, as it will be
quite sparse as soon as we have more enum entries, but let's keep it for
now.)
hostnamed: minimize caching of /etc/hostname, /etc/os-release and /etc/machine-info
Instead of reading these files at startup and never again, let's read
them when we need them. As an optimization (in particular as some of
these files contain the data for many fields at once) let's cache the
results as long as the stat data (i.e. mtime) remains stable.
Also, while we are at it, if we can't read any of these props, let's not
fail everything, but continue without the data.
Luca Boccassi [Tue, 2 Jun 2020 14:35:58 +0000 (15:35 +0100)]
dissect/nspawn: add support for dm-verity root hash signature
Since cryptsetup 2.3.0 a new API to verify dm-verity volumes by a
pkcs7 signature, with the public key in the kernel keyring,
is available. Use it if libcryptsetup supports it.
Luca Boccassi [Thu, 4 Jun 2020 16:41:28 +0000 (17:41 +0100)]
veritysetup: add support for dm-verity root hash signature
Since cryptsetup 2.3.0 a new API to verify dm-verity volumes by a
pkcs7 signature, with the public key in the kernel keyring,
is available. Use it if libcryptsetup supports it in the
veritysetup helper binary.
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
message is only valid until message_len, and we need to make sure we're not
reading pass that. Bug introduced in 2108b56749ebb8d17f06d08b6ada2f79ae4f0.
Michal Koutný [Wed, 24 Jun 2020 18:40:02 +0000 (20:40 +0200)]
cgroup: Parse infinity properly for memory protections
This fixes commit db2b8d2e2895010f3443a589c9c1f1dfb25256a6 that
rectified parsing empty values but broke parsing explicit infinity.
Intended parsing semantics will be captured in a testcase in a follow up
commit.
It's just a follow-up to https://github.com/systemd/systemd/pull/16266.
Currently the Coverity stage is failing with
```
Starting container systemd-fedora-latest 2db425228e1addbce607c7e47e492a0faef2c2c4e85701c6c239a50de95944eb
Error: No such container: bash
The command "$CI_MANAGERS/fedora.sh SETUP" failed and exited with 1 during .
Your build has been stopped.
```
Looks like DOCKER_EXEC got lost somewhere along the way, which, in
turn, caused the "coverity" job to fail with
```
$ $DOCKER_EXEC meson cov-build -Dman=false
Command 'meson' not found, but can be installed with:
apt install meson
Please ask your administrator.
```
log: introduce log_parse_environment_cli() and log_setup_cli()
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
basic/hashmap,set: propagate allocation location info in _copy()
Also use double space before the tracking args at the end. Without
the comma this looks ugly, but it's a bit better with the double space.
At least it doesn't look like a variable with a type.
networkd: take ref immediately after storing item in set
I'm not sure if I understand the code correctly, but it seems that if
storig in the second set failed, we'd return with the first set having
no reference on the link object, and the link object could be freed in the
future, leaving the set with a dangling reference.
core/bpf-firewall: use the correct cleanup function
On error, we'd just free the object, and not close the fd.
While at it, let's use set_ensure_consume() to make sure we don't leak
the object if it was already in the set. I'm not sure if that condition
can be achieved.
This combines set_ensure_allocated() with set_consume(). The cool thing is that
because we know the hash ops, we can correctly free the item if appropriate.
Similarly to set_consume(), the goal is to simplify handling of the case where
the item needs to be freed on error and if already present in the set.
Luca Boccassi [Tue, 23 Jun 2020 10:45:50 +0000 (11:45 +0100)]
make-autosuspend-rules: restore compatibility with Python3 < 3.6
The f'...' format was introduced in Python 3.6 ( https://www.python.org/dev/peps/pep-0498/ )
and returns an error when systemd is built on a system with an older Python3 version:
<...>
File /home/bluca/git/systemd/tools/make-autosuspend-rules.py, line 15
print(f'pci:v{vendor:08X}d{device:08X}*')
^
SyntaxError: invalid syntax
[2/388] Generating version.h with a custom command.
ninja: build stopped: subcommand failed.
$ python3 --version
Python 3.5.6
Use an older format to keep backward compatibility.
Previously we'd used the existance of a specific AF_UNIX socket in the
abstract namespace as lock for disabling lookup recursions. (for
breaking out of the loop: userdb synthesized from nss → nss synthesized
from userdb → userdb synthesized from nss → …)
I did it like that because it promised to work the same both in static
and in dynmically linked environments and is accessible easily from any
programming language.
However, it has a weakness regarding reuse attacks: the socket is
securely hashed (siphash) from the thread ID in combination with the
AT_RANDOM secret. Thus it should not be guessable from an attacker in
advance. That's only true if a thread takes the lock only once and
keeps it forever. However, if a thread takes and releases it multiple
times an attacker might monitor that and quickly take the lock
after the first iteration for follow-up iterations.
It's not a big issue given that userdb (as the primary user for this)
never released the lock and we never made the concept a public
interface, and it was only included in one release so far, but it's
something that deserves fixing. (moreover it's a local DoS only, only
permitting to disable native userdb lookups)
With this rework the libnss_systemd.so.2 module will now export two
additional symbols. These symbols are not used by glibc, but can be used
by arbitrary programs: one can be used to disable nss-systemd, the other
to check if it is currently disabled.
The lock is per-thread. It's slightly less pretty, since it requires
people to manually link against C code via dlopen()/dlsym(), but it
should work safely without the aforementioned weakness.
This just adds a _cleanup_ helper call encapsulating dlclose().
This also means libsystemd-shared is linked against libdl now. I don't
think this is much of an issue, since libdl is part of glibc anyway, and
anything from exotic. It's not an optional part of the OS (think: NSS
requires dynamic linking), hence this pulls in no deps and is almost
certainly loaded into all process' memory anyway.
Revert "cgroup: Allow empty assignments of Memory{Low,Min}="
This reverts commit 53aa85af24cda4470b6750f88e181b775385e228.
The reason is that that patch changes the dbus api to be different than
the types declared by introspection api.