Susant Sahani [Sun, 21 Jun 2020 11:17:34 +0000 (11:17 +0000)]
network: Introduce SR-IOV
SR-IOV provides the ability to partition a single physical PCI
resource into virtual PCI functions which can then be injected in
to a VM. In the case of network VFs, SR-IOV improves north-south n
etwork performance (that is, traffic with endpoints outside the
host machine) by allowing traffic to bypass the host machine’s network stack.
All devices behind a SPI controller have the same udev ID_PATH property.
This is a problem for predicable network names for CAN controllers.
CAN controllers, in contrast to Ethernet controllers, don't have a MAC
Address, so there's no way to tell two CAN controllers on the same SPI
host controller apart:
Luca Boccassi [Tue, 16 Jun 2020 17:46:55 +0000 (18:46 +0100)]
core: store timestamps of unit load attempts
When the system is under heavy load, it can happen that the unit cache
is refreshed for an unrelated reason (in the test I simulate this by
attempting to start a non-existing unit). The new unit is found and
accounted for in the cache, but it's ignored since we are loading
something else.
When we actually look for it, by attempting to start it, the cache is
up to date so no refresh happens, and starting fails although we have
it loaded in the cache.
When the unit state is set to UNIT_NOT_FOUND, mark the timestamp in
u->fragment_loadtime. Then when attempting to load again we can check
both if the cache itself needs a refresh, OR if it was refreshed AFTER
the last failed attempt that resulted in the state being
UNIT_NOT_FOUND.
Update the test so that this issue reproduces more often.
Frantisek Sumsal [Sun, 28 Jun 2020 16:53:28 +0000 (18:53 +0200)]
test: bump the timeout for systemd-hwdb-update.service under ASan
Since the hwdb update from a79be2f80777eb80e0d8177f6bccd7615de7ec1a
the systemd-hwdb-update service started timing out under ASan when
compiled with gcc, as we started tripping over the 3 minutes timeout.
This affects only gcc runs, since the current gcc on Arch still suffers
from the detect_stack_use_after_return performance penalty[0]. Until
the fixed gcc is present in the respective repositories, let's bump
the timeout to 4 minutes, as we might not be able to upgrade right
away, due to systemd/systemd#16199.
../src/shared/efi-loader.c:738:5: error: redefinition of 'efi_loader_get_config_timeout_one_shot'
int efi_loader_get_config_timeout_one_shot(usec_t *ret) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../src/shared/efi-loader.c:9:
../src/shared/efi-loader.h:85:19: note: previous definition of 'efi_loader_get_config_timeout_one_shot' was here
static inline int efi_loader_get_config_timeout_one_shot(usec_t *ret) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/shared/efi-loader.c:776:5: error: redefinition of 'efi_loader_update_entry_one_shot_cache'
int efi_loader_update_entry_one_shot_cache(char **cache, struct stat *cache_stat) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../src/shared/efi-loader.c:9:
../src/shared/efi-loader.h:89:19: note: previous definition of 'efi_loader_update_entry_one_shot_cache' was here
static inline int efi_loader_update_entry_one_shot_cache(char **cache, struct stat *cache_stat) {
Let's make this easier for readers by grouping common subjects together.
Roughly: pid1 features, unit file changes, general syntax changes, kernel
options, general defaults, udevd features, networkd and .network/.netdev
features, networkctl, resolved, systemctl, systemd-run, journald, journalctl,
various other tools, low-level dbus and library stuff, documentation.
Luca Boccassi [Fri, 26 Jun 2020 11:19:48 +0000 (12:19 +0100)]
core: add device mapper to allow-list with DevicePolicy=closed and RootImage
To set up a verity/cryptsetup RootImage the forked child needs to
ioctl /dev/mapper/control and create a new mapper.
If PrivateDevices=yes and/or DevicePolicy=closed are used, this is
blocked by the cgroup setting, so add an exception like it's done
for loop devices (and also add a dependency on the kernel modules
implementing them).
logind: also cache LoaderEntryOneShot EFI variable
With this we are now caching all EFI variables that we expose as
property in logind. Thus a client invoking GetAllProperties() should
only trgger a single read of each variable, but never repeated ones.
The data from this EFI variable is exposed as dbus property, and gdbus
clients are happy to issue GetAllProperties() as if it was free. Hence
make sure it's actually free and cache LoaderConfigTimeoutOneShot, since
it's easy.
Łukasz Stelmach [Wed, 24 Jun 2020 17:24:13 +0000 (19:24 +0200)]
udev: split attribute assignment for MMC cards
Some cards have names consisting only of whitespace characters which
prevents the original rule from matching and assigning ID_SERIAL
properly. With the split rules ID_SERIAL and ID_NAME are assigned
independently and the symlink is created only if both are available the
same way it has worked for partitions.
Luca Boccassi [Tue, 23 Jun 2020 14:56:33 +0000 (15:56 +0100)]
portabled: create temp file for unit, not directory
open_tmpfile_linkable is used to create a temporary file in the same
directory as the target, but portabled uses the name of the parent
directory instead of the file it intends to create.
In other words, it creats a tmp for /etc/systemd/system.attached instead
of /etc/systemd/system.attached/foo.service.
It still works because it's later moved in the right place.
But as a side effect, it tries the create the file in the parent directory
which is /etc/systemd, and it case of read-only filesystems it fails.
journal-file: when individual hash chains grow too large, rotate
Even with the new keyed hash table journal feature: if an attacker
manages to get access to the journal file id it could synthesize records
that result in hash collisions. Let's rotate automatically when we
notice that, so that a new journal file ID is generated, our performance
is restored and the attacker has to guess a new file ID before being
able to trigger the issue again.
That said, untrusted peers should never get access to journal files in
the first case...
journal: use a different hash function for each journal file
This adds a new (incompatible) feature to journal files: if enabled the
hash function used for the hash tables is no longer jenkins hash with a
zero key, but siphash keyed by the file uuid that is included in the
file header anyway. This should make our hash tables more robust against
collision attacks, as long as the attacker has no read access to the
journal files. We switch from jenkins to siphash simply because it's
more well-known and we standardize for the rest of our codebase onto it.
This is hardening in order to make collision attacks harder for clients
that can forge log messages but have no read access to the logs. It has
no effect on clients that have read access.
Let's prefix this with "jenkins_" since it wraps the jenkins hash. We
want to add support for other hash functions to journald soon, hence
better be clear with what this is. In particular as all other symbols
defined by lookup3.h actually are prefixed "jenkins_".
Let's clean this up a bit, following our usual nomenclature to name
return parameters ret-xyz.
This is mostly a bit of renaming, but there's also some minor other
changes: if we return a pointer to a mmap'ed object plus its offset, in
almost all cases we are happy if either parameter is NULL in case the
caller is not interested in it. Let's fix the remaining case to do this
too, to minimize surprises.
The object flags field is a bitmask, hence don't sloppily define
_OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK
as long as we only had two flags, but will fall apart as soon as we have
three. Let's fix this.
(It's kinda sloppy how the string table is built here, as it will be
quite sparse as soon as we have more enum entries, but let's keep it for
now.)
hostnamed: minimize caching of /etc/hostname, /etc/os-release and /etc/machine-info
Instead of reading these files at startup and never again, let's read
them when we need them. As an optimization (in particular as some of
these files contain the data for many fields at once) let's cache the
results as long as the stat data (i.e. mtime) remains stable.
Also, while we are at it, if we can't read any of these props, let's not
fail everything, but continue without the data.