analyze: Do no require a full d-bus bus for the plot command (#8539)
The plot command requires a full d-bus bus to fetch the host
information, which seems rather optional, and having a running dbus
daemon is not always desirable. So instead, we try to acquire a full
bus, and if that fails we acquire the systemd bus, in which case we
omit the host information from the output.
We refactor acquire_bus() into two new functions which in addition
makes the call sites clearer.
Martin Wilck [Sat, 7 Apr 2018 15:33:48 +0000 (17:33 +0200)]
systemd-udevd: limit children-max by available memory (#8668)
Udev workers consume typically 50-100MiB virtual memory.
On systems with lots of CPUs and relatively low memory, that may
easily cause workers to be OOM-killed.
This patch limits the number of workers to 8 per GiB memory.
But don't let the limit drop below the smallest value we had
without this patch (8 + 1 * 2 = 10); on small systems, udev's
memory footprint is likely lower.
Clarify checker/helper in systemd-fsck@.service manpage (#8674)
Clarify the helper/checker terminology in the systemd-fsck@.service manpage to
make the description more clear about what is responsible for deciding if a filesystem
needs checking.
Philip Sequeira [Thu, 5 Apr 2018 14:04:27 +0000 (14:04 +0000)]
nspawn: wait for network namespace creation before interface setup (#8633)
Otherwise, network interfaces can be "moved" into the container's
namespace while it's still the same as the host namespace, in which case
e.g. host0 for a veth ends up on the host side instead of inside the
container.
dissect: when pulling metadata from an image, don't bother with /home or ESP
When we try to read meta-data from an image, don't bother with mounting
/home or the ESP, as that's not where the metadata is. This not only
speeds things up a bit, but also has the benefit that setups where an
unencrypted root is mixed with an encrypted /home (which I have on one
of my own systems) won't result in errors that the crypto key is needed.
1. We'll now explicitly check that the child devices of a block device
we are interested in (i.e. the partitions) are block devices themselves.
On newer kernels the mmc rpmb stuff is actually exposed as char rather
than block device as before, and they probably should have been that in
the first place. By adding this check we'll hence filter out these weird
devices through a second rule too, that hopefully makes things a bit
more future-proof, should more devices like this be added eventually,
or other subsystems do a similar thing.
2. When counting partitions we'll now also check the devnum of the
device being non-null, which we already do when matching up the devices
in the second iteration. This should make things more robust, and
prevent other kinds of miscounting, which after all was the main
issue #8609 fixed.
If an rfkill device disappears between the time we get notified about
the existance and we fully opened it we might get ENXIO or ENODEV (i.e.
the two kinds of "device not found" errors, which are typically
generated when for example a device node has no actual backing device
behind it). let's handle that the same way as ENOENT, and downgrade the
log message to LOG_DEBUG.
tmpfiles: ignore "operational" errors during setup
We still get the errors logged, but we don't fail the service. This
is better for users because rerunning tmpfiles-setup.service a second
time is dangerous (c.f. cd9f5b68ce08375eb1d68a4ddaa7a24a5092d7ba).
Note that this only touches sd-tmpfiles-setup.service and
sd-tmpfiles-setup-dev.service. sd-tmpfiles-clean.service is as before.
tmpfiles: add a new return code for "operational failure" when processing
Things can fail, and we have no control over it:
- file system issues (immutable bits, file system errors, MAC refusals, etc)
- kernel refusing certain arguments when writing to /proc/sys or /sys
Let's add a new code for the case where we parsed configuration but failed
to execute it because of external errors.
units: use `systemctl exit` to kill the user manager (#8648)
Use `systemctl --user --force exit` to implement the systemd-exit
user service.
This removes our dependence on an external `kill` binary and the
concerns about whether they recognize SIGRTMIN+n by name or what their
interpretation of SIGRTMIN is.
Tested: `systemctl --user start systemd-exit.service` kills the
`systemd --user` instance for my user.
oss-fuzz: Fallback to `ninja-build` when available (#8641)
The ninja binary is deployed as `ninja-build` in older distros such as
RHEL 7/CentOS 7. Detect that and use `ninja-build` instead of `ninja`
when it's available.
dissect: Don't count RPMB and boot partitions (#8609)
Filter-out RPMB partitions and boot partitions from MMC devices when
counting partitions enumerated by the kernel. Also factor out the now
duplicated code into a separate function.
This complement the previous fixes to the problem reported in
https://github.com/systemd/systemd/issues/5806
Accept definitions to other AF_ constants, not just PF_ ones,
such as:
#define AF_LINUX AF_LOCAL
It may not be necessary to impose any restriction on the
definitions of the macros extracted, but for now
keep most of that requirement but match AF_* as well.
Hans de Goede [Fri, 30 Mar 2018 18:00:27 +0000 (20:00 +0200)]
hwdb: Add accelerometer orientation quirk for the Lenovo Ideapad Miix 310
Add an accelerometer orientation quirk for the Lenovo Ideapad Miix 310.
Note this quirk is limited to the production batches which ship with a
portrait panel, rather then with a landscape panel (recognized by the
different BIOS version these 2 variants use).
Hans de Goede [Fri, 9 Mar 2018 13:55:11 +0000 (14:55 +0100)]
hwdb: Add accelerometer orientation quirk for the Yours Y8W81 tablet
Add an accelerometer orientation quirk for the Yours Y8W81 8" tablet.
For future reference: this tablet has the same case and mostly the same
internals as the Chuwi Vi8. Both seem to be from an ODM called inet-tek.
Both are labelled: "INET-I86M-REVxx" on the PCB, with the Chuwi Vi8 being
REV03 (and having a ALC5640 audio codec) and the Yours Y8W81 being REV21
(and having a ALC5651 audio codec).
Hans de Goede [Sun, 18 Feb 2018 20:42:43 +0000 (21:42 +0100)]
hwdb: Add accelerometer orientation entry for the I.T.Works TW701 tablet
Add accelerometer orientation entry for the I.T.Works TW701 7"
windows tablet, note this is the same hardware/PCB as the Trekstor
ST70416-6 for which we already have the same quirk.
sd-bus: allow description to be set for system/user busses (#8594)
sd_bus_open/sd_bus_open_system/sd_bus_open_user are convenient, but
don't allow the description to be set. After they return, the bus is
is already started, and sd_bus_set_description() fails with -EBUSY.
It would be possible to allow sd_bus_set_description() to update the
description "live", but messages are already emitted from sd_bus_open
functions, so it's better to allow the description to be set in
sd_bus_open/sd_bus_open_system/sd_bus_open_user.
Fixes message like:
Bus n/a: changing state UNSET → OPENING
install: don't enforce that .d/ dropin files (and their symlink chain elements) for units must have names that qualify as unit names
The names of drop-in files can be anything as long as they are suffixed
in ".conf", hence don't be stricter than necessary when validating the
names used in symlink chains of such drop-in files.
Also, drop-in files should not be ale to change the type of unit file
itself, i.e. not affect whether it is considered masked or an alias as a
whole.
This adds a flag SEARCH_DROPIN that is passed whenever we load a drop-in
rather the main unit file, and in that case loosen checks and behaviour
we otherwise enforce for the unit file itself. Specifically:
1. If SEARCH_DROPIN is passed we won't change the unit's info->type
field anymore, as that field (which can be REGULAR, MASKED, SYMLINK)
should not be affected by drop-ins, but only by the unit file itself.
2. If SEARCH_DROPIN is passed we will shortcut following of symlink
chains, and not validate the naming of each element in the chain,
since that's irrelevant for drop-ins, and only matters for the unit
file itself.
Or in other words, without this:
1. A symlink /etc/systemd/system/foobar.service.d/20-quux.conf →
/dev/null might have caused the whole of foobar.service to be
considered "masked".
2. A symlink /etc/systemd/system/foobar.service.d/20-quux.conf →
/tmp/miepf might have caused the whole loading of foobar.service to
fail as EINVAL, as "miepf" is not a valid unit name.
fd-util: introduce fd_reopen() helper for reopening an fd
We have the same code for this in place at various locations, let's
unify that. Also, let's repurpose test-fs-util.c as a test for this new
helper cal..
tests: run `udevadm settle` after `sfdisk` (#8610)
This makes the script wait for the newly created partition to
show up before trying to put a filesystem on it, which should
prevent the tests from failing with the following error:
```
New situation:
Disklabel type: dos
Disk identifier: 0x3541a0ec
Device Boot Start End Sectors Size Id Type
/dev/loop6p1 2048 800767 798720 390M 83 Linux
/dev/loop6p2 800768 819199 18432 9M 83 Linux
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
The file /dev/loop6p1 does not exist and no size was specified.
make: *** [setup] Error 1
F: Failed to mkfs -t ext4
Makefile:4: recipe for target 'setup' failed
```
Christian Hesse [Wed, 28 Mar 2018 19:58:10 +0000 (21:58 +0200)]
systemd-inhibit: ignore signal interrupt from keyboard (#8569)
By default both processes, systemd-inhibit and the forked one, receive
the signals. Pressing Ctrl+C on the keyboard results in SIGINT being
sent to the processes, followed by SIGTERM being sent to the forked
process when systemd-inhibit exits. This can cause trouble when the
forked process does not clean up properly but exit immediately.
Instead make systemd-inhibit ignore SIGINT, leaving it to the forked
process to clean up and exit.
Note: in check_triggering_units 'path' will be allocated twice. This is a
conscious choice, this way the implementation is simpler and not worth
optimizing.
Yu Watanabe [Wed, 28 Mar 2018 11:37:27 +0000 (20:37 +0900)]
bus-util: add flags for bus_map_all_properties() (#8546)
This adds flags BUS_MAP_STRDUP and BUS_MAP_BOOLEAN_AS_BOOL.
If BUS_MAP_STRDUP is set, then each "s" message is duplicated.
If BUS_MAP_BOOLEAN_AS_BOOL is set, then each "b" message is
written to a bool pointer.
Follow-up for #8488.
See https://github.com/systemd/systemd/pull/8488#discussion_r175816270.
core: dont't remount /sys/fs/cgroup for relabel if not needed (#8595)
The initial fix for relabelling the cgroup filesystem for
SELinux delivered in commit 8739f23e3 was based on the assumption that
the cgroup filesystem is already populated once mount_setup() is
executed, which was true for my system. What I wasn't aware is that this
is the case only when another instance of systemd was running before
this one, which can happen if systemd is used in the initrd (for ex. by
dracut).
In case of a clean systemd start-up the cgroup filesystem is actually
being populated after mount_setup() and does not need relabelling as at
that moment the SELinux policy is already loaded. Since however the root
cgroup filesystem was remounted read-only in the meantime this operation
will now fail.
To fix this check for the filesystem mount flags before relabelling and
only remount ro->rw->ro if necessary and leave the filesystem read-write
otherwise.
shared/specifier: use realloc to free some memory after specifier expansion
This is a separate commit only because it actually *increases* memory allocations:
==3256== total heap usage: 100,120 allocs, 100,120 frees, 13,097,140 bytes allocated
to
==4690== total heap usage: 100,121 allocs, 100,121 frees, 14,198,329 bytes allocated
Essentially, we do a little more work to reduce the memory footprint a bit. For a
test where we just allocate the memory and drop it soon afterwards, this is not
beneficial, but it should still be useful for a long running program.
core: use setreuid/setregid trick to create session keyring with right ownership (#8447)
Re-use the hacks used to link user keyring, when creating the session
keyring. This way changing ownership of the keyring is not required, and thus
incovation_id can be correctly created in restricted environments.
Creating invocation_id with root permissions works and linking it into session
keyring works, as at that point session keyring is possessed.
Simple way to validate this is with following commands:
journal-file: we can't use a chain cache entry if we don't know where it starts (#8542)
It might happen that we try to bisect through a chain of offset arrays in the
journal whose last element was just allocated but no item yet written
to. In that case that array will be all NUL, but it might still end up
in our array chain cache. If it does, we cannot use it for bisection,
since for bisection we need to know the value of the first entry in that
array, but if it's uninitialized it does not have a first value. Hence,
as a simple fix, in this unlikely case, simply ignore the chain cache.
This is supposed to fix the issue pointed out in #8432, but in a more
permissive way, as this case isn't strictly a badly formatted journal
but actually a valid state (though one within a very short time window),
and we should make the best of it, and handle it gracefully.
Background: in each journal file entries are linked up in large arrays
of offsets. In each array the entries are strictly ordered by the
offsets of the entries, which permits search by bisection. These arrays
are allocated with a fixed size and then filled up as entries are added
to the journal file. If an array is fully filled up, a new array
(double in size as the old one) is appended to the journal file, and
linked up. This means, the journal file will contain a series of chained
up arrays, each time doubling in size, and strictly ordered. When
looking for an entry we maintain a "chain cache", which allows us to
bypass traversing the chain in full if we look for entries close to each
other in a short time. With the fix above we make sure we don't
erroneously use a chain cache item that doesn't carry enough information
for this bisection to work.
This reworks the SELinux and SMACK label fixing calls in a number of
ways:
1. The two separate boolean arguments of these functions are converted
into a flags type LabelFixFlags.
2. The operations are now implemented based on O_PATH. This should
resolve TTOCTTOU races between determining the label for the file
system object and applying it, as it it allows to pin the object
while we are operating on it.
3. When changing a label fails we'll query the label previously set, and
if matches what we want to set anyway we'll suppress the error.
Also, all calls to label_fix() are now (void)ified, when we ignore the
return values.
Stuart Hayes [Thu, 18 Jan 2018 20:14:56 +0000 (15:14 -0500)]
udev: net_id: Improve predictable names for NPAR devices
NPAR is a technology that allows a single network interface to
be divided into number of partitions. The partitions show up
as functions on the same PCI device... when there are more than
8 functions, ARI (alternative routing-ID interpretation) is
used. With ARI is enabled, the 8 bit field that normally has 5
bits for the PCI device and 3 bits for the PCI function is instead
interpreted as (implicit) device 0, with 8 bits for the function
number.
Because the linux kernel exposes the PCI device/function numbers
to userspace the same regardless of whether ARI is enabled,
systemd predictable device naming can generate unpredictable
names in this case, because network names using the PCI slot use
the function number, but not the device number, causing systemd
to generate the same name for mulitple network devices (so some
will revert to the "ethX" names).
With this patch, device naming code checks if ARI is enabled for
a PCI network device, and uses the full 8-bit function number
for naming to avoid this situation. This should improve
readability and predictability of device names.
Here is an example of how this change would affect naming:
Stuart Hayes [Wed, 17 Jan 2018 19:31:55 +0000 (14:31 -0500)]
udev: net_id: Improve predictable names for SR-IOV virtual devices
With PCI SR-IOV, a number of virtual network devices can be enabled,
all of which share the same physical network device. Currently,
udev generates names for SR-IOV virtual functions as if they were
independent network devices.
With this change, the predictable network device naming code will
check if a network device is an SR-IOV virtual device, and will
generate a name based on the physical PCI device plus a "v%u"
suffix. This should improve readability and predictability of
device names.
Here is an example of how this change would affect naming:
Stuart Hayes [Tue, 16 Jan 2018 21:08:10 +0000 (16:08 -0500)]
udev: net_id: search parent devices for PCI slot number
To generate predictable network device names, the code in
udev-builting-net_id.c tries to match the PCI device address
of the network device to the entries in /sys/bus/pci/slots.
However, sometimes the slot number is not associated the
network controller PCI device itself, but rather with one of
its parents.
This change will try to find a match in /sys/bus/pci/slots for
the parents of the PCI network device, if it doesn't find a
match for the device itself.
mourikwa [Mon, 26 Mar 2018 15:50:35 +0000 (17:50 +0200)]
Fix for alphabetical ordering (#8581)
I read the addition of the purism laptop keyboard and noticed
that the 60-keyboard.hwdb file could/should have an alphabetical ordering.
I scratched that itch with this commit.