Luca Boccassi [Thu, 2 Nov 2023 11:01:23 +0000 (11:01 +0000)]
mkosi: explicitly disable KVM in GHA runs
mkosi detects whether /dev/kvm is available and uses it if it is. But
some GHA hosts have it, but it's broken and not supported, so we need
to explicitly disable it.
varlink,json: introduce new varlink_dispatch() helper
varlink_dispatch() is a simple wrapper around json_dispatch() that
returns clean, standards-compliant InvalidParameter error back to
clients, if the specified JSON cannot be parsed properly.
For this json_dispatch() is extended to return the offending field's
name. Because it already has quite a few parameters, I then renamed
json_dispatch() to json_dispatch_full() and made json_dispatch() a
wrapper around it that passes the new argument as NULL. While doing so I
figured we should also get rid of the bad= argument in the short
wrapper, since it's only used in the OCI code.
To simplify the OCI code this adds a second wrapper oci_dispatch()
around json_dispatch_full(), that fills in bad= the way we want.
Net result: instead of one json_dispatch() call there are now:
1. json_dispatch_full() for the fully feature mother of all dispathers.
2. json_dispatch() for the simpler version that you want to use most of
the time.
3. varlink_dispatch() that generates nice Varlink errors
4. oci_dispatch() that does the OCI specific error handling
To avoid timeouts in oss-fuzz. The timeout reported in #29736 happened
with a ~500K test case, so with a conservative 128K limit we should
still be well within a range for any reasonable-ish generated input to
get through, while avoiding timeouts.
Since libcs are more interested in the POSIX `fchmodat(3)`, they are
unlikely to provide a direct wrapper for this syscall. Thus, the headers
we examine to set `HAVE_*` are picked somewhat arbitrarily.
Also, hook up `try_fchmodat2()` in `test-seccomp.c`. (Also, correct that
function's prototype, despite the fact that mistake would not matter in
practice)
chase: fix corner case when using CHASE_PARENT with a path ending in ".."
If we use CHASE_PARENT on a path ending in ".." then things are a bit
weird, because we the last path we look at is actually the *parent* and not
the *child* of the preceeding path. Hence we cannot just return the 2nd
to last fd we look at. We have to correct it, by going *two* levels up,
to get to the actual parent, and make sure CHASE_PARENT does what it
should.
Example: for the path /a/b/c chase() with CHASE_PARENT will return
/a/b/c as path, and the fd returned points to /a/b. All good. But now,
for the path /a/b/c/.. chase() with CHASE_PARENT would previously return
/a/b as path (which is OK) but the fd would point to /a/b/c, which is
*not* the parent of /a/b, after all! To get to the actual parent of
/a/b we have to go *two* levels up to get to /a.
Very confusing. But that's what we here for, no?
@mrc0mmand ran into this in https://github.com/systemd/systemd/pull/28891#issuecomment-1782833722
These commits are workarounds for issues caused by 331aa7aa15ee5dd12b369b276f575d521435eb52.
With the previous commit, these workarounds are not necessary anymore,
as partitions are always processed later than their whole disk, and
a decrypted volume is also processed later than its backing volume.
Yu Watanabe [Mon, 30 Oct 2023 04:31:23 +0000 (13:31 +0900)]
udev: update devlink with the newer device node even when priority is equivalent
Several udev rules depends on the previous behavior, i.e. that udev
replaces the devlink with the newer device node when the priority is
equivalent. Let's relax the optimization done by 331aa7aa15ee5dd12b369b276f575d521435eb52.
Note, the offending commit drops O(N) of file reads per uevent, and this
commit does not change the computational order. So, hopefully the
performance impact of this change is small enough.
André Paiusco [Tue, 31 Oct 2023 14:25:01 +0000 (15:25 +0100)]
man: Improve text for SystemMaxFileSize when not set
If one sets the SystemMaxUse=64G by the current documentation would expect that each files size would be around 1/8 of this value (8G), althought if the SystemMaxFileSize is not explicit set, it has a max of 128M per file.
nspawn: make sure idmapped logic works if DDI contains only /usr/ tree
If we have a DDI that contains only a /usr/ tree (and which is thus
combined with a tmpfs for root on boot) we previously would try to apply
idmapping to the tmpfs, but not the /usr/ mount. That's broken of
course.
nspawn: fix barriers when wiping fully visible procfs/sysfs
Let's wait until the child is fully done with mounting it's own
instances of procfs/sysfs before we destroy our fully visible copies of
it.
This borrows heavily from Christian Brauners fix #29521, but splits the
place + sync into two steps so that the child payload is not started
before the parent has destroyed the procfs instance.
Yu Watanabe [Tue, 24 Oct 2023 17:32:04 +0000 (02:32 +0900)]
dissect: reenable automatic removal before trying again
The device node may be different from we want to activate, and we may
try to activate different device in the subsequent loop. In such case,
we should enable the automatic removal for the unexpected device.
Otherwise, it will not be removed even when not necessary anymore.
Jin Liu [Tue, 31 Oct 2023 04:48:24 +0000 (12:48 +0800)]
New PAM module: pam_systemd_loadkey
This module reads password from kernel keyring and sets it as PAM authtok.
It's inspired by gdm's pam_gdm, which reads the LUKS password stored by
systemd-cryptsetup, so Gnome Keyring can be automatically unlocked if set
to the same password (when autologin is enabled so the user doesn't enter
a password in gdm).
This is name ".network.example" for now, to match the existing
80-ethernet.network file.
I think it would make sense to actually install this by default if told
so via a meson file (and then hopefully this would happen even on
Fedora, though in a split off RPM or so). However, we aren't there yet,
hence for now, just ship the .network files as example, like the others.
secure-boot: print just before cold-resetting to help diagnose hangs
When testing the secureboot enroll feature, it can be hard to distinguish without
using the QMP API of QEMU whether we are in a hang situation of the UEFI firmware.
Making it clear that we reached the `ResetSystem` can be helpful towards that need.
Both sleep_mode_supported and write_mode support this,
but parse_sleep_config currently prohibits this - it always
uses our default value if user specifies HibernateMode=<empty>.