fstab-generator: use log message that matches reality
We *assume* that when /sys is read-only, we're running in a container. But
there can other reasons, for example root is mount ro and nobody has mounted
/sys yet, or somebody forgot to add /sys to the list of filesystem not to
remount ro in a sandbox. So let's actually say what we know instead of assuming.
systemd-fstab-generator was reporting that it's running in a container and I
spent a good few minutes trying to figure out why 'systemd-detect-virt -c'
disagrees, before noticing that it's just checking a different condition.
This is an octal number. We used the 0 prefix in some places inconsistently.
The kernel always interprets in base-8, so this has no effect, but I think
it's nicer to use the 0 to remind the reader that this is not a decimal number.
manager: execute generators in a mount namespace "sandbox"
When generators are executed during early boot, /tmp might not be available
yet. This causes problems with bash, because here-docs don't work. Even
non-shell code can often assume that /tmp is available. This limitation is
known to trip up people, and when the code is tested on a "normal" system,
everything works.
We can solve this nicely, and get another small benefit, by making most of the
file system read-only and "punching holes" for some dirs that should be
writable. The generator code runs with full privileges and can do anything it
wants by writing appropriate systemd units, so it doesn't make much sense to do
any significant sandboxing around generators. But making root read-only is nice
because it can catch stupid mistakes where the generator tries to write to a
wrong path or something like that. We effectively also get a "private /tmp" for
the generators, which protects them against existing files in /tmp.
The path does the following:
when executing generators, we fork, and the child unshares root and makes
it recursively read-only, with the exception of /sys and /run. Error handling
is permissive — if some of this setup fails, we're in the same state as
before the patch.
If the flag is set, we mount /tmp/ in a way that is suitable for generators and
other quick jobs.
Unfortunately I had to move some code from shared/mount-util.c to
basic/mountpoint-util.c. The functions that are moved are very thin wrappers
around mount(2), so this doesn't actually change much in the code split between
libbasic and libshared.
Implications for the host would be weird if a private mount namespace is not
used, so assert on FORK_NEW_MOUNTNS when the flag is used.
RUN_WITH_UMASK was initially conceived for spawning externals progs with the
umask set. But nowadays we use it various syscalls and stuff that doesn't "run"
anything, so the "RUN_" prefix has outlived its usefulness.
Peter Cai [Sat, 29 Oct 2022 23:00:53 +0000 (19:00 -0400)]
cryptsetup-fido2: Try all FIDO2 key slots when opening LUKS volume
After #25268, it is now possible to check whether a credential
is present on a FIDO2 token without actually attempting to retrieve said
credential. However, when cryptsetup plugins are not enabled, the
fallback unlock routines are not able to make multiple attempts with
multiple different FIDO2 key slots.
Instead of looking for one FIDO2 key slot when trying to unlock, we now
attempt to use all key slots applicable.
uerdogan [Mon, 12 Dec 2022 20:46:50 +0000 (21:46 +0100)]
Update 60-evdev.hwdb (#25704)
This solves Debian Bug report 1008760:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1008760.
Solution was inspired by this kernel bug report message:
https://bugzilla.kernel.org/show_bug.cgi?id=204967#c15.
My measured pad dimensions with a ruler were 85x44mm.
But I decided to take the 2x size reported by the current kernel
when invoking the touchpad-edge-detector command from the
libdev-tools package. Because this comment claims that the old
vs new kernel reportings differ by factor 2:
https://bugzilla.kernel.org/show_bug.cgi?id=204967#c3 .
Therefore I have used this command to get the new entry to 60-evdev.hwdb:
"root@pb:~# touchpad-edge-detector 80x34 /dev/input/event2
Touchpad ETPS/2 Elantech Touchpad on /dev/input/event2
Move one finger around the touchpad to detect the actual edges
Kernel says: x [0..1254], y [0..528]
Touchpad sends: x [0..2472], y [-524..528] -^C
Touchpad size as listed by the kernel: 40x17mm
User-specified touchpad size: 80x34mm
Calculated ranges: 2472/1052
This tool was "deprecated" back in 65eb4378c3e1de25383d8cd606909e64c71edc80,
but only by removing documentation. This is somewhat surprising, but udevadm
hwdb --update and systemd-hwdb update generate different databases. udevadm
runs in compat mode and (as far as I have been able to figure out from a quick
look), it omits filename information and does some other changes to the
datastructures. The consuming code (udev) is the same in both cases, so this
"compatibility mode" seems very strange. But I don't think it's worth trying to
figure out why things were done this way. Let's just push people towards the
new code.
Inspired by https://github.com/systemd/systemd/issues/25698#issuecomment-1346298094.
Yu Watanabe [Thu, 8 Dec 2022 06:49:02 +0000 (15:49 +0900)]
man: mention that sd_id128_get_boot() and friend may return -ENOSYS
And drop to mention sd_id128_get_boot_app_specific() may return -ENOENT
or -ENOMEDIUM. The function does not read /etc/machine-id. But reads a
file in the procfs, which is a kind of the kernel API. Hence the
failures are caused only when the system has wrong setup.
We would execute up to four hwdb match patterns (+ the keyboard builtin):
After the first hit, we would skip the other patterns, because of the GOTO="evdev_end"
action.
57bb707d48131f4daad2b1b746eab586eb66b4f3 (rules: Add extended evdev/input match
rules for event nodes with the same name), added an additional match with
":phys:<phys>:ev:<ev>" inserted. This breaks backwards compatibility for user
hwdb patterns, because we quit after the first match.
In general hwdb properties are "additive". We often have a general rule that
matches a wider class and then some specific overrides. E.g. in this particular
case, we have a match for all trackpoints, and then a bunch of model-specific
settings.
So let's change the rules to try all the match patterns and combine the
received properties. We execute builtin-keyboard once at the end, if there was
at least one match.
This also impacts other cases which I think would be very confusing for users.
Since we quit after a first successful match, if we had e.g. a match for
'evdev:input:b*v*p*' in out database, and the user added a match using
'evdev:name:*', which is the approach we document in the .hwdb files and which
users quite often use, it would be silently ignored. What's worse, if we added
our 'evdev:input:b*v*p*' match at a later point, user's match would stop
working. If we combine all the properties, we get more stable behaviour.
Yu Watanabe [Mon, 12 Dec 2022 05:16:09 +0000 (14:16 +0900)]
sd-device: fix double-free
If an attribute is read but the value is not used (i.e. ret_value is NULL),
then sd_device_get_sysattr_value() mistakenly frees the read data even though
it is cached internally.
`fido2_is_cred_in_specific_token()` should simply not return error codes
for non-fatal errors. For example, `-ENODEV` can be safely translated to
a `false` return value. When the pre-flight request is not supported, we
should simply return true to instruct the caller to attempt to use the
device anyway.
All error codes returned by the funtion should now be fatal and logged
at error level. Non-fatal errors should only appear in debug logs.
Peter Cai [Mon, 14 Nov 2022 02:58:43 +0000 (21:58 -0500)]
libfido2-util: Perform pre-flight checks as well when a specific device path is given
This prevents unnecessary user interactions when `fido2-device` is set to
something other than `auto` -- a case overlooked in the original PR #23577
(and later #25268).
We do not move pre-flight checks to `fido2_use_hmac_hash_specific_token`
because the behaviors are different between different cases: when the
device path is NULL, we try to automatically choose the correct device,
in which case pre-flight errors should be "soft" errors, without
spamming the tty with error outputs; but when a specific device path is
given, a pre-flight request that determined the non-existence of the
credential should be treated the same as a failed assertion request.
Peter Cai [Mon, 14 Nov 2022 02:12:45 +0000 (21:12 -0500)]
libfido2-util: Disable pre-flight checks for credentials with UV
According to the FIDO2 spec, tokens may not support pre-flight checks
for credentials requiring UV, at least not without at least
`pinUvAuthParam` or `uv = true`. Originally, in #25268, this was
handled by passing a PIN to satisfy `pinUvAuthParams`, but this is not
ideal, since `pinUvAuthParam` can be obtained from either a PIN
or a UV verification. Forcing the user to enter the PIN here (which is
often just the fallback option on UV devices) is no better than just
trying out each device with the actual assertion request.
As a result, this commit disables pre-flight checks when the credential
requires UV, and instead reverts to the old behavior (trying out each
device and each key slot, requiring multiple user interactions) for this
type of credentials.
So, i think "erofs" is probably the better, more modern alternative to
"squashfs". Many of the benefits don't matter too much to us I guess,
but there's one thing that stands out: erofs has a UUID in the
superblock, squashfs has not. Having an UUID in the superblock matters
if the file systems are used in an overlayfs stack, as overlayfs uses
the UUIDs to robustly and persistently reference inodes on layers in
case of metadata copy-up.
Since we probably want to allow such uses in overlayfs as emplyoed by
sysext (and the future syscfg) we probably should ramp up our erofs game
early on. Hence let's natively support erofs, test it, and in fact
mention it in the docs before squashfs even.
Daan De Meyer [Fri, 9 Dec 2022 11:10:09 +0000 (12:10 +0100)]
ci: Labeler improvements
- Mention "/please-review" in the contributing guide
- Remove "needs-rebase" on push
- Don't add "please-review" if a green label is set
- Don't add please-review label to draft PRs
- Add please-review when a PR moves out of draft
Yu Watanabe [Tue, 6 Dec 2022 21:49:17 +0000 (06:49 +0900)]
hexdecoct: several cleanups for base64_append()
- add missing assertions,
- use size_t for buffser size or memory index,
- handle empty input more gracefully,
- return the length or the result string,
- fix off-by-one issue when the prefix is already long enough.
This drops the special casing for s390 and other archs, which was
cargo-culted from glibc. Given it's not obvious why it exists, and is at
best an optimization let's simply avoid it, in particular as the archs
are relatively non-mainstream.
msizanoen1 [Wed, 7 Dec 2022 16:22:05 +0000 (23:22 +0700)]
core/sleep: set timeout for freeze/thaw operation to 1.5 seconds
A FreezeUnit operation can hang due to the presence of kernel threads
(see last 2 commits). Keeping the default configuration will mean the
system will hang for 25 seconds in suspend waiting for the response. 1.5
seconds should be sufficient for most cases.
msizanoen1 [Wed, 7 Dec 2022 16:09:33 +0000 (23:09 +0700)]
core/cgroup: ignore kernel cgroup.events when thawing
The `frozen` state can be `0` while the processes are indeed frozen (see
last commit). Therefore do not respect cgroup.events when checking
whether thawing is necessary.
dissect: add a mode for operating on an in-memory copy of a DDI, instead of directly on it
This is useful for operating in ephemeral, writable mode on any image,
including read-only ones. It also has the benefit of not keeping the
image file's filesystem busy.
"pcrUpdateCounter – this parameter is updated by TPM2_PolicyPCR(). This value
may only be set once during a policy. Each time TPM2_PolicyPCR() executes, it
checks to see if policySession->pcrUpdateCounter has its default state,
indicating that this is the first TPM2_PolicyPCR(). If it has its default value,
then policySession->pcrUpdateCounter is set to the current value of
pcrUpdateCounter. If policySession->pcrUpdateCounter does not have its default
value and its value is not the same as pcrUpdateCounter, the TPM shall return
TPM_RC_PCR_CHANGED.
If this parameter and pcrUpdateCounter are not the same, it indicates that PCR
have changed since checked by the previous TPM2_PolicyPCR(). Since they have
changed, the previous PCR validation is no longer valid."
The TPM will return TPM_RC_PCR_CHANGED if any PCR value changes (no matter
which) between validating the PCRs binded to the enrollment and unsealing the
HMAC key, so this patch adds a retry mechanism in this case.
Space Meyer [Wed, 7 Dec 2022 13:11:30 +0000 (14:11 +0100)]
journald: prevent segfault on empty attr/current
getpidcon() might set con to NULL, even when it returned a 0 return
code[0]. The subsequent strlen(con) will then cause a segfault.
Alternatively the behaviour could also be changed in getpidcon. I
don't know whether the libselinux folks are comitted to the current
behaviour, but the getpidcon man page doesn't really make it obvious
this case could happen.
msizanoen1 [Wed, 7 Dec 2022 13:46:01 +0000 (20:46 +0700)]
core/unit: allow overriding an ongoing freeze operation
Sometimes a freeze operation can hang due to the presence of kernel
threads inside the unit cgroup (e.g. QEMU-KVM). This ensures that the
ThawUnit operation invoked by systemd-sleep at wakeup always thaws the
unit.
msizanoen1 [Wed, 7 Dec 2022 09:54:13 +0000 (16:54 +0700)]
sleep: always thaw user.slice even if freezing failed
`FreezeUnit` can fail even when some units did got frozen, causing some
user units to be frozen. A possible symptom is `user@.service` being
frozen while still being able to log in over SSH.
If given, multiple initrds are concatenated into a temporary file which then
becomes the .initrd section.
It is also possible to give no initrd. After all, some machines boot without an
initrd, and it should be possible to use the stub without requiring an initrd.
(The stub might not like this, but this is something to fix there.)
ukify: try to find the uname string in the linux image if not specified
The approach is based on mkinicpio's autodetection.
This is hacky as hell. Some cases are actually fairly nice: ppc64el images have
a note that contains 'uname -r'. (The note is not uniquely labeled at all, and
only contains the release part instead of the full version-hostname-release
string, and we don't actually care about ppc, and it's very hard to read the
note from Python, but in general that'd be the approach I'd like.)
I opted to simply read and decompress the full linux binary in some cases.
Python doesn't make it easy to do streaming decompression with regexp matching,
and it doesn't seem to matter much: the image decompresses in a fraction of a
second.
Some gymnastics were needed to import ukify as a module. Before the file
was templated, this was trivial: insert the directory in sys.path, call import.
But it's a real pain to import the unsuffixed file after processing. Instead,
the untemplated file is imported, which works well enough for tests and is
very simple.
The tests can be called via pytest:
PATH=build/:$PATH pytest -v src/ukify/test/test_ukify.py
or directly:
PATH=build/:$PATH src/ukify/test/test_ukify.py
or via the meson test machinery output:
meson test -C build test-ukify -v
or without verbose output:
meson test -C build test-ukify
The option is added because we have a similar one for kernel-install. This
program requires python, and some people might want to skip it because of this.
The tool is installed in /usr/lib/systemd for now, since the interface might
change.
A template file is used, but there is no .in suffix.
The problem is that we'll later want to import the file as a module
for tests, but recent Python versions make it annoyingly hard to import
a module from a file without a .py suffix. imp.load_sources() works, but it
is deprecated and throws warnings.
importlib.machinery.SourceFileLoader().load_module() works, but is also
deprecated. And the documented replacements are a maze of twisted little
callbacks that result in an empty module.
So let's take the easy way out, and skip the suffix which makes it easy
to import the template as a module after adding the directory to sys.path.