Mike Yuan [Sat, 25 Nov 2023 14:10:16 +0000 (22:10 +0800)]
systemctl-whoami: use pidfd to refer to processes
While at it, rephrase the output a bit. Before this commit, if
the pid doesn't exist, we output something hard to interpret -
"Failed to get unit for ourselves".
mime: register confext/sysext images in shared-mime-info
This make them recognized by file managers and stuff. Maybe one day we
should properly register mime types in the "vnd." namespace with IANA,
but I am too lazy to deal with the bureaucracy for that, hence let's
stick with the x. namespace for now.
userdbctl: enable ssh-authorized-keys logic by default
sshd now supports config file drop-ins, hence let's install one to hook
up "userdb ssh-authorized-keys", so that things just work.
We put the drop-in relatively early, so that other drop-ins generally
will override this.
Ideally sshd would support such drop-ins in /usr/ rather than /etc/, but
let's take what we can get. It's not that sshd's upstream was
particularly open to weird ideas from Linux people.
pid1: add ProtectSystem= as system-wide configuration, and default it to true in the initrd
This adds a new ProtectSystem= setting that mirrors the option of the
same of services, but in a more restrictive way. If enabled will remount
/usr/ to read-only, very early at boot. Takes a special value "auto"
(which is the default) which is equivalent to true in the initrd, and
false otherwise.
Unlike the per-service option we don't support full/strict modes, but
the door is open to eventually support that too if it makes sense. It's
not entirely trivial though as we have very little mounted this early,
and hence the mechanism might not apply 1:1. Hence in this PR is a
conservative first step.
My primary goal with this is to lock down initrds a bit, since they
conceptually are mostly immutable, but they are unpacked into a mutable
tmpfs. let's tighten the screws a bit on that, and at least make /usr/
immutable.
This is particularly nice on USIs (i.e. Unified System Images, that pack
a whole OS into a UKI without transitioning out of it), such as
diskomator.
We have two mechanisms that remove old coredumps: systemd-coredump has
parameters based on disk use / remaining disk free, and systemd-tmpfiles does
cleanup based on time. The first mechanism should prevent us from using too much
disk space in case something is crashing continuously or there are very large
core files.
The limit of 3 days makes it likely that the core file will be gone by the time
the admin looks at the issue. E.g. if something crashes on Friday, the coredump
would likely be gone before people are back on Monday to look at it.
We'll typically send signals to all remaining processes in the following
cases:
1. pid1 (in initrd) when transitioning from initrd to sysroot: SIGTERM
2. pid1 (in sysroot) before transitioning back to initrd (exitrd): SIGTERM + SIGKILL
3. systemd-shutdown (in exitrd): SIGTERM + SIGKILL
'warn_rootfs' is set to true only when we're not in initrd and we're
sending SIGKILL, which means the second case. So, we want to emit the
warning when the root of the storage daemon IS the same as that of pid1,
rather than the other way around.
The condition is spuriously reversed in the offending commit.
loginctl: show a nicer error message when no session/seat is available
When calling loginctl {seat,session}-status without arguments, show a nicer
error message in case there's no suitable session/seat attached to the calling
tty.
Before:
~# loginctl seat-status
Could not get properties: Unknown object '/org/freedesktop/login1/seat/auto'.
~# systemd-run -q -t loginctl seat-status
Could not get properties: Unknown object '/org/freedesktop/login1/seat/auto'.
~# systemd-run -q -t loginctl session-status
Could not get properties: Unknown object '/org/freedesktop/login1/session/auto'.
After:
~# build/loginctl seat-status
Failed to get path for seat 'auto': Session '1' has no seat.
~# systemd-run -q -t build/loginctl seat-status
Failed to get path for seat 'auto': Caller does not belong to any known session and doesn't own any suitable session.
~# systemd-run -q -t build/loginctl session-status
Failed to get path for session 'auto': Caller does not belong to any known session and doesn't own any suitable session.
Daan De Meyer [Tue, 5 Dec 2023 13:56:15 +0000 (14:56 +0100)]
repart: Add Minimize=best to --make-ddi= partition definitions
Otherwise, repart won't calculate the minimal size of the partition
automatically and things will fail once the partitions exceed the
minimal partition size (10M).
ukify: raise error if genkey is called with no output arguments
The idea is that genkey is called with either
--secureboot-private-key= + --secureboot-certificate=, and then it
writes those, or with --pcr-private-key + optionally --pcr-public-key
and then it writes those, or both. But when called with no arguments
whatsover, it did nothing.
There is no implicit value for any of those parameters as input (unlike in
mkosi), so we also don't want to have implicit values when used as output.
But we shouldn't return success if no work was done, this is quite confusing.
Roland Singer [Wed, 6 Dec 2023 09:49:47 +0000 (10:49 +0100)]
ukify: fix handling of --secureboot-certificate-validity= (#30315)
Before:
$ python src/ukify/ukify.py genkey --secureboot-private-key=sb2.key --secureboot-certificate=sb2.cert --secureboot-certificate-validity=111
Traceback (most recent call last):
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 1660, in <module>
main()
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 1652, in main
generate_keys(opts)
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 943, in generate_keys
key_pem, cert_pem = generate_key_cert_pair(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 891, in generate_key_cert_pair
now + ONE_DAY * valid_days
~~~~~~~~^~~~~~~~~~~~
TypeError: can't multiply sequence by non-int of type 'datetime.timedelta'
Now:
$ python src/ukify/ukify.py genkey --secureboot-private-key=sb2.key --secureboot-certificate=sb2.cert --secureboot-certificate-validity=111
Writing SecureBoot private key to sb2.key
Writing SecureBoot certificate to sb2.cert
Luca Boccassi [Tue, 5 Dec 2023 15:43:12 +0000 (15:43 +0000)]
switch-root: also check that mount IDs are the same, not just inodes
If /run/nextroot/ has been set up, use it, even if the inodes are
the same. It could be a verity device that is reused, but with
different sub-mounts or other differences. Or the same / tmpfs with
different /usr/ mounts. If it was explicitly set up we should use it.
Use the new helper to check that the mount IDs are also the same,
not just the inodes.
The original reason for deny-listing it was that it's flaky there. I'm
not sure if that's still the case, but the Ubuntu CI jobs for i*86 are
gone, so this file shouldn't be needed anymore anyway.
The regexp only worked if the sections were small enough for the size to
start with "0". I have an initrd that is 0x1078ec7e bytes, so the tests
would spuriously fail.
Note, currently, the function is always called with SCMP_ACT_ALLOW as
the default action, except for the test. So, this should not change
anything in the runtime code.
run: fix bad escaping and memory ownership confusion
arg_description was either set to arg_unit (i.e. a const char*), or to
char *description, the result of allocation in run(). But description
was decorated with _cleanup_, so it would be freed when going out of the
function. Nothing bad would happen, because the program would exit after
exiting from run(), but this is just all too messy.
Also, strv_join(" ") + shell_escape() is not a good way to escape command
lines. In particular, one the join has happened, we cannot distinguish
empty arguments, or arguments with whitespace, etc. We have a helper
function to do the escaping properly, so let's use that.
core: when applying syscall filters, use ENOSYS for unknown calls
glibc starting using fchmodat2 to implement fchmod with flags [1], but
current version of libseccomp does not support fchmodat2 [2]. This is
causing problems with programs sandboxed by systemd. libseccomp needs to know
a syscall to be able to set any kind of filter for it, so for syscalls unknown
by libseccomp we would always do the default action, i.e. either return the
errno set by SystemCallErrorNumber or send a fatal signal. For glibc to ignore
the unknown syscall and gracefully fall back to the older implementation,
we need to return ENOSYS. In particular, tar now fails with the default
SystemCallFilter="@system-service" sandbox [3].
This is of course a wider problem: any time the kernel gains new syscalls,
before libseccomp and systemd have caught up, we'd behave incorrectly. Let's
do the same as we already were doing in nspawn since 3573e032f26724949e86626eace058d006b8bf70, and do the "default action" only
for syscalls which are known by us and libseccomp, and return ENOSYS for
anything else. This means that users can start using a sandbox with the new
syscalls only after libseccomp and systemd have been updated, but before that
happens they behaviour that is backwards-compatible.
In seccomp_restrict_sxid() there's a chunk conditionalized with
'#if defined(__SNR_fchmodat2)'. We need to kep that because seccomp_restrict_sxid()
seccomp_restrict_suid_sgid() uses SCMP_ACT_ALLOW as the default action.
DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for
removal in a future version. Use timezone-aware objects to
represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
The difference between the two is that .now(datetime.UTC) returns an object with
a timezone attached, "the numbers" are the same.
This value is fed to cryptography's x509.CertificateBuilder object, so as long
as it can accept a datetime object with tzinfo, the result should be identical.
Takashi Sakamoto [Wed, 29 Nov 2023 13:39:50 +0000 (22:39 +0900)]
hwdb: ieee1394-unit-function: arrangement for Sony DVMC-DA1
A commit 6a42bdb37e39 ("hwdb: ieee1394-unit-function: add Sony
DVMC-DA1") is based on kernel feature unreleased yet (furthermore, not
merged yet). The original intension of new entry is to configure permission
of special file for FireWire character device, so this commit changes the
entry so that it can covers the issued case in existent version of Linux
kernel as out best effort.
When the new version of Linux kernel is released with the new feature,
then following commits would fulfill the hwdb with vendor and model names.
Yu Watanabe [Mon, 27 Nov 2023 02:55:49 +0000 (11:55 +0900)]
sd-journal: fix corrupted journal handling of generic_array_bisect()
Let's consider the following case:
- the direction is down,
- no cached entry,
- the array has 5 entry objects,
- the function test_object() reutns TEST_LEFT for the 1st object,
- the 2nd, 3rd, and 4th objects are broken, so generic_array_bisect_step()
returns TEST_RIGHT for the object.
Then, previously, generic_array_bisect_step() updated the values like the following:
0th: (m = 5, left = 0, right = 4, i = 4) -> (m = 4, left = 0, right = 3, RIGHT)
1st: (m = 4, left = 0, right = 3, i = 1) -> (m = 4, left = 2, right = 3, LEFT)
2nd: (m = 4, left = 2, right = 3, i = 2) -> (m = 2, left = 2, right = 1, RIGHT) <- ouch!!
So, assert(left < right) in generic_array_bisect() was triggered.
See issue #30210.
In such situation, there is no matching entry in the array. By returning
TEST_GOTO_PREVIOUS, generic_array_bisect() handles the result so.
Yu Watanabe [Wed, 29 Nov 2023 21:46:21 +0000 (06:46 +0900)]
sd-journal: ignore failure in testing cached corrupted entry
Let's consider the case that the 1st entry in an array is broken, but
n-th entry is valid. Then, if generic_array_get() is called to read
n-th object, the offset of the broken entry is cached by the function.
If generic_array_bisect() is followed, even if the matching entry is
valid, it always fail with -EBADMSG or friends, as the function test the
cached entry at the beginnning. Let's ignore the failure in testing the
cached entry.