Daan De Meyer [Thu, 7 Dec 2023 13:26:10 +0000 (14:26 +0100)]
repart: Don't look for --make-ddi= definitions inside --root=
It doesn't really make sense to go looking for these inside the
given root directory. While we should resolve specifiers and such
based on the given root directory, let's look up the image definitions
on the host system as there's a good chance they're coupled to the
repart version we're using so there's all kinds of chances for problems
if we use the definitions from the image we're building instead of those
from the host.
Luca Boccassi [Thu, 7 Dec 2023 23:19:36 +0000 (23:19 +0000)]
core: create workdir/upperdir when mounting a Type=overlay mount unit
So far we created the target directory, and the source for bind mounts,
but not workdir/upperdir for overlays, so it has to be done separately
and strictly before the unit is started, which is annoying. Check the
options when creating directories, and if upper/work directories are
specified, create them.
install: don't translate unit instances to paths when reenabling them
For unit instances install_info_discover() returns path to the template,
which then generates confusing errors when passed to
do_unit_file_enable():
~# build/systemctl --root=/tmp/systemctl-test.N9ysbz reenable templ1@two.service
Unit name: templ1@two.service; p: /etc/systemd/system/templ1@.service
Removed "/tmp/systemctl-test.N9ysbz/etc/systemd/system/services.target.wants/templ1@two.service".
Failed to reenable templ1@.service, destination unit services.target is a non-template unit.
This can also be seen with a different reproducer using getty@.service
and a simple bind mount to / - there's no error this time, but it tries
to create a symlink for the default instance (from DefaultInstance=tty1),
which is also incorrect:
Luca Boccassi [Mon, 27 Nov 2023 23:32:31 +0000 (23:32 +0000)]
core: relax dependency on RootImage= storage from Requires= to Wants=
If a unit is running in an image and wants to survive a soft-reboot,
then it can't be deactivated by the storage of the image going away.
Relax the dependency to a Wants=. Access to the image is not needed
when the unit is running anyway, so downgrade to Wants=.
Daan De Meyer [Tue, 5 Dec 2023 13:56:00 +0000 (14:56 +0100)]
repart: Re-open file descriptor to partition target after mkfs
The mkfs binary might unlink the path we give it and replace it with
a new file so let's make sure that our fd points to any new file rather
than the old deleted file.
Specifically this fixes erofs partition generation.
aslepykh [Fri, 8 Dec 2023 01:54:52 +0000 (04:54 +0300)]
test: avoid NO_CAST.INTEGER_OVERFLOW in test-oomd-util (#30365)
The `.mem_total` variable has `uint64_t` type, therefore, when multiplying the number
`20971512` by the number `1024` with the suffix `U`, we will not get the expected result of
`21,474,828,288`, since the number `20971512` without an explicit type indication has
`uint32_t` type.
First, multiplication will occur in accordance with the `uint32_t` type; this operation will
cause a **type overflow**, and only then will this result be assigned to a `uint64_t` type
variable.
It's worth adding the `UL` suffix to the number `20971512` to avoid **overflow**.
Found by Linux Verification Center (portal.linuxtesting.ru) with SVACE.
Author A. Slepykh.
We already highlight sections and "de-highlight" comments, so let's add
the last piece of the puzzle and highlight the configuration directives
to visually distinguish them from the values.
packit: don't take ownership of /etc/ssh/sshd_config.d/
7e3607996a creates a symlink under /etc/ssh/sshd_config.d/ and with
current Rawhide RPM stuff the systemd RPM tries to take ownership of
that directory which conflicts with the openssh-server package. Let's
temporarily tweak the regex in split-files.py until this changes makes
it to Rawhide.
journalctl: don't skip over messages not matching the cursor
When --after-cursor=/--cursor-file= is used together with a journal
filter, we still skipped over the first matching entry even if it wasn't
the entry the cursor points at, thus missing one "valid" entry
completely. Let's fix this by checking if the entry cursor after seeking
matches the user provided cursor, and skip to the next entry only when
the cursors match.
Daan De Meyer [Tue, 5 Dec 2023 09:24:13 +0000 (10:24 +0100)]
nspawn: Check later whether to keep/drop CAP_NET_BIND_SERVICE
Currently the check doesn't take any settings from nspawn settings
files into account, so let's delay the check until after we've
loaded any settings file.
Daan De Meyer [Sun, 3 Dec 2023 19:19:08 +0000 (20:19 +0100)]
gpt-auto-generator: Pass cryptsetup credentials to cryptsetup
cryptsetup reads a bunch of credentials now but we don't pass import
those in any service units yet. Let's pass through all cryptsetup
prefixed credentials to the systemd-cryptsetup@root instance.
Roland Hieber [Mon, 18 Sep 2023 09:52:06 +0000 (11:52 +0200)]
udev: generate system-unique storage symlinks using device path
When the same disk image is written to multiple storage units, for
example an external SD card and an internal eMMC, the symlinks in
/dev/disk/by-{label,uuid,partlabel,partuuid}/ are no longer unique, and
will point to the device that is probed last.
Adressing partitions via labels and UUIDs is nice to work with, and
depending on the use case, it might also be more robust than using the
symlinks in /dev/disk/by-path/ containing the partition number. Combine
the two approaches to create unique symlinks containing both the device
path as well as the respective UUIDs or labels, and throw in a symlink
using the devpath and the partition number for the sake of completeness.
For an exemplary GPT-partitioned disk at "platform-2198000.mmc" with a
partition containing an ext4 file system, this might create symlinks of
the following form:
Mike Yuan [Sat, 25 Nov 2023 14:10:16 +0000 (22:10 +0800)]
systemctl-whoami: use pidfd to refer to processes
While at it, rephrase the output a bit. Before this commit, if
the pid doesn't exist, we output something hard to interpret -
"Failed to get unit for ourselves".
mime: register confext/sysext images in shared-mime-info
This make them recognized by file managers and stuff. Maybe one day we
should properly register mime types in the "vnd." namespace with IANA,
but I am too lazy to deal with the bureaucracy for that, hence let's
stick with the x. namespace for now.
userdbctl: enable ssh-authorized-keys logic by default
sshd now supports config file drop-ins, hence let's install one to hook
up "userdb ssh-authorized-keys", so that things just work.
We put the drop-in relatively early, so that other drop-ins generally
will override this.
Ideally sshd would support such drop-ins in /usr/ rather than /etc/, but
let's take what we can get. It's not that sshd's upstream was
particularly open to weird ideas from Linux people.
pid1: add ProtectSystem= as system-wide configuration, and default it to true in the initrd
This adds a new ProtectSystem= setting that mirrors the option of the
same of services, but in a more restrictive way. If enabled will remount
/usr/ to read-only, very early at boot. Takes a special value "auto"
(which is the default) which is equivalent to true in the initrd, and
false otherwise.
Unlike the per-service option we don't support full/strict modes, but
the door is open to eventually support that too if it makes sense. It's
not entirely trivial though as we have very little mounted this early,
and hence the mechanism might not apply 1:1. Hence in this PR is a
conservative first step.
My primary goal with this is to lock down initrds a bit, since they
conceptually are mostly immutable, but they are unpacked into a mutable
tmpfs. let's tighten the screws a bit on that, and at least make /usr/
immutable.
This is particularly nice on USIs (i.e. Unified System Images, that pack
a whole OS into a UKI without transitioning out of it), such as
diskomator.
We have two mechanisms that remove old coredumps: systemd-coredump has
parameters based on disk use / remaining disk free, and systemd-tmpfiles does
cleanup based on time. The first mechanism should prevent us from using too much
disk space in case something is crashing continuously or there are very large
core files.
The limit of 3 days makes it likely that the core file will be gone by the time
the admin looks at the issue. E.g. if something crashes on Friday, the coredump
would likely be gone before people are back on Monday to look at it.
We'll typically send signals to all remaining processes in the following
cases:
1. pid1 (in initrd) when transitioning from initrd to sysroot: SIGTERM
2. pid1 (in sysroot) before transitioning back to initrd (exitrd): SIGTERM + SIGKILL
3. systemd-shutdown (in exitrd): SIGTERM + SIGKILL
'warn_rootfs' is set to true only when we're not in initrd and we're
sending SIGKILL, which means the second case. So, we want to emit the
warning when the root of the storage daemon IS the same as that of pid1,
rather than the other way around.
The condition is spuriously reversed in the offending commit.
loginctl: show a nicer error message when no session/seat is available
When calling loginctl {seat,session}-status without arguments, show a nicer
error message in case there's no suitable session/seat attached to the calling
tty.
Before:
~# loginctl seat-status
Could not get properties: Unknown object '/org/freedesktop/login1/seat/auto'.
~# systemd-run -q -t loginctl seat-status
Could not get properties: Unknown object '/org/freedesktop/login1/seat/auto'.
~# systemd-run -q -t loginctl session-status
Could not get properties: Unknown object '/org/freedesktop/login1/session/auto'.
After:
~# build/loginctl seat-status
Failed to get path for seat 'auto': Session '1' has no seat.
~# systemd-run -q -t build/loginctl seat-status
Failed to get path for seat 'auto': Caller does not belong to any known session and doesn't own any suitable session.
~# systemd-run -q -t build/loginctl session-status
Failed to get path for session 'auto': Caller does not belong to any known session and doesn't own any suitable session.
Daan De Meyer [Tue, 5 Dec 2023 13:56:15 +0000 (14:56 +0100)]
repart: Add Minimize=best to --make-ddi= partition definitions
Otherwise, repart won't calculate the minimal size of the partition
automatically and things will fail once the partitions exceed the
minimal partition size (10M).
ukify: raise error if genkey is called with no output arguments
The idea is that genkey is called with either
--secureboot-private-key= + --secureboot-certificate=, and then it
writes those, or with --pcr-private-key + optionally --pcr-public-key
and then it writes those, or both. But when called with no arguments
whatsover, it did nothing.
There is no implicit value for any of those parameters as input (unlike in
mkosi), so we also don't want to have implicit values when used as output.
But we shouldn't return success if no work was done, this is quite confusing.
Roland Singer [Wed, 6 Dec 2023 09:49:47 +0000 (10:49 +0100)]
ukify: fix handling of --secureboot-certificate-validity= (#30315)
Before:
$ python src/ukify/ukify.py genkey --secureboot-private-key=sb2.key --secureboot-certificate=sb2.cert --secureboot-certificate-validity=111
Traceback (most recent call last):
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 1660, in <module>
main()
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 1652, in main
generate_keys(opts)
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 943, in generate_keys
key_pem, cert_pem = generate_key_cert_pair(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/zbyszek/src/systemd-work/src/ukify/ukify.py", line 891, in generate_key_cert_pair
now + ONE_DAY * valid_days
~~~~~~~~^~~~~~~~~~~~
TypeError: can't multiply sequence by non-int of type 'datetime.timedelta'
Now:
$ python src/ukify/ukify.py genkey --secureboot-private-key=sb2.key --secureboot-certificate=sb2.cert --secureboot-certificate-validity=111
Writing SecureBoot private key to sb2.key
Writing SecureBoot certificate to sb2.cert
The man page doesn't even mention errno. It just says that ferror() should
be used to check for errors. Those writes are unlikely to fail, but if they
do, errno might even be 0. Also, we have fflush_and_check() which does
additional paranoia around errno, because we apparently do not trust that
errno will always be set correctly.
This is a fancy wrapper around "cat <<EOF", but:
- the user doesn't need to figure out the file name,
- parent directories are created automatically,
- daemon-reload is implied,
so it's a convenient way to create units or drop-ins.
Luca Boccassi [Tue, 5 Dec 2023 15:43:12 +0000 (15:43 +0000)]
switch-root: also check that mount IDs are the same, not just inodes
If /run/nextroot/ has been set up, use it, even if the inodes are
the same. It could be a verity device that is reused, but with
different sub-mounts or other differences. Or the same / tmpfs with
different /usr/ mounts. If it was explicitly set up we should use it.
Use the new helper to check that the mount IDs are also the same,
not just the inodes.
If the user edits result in an empty file (after stripping), we would silently
not do anything, aborting the edit. This is rather confusing, let's emit a
notice:
$ build/systemctl --user edit asdf.service --full --stdin </dev/null
/home/zbyszek/.config/systemd/user/asdf.service: after editing, new contents are empty, not writing file.
(This also works with an editor, instead of --stdin. The message is printed on
the console and is visible after the editor exits.)
While at it, fix the condition to skip writing file after stripping. We had
"old_contents", but we modified that string. In some code flows, we would
compare the stripped old contents (i.e. a string which by defintion doesn't
have a newline at the end) with a string to which we just appended a newline
(i.e. a string which by defintion has a newline at the end).
shared/edit-util: split out function to populate temp file
In preparation for future changes: I want to add a mode where interactive
editing is not done, and when this preparation is moved to a helper, it's much
easier to skip it.
e->line is initialized to 1 and overwritten even if sync fails. Theoretically
this is against our style, but the alternative is to propagate a temporary
value of line through the layers, which adds a lot of noise. If we fail, this
EditFile object will not be used for anything, to the changed value of .line
has no effect.
HibernateInfo.from_efi is not actually useful. info.efi is only
set if the system identifier stored in EFI variable matches with
that of the running system, and thus the variable should be cleared
no matter whether resume= is set from kernel cmdline or not.