smbios: move validation of SMBIOS table sizes fully into get_smbios_table()
We do half a validation currently ourselves (i.e. check the header fits
into the rest of the data), and leave the other half to the
caller (i.e. check the table fits into the rest of the data).
get_smbios_table() is changed to accept the minimum object size and
validates it before returning a table.
Daan De Meyer [Thu, 10 Oct 2024 20:37:39 +0000 (22:37 +0200)]
rpm/systemd-update-helper: Use systemctl reload to reexec/reload user managers
Let's always use systemctl reload to reexec and reload user managers
now that it always implies a reexec. This moves all the job management
logic to pid 1 instead of bash and reduces the complexity of the logic
as we remove systemd-run, pam and systemd-stdio-bridge from the equation.
Mike Yuan [Thu, 10 Oct 2024 19:32:17 +0000 (21:32 +0200)]
units/{user,capsule}@.service: issue daemon-reexec when notify-reloading
Closes #28367 (but not really in the exact form, see below)
We have the problem of restarting all user manager instances
after upgrade. Current approaches involve systemctl kill
with SIGRTMIN+25, which is async and feels rather ugly [1][2];
or systemctl --machine=user@ --user, which requires entering
each user session. Neither is particularly elegant.
Instead, let's just signal daemon-reexec when user@.service
is reloaded from system manager. Our long goal of dropping
daemon-reload in favor of reexec (see TODO) is unlikely to happen
due to user dbus restrictions, but here the synchronization
is done via READY=1.
#28367 would not really work for us now I come to think about it,
because all processes will be reparented to pid1 as soon as
original user manager process exits. This alternative approach
seems good enough for our use case.
Mike Yuan [Thu, 10 Oct 2024 19:06:35 +0000 (21:06 +0200)]
core/manager-serialize: drop serialization for Manager.ready_sent
This field indicates whether READY=1 has been sent to
the service manager/supervisor. Whenever we reload/reexec/soft-reboot,
manager_send_reloading() always resets it to false first,
so that READY=1 is sent after reloading finishes. Hence
we utterly get "false" at all times. Kill it.
The offending commit wrongly assumed that the second READY=1
notification is for system scope only, but it also serves the purpose
of flushing out previous STATUS= containing user unit job status.
Uday Shankar [Thu, 10 Oct 2024 20:29:10 +0000 (14:29 -0600)]
udev: allow persistent storage rules for ublk devices
Tools such as lsblk which query the udev database instead of probing
devices directly fail when run on ublk devices. For instance, in the
following commands, the partition type is missing, despite the fact that
/dev/ublkb0 was just partitioned with a single Linux filesystem type
partition.
$ lsblk /dev/ublkb0
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
ublkb0 259:0 0 31.3G 0 disk
└─ublkb0p1 259:1 0 31.2G 0 part
$ lsblk -o pkname,parttype /dev/ublkb0
PKNAME PARTTYPE
ublkb0
This happens because ublk devices are missing from a couple of
whitelists in the udev rules which are responsible for populating the
database with the data lsblk is looking for. Add the ublk devices to
these whitelists.
David Rheinsberg [Fri, 11 Oct 2024 07:53:25 +0000 (09:53 +0200)]
docs/DESKTOP_ENVIRONMENTS: fix formatting
The annotation about omittance is meant to be about the `RANDOM` string.
However, the current formatting makes it look like the entire naming
scheme is optional. Fix this.
Yu Watanabe [Thu, 10 Oct 2024 03:30:41 +0000 (12:30 +0900)]
sd-netlink: various cleanups
- use uint8_t, uint16_t, and so on, rather than unsigned char, unsigned
short, and so on, respectively,
- rename output parameters to ret or ret_xyz,
- add several missing assertions.
man: reword comment a bit regarding ExecStartPre= multiple commands
The documentation claimed that ExecStartPre=/ExecStartPost= accepts
multiple command lines, in contrast to ExecStart=. This is half an
untruth, because ExecStart= allows that too – as long as Type=oneshot is
set.
Hence, reword this a bit, and do not emphasize the contrast.
Yu Watanabe [Wed, 9 Oct 2024 01:07:31 +0000 (10:07 +0900)]
login: provide delayed action in ScheduledShutdown property
Even though we can get the existence of delayed action through
PreparingForShutdownWithMetadata property or friends, for consistency
with CancelScheduledShutdown() method, it is better to also provide the
information through ScheduledShutdown property.
Tobias Fleig [Tue, 8 Oct 2024 14:54:43 +0000 (07:54 -0700)]
stub: Add support for .initrd addon files
Teaches systemd-stub how to load additional initrds from addon files.
This is very similar to the support for .ucode sections in addon files,
but with different ordering. Initrds from addons have a chance to
overwrite files from the base initrd in the UKI.
WilliButz [Fri, 4 Oct 2024 17:51:57 +0000 (19:51 +0200)]
repart: derive hash partition size from SizeMaxBytes= of data sibling
This change makes it possible for repart to create dm-verity hash
partitions for a custom amount of protected data. When the property
`SizeMaxBytes=` is specified for a dm-verity data partition, the size
of the corresponding hash partition is set to accommodate hash data
for this maximum size, rather than the actual contents its data
sibling. However, the contained hash data continues to be generated
from said sibling.
hwdb: move key 66/65 handling from specific to generic HP laptop coverage
This takes the idea from #18595 and implements it based on our current
hwdb: the original PR suggested the keys 66/65 are a generic HP thing,
and not limited to specific laptops. The current specific laptop entries
do not contradict that claim.
Hence, let's move them from the specific sections matching some HP
laptops to the generic section matching all.
This uses the correct key names, which have long been fixed (which used
to be a problem our CI was tripped off by).
This is not tested, but I think fairly risk-less, and should allow us to
get rid of a really old PR.
Chen Guanqiao [Wed, 2 Oct 2024 05:10:21 +0000 (13:10 +0800)]
mount: optimize mountinfo traversal by decoupling device discovery
In mount_load_proc_self_mountinfo(), device_found_node() is synchronously called
during the traversal of mountinfo entries. When there are a large number of
mount points, and the device types are not significantly different, this results
in excessive time consumption during device discovery, causing a performance
bottleneck. This issue is particularly prominent on servers with a large number
of cores in IDC.
This patch decouples device discovery from the mountinfo traversal process,
avoiding redundant device operations. As a result, it significantly improves
performance, especially in environments with numerous mount points.
The documentation says the option takes a boolean or one of the "self"
and "identity". But the parser uses private_users_from_string() which
also accepts "off". Let's drop the implicit support of "off".
Yu Watanabe [Mon, 7 Oct 2024 21:19:04 +0000 (06:19 +0900)]
core: suppress one debugging log
Otherwise, the log is shown even when getting properties.
Even though it is in the debug level, that's quite noisy.
[ 338.785847] TEST-55-OOMD.sh[1624]: Oct 07 16:35:15 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.786985] TEST-55-OOMD.sh[1624]: Oct 07 16:35:17 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.787412] TEST-55-OOMD.sh[1624]: Oct 07 16:35:20 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.791776] TEST-55-OOMD.sh[1624]: Oct 07 16:35:22 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.792938] TEST-55-OOMD.sh[1624]: Oct 07 16:35:24 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.793225] TEST-55-OOMD.sh[1624]: Oct 07 16:35:26 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.793424] TEST-55-OOMD.sh[1624]: Oct 07 16:35:28 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.796448] TEST-55-OOMD.sh[1624]: Oct 07 16:35:31 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.797997] TEST-55-OOMD.sh[1624]: Oct 07 16:35:33 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
[ 338.799206] TEST-55-OOMD.sh[1624]: Oct 07 16:35:35 H systemd[1]: TEST-55-OOMD-testmunch.service: Unit not running in private mount namespace, cannot live mount
The method was added with migration of resources in mind (e.g. process's
allocated memory will follow it to the new scope), however, such a
resource migration is not in cgroup semantics. The method may thus have
the intended users and others could be guided to StartTransientUnit().
Since this API was advertised in a regular release, start the removal
with a deprecation message to callers.
Eventually, the goal is to remove the method to clean up DBus API and
simplify code (removal of cgroup_context_copy()).
Part of DBus docs is retained to satisfy build checks.
Catch up with the nice little toys the kernel fs developers have added
for us. Preferably, let's make use of the new F_DUPFD_QUERY fcntl() call
that checks whether two fds are just duplicates of each other
(duplicates as in dup(), not as in open() of the same inode, i.e.
whether they share a single file offset and so on).
This API is much nicer, since it is a core kernel feature, unlike the
kcmp() call we so far used, which is part of the (optional)
checkpoint/restore stuff.