test: Make sure symlinks in integration-tests are properly installed
meson follows symlinks by default, so make sure we use
follow_symlinks=False if meson is new enough and rsync otherwise like
we already do for other testdata subdirectories.
kmeaw [Sun, 30 Mar 2025 12:08:38 +0000 (13:08 +0100)]
shared/calendarspec: fix normalization when DST is negative
When trying to calculate the next firing of 'hourly', we'd lose the
tm_isdst value on the next iteration.
On most systems in Europe/Dublin it would cause a 100% cpu hang due to
timers restarting.
This happens in Europe/Dublin because Ireland defines the Irish Standard Time
as UTC+1, so winter time is encoded in tzdata as negative 1 hour of daylight
saving.
Before this patch:
$ env TZ=IST-1GMT-0,M10.5.0/1,M3.5.0/1 systemd-analyze calendar --base-time='Sat 2025-03-29 22:00:00 UTC' --iterations=5 'hourly'
Original form: hourly
Normalized form: *-*-* *:00:00
Next elapse: Sat 2025-03-29 23:00:00 GMT
(in UTC): Sat 2025-03-29 23:00:00 UTC
From now: 13h ago
Iteration #2: Sun 2025-03-30 00:00:00 GMT
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
Iteration #3: Sun 2025-03-30 00:00:00 GMT <-- note every next iteration having the same firing time
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
...
With this patch:
$ env TZ=IST-1GMT-0,M10.5.0/1,M3.5.0/1 systemd-analyze calendar --base-time='Sat 2025-03-29 22:00:00 UTC' --iterations=5 'hourly'
Original form: hourly
Normalized form: *-*-* *:00:00
Next elapse: Sat 2025-03-29 23:00:00 GMT
(in UTC): Sat 2025-03-29 23:00:00 UTC
From now: 13h ago
Iteration #2: Sun 2025-03-30 00:00:00 GMT
(in UTC): Sun 2025-03-30 00:00:00 UTC
From now: 12h ago
Iteration #3: Sun 2025-03-30 02:00:00 IST <-- the expected 1 hour jump
(in UTC): Sun 2025-03-30 01:00:00 UTC
From now: 11h ago
...
This bug isn't reproduced on Debian and Ubuntu because they mitigate it by
using the rearguard version of tzdata. ArchLinux and NixOS don't, so it would
cause pid1 to spin during DST transition.
This is how the affected tzdata looks like:
$ zdump -V -c 2024,2025 Europe/Dublin
Europe/Dublin Sun Mar 31 00:59:59 2024 UT = Sun Mar 31 00:59:59 2024 GMT isdst=1 gmtoff=0
Europe/Dublin Sun Mar 31 01:00:00 2024 UT = Sun Mar 31 02:00:00 2024 IST isdst=0 gmtoff=3600
Europe/Dublin Sun Oct 27 00:59:59 2024 UT = Sun Oct 27 01:59:59 2024 IST isdst=0 gmtoff=3600
Europe/Dublin Sun Oct 27 01:00:00 2024 UT = Sun Oct 27 01:00:00 2024 GMT isdst=1 gmtoff=0
Compare it to Europe/London:
$ zdump -V -c 2024,2025 Europe/London
Europe/London Sun Mar 31 00:59:59 2024 UT = Sun Mar 31 00:59:59 2024 GMT isdst=0 gmtoff=0
Europe/London Sun Mar 31 01:00:00 2024 UT = Sun Mar 31 02:00:00 2024 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 27 00:59:59 2024 UT = Sun Oct 27 01:59:59 2024 BST isdst=1 gmtoff=3600
Europe/London Sun Oct 27 01:00:00 2024 UT = Sun Oct 27 01:00:00 2024 GMT isdst=0 gmtoff=0
Turns out makepkg sets $SOURCE_DATE_EPOCH= to the current time for
every build if not set explicitly which causes full rebuilds if we
don't set time-epoch explicitly ourselves, so let's do that everywhere
to avoid unnecessary rebuilds.
vsc_tag() always reruns even if the vcs-tag option is disabled. Let's
use custom_target() instead so that we can only enable build_always_stale
if the vcs-tag option is enabled.
mkosi: drop os-release symlink for minimal-base image
[ 385s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[ 385s] /usr/lib/systemd/tests/mkosi/mkosi.images/minimal-base/mkosi.extra/etc/os-release -> ../usr/lib/os-release
It shouldn't be even needed, everything should look in /usr/lib/os-release too
udev-config: restore log level set by systemd.log_level on reload
If previously log level was specified in udev.conf but not now,
then let's make 'udevadm control --reload' sets the log level
specified by systemd.log_level.
Yu Watanabe [Thu, 27 Mar 2025 03:57:30 +0000 (12:57 +0900)]
udev-watch: add inotify watch by manager process
Previously, inotify watch on a device node was added/removed by a
worker process processing the relevant uevent. However, that could not
avoid races. For example,
1. A device node X is removed by the kernel (e.g. unplug USB memory), and
the kernel removes the inotify watch for the device node and produces
IN_IGNORED event and 'remove' uevent for the device.
2. Before udevd processes the 'remove' uevent of the device, a worker
process may try to add an inotify watch on another device node Y.
As the inotify watch on X has been already removed, the worker may
acquire the same watch handle that was previously assigned to X.
3. Since the 'remove' uevent for X is not processed yet, the symlink
named with the watch handle still exists and points to X. So, the
worker process for Y cannot add the symlink...
To avoid such races, let's sequentially add/remove inotify watch by the
manager process.
Note, this potentially reduces performance on boot when there exists
huge amount of disks and/or partitions.
With the latest mkosi it's possible for MinimumVersion= to be a git
commit so let's start making use of that. This will make mkosi fail
if it's executed within the systemd repository and the checked out
commit is too old.
Putting the mkosi commit sha in mkosi/mkosi.conf also allows retrieving
it without having the full source tree available.
We also make a bunch of improvements to the fetch-mkosi.py script.
The arguments `(rd.)systemd.mount-extra` take a value that looks like
`WHAT:WHERE[:FSTYPE[:OPTIONS]]`. The `OPTIONS` were parsed into a nulstr
where a comma-separated c-string was expected. This leads to a bug where
only the first option was taken into account by the generator.
For example, if you passed `systemd.mount-extra=/x:/y:baz:ro,defaults`
to the kernel, `systemd-fstab-generator` would translate that into a
nulstr: `ro\0defaults\0`.
Since methods processing options in the generator expected a
comma-separated c-string, they would only see the first option, `ro` in
this case.
base-filesystem: avoid creating /lib64 symlink on existing rootfs
While all distributions agree on where the basic rootfs symlinks
(/bin /sbin /lib) should point to, not all of them agree on the
target of /lib64. Debian and derivatives, expect something different
than Fedora et al. This is mostly due to the different way multiarch
vs multilib are designed.
This can lead to the situation where running systemd-nspawn on Fedora
to boot a Debian container creates an incompatible symlink in the guest
persistent, pre-created and pre-populated root filesystem, causing
issues due to these incompatibilities.
While it would be great if Debian and derivatives had the same
expectations as the rest of the world, this is baked in many places
and not likely to ever be fixable, as the multiarch vs multilib
behaviours are now very entrenched, and changing it would break
compatibilities left and right.
The core purpose of base-filesystem was to allow bringing up a system
with an empty/ephemeral/etc rootfs (and a /usr/ image on top). So as
a workaround, create /lib64 only if we detect that we have created
/bin /lib and /sbin, as that's a sure sign we are booting into an
empty rootfs that needs to be populated.
Conversely, if the filesystem _already_ has /bin /sbin and /lib,
it means it is not ephemeral and it is pre-prepared and persistent,
so it's a good idea to avoid second-guessing the image builder tool
or the package manager and override what it does, and just let them
carry on with the system however they configured it.
Reworked and reworded, original author: Helmut Grohne <helmut@subdivi.de>
udev: drop unnecessary discardment of queued events
With the previous commit, now on_post_exit() checks only events
currently being processed. Hence, it is not necessary to discard
queued events in manager_exit().
Also, as already SIGTERM is sent to all workers, kill workers timer
is not necessary anymore after manager_exit(), hence disable it.
This mostly does not change any behavior. Just refactoring and
preparation for later change.
udev: do not wait for event queue being empty on exit
When the manager process is requested to terminate, if a worker process
try to lock a block device and failed, then the worker returns a
TRY_AGAIN notification and the event is requeued. Hence, the event queue
may have pending events even after manager_exit() is called. In such
situation, sd_event_exit() will never called, and udevd will stuck.
This makes, after termination is requested, it only checks whether there
are any events currently being processed.
It is not necessary to wait for a worker processing an event before
sending SIGTERM. Workers will handle SIGTERM after they finish events
that they are currently processing. Let's send SIGTERM whenever it
necessary.
Mike Yuan [Sun, 6 Apr 2025 14:10:43 +0000 (16:10 +0200)]
core: do not use pidref_hash_ops_free for Manager.watch_pids
The PidRefs are in all cases owned by Unit.pids, and gets removed
from Manager.watch_pids(_more) when the unit is destructed, via
unit_unwatch_pidref(). This hasn't caused any issue because
manager_clear_jobs_and_units() is called before destroying
Manager.watch_pids(_more), but let's get this right.
We need to be extremely careful with using the path associated with fd,
since it contains the resolved path if a symlink was opened. In particular,
it's really not desirable to return the resolved executable path in
pin_callout_binary(), which would end up as argv[0] in udev_event_spawn(),
potentially changing the behavior of spawned process.
* 7948d79b63 upgpkg: 257.5-1: new upstream release
* d9badad1d4 drop use of deprecated nscd meson option
* af071243cf upgpkg: 257.4-1: new upstream release
shared/cred-util: Ensure TPM code is used with HAVE_TPM2 guards
Building with no TPM2 we end up with following error
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: note: in a call to built-in function ‘__builtin___memcpy_chk’
In function ‘memcpy’,
inlined from ‘encrypt_credential_and_warn’ at ../git/src/shared/creds-util.c:1091:17:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: error: argument 2 null where non-null expected [-Werror=nonnull]
29 | return __builtin___memcpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30 | __glibc_objsize0 (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: note: in a call to built-in function ‘__builtin___memcpy_chk’
cc1: some warnings being treated as errors 29 | return __builtin___memcpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30 | __glibc_objsize0 (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
This is because code referencing tpm2 data structures is still used while the
initialization of the function has been compiled out since its conditional on HAVE_TPM2
We add needed guards in places where it is missing to fix this problem
When compiling in systems which do not have gcc installed
(like a musl+llvm system) the forced linkage "-lgcc" is
stopping it to compile. As when compiler is clang it do not
need to link explicitelly to gcc I've modified meson to only
link to gcc library when compiler is gcc.
Add support for creating HSR/PRP interfaces. HSR (High-availability Seamless
Redundancy) and PRP (Parallel Redundancy Protocol) are two protocols that
provide seamless failover against failure of any single network component. They
are both implemented by the "hsr" kernel driver.
exec-invoke: Always go via stdin fd in setup_pam() to get tty
We might have resolved the tty to something else if it was set to
/dev/console, so let's always go via stdin in setup_pam(). This also
means we won't set the pam tty if only stdout or stderr are connected
to a tty, which seems like a sensible thing to do.
Daan De Meyer [Fri, 21 Mar 2025 09:39:46 +0000 (10:39 +0100)]
core: Resolve /dev/console if it's connected to stdin
If /dev/console is connected to stdin there's a possibility that
the unit might try to start a logind session from within the unit.
Let's make sure that any such sessions are started on the tty that
/dev/console points to and not on /dev/console itself.
udev-spawn: search executed command in build directory (#36985)
This makes pin_callout_binary() optionally provides the path of the pinned
binary, and makes it used in udev-spawn.c, to allow easy debugging of
program invocations requested by RUN{program} and friends.
Mike Yuan [Mon, 24 Mar 2025 18:46:46 +0000 (19:46 +0100)]
core/cgroup: drop extraneous CGRuntime check in unit_get_memory_available()
Currently, for units whose CGRuntime is not allocated just yet, e.g.
inactive ones, MemoryAvailable fails to account for their MemoryMax/High
settings. Let's remove the CGRuntime check hence. The call to
unit_get_memory_accounting() would certainly fail, but it doesn't matter,
since 'current' is initially set to 0 anyways.