Chris Down [Wed, 19 Nov 2025 14:06:03 +0000 (22:06 +0800)]
tests: ASSERT_SIGNAL: Do not allow parent to hallucinate it is the child
assert_signal_internal() returns 0 in two distinct cases:
1. In the child process (immediately after fork returns 0).
2. In the parent process, if the child exited normally (no signal).
ASSERT_SIGNAL fails to distinguish these cases. When a child exited
normally (case 2), the parent process receives 0, incorrectly interprets
it as meaning it is the child, and re-executes the test expression
inside the parent process. Goodness gracious!
This causes two severe test integrity issues:
1. False positives. The parent can run the expression, succeed, and call
_exit(EXIT_SUCCESS), causing the test to pass even though no signal
was raised.
2. Silent truncation. The _exit() call in the parent terminates the test
runner prematurely, preventing subsequent tests in the same file from
running.
Example of the bug in action, from #39674:
ASSERT_SIGNAL(fd_is_writable(closed_fd), SIGABRT)
This test should fail (fd_is_writable does not SIGABRT here), but with
the bug, the parent hallucinated being the child, re-ran the expression
successfully, and exited with success.
Fix this by refactoring assert_signal_internal() to be much more strict
about separating control flow from data.
The signal status is now returned via a strictly typed output parameter,
guaranteeing that determining whether we are the child is never
conflated with whether the child exited cleanly.
Chris Down [Wed, 19 Nov 2025 13:45:40 +0000 (21:45 +0800)]
tests: ASSERT_SIGNAL: Ensure sanitisers do not mask expected signals
ASAN installs signal handlers to catch crashes like SIGSEGV or SIGILL.
When these signals are raised, ASAN traps them, prints an error report,
and then typically terminates the process with a different signal (often
SIGABRT) or a non-zero exit code.
This interferes with ASSERT_SIGNAL when checking for specific crash
signals (for example, checking that a function raises SIGSEGV). In such
a case, the test harness sees the ASAN termination signal rather than
the expected signal, causing the test to fail.
Fix this by resetting the signal handler to SIG_DFL in the child process
immediately before executing the test expression. This ensures the
kernel kills the process directly with the expected signal, bypassing
ASAN's interceptors.
Chris Down [Wed, 19 Nov 2025 08:50:38 +0000 (16:50 +0800)]
tests: ASSERT_SIGNAL: Stop exit codes from masquerading as signals
When a child process exits normally (si_code == CLD_EXITED),
siginfo.si_status contains the exit code. When it is killed by a signal
(si_code == CLD_KILLED or CLD_DUMPED), si_status contains the signal
number. However, assert_signal_internal() returns si_status blindly.
This causes exit codes to be misinterpreted as signal numbers.
This allows failing tests to silently pass if their exit code
numerically coincides with the expected signal. For example, a test
expecting SIGABRT (6) would incorrectly pass if the child simply exited
with status 6 instead of being killed by a signal.
Fix this by checking si_code. Only return si_status as a signal number
if the child was actually killed by a signal (CLD_KILLED or CLD_DUMPED).
If the child exited normally (CLD_EXITED), return 0 to indicate that no
signal occurred.
Chris Down [Wed, 19 Nov 2025 08:49:22 +0000 (16:49 +0800)]
tests: Avoid variable shadowing in ASSERT_SIGNAL
The ASSERT_SIGNAL macro uses a fixed variable name, `_r`. This prevents
nesting the macro (like ASSERT_SIGNAL(ASSERT_SIGNAL(...))), as the inner
instance would shadow the outer instance's variable.
Switch to using the UNIQ_T helper to generate unique variable names at
each expansion level. This allows the macro to be used recursively,
which is required for upcoming regression tests regarding signal
handling logic.
Daan De Meyer [Wed, 19 Nov 2025 09:30:01 +0000 (10:30 +0100)]
tools: Add script to detect unused symbols in libshared
Symbols exported by libshared can't get pruned by the linker, so
every unused exported symbol is effectively dead code we ship to users
for no good reason. Let's add a script to analyze how many such symbols
we have.
We also add a meson test to run the script on all of our binaries.
Since it detects unused symbols and still has a few false positives,
don't enable the test by default similar to the clang-tidy tests.
The script was 100% vibe coded by Github Copilot with Claude Sonnet 4.5
as the model.
Current results are (without the unused symbols list):
Analysis of libsystemd-shared-259.so
======================================================================
Total exported symbols: 4830
(excluding public API symbols starting with 'sd_')
Used symbols: 4672
Unused symbols: 158
Usage rate: 96.7%
man: use prefix number that matches the general suggestion
`systemd.network(5)` recommends “that each filename is prefixed with a number
smaller than "70" (e.g. 10-eth0.network)”.
Reduce that used by the example accordingly, but stay above the number (`50`)
used in the earlier example for static configuration, so that would take
precedence over the dynamic one if both match for the same network.
Before:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': No such process
Now:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': Unknown user
core: improve messages about unknown users and groups
$ sudo build/systemd-run --uid=asdf whoami
$ journalctl -e
(whoami)[1007784]: run-p1007782-i5200512.service: Failed to determine user credentials: No such process
(whoami)[1007784]: run-p1007782-i5200512.service: Failed at step USER spawning /usr/sbin/whoami: No such process
systemd[1]: run-p1007782-i5200512.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1007782-i5200512.service: Failed with result 'exit-code'.
Now:
(whoami)[1013204]: run-p1013202-i5205932.service: Failed to determine credentials for user 'asdf': Unknown user
(whoami)[1013204]: run-p1013202-i5205932.service: Failed at step USER spawning /usr/sbin/whoami: Invalid argument
systemd[1]: run-p1013202-i5205932.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1013202-i5205932.service: Failed with result 'exit-code'.
Before:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user asdf: No such process
Now:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user 'asdf': Unknown user
tmpfiles: improve error message for missing user/group
From a boot with a dracut initrd:
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:2: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:4: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:6: Failed to resolve group 'tss': No such process
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:7: Failed to resolve group 'tss': No such process
udev: define a generic helper to print messages about unknown users and groups
We cannot just use %m, because strerror returns a confusing error message
for ESRCH or ENOEXEC. udev code was doing a good job, but the error handling
was very verbose. Let's encapsulate the customized error messages in a
helper.
No functional change, except that the error messages have a slightly different
form now. The old messages were a bit better, but we don't have as much
flexibility in the new scheme. "Failed to resolve user 'foo': Unknown user"
should be good enough.
network: gracefully disable resolve hook when socket is disabled
systemd-networkd cannot create the directory /run/systemd/resolve.hook/. Even
if the directory exists, it is not owned by systemd-network user/group, so
systemd-networkd cannot create socket file in the directory. Hence, if the
systemd-networkd-resolve-hook.socket unit is disabled, networkd fails to open
the varlink socket, and fail to start:
systemd-networkd[1304645]: Failed to bind to systemd-resolved hook Varlink socket: Permission denied
systemd-networkd[1304645]: Could not set up manager: Permission denied
systemd[1]: systemd-networkd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
systemd[1]: Failed to start systemd-networkd.service - Network Management.
If the socket unit is disabled, that should mean the system administrator wants
to disable the feature. Let's not try to setup the varlink socket in that case.
Now the resolve hook feature can be toggled by enabling/disabling the socket
unit, let's drop the $SYSTEMD_NETWORK_RESOLVE_HOOK environment variable.
Simon Barth [Mon, 10 Nov 2025 20:57:24 +0000 (21:57 +0100)]
man: Fix systemd-analyze exit-status example output
The output of `systemd-analyze exit-status` changed in commit e04ed6db6b44681b7a7876b9c4a1e6adaf877670, so that the exit-status class
for EXIT_SUCCESS and EXIT_FAILURE is "libc" instead of "glibc".
This commit makes the example output in the man-page match the actual
output again.
Mike Yuan [Sat, 15 Nov 2025 20:06:39 +0000 (21:06 +0100)]
core/unit: mark running reload job as canceled if the unit deactivated
The semantics of reload is that the service updates its extrinsic state
and continues execution. If it actually deactivated we shouldn't
spuriously notify the caller that reload succeeded.
Mike Yuan [Sun, 16 Nov 2025 14:59:28 +0000 (15:59 +0100)]
core/unit: no need to handle intermediate job types in unit_process_job()
Installed jobs are always collapsed, i.e. can only be of types
accepted by job_run_and_invalidate() modulo JOB_NOP which is
stored in Unit.nop_job (if any). Let's trim the unreachable
branches.
libutmps does not support utmpxname(), the function always fails
with ENOSYS, and always uses their own file.
However, our code relies on the funtion needs to succeed.
Let's revert the change now, and revisit later when musl users
request to support libutmps.
Philip Withnall [Sun, 2 Nov 2025 11:34:03 +0000 (11:34 +0000)]
docs: Update MEMORY_PRESSURE to mention recent improvements in GLib
See https://gitlab.gnome.org/GNOME/glib/-/issues/2931 for the changes in
GLib upstream. Using `GMemoryMonitor` is now more compliant with the
systemd recommended approach, but it needs further work to read the
recommended environment variables rather than unconditionally accessing
the per-cgroup PSI kernel file directly.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Saying "table" everywhere is not needed. Everybody can see that the table
is a table is a table. Also tweak the grammar in various places to make
reading nicer.
Pressing Fn+F10 on Acer Nitro 5 AN515-58 incorrectly triggers display
brightness down (scancode 0xef) instead of keyboard backlight control,
causing the screen to go completely dark. Similarly, Fn+F9 (scancode
0xf0) has no function explictily stated in hwdb causing unknown keycode
debug messages.
Both keys should control the keyboard backlight as labeled on the
keyboard. Map scancodes 0xef and 0xf0 to kbdillumup and kbdillumdown
respectively to enable proper keyboard backlight control.
repart: avoid label string clashes between LUKS superblocks and the filesystems on them
Let's make sure that by default /dev/disk/by-label/ symlinks avoid
ambiguities, and the LUKS volume carries a different one than the file
system inside it.
NEWS: cleanups and rewordings, extend the section about musl
I think we should make it clear that the "incomplete musl support" does not
mean that it'll for certain be completed later. The feedback from users will be
an important consideration.
Yu Watanabe [Sun, 16 Nov 2025 11:14:00 +0000 (20:14 +0900)]
log: make each string generated in log_format_iovec() NUL terminated
Nowadays, we append an extra NUL for each data if possible for safety.
We already do the same for example at write_to_kmsg(), log_do_context(),
write_to_journal(), log_struct_iovec_internal(), and so on.
This does not change any behavior, as the iov_len field is unchanged.
In a typical output from systemd-repart, the output is very wide any any wasted
space is bad because it pushes the interesting information even further to the
right. We usually need at most one or two digits to express the partition
numbers, so let's shorten the title of the column to effectively remove two
columns in the output.
In JSON output, the old field name is retained. This follows the pattern
already used for field "drop-in_files".
Also right-align the columns with numbers always to the right. I doesn't make
sense to align the columns which are only used for JSON output, so stop setting
alignment for those.
Charlie Le [Mon, 17 Nov 2025 13:34:03 +0000 (08:34 -0500)]
hwdb: Add Elecom IST Pro trackball (#39762)
Added entries for the Elecom IST Pro via its three connection methods- a
USB cable, the included G1000 USB receiver, and Bluetooth.
The G1000 USB receiver _may_ have to be removed in the future depending
on the input devices that can connect to it. According to Elecom, the
receiver can have up to three different input devices connected such as
trackballs, mice, keyboards, etc. That said, as far as I can tell, the
IST Pro is the only released Elecom device that uses the receiver. The
non-pro model and the upcoming Elecom Huge Plus might use the same
receiver, but that should not matter as both devices are trackballs.
Yu Watanabe [Sat, 30 Aug 2025 13:25:22 +0000 (22:25 +0900)]
cgroup-util: do not check validity of controller in cg_split_spec()
Now the controller part is always ignored, hence let's skip check for
the controller part of the spec. This also make it acceppt unnormalized
path. Previously paths were checked by path_is_normalized(), but now
checked by path_is_safe(). Also, now this mapps an empty path to NULL.
Yu Watanabe [Fri, 29 Aug 2025 21:38:14 +0000 (06:38 +0900)]
tree-wide: replace cg_get_path_and_check() with cg_get_path()
We have dropped cgroup v1 support in v258. When running on cgroup v2,
cg_get_path_and_check() with SYSTEMD_CGROUP_CONTROLLER as controller is
equivalent with checking if we are running on cgroup v2 and then
cg_get_path(). As we can assume we are running on cgroup v2, then the
check is not necessary anymore, thus we can replace
cg_get_path_and_check() with cg_get_path().
Yu Watanabe [Fri, 29 Aug 2025 21:32:56 +0000 (06:32 +0900)]
cgroup-util: drop cgroup v1 support from cg_pid_get_path()
We have dropped cgroup v1 support in v258. Let's drop legacy code.
Then, we can drop 'controller' argument from cg_pid_get_path() and
cg_pidref_get_path().