Yu Watanabe [Wed, 19 Nov 2025 23:19:46 +0000 (08:19 +0900)]
core: Verify inherited FDs are writable for stdout/stderr (#39674)
When inheriting file descriptors for stdout/stderr (either from stdin or
when making stderr inherit from stdout), we previously just assumed they
would be writable and dup'd them. This could lead to broken setups if
the inherited FD was actually opened read-only.
Before dup'ing any inherited FDs to stdout/stderr, verify they are
actually writable using the new fd_is_writable() helper. If not, fall
back to /dev/null (or reopen the terminal in the TTY case) with a
warning, rather than silently creating a broken setup where output
operations would fail.
network: clear existing routes if Gateway= is empty in [Network]
Add support for an empty Gateway= in [Network] to clear the existing
routes. This change will allow users to remove the default route from a
drop-in file.
man: add 'testing' as one of the suggestions for DEPLOYMENT=
Looking at the list, "test" or "testing" seems to be a fairly generic entry
that is missing from the list of suggestions. I went with "testing" because it
fits better with the other item, e.g. "staging".
In https://github.com/systemd/systemd/issues/38743 "laboratory" was also
suggested. I didn't include this because that is more about the location, not
deployment type. Any of the other deployments could be in a "laboratory".
Chris Down [Wed, 19 Nov 2025 19:52:02 +0000 (03:52 +0800)]
tests: ASSERT_SIGNAL: Prevent hallucinating parent as child and confusing exit codes with signals (#39807)
This series fixes two distinct, pretty bad bugs in `ASSERT_SIGNAL`.
These bugs can allow failing tests to pass, and can also cause the test
runner to silently terminate prematurely in a way that looks like
success.
This is not theoretical, see
https://github.com/systemd/systemd/pull/39674#discussion_r2540552699 for
a real case of this happening.
---
Bug 1: Parent process hallucinates it is the child and re-executes the
expression being tested
Previously, assert_signal_internal() returned 0 in two mutually
exclusive states:
1. We are the child process (immediately after fork()).
2. We are the parent process, and the child exited normally (status 0).
The macro failed to distinguish these cases. If a child failed to crash
as expected, the parent received 0, incorrectly interpreted it as it
being the child, and re-executed the test expression inside the parent
process.
This can cause tests to falsely pass. The parent would successfully run
the expression (which wasn't supposed to crash in the parent), succeed,
and call _exit(EXIT_SUCCESS).
The second consequence is silent truncation. When the parent called
_exit(), it terminated the entire test runner immediately. Any
subsequent tests in the same binary were never executed.
---
Bug 2: Conflation of exit codes and signals
The harness returned the raw si_status without checking si_code. This
meant that an exit code was indistinguishable from a signal number. For
example, if a child process failed and called exit(6), the harness
reported it as having been killed by SIGABRT (signal 6).
---
This PR both fixes the bugs and reworks the ASSERT_SIGNAL infrastructure
to ensure this is very unlikely to regress:
- assert_signal_internal now returns an explicit control flow enum
(FORK_CHILD / FORK_PARENT) separate from the status data. This makes it
structurally impossible for the parent to hallucinate that it is the
child.
- The output parameter is only populated with a signal number if si_code
confirms the process was killed by a signal. Normal exits return 0.
Chris Down [Wed, 19 Nov 2025 14:06:03 +0000 (22:06 +0800)]
tests: ASSERT_SIGNAL: Do not allow parent to hallucinate it is the child
assert_signal_internal() returns 0 in two distinct cases:
1. In the child process (immediately after fork returns 0).
2. In the parent process, if the child exited normally (no signal).
ASSERT_SIGNAL fails to distinguish these cases. When a child exited
normally (case 2), the parent process receives 0, incorrectly interprets
it as meaning it is the child, and re-executes the test expression
inside the parent process. Goodness gracious!
This causes two severe test integrity issues:
1. False positives. The parent can run the expression, succeed, and call
_exit(EXIT_SUCCESS), causing the test to pass even though no signal
was raised.
2. Silent truncation. The _exit() call in the parent terminates the test
runner prematurely, preventing subsequent tests in the same file from
running.
Example of the bug in action, from #39674:
ASSERT_SIGNAL(fd_is_writable(closed_fd), SIGABRT)
This test should fail (fd_is_writable does not SIGABRT here), but with
the bug, the parent hallucinated being the child, re-ran the expression
successfully, and exited with success.
Fix this by refactoring assert_signal_internal() to be much more strict
about separating control flow from data.
The signal status is now returned via a strictly typed output parameter,
guaranteeing that determining whether we are the child is never
conflated with whether the child exited cleanly.
Chris Down [Wed, 19 Nov 2025 13:45:40 +0000 (21:45 +0800)]
tests: ASSERT_SIGNAL: Ensure sanitisers do not mask expected signals
ASAN installs signal handlers to catch crashes like SIGSEGV or SIGILL.
When these signals are raised, ASAN traps them, prints an error report,
and then typically terminates the process with a different signal (often
SIGABRT) or a non-zero exit code.
This interferes with ASSERT_SIGNAL when checking for specific crash
signals (for example, checking that a function raises SIGSEGV). In such
a case, the test harness sees the ASAN termination signal rather than
the expected signal, causing the test to fail.
Fix this by resetting the signal handler to SIG_DFL in the child process
immediately before executing the test expression. This ensures the
kernel kills the process directly with the expected signal, bypassing
ASAN's interceptors.
Chris Down [Wed, 19 Nov 2025 08:50:38 +0000 (16:50 +0800)]
tests: ASSERT_SIGNAL: Stop exit codes from masquerading as signals
When a child process exits normally (si_code == CLD_EXITED),
siginfo.si_status contains the exit code. When it is killed by a signal
(si_code == CLD_KILLED or CLD_DUMPED), si_status contains the signal
number. However, assert_signal_internal() returns si_status blindly.
This causes exit codes to be misinterpreted as signal numbers.
This allows failing tests to silently pass if their exit code
numerically coincides with the expected signal. For example, a test
expecting SIGABRT (6) would incorrectly pass if the child simply exited
with status 6 instead of being killed by a signal.
Fix this by checking si_code. Only return si_status as a signal number
if the child was actually killed by a signal (CLD_KILLED or CLD_DUMPED).
If the child exited normally (CLD_EXITED), return 0 to indicate that no
signal occurred.
Chris Down [Mon, 10 Nov 2025 20:26:10 +0000 (04:26 +0800)]
core: Verify inherited FDs are writable for stdout/stderr
When inheriting file descriptors for stdout/stderr (either from stdin
or when making stderr inherit from stdout), we previously just assumed
they would be writable and dup'd them. This could lead to broken setups
if the inherited FD was actually opened read-only.
Before dup'ing any inherited FDs to stdout/stderr, verify they are
actually writable using the new fd_is_writable() helper. If not, fall
back to /dev/null (or reopen the terminal in the TTY case) with a
warning, rather than silently creating a broken setup where output
operations would fail.
Chris Down [Mon, 17 Nov 2025 03:05:09 +0000 (11:05 +0800)]
fd-util: Add fd_is_writable() to check if FD is opened for writing
This checks whether a file descriptor is valid and opened in a mode that
allows writing (O_WRONLY or O_RDWR). This is useful when we want to
verify that inherited FDs can actually be used for output operations
before dup'ing them.
The helper explicitly handles O_PATH file descriptors, which cannot be
used for I/O operations and thus are never writable.
Chris Down [Wed, 19 Nov 2025 08:49:22 +0000 (16:49 +0800)]
tests: Avoid variable shadowing in ASSERT_SIGNAL
The ASSERT_SIGNAL macro uses a fixed variable name, `_r`. This prevents
nesting the macro (like ASSERT_SIGNAL(ASSERT_SIGNAL(...))), as the inner
instance would shadow the outer instance's variable.
Switch to using the UNIQ_T helper to generate unique variable names at
each expansion level. This allows the macro to be used recursively,
which is required for upcoming regression tests regarding signal
handling logic.
Daan De Meyer [Wed, 19 Nov 2025 09:30:01 +0000 (10:30 +0100)]
tools: Add script to detect unused symbols in libshared
Symbols exported by libshared can't get pruned by the linker, so
every unused exported symbol is effectively dead code we ship to users
for no good reason. Let's add a script to analyze how many such symbols
we have.
We also add a meson test to run the script on all of our binaries.
Since it detects unused symbols and still has a few false positives,
don't enable the test by default similar to the clang-tidy tests.
The script was 100% vibe coded by Github Copilot with Claude Sonnet 4.5
as the model.
Current results are (without the unused symbols list):
Analysis of libsystemd-shared-259.so
======================================================================
Total exported symbols: 4830
(excluding public API symbols starting with 'sd_')
Used symbols: 4672
Unused symbols: 158
Usage rate: 96.7%
ssh-generator: suppress error message for vsock EADDRNOTAVAIL
In logs in the Fedora OpenQA CI:
Nov 17 22:20:06 fedora systemd-ssh-generator[4117]: Failed to query local AF_VSOCK CID: Cannot assign requested address
Nov 17 22:20:06 fedora (generato[4088]: /usr/lib/systemd/system-generators/systemd-ssh-generator failed with exit status 1.
Nov 17 22:20:06 fedora systemd[1]: sshd-vsock.socket: Unit configuration changed while unit was running, and no socket file descriptors are open. Unit not functional until restarted.
AF_VSOCK is not configured there and systemd-ssh-generator should just exit
quietly. vsock_get_local_cid() already does some logging at debug level, so we
don't need to.
There is also a second bug, we report modifications to the unit have just
created. I think we have an issue open for this somewhere, but cannot find it.
man: use prefix number that matches the general suggestion
`systemd.network(5)` recommends “that each filename is prefixed with a number
smaller than "70" (e.g. 10-eth0.network)”.
Reduce that used by the example accordingly, but stay above the number (`50`)
used in the earlier example for static configuration, so that would take
precedence over the dynamic one if both match for the same network.
Before:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': No such process
Now:
$ build/systemd-creds --uid=asdf
Failed to resolve user 'asdf': Unknown user
core: improve messages about unknown users and groups
$ sudo build/systemd-run --uid=asdf whoami
$ journalctl -e
(whoami)[1007784]: run-p1007782-i5200512.service: Failed to determine user credentials: No such process
(whoami)[1007784]: run-p1007782-i5200512.service: Failed at step USER spawning /usr/sbin/whoami: No such process
systemd[1]: run-p1007782-i5200512.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1007782-i5200512.service: Failed with result 'exit-code'.
Now:
(whoami)[1013204]: run-p1013202-i5205932.service: Failed to determine credentials for user 'asdf': Unknown user
(whoami)[1013204]: run-p1013202-i5205932.service: Failed at step USER spawning /usr/sbin/whoami: Invalid argument
systemd[1]: run-p1013202-i5205932.service: Main process exited, code=exited, status=217/USER
systemd[1]: run-p1013202-i5205932.service: Failed with result 'exit-code'.
Before:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user asdf: No such process
Now:
$ sudo build/systemd-run --scope --uid=asdf whoami
Failed to resolve user 'asdf': Unknown user
tmpfiles: improve error message for missing user/group
From a boot with a dracut initrd:
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:2: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:4: Failed to resolve user 'tss': No such process
systemd-tmpfiles[242]: Failed to parse ACL "default:group:tss:rwx", ignoring: Invalid argument
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:6: Failed to resolve group 'tss': No such process
systemd-tmpfiles[242]: /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:7: Failed to resolve group 'tss': No such process
udev: define a generic helper to print messages about unknown users and groups
We cannot just use %m, because strerror returns a confusing error message
for ESRCH or ENOEXEC. udev code was doing a good job, but the error handling
was very verbose. Let's encapsulate the customized error messages in a
helper.
No functional change, except that the error messages have a slightly different
form now. The old messages were a bit better, but we don't have as much
flexibility in the new scheme. "Failed to resolve user 'foo': Unknown user"
should be good enough.
network: gracefully disable resolve hook when socket is disabled
systemd-networkd cannot create the directory /run/systemd/resolve.hook/. Even
if the directory exists, it is not owned by systemd-network user/group, so
systemd-networkd cannot create socket file in the directory. Hence, if the
systemd-networkd-resolve-hook.socket unit is disabled, networkd fails to open
the varlink socket, and fail to start:
systemd-networkd[1304645]: Failed to bind to systemd-resolved hook Varlink socket: Permission denied
systemd-networkd[1304645]: Could not set up manager: Permission denied
systemd[1]: systemd-networkd.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
systemd[1]: Failed to start systemd-networkd.service - Network Management.
If the socket unit is disabled, that should mean the system administrator wants
to disable the feature. Let's not try to setup the varlink socket in that case.
Now the resolve hook feature can be toggled by enabling/disabling the socket
unit, let's drop the $SYSTEMD_NETWORK_RESOLVE_HOOK environment variable.
Simon Barth [Mon, 10 Nov 2025 20:57:24 +0000 (21:57 +0100)]
man: Fix systemd-analyze exit-status example output
The output of `systemd-analyze exit-status` changed in commit e04ed6db6b44681b7a7876b9c4a1e6adaf877670, so that the exit-status class
for EXIT_SUCCESS and EXIT_FAILURE is "libc" instead of "glibc".
This commit makes the example output in the man-page match the actual
output again.
Mike Yuan [Sat, 15 Nov 2025 20:06:39 +0000 (21:06 +0100)]
core/unit: mark running reload job as canceled if the unit deactivated
The semantics of reload is that the service updates its extrinsic state
and continues execution. If it actually deactivated we shouldn't
spuriously notify the caller that reload succeeded.
Mike Yuan [Sun, 16 Nov 2025 14:59:28 +0000 (15:59 +0100)]
core/unit: no need to handle intermediate job types in unit_process_job()
Installed jobs are always collapsed, i.e. can only be of types
accepted by job_run_and_invalidate() modulo JOB_NOP which is
stored in Unit.nop_job (if any). Let's trim the unreachable
branches.
libutmps does not support utmpxname(), the function always fails
with ENOSYS, and always uses their own file.
However, our code relies on the funtion needs to succeed.
Let's revert the change now, and revisit later when musl users
request to support libutmps.
Philip Withnall [Sun, 2 Nov 2025 11:34:03 +0000 (11:34 +0000)]
docs: Update MEMORY_PRESSURE to mention recent improvements in GLib
See https://gitlab.gnome.org/GNOME/glib/-/issues/2931 for the changes in
GLib upstream. Using `GMemoryMonitor` is now more compliant with the
systemd recommended approach, but it needs further work to read the
recommended environment variables rather than unconditionally accessing
the per-cgroup PSI kernel file directly.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Saying "table" everywhere is not needed. Everybody can see that the table
is a table is a table. Also tweak the grammar in various places to make
reading nicer.
Pressing Fn+F10 on Acer Nitro 5 AN515-58 incorrectly triggers display
brightness down (scancode 0xef) instead of keyboard backlight control,
causing the screen to go completely dark. Similarly, Fn+F9 (scancode
0xf0) has no function explictily stated in hwdb causing unknown keycode
debug messages.
Both keys should control the keyboard backlight as labeled on the
keyboard. Map scancodes 0xef and 0xf0 to kbdillumup and kbdillumdown
respectively to enable proper keyboard backlight control.
repart: avoid label string clashes between LUKS superblocks and the filesystems on them
Let's make sure that by default /dev/disk/by-label/ symlinks avoid
ambiguities, and the LUKS volume carries a different one than the file
system inside it.
NEWS: cleanups and rewordings, extend the section about musl
I think we should make it clear that the "incomplete musl support" does not
mean that it'll for certain be completed later. The feedback from users will be
an important consideration.
Yu Watanabe [Sun, 16 Nov 2025 11:14:00 +0000 (20:14 +0900)]
log: make each string generated in log_format_iovec() NUL terminated
Nowadays, we append an extra NUL for each data if possible for safety.
We already do the same for example at write_to_kmsg(), log_do_context(),
write_to_journal(), log_struct_iovec_internal(), and so on.
This does not change any behavior, as the iov_len field is unchanged.
In a typical output from systemd-repart, the output is very wide any any wasted
space is bad because it pushes the interesting information even further to the
right. We usually need at most one or two digits to express the partition
numbers, so let's shorten the title of the column to effectively remove two
columns in the output.
In JSON output, the old field name is retained. This follows the pattern
already used for field "drop-in_files".
Also right-align the columns with numbers always to the right. I doesn't make
sense to align the columns which are only used for JSON output, so stop setting
alignment for those.
Charlie Le [Mon, 17 Nov 2025 13:34:03 +0000 (08:34 -0500)]
hwdb: Add Elecom IST Pro trackball (#39762)
Added entries for the Elecom IST Pro via its three connection methods- a
USB cable, the included G1000 USB receiver, and Bluetooth.
The G1000 USB receiver _may_ have to be removed in the future depending
on the input devices that can connect to it. According to Elecom, the
receiver can have up to three different input devices connected such as
trackballs, mice, keyboards, etc. That said, as far as I can tell, the
IST Pro is the only released Elecom device that uses the receiver. The
non-pro model and the upcoming Elecom Huge Plus might use the same
receiver, but that should not matter as both devices are trackballs.
Yu Watanabe [Sat, 30 Aug 2025 13:25:22 +0000 (22:25 +0900)]
cgroup-util: do not check validity of controller in cg_split_spec()
Now the controller part is always ignored, hence let's skip check for
the controller part of the spec. This also make it acceppt unnormalized
path. Previously paths were checked by path_is_normalized(), but now
checked by path_is_safe(). Also, now this mapps an empty path to NULL.