This revert is necessary because the change breaks mDNS hostname stability
whenever a DNS-SD service calls UnregisterService. When a service
unregisters (e.g. on process restart), manager_refresh_rrs() clears and
re-adds all RRs in PROBING state, which sends a multicast announcement
(QR=1). The kernel reflects this back to resolved's own socket. Because
the local-address check was moved inside the query-only branch by the
reverted commit, the reply path in on_mdns_packet() is now unguarded.
The looped-back announcement matches the pending probe transaction and
completes it with DNS_TRANSACTION_SUCCESS. Since the zone item is still
in PROBING state (not ESTABLISHED), dns_zone_item_notify() sets
we_lost=true and calls dns_zone_item_conflict(), which invokes
manager_next_hostname() and renames the hostname (e.g. foo.local →
foo4.local). This happens reliably on every restart of any service using
RegisterService/UnregisterService (homebridge, avahi-compat wrappers,
etc.).
The top-level local-address check in on_mdns_packet() suppresses all
looped-back multicast traffic before the reply/query split. Restoring it
there is consistent with the overall design: dns_scope_check_conflicts()
already has its own manager_packet_from_local_address() guard and is
unaffected.
A more targeted long-term fix (e.g. guarding dns_transaction_process_reply()
for mDNS, or avoiding unnecessary re-probing of already-established records
in manager_refresh_rrs()) can be pursued separately.
shared: drop redundant cryptsetup_enable_logging(NULL) calls (#41785)
These were only used to implicitly load libcryptsetup at startup.
dlopen_cryptsetup() now calls cryptsetup_enable_logging(NULL) itself,
and every code path that uses libcryptsetup calls dlopen_cryptsetup()
before doing so, so the upfront calls are no longer needed.
repart: raise log level to LOG_ERR if dlopen_fdisk() fails
libfdisk is required by systemd-repart and it silently exits if dlopen fails
(unless the debug log level is set):
```
$ SYSTEMD_LOG_LEVEL=debug systemd-repart
Shared library 'libfdisk.so.1' is not available: libfdisk.so.1: cannot open shared object file: No such file or directory
$ echo $?
1
```
repart: trim NUL bytes from verity sig split artifact
The verity signature partition content is a bare JSON object. Repart
pads it with zeros to fill the GPT partition. But when splitting out
the content as an individual file, the padding remains, so it's not
a valid text file.
gpt-auto-generator: do not fail on missing libcryptsetup when verity
is not used
add_veritysetup() is called unconditionally from add_root_mount() and
add_usr_mount() whenever in_initrd() is true, to generate units that
only activate if verity devices appear. However, when compiled without
libcryptsetup, this function returned a hard error, causing the entire
generator to fail even when no verity protection is in use.
Change the #else fallback to log a debug message and return 0, matching
the pattern already used by add_root_cryptsetup().
shared: drop redundant dlopen_cryptsetup() calls from cryptsetup_* helpers
cryptsetup_set_minimal_pbkdf(), cryptsetup_get_token_as_json() and
cryptsetup_add_token_json() each take a struct crypt_device *cd, which
can only be obtained by first calling sym_crypt_init*() — and that
already requires dlopen_cryptsetup() to have succeeded. The internal
calls here were only implicitly re-loading a library the caller is
guaranteed to have already loaded.
shared: drop redundant cryptsetup_enable_logging(NULL) calls
These were only used to implicitly load libcryptsetup at startup.
dlopen_cryptsetup() now calls cryptsetup_enable_logging(NULL) itself,
and every code path that uses libcryptsetup calls dlopen_cryptsetup()
before doing so, so the upfront calls are no longer needed.
cryptsetup: load libcryptsetup via dlopen in setup binaries
Convert systemd-cryptsetup, systemd-cryptenroll, systemd-veritysetup
and systemd-integritysetup to go through the existing dlopen wrapper
for libcryptsetup instead of linking the library directly. Each binary
calls dlopen_cryptsetup() at the start of its run() and uses the sym_*
variants for every libcryptsetup entry point.
Extend cryptsetup-util.{h,c} to cover the libcryptsetup symbols that
these binaries use and that the wrapper was missing:
crypt_activate_by_token_pin, crypt_deactivate, crypt_init_data_device,
crypt_keyslot_status, crypt_set_keyring_to_link (conditional on
HAVE_CRYPT_SET_KEYRING_TO_LINK), crypt_status and
crypt_token_external_path.
With no direct callers of crypt_free() left, drop the non-sym
crypt_freep cleanup variant and rename sym_crypt_freep back to
crypt_freep via DEFINE_TRIVIAL_CLEANUP_FUNC_FULL_RENAME, matching the
naming convention used by other dlopen wrappers (acl_freep,
xkb_context_unrefp, ...). Update the remaining users in src/shared,
src/repart, src/home and src/growfs to the new name.
The four affected meson targets switch from libcryptsetup to
libcryptsetup_cflags so they no longer record a DT_NEEDED entry for
libcryptsetup.so.12.
shared: load libgnutls and libmicrohttpd via dlopen
Convert the GnuTLS and libmicrohttpd usage in journal-remote to the
dlopen pattern used by other optional shared libraries. A new
src/shared/gnutls-util.{h,c} declares the GnuTLS entry points via
DLSYM_PROTOTYPE and resolves them in dlopen_gnutls(); microhttpd-util
is moved from src/journal-remote to src/shared and gains analogous
DLSYM_PROTOTYPEs plus dlopen_microhttpd(). Callers in journal-gatewayd,
journal-remote-main and microhttpd-util itself call the sym_* wrappers
and invoke dlopen_gnutls()/dlopen_microhttpd() at their entry points.
setup_gnutls_logger() no longer fails when libgnutls is missing at
runtime; it logs a notice and returns 0 so journal-gatewayd starts up
without TLS dependencies installed.
The meson files gain libgnutls_cflags and libmicrohttpd_cflags partial
dependencies that expose include paths and compile flags only. Every
systemd-journal-{gatewayd,remote,upload} target switches to the cflags
variant, dropping the direct libgnutls/libmicrohttpd link. The
gatewayd->remote object-sharing dance for microhttpd-util.o goes away
since the code now lives in libshared.
test-dlopen-so gains assertions for dlopen_gnutls and dlopen_microhttpd.
test: wrap mount/umount when running with sanitizers
On Fedora Rawhide mount/umount is linked against libsystemd, which then
breaks the binaries in sanitizer runs, as we try to run instrumented
code from an uninstrumented binary:
bash-5.3# ldd /usr/bin/mount
linux-vdso.so.1 (0x00007fa757ef9000)
libmount.so.1 => /lib64/libmount.so.1 (0x00007fa757e84000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fa757e51000)
libc.so.6 => /lib64/libc.so.6 (0x00007fa757c56000)
libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fa757c16000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fa757400000)
libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fa75734f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa757efb000)
libclang_rt.asan.so => /usr/lib/clang/22/lib/x86_64-redhat-linux-gnu/libclang_rt.asan.so (0x00007fa756800000)
libm.so.6 => /lib64/libm.so.6 (0x00007fa7566e4000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fa7566b7000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fa756400000)
bash-5.3# mount
==458==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
This then breaks the whole machine, as mount is quite essential during
boot.
Let's just add mount/umount to the list of wrapped binaries to fix this.
Daan De Meyer [Sun, 29 Mar 2026 11:15:35 +0000 (11:15 +0000)]
nspawn: add --forward-journal= and --forward-journal-*= options
Add --forward-journal=FILE|DIR to forward the container's journal
entries to the host via systemd-journal-remote. When specified,
nspawn starts systemd-journal-remote listening on a Unix socket,
bind-mounts it into the container at /run/host/journal/socket, and
passes a journal.forward_to_socket credential pointing to it.
Add --forward-journal-max-use=, --forward-journal-keep-free=,
--forward-journal-max-file-size=, and --forward-journal-max-files=
to configure disk usage limits for the forwarded journal.
Consolidate nspawn's per-machine on-disk state under a single runtime
directory at /run/systemd/nspawn/<machine>/. The container rootdir
mount point moves from /tmp/nspawn-root-XXXXXX to <runtime_dir>/root,
the unix-export directory moves from
/run/systemd/nspawn/unix-export/<machine> to <runtime_dir>/unix-export,
and the journal-remote socket lives at
<runtime_dir>/journal-remote-socket. Update ssh-generator and
ssh-proxy to follow the new unix-export path layout.
Extract fork_journal_remote() into fork-notify.{c,h} as a shared
helper used by both nspawn and vmspawn, replacing vmspawn's
start_systemd_journal_remote().
Extract runtime_directory_make() into path-lookup.{c,h} as a shared
helper used by both nspawn and vmspawn, replacing vmspawn's inline
runtime directory creation logic.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Fri, 27 Mar 2026 14:38:09 +0000 (14:38 +0000)]
vmspawn,journal-remote: add journal forwarding disk usage options
Add options to vmspawn to configure journal-remote disk usage limits
when forwarding journal entries from the VM. These are passed through
as --max-use=, --keep-free=, --max-file-size=, and --max-files=
command-line arguments to systemd-journal-remote.
Add --max-use=, --keep-free=, --max-file-size=, and --max-files=
command-line options to systemd-journal-remote to allow overriding the
corresponding settings from the configuration file.
Add $SYSTEMD_JOURNAL_REMOTE_CONFIG_FILE environment variable support
to systemd-journal-remote. When set, the specified file is used
instead of the default configuration file and drop-in directories.
When set to the empty string or /dev/null, configuration file parsing
is skipped entirely. vmspawn sets this to /dev/null in the child
process to avoid inheriting the host's journal-remote configuration.
Make fork_notify() argv parameter optional. When NULL is passed,
fork_notify() returns 0 in the child (with $NOTIFY_SOCKET set) and
lets the caller run custom code before exec. Returns 1 in the parent.
This allows vmspawn to set environment variables in the child without
polluting the parent process.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace the getopt_long()-based parser with the FOREACH_OPTION /
OPTION_* macros from src/shared/options.h, mirroring the recent
conversions of nspawn and vmspawn. Each option's metadata (long
name, short name, metavar and help text) now lives next to its
parsing logic, and the --help text is generated from those
definitions via option_parser_get_help_table() instead of being
hard-coded.
Positional file arguments are collected via
option_parser_get_args() rather than strv_skip(argv, optind).
dlopen: take log_level argument and log in fallback stubs
Every dlopen_xxx() helper now takes an int log_level argument. It is
passed through to dlopen_many_sym_or_warn() (which in turn propagates it
to dlopen_verbose() for the library-not-installed case), and is used by
the fallback stub when support for the library is not compiled in to
emit a "<lib> support is not compiled in." message at the caller's level.
Callers pass LOG_DEBUG when gracefully degrading, or a higher level when
the failure should surface, and no longer need to log redundantly at the
call site.
As part of this, dlopen_bpf_full() (which already took a log_level) is
merged into dlopen_bpf() rather than keeping both.
The static inline fallbacks used to live in the headers, which required
pulling log.h in from every header that declared a dlopen_xxx(). Move
them into the .c files instead: the declaration is always outside the
#if HAVE_XXX block, the impl sits outside the outer #if HAVE_XXX wrap
with its own internal #if HAVE_XXX/#else/#endif, and apparmor-util.c,
idn-util.c, libmount-util.c and pam-util.c are now always compiled so
they can host their stubs.
mkosi: trim verity.sig json files to remove NUL padding before passing to jq (#41767)
jq started rejecting input that has NUL bytes to fix some security
issues,
so we need to trim the verity.sig json files, which are spat out with
the NUL bytes padding from the GPT partition content.
```
‣ Running postinstall script /home/runner/work/systemd/systemd/mkosi/mkosi.postinst.chroot…
jq: parse error: Invalid numeric literal at EOF at line 1, column 16384
‣ "/work/postinst final" returned non-zero exit code 5.
```
It was bumped in a40d93400759c8eb46a2ec8702735bde2333812a but this
is hardly load bearing stuff so let's document the version we actually
require rather than the version that makes a hardly load bearing feature
work properly, especially since v2.41 is extremely new and requiring
distributions to have that is just unrealistic.
This doesn't actually change anything materially except documentation,
but it keeps us honest about depending on stuff from newer util-linux
because we happen to document reliance on an extremely new version.
Coccinelle check(s) failed. For each flagged dereference, either:
- Add assert(param)/ASSERT_PTR(param) at the top of the function (if the parameter must not be NULL)
- Add an if (param) guard before the dereference (if NULL is valid)
- Add POINTER_MAY_BE_NULL(param) if NULL is okay for param
bpf: suppress false-positive clang-tidy/clangd diagnostics under src/bpf
clang-tidy's misc-use-internal-linkage fires on BPF map declarations
(they have the SEC(".maps") attribute and must retain external linkage
so bpftool gen skeleton can resolve them as ELF symbols), and its
misc-include-cleaner flags errno.h as unused even where a /* IWYU
pragma: keep */ is present. clangd's own unused-includes analysis
emits the equivalent diagnostic independently of clang-tidy.
Add src/bpf/.clang-tidy and src/bpf/.clangd that inherit the parent
configs and scope these suppressions to BPF sources only.
bpf: register compile_commands.json entries for bpf programs
compile_commands.json is generated by ninja from c_COMPILER rules, so
BPF programs (built via custom_target and thus emitted as CUSTOM_COMMAND
rules) never show up there. Clangd consequently falls back to
heuristics when opening .bpf.c files, with poor diagnostic fidelity.
Register a meson postconf script per BPF program that upserts an entry
into compile_commands.json using the same argv meson constructed for
the custom_target. The script runs after meson has written the DB,
substitutes @INPUT@/@OUTPUT@, and keys entries by source path so
repeated reconfigures don't accumulate duplicates.
bpf: move all programs into src/bpf/ and consolidate meson logic
The six .bpf.c files and their shared .bpf.h headers now live directly
under src/bpf/, rather than scattered across src/core/bpf/<prog>/,
src/network/bpf/<prog>/ and src/nsresourced/bpf/<prog>/.
All BPF compilation logic — the BPF_FRAMEWORK determination (clang/gcc/
bpftool/llvm-strip lookups), flag and command construction, vmlinux.h
handling, the bpf_programs list and the loop that builds the unstripped
object, the stripped object and the skeleton header for each program —
moves into src/bpf/meson.build. The top-level meson.build only keeps
option handling and the libbpf dependency. subdir('src/bpf') is pulled
up before config.h generation so that BPF_FRAMEWORK, HAVE_VMLINUX_H and
ENABLE_SYSCTL_BPF land in conf in time.
Skeleton wrapper headers (<prog>-skel.h) are now emitted by a
configure_file() template (src/bpf/bpf-skel-wrapper.h.in) at meson
setup, replacing the previously checked-in shims of the same name.
Consumer #include paths are flattened:
"bpf/<prog>/<prog>-skel.h" becomes "<prog>-skel.h",
"bpf/<prog>/<prog>-api.bpf.h" becomes "<prog>-api.bpf.h". src/bpf is
added to includes so the shared BPF headers resolve.
sysctl-write-event.h, now shared between userspace (networkd-sysctl.c)
and BPF (sysctl-monitor.bpf.c) from a single location, gains guarded
includes so pid_t and uint64_t resolve on both sides: vmlinux.h in the
BPF case (selected via __bpf__), stdint.h + sys/types.h otherwise.
Chris Hofer [Mon, 20 Apr 2026 14:55:38 +0000 (16:55 +0200)]
build: Compile fuzz-journald-util.c only if want_fuzz_tests
fuzz-journald-util.c is compiled unconditionally even though fuzzing
tests aren't enabled. Only build it if fuzzing tests are configured.
This also ensure that the functions it uses from src/shared/tests.c are
available.
Both devices have -90 degrees mounted panels but they don't have the
quirk in kernel.
The Pocket 4 has been researched and it has an acpi accel matrix that
works when setting panel orientation at boot parameter.
The Pocket 3 hasn't been tested, but given it didn't had panel
orientation quirk is for sure that matrix is wrong for it.
Actually is pending the quirks for both devices in kernel but eventually
they will get merged. Till that happens is encourage that owners of
these devices set panel orientation boot parameter to right-up.
* 94af257c72 d/t/control: pull libmicrohttpd-dev for unit-tests suite
* 08263f18a4 d/t/control: pull libfdisk-dev for test suites
* e54175a0a4 Install new files for upstream build
mkosi: trim verity.sig json files to remove NUL padding before passing to jq
jq started rejecting input that has NUL bytes to fix some security issues,
so we need to trim the verity.sig json files, which are spat out with
the NUL bytes padding from the GPT partition content.
‣ Running postinstall script /home/runner/work/systemd/systemd/mkosi/mkosi.postinst.chroot…
jq: parse error: Invalid numeric literal at EOF at line 1, column 16384
‣ "/work/postinst final" returned non-zero exit code 5.
sd-stub: make initrd passing incremental + other EFI prep work for #41543 (#41748)
This is split out of #41543 but I think makes sense of its own.
It primary does one thing: ensure that initrds installed via the Linux EFI protocol are incremental in behaviour (i.e. we read the previously set initrd and combine it with ours). So far we'd simply not install any initrds at all in this case, which would break stuff.
THis is preparatory for #41543, but is generally the better, safer behaviour.
This also contains three minor changes which are purley prep work for #41543 but shouldn't hurt in the big picture.
test-journal-append: convert to the new option parser
The help string is adjusted/reworded. In particular, [a, b) is used
as notation to show a closed-open range, instead of the unusual <a; b).
--help is shown in --help.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Translations update from [Fedora
Weblate](https://translate.fedoraproject.org) for
[systemd/main](https://translate.fedoraproject.org/projects/systemd/main/).
vmspawn: catch unsupported growing of qcow2 images (#41654)
For qcow2 images it's not enough to grow the file. Since we probably
don't want to shell out to qemu-img either let's just error out to make
the user aware that growing needs to be done manually.
Turns out that the real reason behind this fail is that the machine was
under heavy load due to a busy-loop from the stub init. The cause of
this is a bug in bash, where running commands that fork (i.e. not
built-ins) can cause a permanent busy-loop due to a desync in trap
handling if you send the signals to the bash process _just right_:
ci: Restore severity prefix on claude-review inline comments
Commit a65ebc3ff9 ("claude-review: improve review quality for large
PRs") dropped the `Claude: **<severity>**: ` prefix from posted inline
comments on the theory that Claude was also adding the severity into
`body`, producing duplicates. But nothing in the prompt or schema
actually asks the subagent to include severity in `body` — severity
is a separate structured field. The result is that inline comments
no longer show must-fix/suggestion/nit classification.
Restore the prefix in the posting step, and add an explicit instruction
to the subagent prompt telling it not to repeat severity inside `body`
so the two don't collide.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
boot: introduce a common structure for cpio target dirs
There are only a few target dirs we place resources in when generating
on-the-fly initrd cpios. These dirs have very specific attributes.
Instead of repeating this everywhere, let's encapsulate them in a new
explicit structure, that we can reuse at various places.
This is preparation for placing extra resources of Type #1 entry also in
them without having to encode access modes at multiple places
redundantly.
stub: load previous initrd that is already configured, too
This changes the initrd combination logic to also include any initrd
already configured via the "LINUX_INITRD_MEDIA_GUID" device in the
initrd we pass to the linux kernel.
Or in other words: with this systemd-stub starts operating purely
incremental: it will extend any previously installed initrd with its own
stuff, so that both the previous initrd(s) and systemd-stub's are in
effect.
boot: change initrd_register() so that it replaces any previously registered LINUX_INITRD_MEDIA device
So far, if an initrd is already registered we'd silently not register
one again. Let's make this more reliable and systematic, and register
ours, overriding what is previously set.
(Note, in a later commit we'll incorporate any previously set initrd,
which hence makes this all incremental instead of destructive as it
might appear now)
Convert fdisk-util to the dlopen pattern used by other optional shared
libraries in libshared. Declare the libfdisk API entry points with
DLSYM_PROTOTYPE, resolve them in a dlopen_fdisk() helper, and call the
sym_* wrappers from the homework, sysupdate and repart binaries that
use them.
With this in place fdisk-util can live in libshared itself, linked only
against libfdisk's headers (via libfdisk_cflags). The libshared_fdisk
convenience library and the libfdisk link dependency on systemd-homework,
systemd-sysupdate, systemd-repart and systemd-repart.standalone go away.
Also add a dlopen_fdisk() check to test-dlopen-so.
Philip Withnall [Mon, 20 Apr 2026 17:02:42 +0000 (18:02 +0100)]
sysupdate: Emit READY=1 status when installing
`READY=1` is already correctly emitted when acquiring an update, but was
forgotten to be emitted when subsequently installing that update.
That meant that the state tracking code in `sysupdated` and hence
`updatectl` could not properly report the progress of the install
operation, and hence printed “Already up-to-date” after a successful
update installation, rather than “Done”.
Add a test to catch this in future.
Signed-off-by: Philip Withnall <pwithnall@gnome.org> Fixes: https://github.com/systemd/systemd/issues/41502
core: implement Kill/Automount/Mount Context/Runtime for io.systemd.Unit.List (#39391)
The PR implements the following objects + tests for
`io.systemd.Unit.List`:
- `KillContext`
- `AutomountContext`
- `AutomountRuntime`
- `MountContext`
- `MountRuntime`
It's a continuation of the following PRs:
* https://github.com/systemd/systemd/pull/37432
* https://github.com/systemd/systemd/pull/37646
* https://github.com/systemd/systemd/pull/38032
* https://github.com/systemd/systemd/pull/38212
Convert curl-util to the dlopen pattern used by other optional shared
libraries in libshared (libarchive, pcre2, idn, ...). Declare the curl
API entry points with DLSYM_PROTOTYPE, resolve them in a dlopen_curl()
helper, and call the sym_* wrappers from callers. curl_glue_new() now
loads the library on first use, so consumers going through CurlGlue
pick this up automatically; journal-upload and report-upload call
dlopen_curl() directly since they use curl without the glue layer.
With this in place curl-util can live in libshared itself, linked only
against libcurl's headers (via libcurl_cflags). The libcurlutil_static
convenience library and the libcurl link dependency on systemd-imdsd,
systemd-pull, systemd-journal-upload and systemd-report go away.
Also move the easy_setopt() helper macro next to the DLSYM declarations
so all consumers use a single sym-prefixed definition, and add a
dlopen_curl() check to test-dlopen-so.
json-stream: hide JsonStreamQueueItem as an implementation detail
The json-stream API previously exposed JsonStreamQueueItem and several
functions operating on it (json_stream_make_queue_item(),
json_stream_enqueue_item(), json_stream_queue_item_free(),
json_stream_queue_item_get_data()). These existed solely to support
sd-varlink's "defer-and-modify" pattern for streaming replies, where a
reply is held back so its "continues" field can be set before
transmission. This is a varlink protocol concern that should not leak
into the generic transport layer.
Similarly, the fd pushing API (json_stream_push_fd(),
json_stream_reset_pushed_fds()) and the pushed_fds state lived inside
JsonStream, even though fd-to-message association is a protocol-level
concern managed entirely by sd-varlink.
Rework the API so that:
- JsonStreamQueueItem and all its functions become static to
json-stream.c. The only output API is now json_stream_enqueue_full()
(accepting explicit fds) and the inline json_stream_enqueue() wrapper
for the common no-fds case.
- The pushed_fds state moves from JsonStream into sd_varlink, where
sd_varlink_push_fd() and sd_varlink_reset_fds() manage it directly.
- The deferred reply in sd-varlink changes from a JsonStreamQueueItem*
to a plain sd_json_variant* plus a separate previous_fds/n_previous_fds
pair, keeping the protocol-specific bookkeeping in sd-varlink where it
belongs.
- A new varlink_enqueue() helper wraps json_stream_enqueue_full() with
the varlink connection's pushed fds, transferring fd ownership to the
queue item on success.
The reworks the ESP/XBOOTLDR logic to pin the ESP/XBOOTLDR via an fd,
and return that as optional return parameter.
So far we only pinned the parent dir of the ESP/XBOOTLDR, which was
useful when verifying that ESP/XBOOTLDR is actually a mount point by
comparing mount ids. This however became obsolete with a98a6eb95cc980edab4b0f9c59e6573edc7ffe0c. Hence, let's clean this up,
and pin the inode we really care about and return it.
chase: tighten flags checks in chase_and_unlinkat()
Some flags don't reasonably apply to chase_and_unlinkat() (because we
open the parent inode of an inode to delete, which is always a dir),
hence let's catch these flags when misused.
(I ran into this, and it was very confusing to debug, hence let's make
it easier)
Newer tar started using openat2() via open_subdir() to address
CVE-2025-45582 [0]. Now, gnulib, that tar uses, provides the openat2()
syscall in two ways [1]:
1) If glibc doesn't provide openat2(), it provides its own version in
openat2.c, that tries to call openat2() syscall first, and if it
returns ENOSYS, it emulates the function in userspace.
2) If glibc provides openat2(), it uses that directly, without providing
any fallback on ENOSYS.
Quite recently our test suite started calling nspawn with
--suppress-sync=yes. This means that we call seccomp_suppress_sync(),
which eventually calls block_open_flag(), that blocks the openat2()
syscall completely and refuses it with ENOSYS as this syscall can't be
sensibly filtered (see the openat2()-relevant comments in
block_open_flag() and seccomp_restrict_sxid()). And when glibc provides
openat2(), there's no fallback, so the ENOSYS bubbles up to the user as:
TEST-25-IMPORT.sh[163]: + tar xzf /var/tmp/scratch.tar.gz
TEST-25-IMPORT.sh[163]: tar: ./adirectory/athirdfile: Cannot open: Function not implemented
TEST-25-IMPORT.sh[163]: tar: Exiting with failure status due to previous errors
Let's mitigate this by re-enabling sync for TEST-25-IMPORT, at least for
now.
In nspawn.c's run_container() the child_netns_fd = receive_one_fd(...)
failure path logged 'r' instead of the negative errno returned in
child_netns_fd, so the actual error from receive_one_fd was being
overwritten by whatever 'r' happened to hold. The other receive_one_fd
call sites in the same function use the returned fd variable directly
(mntns_fd, etc.), so align this one.
In shared/nsresource.c's nsresource_add_cgroup() the cgroup_fd_idx =
sd_varlink_push_dup_fd(...) failure path logged userns_fd_idx, which
is the previous successful push's index, not the negative errno we
just got from pushing cgroup_fd. Log cgroup_fd_idx instead.
Both were flagged by static analysis (#41709) and match the immediately
preceding userns_fd-path pattern that was presumably copy-pasted.
compress: gracefully handle a truncated ZSTD frame
If a journal file contains a truncated ZSTD frame (i.e. a frame with
Frame_Content_Size > 0, but with not enough data in Data_Block),
ZSTD_decompressStream() would return a non-zero, non-error value. This
would then skip the error path in the ZSTD_isError() branch and we'd hit
the following assert:
Let's handle this situation gracefully and return EBADMSG instead.
Also, add another journalctl invocation to the corrupted-journals test
that goes through the sd_journal_get_data() -> decompress_startswith_zstd()
code path which, among other things, covers the issue when run on the
provided journal file.
strxcpyx: add a paranoia check for vsnprintf()'s return value
vsnprintf() can, under some circumstances, return negative value, namely
during encoding errors when converting wchars to multi-byte characters.
This would then wreak havoc in the arithmetics we do following the
vsnprintf() call. However, since we never do any wchar shenanigans in
our code it should never happen.
Let's encode this assumption into the code as an assert(), similarly how
we already do this in other places (like strextendf_with_separator()).
iovec-wrapper: rename iovw_append() to iovw_extend()
The naming is consistent with strv_extend().
This also
- introduces tiny iovw_extend_iov() wrapper,
- refuse when the source and target points to the same object,
- check the final count before extending in iovw_extend_iovw().