git.ipfire.org Git - thirdparty/systemd.git/log

loop-util: don't reuse partition fd when partscan needed

Some devices (e.g. android phones running pmOS) cannot have their OEM
partition table altered without breaking the firmware, so the distros's
partitions live inside a nested GPT carved into one of the OEM
partitions. Exposing these subpartitions requires wrapping the outer
partition in a loop device with partscan enabled, since the kernel does
not go into nested partition tables.

systemd already detects this case in udev-builtin-blkid
(ID_PART_GPT_AUTO_ROOT_DISK_NEEDS_LOOP) and acts on with
systemd-loop@.service, but this fails towards the end.
loop_device_make_internal has an optimization where if the input is
already a block device with a matching sector size, it skips creating
a loop and just hands back the original fd. That's fine for whole disks
but wrong for partitions, which don't support partscan, so this causes
dissect_image to fail with EPROTONOSUPPORT.

This patch changes the behavior to only take the shortcut when the input
is a whole disk, or when partscan was not requested.

Co-Authored-By: Clayton Craft <clayton@craftyguy.net>
(cherry picked from commit 47d408163b0b71e5f8fed6b2e520c053cefc5780)

udev: don't assert on worker cap after killing a broken idle worker

manager_can_process_event() considers an event processable if either
there is room below children_max to spawn, or an idle worker exists.
When only the latter holds, event_run() picks the idle worker and
tries device_monitor_send(). If that send fails, event_run() SIGKILLs
the worker, marks it WORKER_KILLED and continues the loop. With no
other idle worker available, it falls through to worker_spawn(),
guarded by:

    assert(hashmap_size(manager->workers) < manager->config.children_max);

The just-killed worker is still in manager->workers until its SIGCHLD
is reaped by on_worker_exit(), so at the cap this assertion trips and
udevd aborts:

    Assertion 'hashmap_size(manager->workers) < manager->config.children_max'
    failed at src/udev/udev-manager.c:635, function event_run(). Aborting.

Instead of asserting, bail out when we are already at the worker
limit. The event remains in EVENT_QUEUED; once the killed worker's
SIGCHLD arrives and frees it from the hashmap, on_post() re-runs
event_queue_start() and the event is retried.

(cherry picked from commit 181a9f65a7cf9059da0f2a44e2152d7636446b33)

nsresource: fix buffer overrun reported by ASAN

This came up when running systemd-vmspawn with ASAN to fix another bug
and thus I had to fix this overrun here first: The dispatch tables were
missing the terminator, add it.

(cherry picked from commit 41cf6891213dd78c5af3c2b8ff05194c52efa62f)

format-table: fix potential segfault

In format-table.h, TABLE_IN_ADDR is commented as "Takes a union in_addr_union
(or a struct in_addr)". However, if we pass struct in_addr to table_add_many(),
the function reads more than the size of the struct.

(cherry picked from commit f0483f308a4daed188deee776aa7a7a733293642)

ssh-proxy: add a missing dispatch table sentinel

Which was accidentally dropped in e6be5fb7200fb02e78e4f27f49a4d734b7b850a0.

Follow-up for e6be5fb7200fb02e78e4f27f49a4d734b7b850a0.

(cherry picked from commit bad594f79a11b8896f1a830f86ae3beefac0d22f)

test: don't strip directives from test units

The original find was matching even our test units, which caused issues
when the check was extended with Memory*= directives, as we stripped
them off from test units for TEST-55-OOMD where we certainly need them.
Since the stripping was meant primarily for "production-grade" units,
let's limit it to units under /etc/systemd/system/ and
/usr/lib/systemd/system/.

(cherry picked from commit ef3c07352b5dfe04afefdf66f4693986562ddc2b)

test: slightly reduce the performance/memory overhead for wrapped binaries

Let's drop the quarantine that ASan uses for use-after-free detection,
as it's pointless in wrapped binaries and can consume up to 256 MiB of
memory (with the default configuration). Also, don't keep any stack
traces for allocations & deallocations, which should (slightly) help
with both memory & performance overhead.

(cherry picked from commit 035ba3ea571bad6772cf3731f6b5379ccb08267f)

test: temporarily ignore sanitizer warning about blocked ptrace()

LLVM 22 introduced an additional check [0] for ptrace() syscall when
invoking sanitizers [0] which currently produces a false-positive
warning when running some of our units under sanitizers:

[ 47.524680] systemd-timedated[740]: ==740==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 47.524680] systemd-timedated[740]: ==740==Child exited with signal 15.
...
[ 1555.734223] systemd-oomd[93]: ==93==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 1555.734223] systemd-oomd[93]: ==93==Child exited with signal 15.
...

It is a false positive because we disable the seccomp filters
system-wide for our units in the sanitizer jobs.

Now, from what I've seen so far this happens only in
Type=notify(-reload) units that also utilize bus_event_loop_with_idle().
This, combined with the fact that the ptrace()-check child process from
[0] checks only if the child process was killed by _any_ signal, means
that if the systemd unit exits on its own after becoming idle and then
something sends it SIGTERM (either via explicit `systemctl stop` or
during system shutdown), this SIGTERM might hit the ptrace()-check child
process from the sanitizer handler (as we also send the signal to all
processes in the target cgroup), which the parent process then
mistakenly evaluates as a blocked ptrace() syscall, even though the
check process wasn't killed by SIGSYS.

I filed this as [1] to the LLVM project, but let's also temporarily
ignore the warning in the sanitizer report processing, as it currently
causes annoying test fails.

[0] https://github.com/llvm/llvm-project/commit/a708b4bf21d7c2298224cdacf7d424abc3c8fed4
[1] https://github.com/llvm/llvm-project/issues/193714

(cherry picked from commit 445f9805489a575c9b1bc74daa173c4fdf9b1bf7)

test: drop any memory limits from units when running with sanitizers

As the memory usage under sanitizers is quite unpredictable.

This is currently relevant mainly for Polkit, as it introduced memory
limits for its polkitd.service unit in the latest version [0] which are
very easy to trigger when running under sanitizers (as polkitd depends
on libsystemd which brings ASan into polkitd's address space).

[0] https://github.com/polkit-org/polkit/commit/7d9c06c58a957ee3f2a4383ade6f207b05207e3e

(cherry picked from commit e3aaf3d76eb0990ca015961703e905977df8faf7)

test: wrap even more binaries when running with sanitizers

Turns out that the util-linux dep on libsystemd caused more fun than I
originally anticipated:

$ lddtree /usr/bin/dfuzzer
dfuzzer => /usr/bin/dfuzzer (interpreter => /lib64/ld-linux-x86-64.so.2)
    libgio-2.0.so.0 => /lib64/libgio-2.0.so.0
        libgmodule-2.0.so.0 => /lib64/libgmodule-2.0.so.0
        libz.so.1 => /lib64/libz.so.1
        libmount.so.1 => /lib64/libmount.so.1
            libblkid.so.1 => /lib64/libblkid.so.1
            libsystemd.so.0 => /lib64/libsystemd.so.0
                libm.so.6 => /lib64/libm.so.6
                    ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2
        libselinux.so.1 => /lib64/libselinux.so.1
            libpcre2-8.so.0 => /lib64/libpcre2-8.so.0
...

Also, the tpm2 utils now depend on libudev through libcurl -> libssh ->
libfido2 dep chain:

$ lddtree /usr/bin/tpm2_pcrread
tpm2_pcrread => /usr/bin/tpm2_pcrread (interpreter => /lib64/ld-linux-x86-64.so.2)
    ...
    libcurl.so.4 => /lib64/libcurl.so.4
    ...
        libssh.so.4 => /lib64/libssh.so.4
            libfido2.so.1 => /lib64/libfido2.so.1
                libcbor.so.0.13 => /lib64/libcbor.so.0.13
                libudev.so.1 => /lib64/libudev.so.1
                    libgcc_s.so.1 => /lib64/libgcc_s.so.1
...

Follow-up for 8030e0b19ef7c0e823d84dd08ad38a2d88e0a230.

(cherry picked from commit a8400c8f1a61223e4905e2939f8d71be82831c8c)

units: order networkd resolve hook After=network-pre.target

Without this, the socket is available well before systemd-networkd.service
is able to start, because of its own After=network-pre.target ordering.
Then, if resolved handles queries before network-pre.target, it will
hang waiting for networkd to reply to hook queries.

This is currently happening in the wild with cloud-init.

(cherry picked from commit 37adb410a2b62716b666dbf8359edf8a6546ff94)

nss-myhostname: fix maybe-uninitialized warning

In resolute with gcc 15.2.0:

472s ../src/nss-myhostname/nss-myhostname.c: In function ‘_nss_myhostname_gethostbyname4_r’:
472s ../src/nss-myhostname/nss-myhostname.c:132:44: error: ‘local_address_ipv4’ may be used uninitialized [-Werror=maybe-uninitialized]
472s   132 |                 *(uint32_t*) r_tuple->addr = local_address_ipv4;
472s       |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
472s ../src/nss-myhostname/nss-myhostname.c:42:18: note: ‘local_address_ipv4’ was declared here
472s    42 |         uint32_t local_address_ipv4;
472s       |                  ^~~~~~~~~~~~~~~~~~

(cherry picked from commit d41555dd2cb8b3bc3876edd4869b3142048393fe)

updatectl: Show a helpful error if an update is partially downloaded

If an update is partially downloaded and the user tries to update again,
`updatectl` can’t currently do anything (it doesn’t yet support resuming
downloads). At the moment, though, it’ll return success as if the system
was up to date, even though it isn’t up to date.

Instead, print a more helpful error message telling the user to try
vacuuming the partial version and trying again.

I decided not to make it automatically vacuum the partial version, as
that seems like a way to get into a nasty retry loop if, for example,
the checksum provided by the server doesn’t match that of the downloaded
file (which is one way to trigger this code path).

Add an integration test which simulates this failure by corrupting the
`SHA256SUMS` file, trying to download an update, and then working
through the recovery steps.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Fixes: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit 66b950cd3f47f49087ddd4a2f4812e82b209f2b7)

sysupdate: Allow a partial version to be vacuumed

Previously we prevented partial and pending versions from being
vacuumed. But until we support resuming downloads, there’s nothing else
which can be done with a partial version except to vacuum it and try
again.

Accordingly, allow partial versions (but not pending versions) to be
vacuumed.

This behaviour can be changed again once resuming downloads is supported
— at that point I expect we’ll want to try resuming the partial download
rather than throwing it all away and trying again.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit e1146fe710d0d1bbe0fdca974f66edd7f6c573cc)

sysupdate: Allow a partial version to be the candidate

Previously we allowed a pending version to be the candidate — but if
there are no better choices, then we might as well allow a partial
version to be candidate as well.

The alternative is having no update candidate when a new version is
partially installed (i.e. downloaded but not moved into place). This
would mean that an update which is interrupted then needs to be re-run
with an explicit version number to progress, rather than being able to
be re-run without a version number (as it was in the first place).

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit 2babac90137ea4b6a70958133c783131d2619051)

sysupdate: Allow partial+pending flags in a few more places for UpdateSets

While a resource Instance can either be partial or pending, but not
both; an UpdateSet (which potentially comprises several Instances) can
be both partial *and* pending if it contains Instances in both those
states.

Amend a few bits of internal code to allow that in situations which were
previously overlooked.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
(cherry picked from commit b2d19b4651fb87b8f6dc8a427a96d7a9d3a16961)

systemd-cat does not connect the standard *input* of a process to the journal

The first paragraph of the description of the systemd-cat utility incorrectly referred to stdin when it obviously meant stderr: the other fd that it connects to the journal via a unix(7) domain socket, as clarified in the following paragraphs.

I've also replaced "process" with "command" as in that mode, systemd-cat executes a file and does not spawn a process.

(cherry picked from commit 59e78701ecb4039f42f5e77692af97e498118479)

networkd: allow route table names for VRF.Table=

Allow `[VRF] Table=` to accept route table names in addition to
numeric table identifiers. These may be predefined route table names
or names configured with `networkd.conf` `RouteTable=`.

There was an earlier attempt to make `VRF.Table=` accept names in
f98dd1e707, but it wired the setting to
`config_parse_route_table()`. That parser was a `[Route]` section
parser, not a generic scalar parser for netdevs: it expected
network/route parser state and created a `Route` object. It was
therefore reverted by 40352cf0c1.

This commit replaces the uint32 parser with
`manager_get_route_table_from_string()`, the generic table parser
already used by route/rule, DHCP/RA `RouteTable=`, and WireGuard
`RouteTable=` in `.netdev` files. The VRF semantics stay
unchanged. The commit retains the existing behavior of the
deprecated `TableId=` field.

Co-developed-by: OpenAI Codex <codex@openai.com>
(cherry picked from commit bbadd35596949678009e3f5a7cf4689853998b79)

mkosi: user and group bin needed for a test

* Fix the test TEST-02-UNITTESTS for openSUSE environment.

(cherry picked from commit 14c7014d7faa21ae8558982acfa2a45500ba3fb7)

man: clarify that /etc/verity.d only parses certificates with the .crt extension

Exposed in the dracut testsuite while adding tests for sysexts:

```
[    2.972948] localhost (sd-merge)[510]: Validation of dm-verity signature failed via the kernel, trying userspace validation instead: Required key not available
[    2.972993] localhost (sd-merge)[510]: Skipping file '/etc/verity.d/dracut.pem', suffix is not '.crt'.
[    2.973045] localhost (sd-merge)[510]: No userspace dm-verity certificates found.
```

(cherry picked from commit dfa5aa07b5637cb9a9f46d7908c964217940a073)

dissect-image: fix typo in log message

(cherry picked from commit 341251b6e3eb38a9908192f0a144ade57606dff9)

Revert "resolve: refuse traffic from the local host only for queries"

This reverts commit 526f1594daec073269c3e70ee7914f6dd8740d5c.

This revert is necessary because the change breaks mDNS hostname stability
whenever a DNS-SD service calls UnregisterService. When a service
unregisters (e.g. on process restart), manager_refresh_rrs() clears and
re-adds all RRs in PROBING state, which sends a multicast announcement
(QR=1). The kernel reflects this back to resolved's own socket. Because
the local-address check was moved inside the query-only branch by the
reverted commit, the reply path in on_mdns_packet() is now unguarded.
The looped-back announcement matches the pending probe transaction and
completes it with DNS_TRANSACTION_SUCCESS. Since the zone item is still
in PROBING state (not ESTABLISHED), dns_zone_item_notify() sets
we_lost=true and calls dns_zone_item_conflict(), which invokes
manager_next_hostname() and renames the hostname (e.g. foo.local →
foo4.local). This happens reliably on every restart of any service using
RegisterService/UnregisterService (homebridge, avahi-compat wrappers,
etc.).

The top-level local-address check in on_mdns_packet() suppresses all
looped-back multicast traffic before the reply/query split. Restoring it
there is consistent with the overall design: dns_scope_check_conflicts()
already has its own manager_packet_from_local_address() guard and is
unaffected.

A more targeted long-term fix (e.g. guarding dns_transaction_process_reply()
for mDNS, or avoiding unnecessary re-probing of already-established records
in manager_refresh_rrs()) can be pursued separately.

(cherry picked from commit 658e5ac06f80ee2078b034f7cc483204d7f91c5e)

repart: trim NUL bytes from verity sig split artifact

The verity signature partition content is a bare JSON object. Repart
pads it with zeros to fill the GPT partition. But when splitting out
the content as an individual file, the padding remains, so it's not
a valid text file.

jq started rejecting files with NUL bytes to fix a security issue:
https://github.com/jqlang/jq/commit/6374ae0bcdfe33a18eb0ae6db28493b1f34a0a5b

Trim the output when writing these files out.

(cherry picked from commit b54ef83414234ac0a742895dd645632662b5aa77)

gpt-auto-generator: do not fail on missing libcryptsetup when verity
is not used

add_veritysetup() is called unconditionally from add_root_mount() and
add_usr_mount() whenever in_initrd() is true, to generate units that
only activate if verity devices appear. However, when compiled without
libcryptsetup, this function returned a hard error, causing the entire
generator to fail even when no verity protection is in use.

Change the #else fallback to log a debug message and return 0, matching
the pattern already used by add_root_cryptsetup().

(cherry picked from commit 1d78c2d327cbd4e738d0f1281a976a771f643517)

userdbctl: drop unused variable

(cherry picked from commit f59b1f1a7dd2677b71a4ea4a47c951da85fff3a5)

measure: fix oom check

Pointed out in review.

(cherry picked from commit 9a3a861ff8c56f85d86b2276917d5bc7a1a331fb)

meson: move fuzz-journald-util.c to fuzz-journal-audit

The .c file is shared between various fuzz-journal-* binaries. It
was added to 32bd43d768a4bdd54481c5e37ce9ea3d1009a824, but that is
somewhat ugly.

Let's add it to the alphabetially first fuzzer and share from there.

Follow-up for 32bd43d768a4bdd54481c5e37ce9ea3d1009a824 and
85b5acde869baa51f5618fa503eafac3dccbf379.

(cherry picked from commit d9506e7df71aab1f0f0ba929db1707dbe3b5f92c)

meson: concatenate donors specified in 'objects'

Previously, we'd only honour the last donor.

(cherry picked from commit 079361e8f0ad1deb711e377472ce2fcaa210bb56)

repart: Fix xopenat_full() error handling

(cherry picked from commit 4114bf7e700fa2c6877230ca1199056cfbafc4e7)

ukify: fix default path for hwids

The documentation and commit that added this seem to suggest this should
be under /usr/lib/systemd

fixes 117ec9db7e71357837190833d7731bc61ae54ecc

(cherry picked from commit 9149c7595305a7c4d105d5d33ba25733af4302eb)

test: wrap mount/umount when running with sanitizers

On Fedora Rawhide mount/umount is linked against libsystemd, which then
breaks the binaries in sanitizer runs, as we try to run instrumented
code from an uninstrumented binary:

bash-5.3# ldd /usr/bin/mount
        linux-vdso.so.1 (0x00007fa757ef9000)
        libmount.so.1 => /lib64/libmount.so.1 (0x00007fa757e84000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fa757e51000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fa757c56000)
        libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fa757c16000)
        libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fa757400000)
        libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fa75734f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa757efb000)
        libclang_rt.asan.so => /usr/lib/clang/22/lib/x86_64-redhat-linux-gnu/libclang_rt.asan.so (0x00007fa756800000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fa7566e4000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fa7566b7000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fa756400000)
bash-5.3# mount
==458==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

This then breaks the whole machine, as mount is quite essential during
boot.

Let's just add mount/umount to the list of wrapped binaries to fix this.

(cherry picked from commit 8030e0b19ef7c0e823d84dd08ad38a2d88e0a230)

sysupdate: Prevent a possible invalid partial+pending state on an instance

If the resource code is recursing, it’s possible for one iteration to
set a partial flag, and then a recursive iteration to set a pending flag
(or vice-versa). It doesn’t make sense to have both set at the same time
for a specific instance, so make sure to clear the other flag when
setting one of them.

Add some assertions to make this invariant clearer and easier to debug
if it fails.

Signed-off-by: Philip Withnall <pwithnall@gnome.org>
(cherry picked from commit 0d02d0bc51f2fd431f81daa39a970d5dea279f29)

build: Compile fuzz-journald-util.c only if want_fuzz_tests

fuzz-journald-util.c is compiled unconditionally even though fuzzing
tests aren't enabled. Only build it if fuzzing tests are configured.
This also ensure that the functions it uses from src/shared/tests.c are
available.

Fixes 32bd43d768a4bdd54481c5e37ce9ea3d1009a824
Closes #39984

Signed-off-by: Chris Hofer <christian.hofer@codasip.com>
(cherry picked from commit 85b5acde869baa51f5618fa503eafac3dccbf379)

localed-util: respect env var when writing vconsole.conf

(cherry picked from commit 72e894c5909ab6da226182596c80db8183abadbd)

mkosi: trim verity.sig json files to remove NUL padding before passing to jq

jq started rejecting input that has NUL bytes to fix some security issues,
so we need to trim the verity.sig json files, which are spat out with
the NUL bytes padding from the GPT partition content.

‣ Running postinstall script /home/runner/work/systemd/systemd/mkosi/mkosi.postinst.chroot…
jq: parse error: Invalid numeric literal at EOF at line 1, column 16384
‣ "/work/postinst final" returned non-zero exit code 5.

https://github.com/jqlang/jq/commit/6374ae0bcdfe33a18eb0ae6db28493b1f34a0a5b
(cherry picked from commit 6dccf54cd646fe0621b4f256e7d61ad2fec2cbe6)

test: avoid using external commands in trap handlers

In #39675 the reported fail was as follows:

5580s [  247.559994] TEST-13-NSPAWN.sh[1858]: Exported 93%.
5580s [  247.659002] TEST-13-NSPAWN.sh[1858]: Exported 95%.
5580s [  247.785893] TEST-13-NSPAWN.sh[1858]: Operation completed successfully.
5580s [  247.923727] TEST-13-NSPAWN.sh[1858]: Exiting.
5580s [  258.300406] TEST-13-NSPAWN.sh[1074]: + machinectl import-raw /var/tmp/container-export.raw container-raw-reimport
5580s [  258.323328] TEST-13-NSPAWN.sh[1884]: The 'machinectl import-raw' command has been replaced by 'importctl -m import-raw'. Redirecting invocation.
5580s [  258.659982] TEST-13-NSPAWN.sh[1884]: Failed to transfer image: Remote peer disconnected
5580s [  258.734218] TEST-13-NSPAWN.sh[1074]: + at_exit

Turns out that the real reason behind this fail is that the machine was
under heavy load due to a busy-loop from the stub init. The cause of
this is a bug in bash, where running commands that fork (i.e. not
built-ins) can cause a permanent busy-loop due to a desync in trap
handling if you send the signals to the bash process _just right_:

[   90.855318] TEST-13-NSPAWN.sh[1074]: + machinectl poweroff long-running long-running long-running
[   90.855318] TEST-13-NSPAWN.sh[1074]: + machinectl reboot long-running long-running long-running
[   90.928980] systemd-nspawn[1679]: ++ touch /poweroff
[   90.928980] systemd-nspawn[1679]: +++ touch /reboot
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + wait
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + wait
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + :
[   90.928980] systemd-nspawn[1679]: + wait
...

$ journalctl --file TEST-13-NSPAWN-1.journal -o short-monotonic --no-hostname --grep "^\+ wait$" | wc -l
349734

So the stub-init was hammering the machine in a tight endless loop,
which then caused systemd-importd to timeout when talking to D-Bus:

[  258.300096] TEST-13-NSPAWN.sh[1074]: + machinectl import-raw /var/tmp/container-export.raw container-raw-reimport
...
[  258.415319] systemd-importd[1859]: Unable to request name, failing connection: Method call timed out
[  258.483662] systemd-importd[1859]: Bus n/a: changing state RUNNING → CLOSING
[  258.605442] systemd-importd[1859]: Bus n/a: changing state CLOSING → CLOSED
[  258.659958] TEST-13-NSPAWN.sh[1884]: Failed to transfer image: Remote peer disconnected

Given this is not our issue, let's work around it by using just
built-ins from the trap handlers, which are not susceptible to this bug.

Resolves: #39675
(cherry picked from commit 4d92c72b819bd6d54dfbbb12e4cd25de5053714c)

test: convert sd-journal tests to the new test macros

So we can, hopefully, debug issues like #40551 more easily.

(cherry picked from commit 1b787f20cfb307d1848dc6a479643f6caadde24d)

btrfs-util: make sure btrfs_get_block_device_at() works when called without path

(cherry picked from commit 54ed5c5806fecc5acedb3c7a2d02289501cea0af)

nspawn,shared/nsresource: fix copy-paste errno logging args

In nspawn.c's run_container() the child_netns_fd = receive_one_fd(...)
failure path logged 'r' instead of the negative errno returned in
child_netns_fd, so the actual error from receive_one_fd was being
overwritten by whatever 'r' happened to hold. The other receive_one_fd
call sites in the same function use the returned fd variable directly
(mntns_fd, etc.), so align this one.

In shared/nsresource.c's nsresource_add_cgroup() the cgroup_fd_idx =
sd_varlink_push_dup_fd(...) failure path logged userns_fd_idx, which
is the previous successful push's index, not the negative errno we
just got from pushing cgroup_fd. Log cgroup_fd_idx instead.

Both were flagged by static analysis (#41709) and match the immediately
preceding userns_fd-path pattern that was presumably copy-pasted.

Refs #41709.

(cherry picked from commit 5159388230574da1b9b91137a4b1f2fba9e6a729)

compress: gracefully handle a truncated ZSTD frame

If a journal file contains a truncated ZSTD frame (i.e. a frame with
Frame_Content_Size > 0, but with not enough data in Data_Block),
ZSTD_decompressStream() would return a non-zero, non-error value. This
would then skip the error path in the ZSTD_isError() branch and we'd hit
the following assert:

$ build-local/journalctl -o cat --file zstd-truncated.journal
Assertion 'output.pos >= prefix_len + 1' failed at src/basic/compress.c:1236, function decompress_startswith_zstd(). Aborting.
Aborted (core dumped) build-local/journalctl -o cat --file zstd-truncated.journal

Let's handle this situation gracefully and return EBADMSG instead.

Also, add another journalctl invocation to the corrupted-journals test
that goes through the sd_journal_get_data() -> decompress_startswith_zstd()
code path which, among other things, covers the issue when run on the
provided journal file.

(cherry picked from commit 35eb598af26f66c94d7403f6170d5cae438871fb)

test: append .journal to unpacked corrupted journals

Otherwise `journalctl --directory=` skips over them in the second part of
the test.

(cherry picked from commit e869c83367388a3dc8d5ec8ca6820edc422b58c2)

sd-json: make sure SD_JSON_BUILD_STRING_UNDERSCORIFY() can deal with NULL strings

SD_JSON_BUILD_STRING() and everything else can deal with it, make sure
SD_JSON_BUILD_STRING_UNDERSCORIFY() can too.

(cherry picked from commit a77a95665fba06861125a4a62ffed8ccd75f37f2)

import: fix an always-true assert()

(cherry picked from commit 6182d0c66654594c65c680bc0e486d8bbcb359f5)

strxcpyx: add a paranoia check for vsnprintf()'s return value

vsnprintf() can, under some circumstances, return negative value, namely
during encoding errors when converting wchars to multi-byte characters.
This would then wreak havoc in the arithmetics we do following the
vsnprintf() call. However, since we never do any wchar shenanigans in
our code it should never happen.

Let's encode this assumption into the code as an assert(), similarly how
we already do this in other places (like strextendf_with_separator()).

(cherry picked from commit 774a9f440bebeea960b69bb46109d72b3d7b8667)

dhcp-protocol: Option Overload (52) DHCP option value takes flags

(cherry picked from commit c651876cc9cd660f5c4a27ee0b0197d3db1398f7)

resolve: add missing OOM check

(cherry picked from commit 16f5a2992d3babdcbba319abaf9617db5ce1c924)

sd-dhcp-client: fix memleak of sd_dhcp_client.timeout_ipv6_only_mode

This also drops unnecessary zero assignments.

(cherry picked from commit 366e1d264a6d1c2aa96d85bf6dd80be2bbd65f72)

mailmap: name change

(cherry picked from commit c8f5a564089a497fb5b1232696c188e1e78b7359)

man: Fix NOTES formatting

The NOTES section in os-release(5) contains an unusual formatting.
Switch function and ulink tags and remove a newline within ulink text to
keep the entry formatting in sync with others. Also, this preserves the
formatting within the text itself.

Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
(cherry picked from commit 4d2847dcc2083afe8460bec839b0e0940818d8dd)

mountpoint-util: initialize mnt_id for name_to_handle_at(AT_HANDLE_MNT_ID_UNIQUE)

Suppress the following message:
```
$ sudo valgrind --leak-check=full build/networkctl dhcp-lease wlp59s0
==175708== Memcheck, a memory error detector
==175708== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==175708== Using Valgrind-3.26.0 and LibVEX; rerun with -h for copyright info
==175708== Command: build/networkctl status wlp59s0
==175708==
==175708== Conditional jump or move depends on uninitialised value(s)
==175708==    at 0x4BC33D1: inode_same_at (stat-util.c:610)
==175708==    by 0x4BF1972: inode_same (stat-util.h:86)
==175708==    by 0x4BF48FE: running_in_chroot (virt.c:817)
==175708==    by 0x4B16643: running_in_chroot_or_offline (verbs.c:37)
==175708==    by 0x4B175CE: _dispatch_verb_with_args (verbs.c:136)
==175708==    by 0x4B17868: dispatch_verb (verbs.c:160)
==175708==    by 0x407CBB: networkctl_main (networkctl.c:249)
==175708==    by 0x407D06: run (networkctl.c:263)
==175708==    by 0x407D39: main (networkctl.c:266)
==175708==
```
Not sure if it is an issue in valgrind or glibc, but at least there is
nothing we can do except for working around it.

(cherry picked from commit f46ad8b4ba8a5aef0bb628bd5dcd7ec43283e9a1)

bootctl: drop redundant log message

If unprivileged_mode is false then verify_esp() will treat access errors
like any other and log about them. Here we set it to false, hence
there's no point to log a 2nd time.

(cherry picked from commit f85415498cf3800bef26b1092096e658a8211e97)

boot: gracefully handle LoadFile() implementations that return EFI_SUCCESS with a NULL buffer

LoadFile() with a NULL buffer is supposed to return the file size
without acquiring the data and return EFI_BUFFER_TOO_SMALL.

However it appears some firmware returns EFI_SUCCESS in case the file is
empty, i.e. the file size returned is zero. And I guess that's even
fine.

Let's handle this gracefully hence.

(cherry picked from commit c40f254cca5e96b876b90e20ced69c33115940c3)

boot: never auto-boot a menu entry with the non-default profile

When figuring out which menu entry to pick by default, let's not
consider any with a profile number > 0. This reflects that fact that
additional profiles are generally used for
debug/recovery/factory-reset/storage target mode boots, and those should
never be auto-selected. Hence do a simple check: if profile != 0, simply
do not consider the entry as a default.

We might eventually want to beef this up, and add a property one can set
in the profile metadata that controls this behaviour, but for now let's
just do a this simple fix.

(cherry picked from commit a93fa2c6b45c6b18f0cb3a16a793edb71ae6b444)

man: drop redundant word from varlinkctl man page

(cherry picked from commit c83c21a05450cb17f7a7682e687ba4d3fc794fc3)

namespace: don't log misleading error in the r > 0 path

fd_is_fs_type() returns < 0 for errors, 0 for false, and > 0 for true, so
in the r > branch we'd most likely report EPERM together with the error
message which is misleading.

(cherry picked from commit c859d41232b3bcff9f9b01f1281c1b202f15d0b2)

nspawn: avoid passing NULL to log_syntax()

If range is NULL (i.e. when PrivateUsers= doesn't contain ':'),
both later error paths will then pass NULL to log_syntax():

~# cat foo.nspawn
[Exec]
PrivateUsers=9999999999999999999

~# SYSTEMD_LOG_LEVEL=debug systemd-nspawn -D foo |& grep foo.nspawn
Found settings file: /root/foo.nspawn
/root/foo.nspawn:2: UID/GID shift invalid, ignoring: (null)

or

~# cat foo.nspawn
[Exec]
PrivateUsers=4294967294

~ # SYSTEMD_LOG_LEVEL=debug systemd-nspawn -D foo |& grep foo.nspawn
Found settings file: /root/foo.nspawn
/root/foo.nspawn:2: UID/GID shift and range combination invalid, ignoring: (null)

Let's just use rvalue in both of these cases instead.

(cherry picked from commit d651df8283ce62d2f03f8abe5cd4798bd1b8bf58)

udev-builtin: add a couple of asserts

(cherry picked from commit 4977c00e2a7efda3a7be2136fe2fed4de6777565)

scsi_id: use safe_atoi() instead of plain atoi()

(cherry picked from commit 5853d0a53378ed973d8c006531846717ae55090a)

test-string-util: test empty_to_null on a char array

Unfortunately empty_to_null(t) where t is char[] fails. But it
works with &t[0].

(cherry picked from commit 067aa9b767954d134b6f69a5b97ebbd19bbb9697)

timesync: verify the actual size of the received data

iov.iov_len doesn't change after calling recvmsg() so it remains set to
sizeof(ntpmsg), which makes the check for a short packet always false.
Let's fix that by checking the actual size of the received data instead.

(cherry picked from commit 335cc8f39c75c2b7cdab8c52fe4c378929556f7e)

string-util: check for overflow in strrep()

This simply mirrors the same overflow check we already have in
strrepa(), in case someone passed us a sufficiently long string.

strrep() is currently used only in tests, so this is just hardening.

(cherry picked from commit b22daa97e1608d865ce76ed72fd6b7bd59ccbf70)

machined: gate metadata querying behind inspect-machines/images action

Ensure only privileged users can call the system scope machined's
APIs that get data out of a machine

Follow-up for 1bd979dddbb6ed3ffe410d78a7ff80cbb1c42a64
Follow-up for 9153b02bb5030e29d6008992fb74b9028d7c392c

(cherry picked from commit 3e716178cc3bcf972d878d3899ad1c977a7d707b)

sysupdated: don't crash when an mstack machine image is found

As soon as machinectl list-images has an mstack entry updatectl fails
because systemd-sysupdated crashes with an assertion failing because
the mstack case was not handled.
For now mstack is not supported as image for sysupdate to operate on
and we can skip it.

Fixes https://github.com/systemd/systemd/issues/41649

(cherry picked from commit 4efb5d389c9653e3a61e583c64dc3d094eb8911e)

sd-varlink: Don't log successful sentinel error dispatch as a failure

sd_varlink_error() deliberately returns a negative errno mapped from
the error id on success so callbacks can `return sd_varlink_error(...);`
to enqueue the reply and propagate a matching errno at once. When
varlink_dispatch_method() dispatches a configured error sentinel itself,
it doesn't need that mapping — but it was treating any negative return
as a dispatch failure and logging "Failed to process sentinel" even
though the error reply had been successfully enqueued.

Detect success via the state transition to VARLINK_PROCESSED_METHOD
instead, so only genuine enqueue failures are logged.

(cherry picked from commit 48326af23a1c9d95f9aa2fd66fcecbc7f90ccff5)

various: fix compilation with openssl-4.0.0-beta1

Various types have been made opaque, so we need to use some accessor
functions.

(cherry picked from commit 693ecaac7e12c120d1323478b2433d77367aa0c9)

mkosi: update fedora commit reference to 207e2d004468bf79a8bd78182d9b10956edf45c7

* 207e2d0044 Stop building support for openssl engines
* 36a234147f Upload sources
* 3681163f81 Version 260.1
* 8f4f0f58e3 Version 260
* e3fab23aa0 Version 260~rc4
* e4c1c2100b Version 260~rc3
* 453696813e Fix typo in unit name in %post scriptlet
* 154edb7cdb Silence false positive "HWID match failed, no DT blob" error (rhbz#2444759)
* 03b6637c35 riscv64 port has LTO disabled
* ce1dec6a40 Version 260~rc2
* 809049777c Add patch for symlink creation error
* 6ff27708f7 Enable getty@.service through presets
* ba7807fbce Drop scriptlet for upgrades from versions <253
* 455f277188 Move support for tpm2 to systemd-udev subpackage
* 0183bc784e Version 260~rc1

(cherry picked from commit 5cee6c6a92292acbede4c183c29d3e1aafd7c210)

resolved: check for reset-statistics polkit action via D-Bus too

The varlink method checks for polkit authorization, so also
update the D-Bus method to match it.

Follow-up for cf01bbb7a45fb1eec28cd0a813bd68fde413410f

(cherry picked from commit b1d9127c39f918b2f6eb61ed8e8c97ae07ac11c2)

TEST-75-RESOLVED: Make sure --suppress-sync is not used

(cherry picked from commit 0d57260976b87ff213edfe3cb29e352c67e54d48)

test-seccomp: Handle environment where sync() is already suppressed

We might be running in an nspawn container booted with --suppress-sync,
so make sure we handle that scenario gracefully.

(cherry picked from commit 31441cb782139c19e69ea0037871eff5299bf228)

TEST-64-UDEV-STORAGE: Add missing scsi controllers

(cherry picked from commit 014a4d93e00e32da14bc7f21102bbd628af695d4)

TEST-24-CRYPTSETUP: Use virtio-blk-pci

Doesn't require a controller.

(cherry picked from commit a8416614b015a629e5554a88129264732370edfb)

TEST-13-NSPAWN: Use timeout --foreground in two more places

(cherry picked from commit 518dcfadab8c540cff056a3bd94d5c817e7a17b2)

TEST-07-PID1: Use --foreground with timeout

Otherwise the test fails if a TTY is attached to stdio.

(cherry picked from commit 9c0abfaf15c1494b8ed3c874342979c56f14e282)

TEST-07-PID1: Don't fail in vm without ESP or XBOOTLDR mount

(cherry picked from commit e7d1030d771d46d8004d44f33585261b0e48fc43)

docs: update footer to 2026

(cherry picked from commit db1ca20591610bfaec80ccddddd42ce74ec185d0)

hwdb: update to main@{2026-04-14}

The addition of hwdb.d/40-imds.hwdb is dropped. (Corresponding changes
in hwdb_parse.py are kept, they are not a problem.)

importd: harden curl file protocol handling

With old libcurl versions file:// can get redirects which can be messy, while
the new version rejects them. Set an option to explicit block them.

(cherry picked from commit 275ec1160b62c90dad3264e484e976213b0ada30)

journal-upload: also disable VERIFYHOST when --trust=all is used

When --trust=all disables CURLOPT_SSL_VERIFYPEER, the residual
CURLOPT_SSL_VERIFYHOST check is ineffective since an attacker can
present a self-signed certificate with the expected hostname. Disable
both for consistency and log that server certificate verification is
disabled.

Follow-up for 8847551bcbfa8265bae04f567bb1aadc7b480325

(cherry picked from commit f125fc6a22167f3d52c97763e555b2d7d654788e)

machined: pass user as positional argument in machine_default_shell_args()

Instead of interpolating the user name directly into the sh -c script
body via asprintf %s, pass it as a positional parameter ($1) in a
separate argv entry. This avoids the user string being parsed as part
of the shell script syntax.

Also validate the user name in bus_machine_method_open_shell() with
valid_user_group_name(), matching the validation already done on the
Varlink path via json_dispatch_const_user_group_name().

Follow-up for 49af9e1368571f4e423cde0fd45ee284451434d1

(cherry picked from commit a9e9288288567beae57337ae903dd3b6c774001c)

logind: reject wall messages containing control characters

method_set_wall_message() and the property setter only checked the
message length but not its content. Since wall messages are broadcast
to all TTYs, control characters in the message could interfere with
terminal state. Reject messages containing control characters other
than newline and tab.

Follow-up for 9ef15026c0e7e6600372056c43442c99ec53746e
Follow-up for e2fa5721c3ee5ea400b99a6463e8c1c257e20415

(cherry picked from commit 4bf9db731445ba72a9e5097561e1883dfe1183d8)

core: check selinux access on each unit when listing

Units might have different access rules, so check the access on each
unit when querying the full list.

(cherry picked from commit 04f32dddd7221de01c4da70128bd5fb21bc53427)

core: add missing SELinux access checks when listing units

Add mac_selinux_unit_access_check_varlink() to the unit enumeration
loop in vl_method_list_units(), silently skipping units the caller
is not permitted to see, matching the D-Bus ListUnits behavior.
Add mac_selinux_access_check_varlink() to vl_method_describe_manager().

Follow-up for 472abf7bec89caeb1cc413c1de17984ab8ccb5d6
Follow-up for 736349958efe34089131ca88950e2e5bb391d36a

(cherry picked from commit 26fd286210964a76c5e1a52a416626f7dde53936)

docs: fix capability name, it's CAP_MKNOD not CAP_SYS_MKNOD (#41621)

(cherry picked from commit b40ed2067fb669540b1a640e293334fd31403676)

dhcp: fix user class and vendor specific option assignment

The commit 6d7cb9a6b8361d2b327222bc12872a3676358bc3 fixes the assignment
of the these options when specified through SendOption=. However, it
breaks when specified through UserClass= or SendVendorOption=.

When UserClass= or SendVendorOption= is specified, the option length is
calculated from the sd_dhcp_client.user_class or .vendor_options. Hence,
we can use 0 for the length in that case.

Follow-up for 6d7cb9a6b8361d2b327222bc12872a3676358bc3.

(cherry picked from commit 55f2fdd508dfe430cc36b9961b09d9eb649c6a83)

boot-entry: add 'auto' keyword to parse_boot_entry_token_type

Add the auto keyword as documented in the help message and man pages of
`kernel-install`, `bootctl` and `systemd-pcrlock`.

(cherry picked from commit 7208a74a2b7eee1b2465793fb3b2642c888fe0ce)

boot: fix loop bound and OOB in devicetree_get_compatible()

The loop used the byte offset end (struct_off + struct_size) as the
iteration limit, but cursor[i] indexes uint32_t words. This reads
past the struct block when end > size_words.

Use size_words (struct_size / sizeof(uint32_t)) which is the correct
number of words to iterate over.

Also fix a pre-existing OOB in the FDT_BEGIN_NODE handler: the guard
i >= size_words is always false inside the loop (since the loop
condition already ensures i < size_words), so cursor[++i] at the
boundary reads one word past the struct block. Use i + 1 >= size_words
to check before incrementing.

Fixes: https://github.com/systemd/systemd/issues/41590
(cherry picked from commit 2c664b953163be5e8e18df3fd73ed7bfae229a37)

boot: fix integer overflow and division by zero in BMP splash parser

Bound image dimensions before computing row_size to prevent overflow
in the depth * x multiplication on 32-bit. Without this, crafted
dimensions like depth=32 x=0x10000001 wrap to a small row_size that
passes all subsequent checks.

Reject channel masks where all bits are set (popcount == 32), since
1U << 32 is undefined behavior and causes division by zero on
architectures where it evaluates to zero. Move the validation before
computing derived values for clarity. Use unsigned 1U in shifts to
avoid signed integer overflow UB for popcount == 31.

Also reject zero-width and zero-height images.

Fixes: https://github.com/systemd/systemd/issues/41589
(cherry picked from commit c3c9cc7adb7d6eeb68086c8244086e115a788542)

core: use JSON_BUILD_CONST_STRING() where appropriate

(cherry picked from commit 087733e348f060b1c79cf72c9615c706c2c9d851)

udev/scsi-id: check for invalid header from kernel buffer

(cherry picked from commit 06d3f37336ab8dea545521d95ebc6246b29241f0)

udev/scsi-id: check for invalid chars in various fields received from the kernel

Follow-up for 16325b35fa6ecb25f66534a562583ce3b96d52f3

(cherry picked from commit 5f700d148c44063c0f0dbb9fc136866339cd3fa7)

nss-systemd: fix off-by-one in nss_pack_group_record_shadow()

nss_count_strv() counts trailing NULL pointers in n. The pointer area
then used (n + 1), reserving one slot more than the size check
accounted for.

Drop the + 1 since n already includes the trailing NULLs, unlike the
non-shadow nss_pack_group_record() where n does not.

Fixes: https://github.com/systemd/systemd/issues/41591
(cherry picked from commit aa85a742fe5e0816312566a700599496e720246d)

journal: limit decompress_blob() output to DATA_SIZE_MAX

We already have checks in place during compression that limit the data
we compress, so they shouldn't decompress to anything larger than
DATA_SIZE_MAX unless they've been tampered with. Let's make this
explicit and limit all our decompress_blob() calls in journal-handling
code to that limit.

One possible scenario this should prevent is when one tries to open and
verify a journal file that contains a compression bomb in its payload:

$ ls -lh test.journal
-rw-rw-r--+ 1 fsumsal fsumsal 1.2M Apr 12 15:07 test.journal

$ systemd-run --user --wait --pipe -- build-local/journalctl --verify --file=$PWD/test.journal
Running as unit: run-p682422-i4875779.service
000110: Invalid hash (00000000 vs. 11e4948d73bdafdd)
000110: Invalid object contents: Bad message
File corruption detected at /home/fsumsal/repos/@systemd/systemd/test.journal:272 (of 1249896 bytes, 0%).
FAIL: /home/fsumsal/repos/@systemd/systemd/test.journal (Bad message)
          Finished with result: exit-code
Main processes terminated with: code=exited, status=1/FAILURE
               Service runtime: 48.051s
             CPU time consumed: 47.941s
                   Memory peak: 8G (swap: 0B)

Same could be, in theory, possible with just `journalctl --file=`, but
the reproducer would be a bit more complicated (haven't tried it, yet).

Lastly, the change in journal-remote is mostly hardening, as the maximum
input size to decompress_blob() there is mandated by MHD's connection
memory limit (set to JOURNAL_SERVER_MEMORY_MAX which is 128 KiB at the
time of writing), so the possible output size there is already quite
limited (e.g. ~800 - 900 MiB for xz-compressed data).

(cherry picked from commit 31d360fb0b28859aba891aaefb1452f820a5861a)

compress: limit the output to dst_max bytes with LZ4 if set

We already do that with other algorithms, so let's make
decompress_blob_lz4() consistent with the rest.

(cherry picked from commit 2cda5f6169e4a03e9860d315e7b4a7b0d61ca11f)

journal: move the {DATA,ENTRY}_SIZE constants to sd-journal

So we can access them from the code there as well.

(cherry picked from commit d830bb1fc132d31b5e82ba3c676051a4000d1538)

test-json: add iszero_safe guards for float division at index 0 and 1

The existing iszero_safe guards at index 9 and 10 were added to
silence Coverity, but the same division-by-float-zero warning also
applies to the divisions at index 0 (DBL_MIN) and 1 (DBL_MAX).

CID#1587762

Follow-up for 7f133c996c8b1ea9219540ec8f966b64b58d30a6

(cherry picked from commit 44296e41db20b40d0b9a4cbe320d262ffdd8905d)

nss-myhostname: add more INC_SAFE for buffer index accumulation

Use overflow-safe INC_SAFE() instead of raw addition for idx
accumulation, so that Coverity can see the addition is checked.

CID#1548028

Follow-up for a05483a921a518fd283e7cb32dc8c8e816b2ab2c

(cherry picked from commit 1afc0c6c608e75e6fccba13cc4f36039b0a7ae6e)

sd-varlink: scale down the limit of connections per UID to 128

1024 connections per UID is unnecessarily generous, so let's scale this
down a bit. D-Bus defaults to 256 connections per UID, but let's be even
more conservative and go with 128.

(cherry picked from commit d9da339bf12f6433eaeb624589956f2f8737a6a0)

importctl: fix -N to actually clear keep-download flag

-N was clearing and re-setting the same bit in arg_import_flags_mask,
which is a no-op. It should clear the bit in arg_import_flags instead,
matching what --keep-download=no does via SET_FLAG().

(cherry picked from commit ee96f934c6efccd4a2a3fe1073f4da961fe4eb25)

core: fix EBUSY on restart and clean of delegated services

When a service is configured with Delegate=yes and DelegateSubgroup=sub,
the delegated container may write domain controllers (e.g. "pids") into the
service cgroup's cgroup.subtree_control via its cgroupns root. On container
exit the stale controllers remain, and on service restart clone3() with
CLONE_INTO_CGROUP fails with EBUSY because placing a process into a cgroup
that has domain controllers in subtree_control violates the no-internal-
processes rule. The same issue affects systemctl clean, where cg_attach()
fails with EBUSY for the same reason.

Add unit_cgroup_disable_all_controllers() helper in cgroup.c that clears
stale controllers via cg_enable(mask=0) and updates cgroup_enabled_mask to
keep internal tracking in sync. Call it from service_start() and
service_clean() right before spawning, so that resource control is preserved
for any lingering processes from the previous invocation as long as possible.

(cherry picked from commit 056bc106e1e344f98cdfa86fdf62e6fed72958c9)

sd-json: limit the stack depth during parsing as well

(cherry picked from commit 1016dd315f94917cd0818a90bc09c99ef76ab556)