loop-util: don't reuse partition fd when partscan needed
Some devices (e.g. android phones running pmOS) cannot have their OEM
partition table altered without breaking the firmware, so the distros's
partitions live inside a nested GPT carved into one of the OEM
partitions. Exposing these subpartitions requires wrapping the outer
partition in a loop device with partscan enabled, since the kernel does
not go into nested partition tables.
systemd already detects this case in udev-builtin-blkid
(ID_PART_GPT_AUTO_ROOT_DISK_NEEDS_LOOP) and acts on with
systemd-loop@.service, but this fails towards the end.
loop_device_make_internal has an optimization where if the input is
already a block device with a matching sector size, it skips creating
a loop and just hands back the original fd. That's fine for whole disks
but wrong for partitions, which don't support partscan, so this causes
dissect_image to fail with EPROTONOSUPPORT.
This patch changes the behavior to only take the shortcut when the input
is a whole disk, or when partscan was not requested.
udev: don't assert on worker cap after killing a broken idle worker
manager_can_process_event() considers an event processable if either
there is room below children_max to spawn, or an idle worker exists.
When only the latter holds, event_run() picks the idle worker and
tries device_monitor_send(). If that send fails, event_run() SIGKILLs
the worker, marks it WORKER_KILLED and continues the loop. With no
other idle worker available, it falls through to worker_spawn(),
guarded by:
The just-killed worker is still in manager->workers until its SIGCHLD
is reaped by on_worker_exit(), so at the cap this assertion trips and
udevd aborts:
Assertion 'hashmap_size(manager->workers) < manager->config.children_max'
failed at src/udev/udev-manager.c:635, function event_run(). Aborting.
Instead of asserting, bail out when we are already at the worker
limit. The event remains in EVENT_QUEUED; once the killed worker's
SIGCHLD arrives and frees it from the hashmap, on_post() re-runs
event_queue_start() and the event is retried.
Kai Lüke [Fri, 24 Apr 2026 16:42:36 +0000 (01:42 +0900)]
nsresource: fix buffer overrun reported by ASAN
This came up when running systemd-vmspawn with ASAN to fix another bug
and thus I had to fix this overrun here first: The dispatch tables were
missing the terminator, add it.
In format-table.h, TABLE_IN_ADDR is commented as "Takes a union in_addr_union
(or a struct in_addr)". However, if we pass struct in_addr to table_add_many(),
the function reads more than the size of the struct.
The original find was matching even our test units, which caused issues
when the check was extended with Memory*= directives, as we stripped
them off from test units for TEST-55-OOMD where we certainly need them.
Since the stripping was meant primarily for "production-grade" units,
let's limit it to units under /etc/systemd/system/ and
/usr/lib/systemd/system/.
test: slightly reduce the performance/memory overhead for wrapped binaries
Let's drop the quarantine that ASan uses for use-after-free detection,
as it's pointless in wrapped binaries and can consume up to 256 MiB of
memory (with the default configuration). Also, don't keep any stack
traces for allocations & deallocations, which should (slightly) help
with both memory & performance overhead.
test: temporarily ignore sanitizer warning about blocked ptrace()
LLVM 22 introduced an additional check [0] for ptrace() syscall when
invoking sanitizers [0] which currently produces a false-positive
warning when running some of our units under sanitizers:
[ 47.524680] systemd-timedated[740]: ==740==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 47.524680] systemd-timedated[740]: ==740==Child exited with signal 15.
...
[ 1555.734223] systemd-oomd[93]: ==93==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 1555.734223] systemd-oomd[93]: ==93==Child exited with signal 15.
...
It is a false positive because we disable the seccomp filters
system-wide for our units in the sanitizer jobs.
Now, from what I've seen so far this happens only in
Type=notify(-reload) units that also utilize bus_event_loop_with_idle().
This, combined with the fact that the ptrace()-check child process from
[0] checks only if the child process was killed by _any_ signal, means
that if the systemd unit exits on its own after becoming idle and then
something sends it SIGTERM (either via explicit `systemctl stop` or
during system shutdown), this SIGTERM might hit the ptrace()-check child
process from the sanitizer handler (as we also send the signal to all
processes in the target cgroup), which the parent process then
mistakenly evaluates as a blocked ptrace() syscall, even though the
check process wasn't killed by SIGSYS.
I filed this as [1] to the LLVM project, but let's also temporarily
ignore the warning in the sanitizer report processing, as it currently
causes annoying test fails.
test: drop any memory limits from units when running with sanitizers
As the memory usage under sanitizers is quite unpredictable.
This is currently relevant mainly for Polkit, as it introduced memory
limits for its polkitd.service unit in the latest version [0] which are
very easy to trigger when running under sanitizers (as polkitd depends
on libsystemd which brings ASan into polkitd's address space).
Nick Rosbrook [Fri, 24 Apr 2026 13:38:42 +0000 (09:38 -0400)]
units: order networkd resolve hook After=network-pre.target
Without this, the socket is available well before systemd-networkd.service
is able to start, because of its own After=network-pre.target ordering.
Then, if resolved handles queries before network-pre.target, it will
hang waiting for networkd to reply to hook queries.
This is currently happening in the wild with cloud-init.
Philip Withnall [Wed, 22 Apr 2026 16:31:27 +0000 (17:31 +0100)]
updatectl: Show a helpful error if an update is partially downloaded
If an update is partially downloaded and the user tries to update again,
`updatectl` can’t currently do anything (it doesn’t yet support resuming
downloads). At the moment, though, it’ll return success as if the system
was up to date, even though it isn’t up to date.
Instead, print a more helpful error message telling the user to try
vacuuming the partial version and trying again.
I decided not to make it automatically vacuum the partial version, as
that seems like a way to get into a nasty retry loop if, for example,
the checksum provided by the server doesn’t match that of the downloaded
file (which is one way to trigger this code path).
Add an integration test which simulates this failure by corrupting the
`SHA256SUMS` file, trying to download an update, and then working
through the recovery steps.
Signed-off-by: Philip Withnall <pwithnall@gnome.org> Fixes: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit 66b950cd3f47f49087ddd4a2f4812e82b209f2b7)
Philip Withnall [Wed, 22 Apr 2026 16:28:31 +0000 (17:28 +0100)]
sysupdate: Allow a partial version to be vacuumed
Previously we prevented partial and pending versions from being
vacuumed. But until we support resuming downloads, there’s nothing else
which can be done with a partial version except to vacuum it and try
again.
Accordingly, allow partial versions (but not pending versions) to be
vacuumed.
This behaviour can be changed again once resuming downloads is supported
— at that point I expect we’ll want to try resuming the partial download
rather than throwing it all away and trying again.
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit e1146fe710d0d1bbe0fdca974f66edd7f6c573cc)
Philip Withnall [Wed, 22 Apr 2026 16:25:44 +0000 (17:25 +0100)]
sysupdate: Allow a partial version to be the candidate
Previously we allowed a pending version to be the candidate — but if
there are no better choices, then we might as well allow a partial
version to be candidate as well.
The alternative is having no update candidate when a new version is
partially installed (i.e. downloaded but not moved into place). This
would mean that an update which is interrupted then needs to be re-run
with an explicit version number to progress, rather than being able to
be re-run without a version number (as it was in the first place).
Signed-off-by: Philip Withnall <pwithnall@gnome.org>
Helps: https://github.com/systemd/systemd/issues/41502
(cherry picked from commit 2babac90137ea4b6a70958133c783131d2619051)
Philip Withnall [Wed, 22 Apr 2026 16:23:49 +0000 (17:23 +0100)]
sysupdate: Allow partial+pending flags in a few more places for UpdateSets
While a resource Instance can either be partial or pending, but not
both; an UpdateSet (which potentially comprises several Instances) can
be both partial *and* pending if it contains Instances in both those
states.
Amend a few bits of internal code to allow that in situations which were
previously overlooked.
systemd-cat does not connect the standard *input* of a process to the journal
The first paragraph of the description of the systemd-cat utility incorrectly referred to stdin when it obviously meant stderr: the other fd that it connects to the journal via a unix(7) domain socket, as clarified in the following paragraphs.
I've also replaced "process" with "command" as in that mode, systemd-cat executes a file and does not spawn a process.
Allow `[VRF] Table=` to accept route table names in addition to
numeric table identifiers. These may be predefined route table names
or names configured with `networkd.conf` `RouteTable=`.
There was an earlier attempt to make `VRF.Table=` accept names in f98dd1e707, but it wired the setting to
`config_parse_route_table()`. That parser was a `[Route]` section
parser, not a generic scalar parser for netdevs: it expected
network/route parser state and created a `Route` object. It was
therefore reverted by 40352cf0c1.
This commit replaces the uint32 parser with
`manager_get_route_table_from_string()`, the generic table parser
already used by route/rule, DHCP/RA `RouteTable=`, and WireGuard
`RouteTable=` in `.netdev` files. The VRF semantics stay
unchanged. The commit retains the existing behavior of the
deprecated `TableId=` field.
This revert is necessary because the change breaks mDNS hostname stability
whenever a DNS-SD service calls UnregisterService. When a service
unregisters (e.g. on process restart), manager_refresh_rrs() clears and
re-adds all RRs in PROBING state, which sends a multicast announcement
(QR=1). The kernel reflects this back to resolved's own socket. Because
the local-address check was moved inside the query-only branch by the
reverted commit, the reply path in on_mdns_packet() is now unguarded.
The looped-back announcement matches the pending probe transaction and
completes it with DNS_TRANSACTION_SUCCESS. Since the zone item is still
in PROBING state (not ESTABLISHED), dns_zone_item_notify() sets
we_lost=true and calls dns_zone_item_conflict(), which invokes
manager_next_hostname() and renames the hostname (e.g. foo.local →
foo4.local). This happens reliably on every restart of any service using
RegisterService/UnregisterService (homebridge, avahi-compat wrappers,
etc.).
The top-level local-address check in on_mdns_packet() suppresses all
looped-back multicast traffic before the reply/query split. Restoring it
there is consistent with the overall design: dns_scope_check_conflicts()
already has its own manager_packet_from_local_address() guard and is
unaffected.
A more targeted long-term fix (e.g. guarding dns_transaction_process_reply()
for mDNS, or avoiding unnecessary re-probing of already-established records
in manager_refresh_rrs()) can be pursued separately.
repart: trim NUL bytes from verity sig split artifact
The verity signature partition content is a bare JSON object. Repart
pads it with zeros to fill the GPT partition. But when splitting out
the content as an individual file, the padding remains, so it's not
a valid text file.
gpt-auto-generator: do not fail on missing libcryptsetup when verity
is not used
add_veritysetup() is called unconditionally from add_root_mount() and
add_usr_mount() whenever in_initrd() is true, to generate units that
only activate if verity devices appear. However, when compiled without
libcryptsetup, this function returned a hard error, causing the entire
generator to fail even when no verity protection is in use.
Change the #else fallback to log a debug message and return 0, matching
the pattern already used by add_root_cryptsetup().
test: wrap mount/umount when running with sanitizers
On Fedora Rawhide mount/umount is linked against libsystemd, which then
breaks the binaries in sanitizer runs, as we try to run instrumented
code from an uninstrumented binary:
bash-5.3# ldd /usr/bin/mount
linux-vdso.so.1 (0x00007fa757ef9000)
libmount.so.1 => /lib64/libmount.so.1 (0x00007fa757e84000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fa757e51000)
libc.so.6 => /lib64/libc.so.6 (0x00007fa757c56000)
libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fa757c16000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fa757400000)
libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fa75734f000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa757efb000)
libclang_rt.asan.so => /usr/lib/clang/22/lib/x86_64-redhat-linux-gnu/libclang_rt.asan.so (0x00007fa756800000)
libm.so.6 => /lib64/libm.so.6 (0x00007fa7566e4000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fa7566b7000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fa756400000)
bash-5.3# mount
==458==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
This then breaks the whole machine, as mount is quite essential during
boot.
Let's just add mount/umount to the list of wrapped binaries to fix this.
Philip Withnall [Wed, 22 Apr 2026 16:19:21 +0000 (17:19 +0100)]
sysupdate: Prevent a possible invalid partial+pending state on an instance
If the resource code is recursing, it’s possible for one iteration to
set a partial flag, and then a recursive iteration to set a pending flag
(or vice-versa). It doesn’t make sense to have both set at the same time
for a specific instance, so make sure to clear the other flag when
setting one of them.
Add some assertions to make this invariant clearer and easier to debug
if it fails.
Chris Hofer [Mon, 20 Apr 2026 14:55:38 +0000 (16:55 +0200)]
build: Compile fuzz-journald-util.c only if want_fuzz_tests
fuzz-journald-util.c is compiled unconditionally even though fuzzing
tests aren't enabled. Only build it if fuzzing tests are configured.
This also ensure that the functions it uses from src/shared/tests.c are
available.
mkosi: trim verity.sig json files to remove NUL padding before passing to jq
jq started rejecting input that has NUL bytes to fix some security issues,
so we need to trim the verity.sig json files, which are spat out with
the NUL bytes padding from the GPT partition content.
‣ Running postinstall script /home/runner/work/systemd/systemd/mkosi/mkosi.postinst.chroot…
jq: parse error: Invalid numeric literal at EOF at line 1, column 16384
‣ "/work/postinst final" returned non-zero exit code 5.
Turns out that the real reason behind this fail is that the machine was
under heavy load due to a busy-loop from the stub init. The cause of
this is a bug in bash, where running commands that fork (i.e. not
built-ins) can cause a permanent busy-loop due to a desync in trap
handling if you send the signals to the bash process _just right_:
In nspawn.c's run_container() the child_netns_fd = receive_one_fd(...)
failure path logged 'r' instead of the negative errno returned in
child_netns_fd, so the actual error from receive_one_fd was being
overwritten by whatever 'r' happened to hold. The other receive_one_fd
call sites in the same function use the returned fd variable directly
(mntns_fd, etc.), so align this one.
In shared/nsresource.c's nsresource_add_cgroup() the cgroup_fd_idx =
sd_varlink_push_dup_fd(...) failure path logged userns_fd_idx, which
is the previous successful push's index, not the negative errno we
just got from pushing cgroup_fd. Log cgroup_fd_idx instead.
Both were flagged by static analysis (#41709) and match the immediately
preceding userns_fd-path pattern that was presumably copy-pasted.
compress: gracefully handle a truncated ZSTD frame
If a journal file contains a truncated ZSTD frame (i.e. a frame with
Frame_Content_Size > 0, but with not enough data in Data_Block),
ZSTD_decompressStream() would return a non-zero, non-error value. This
would then skip the error path in the ZSTD_isError() branch and we'd hit
the following assert:
Let's handle this situation gracefully and return EBADMSG instead.
Also, add another journalctl invocation to the corrupted-journals test
that goes through the sd_journal_get_data() -> decompress_startswith_zstd()
code path which, among other things, covers the issue when run on the
provided journal file.
strxcpyx: add a paranoia check for vsnprintf()'s return value
vsnprintf() can, under some circumstances, return negative value, namely
during encoding errors when converting wchars to multi-byte characters.
This would then wreak havoc in the arithmetics we do following the
vsnprintf() call. However, since we never do any wchar shenanigans in
our code it should never happen.
Let's encode this assumption into the code as an assert(), similarly how
we already do this in other places (like strextendf_with_separator()).
The NOTES section in os-release(5) contains an unusual formatting.
Switch function and ulink tags and remove a newline within ulink text to
keep the entry formatting in sync with others. Also, this preserves the
formatting within the text itself.
mountpoint-util: initialize mnt_id for name_to_handle_at(AT_HANDLE_MNT_ID_UNIQUE)
Suppress the following message:
```
$ sudo valgrind --leak-check=full build/networkctl dhcp-lease wlp59s0
==175708== Memcheck, a memory error detector
==175708== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==175708== Using Valgrind-3.26.0 and LibVEX; rerun with -h for copyright info
==175708== Command: build/networkctl status wlp59s0
==175708==
==175708== Conditional jump or move depends on uninitialised value(s)
==175708== at 0x4BC33D1: inode_same_at (stat-util.c:610)
==175708== by 0x4BF1972: inode_same (stat-util.h:86)
==175708== by 0x4BF48FE: running_in_chroot (virt.c:817)
==175708== by 0x4B16643: running_in_chroot_or_offline (verbs.c:37)
==175708== by 0x4B175CE: _dispatch_verb_with_args (verbs.c:136)
==175708== by 0x4B17868: dispatch_verb (verbs.c:160)
==175708== by 0x407CBB: networkctl_main (networkctl.c:249)
==175708== by 0x407D06: run (networkctl.c:263)
==175708== by 0x407D39: main (networkctl.c:266)
==175708==
```
Not sure if it is an issue in valgrind or glibc, but at least there is
nothing we can do except for working around it.
If unprivileged_mode is false then verify_esp() will treat access errors
like any other and log about them. Here we set it to false, hence
there's no point to log a 2nd time.
boot: never auto-boot a menu entry with the non-default profile
When figuring out which menu entry to pick by default, let's not
consider any with a profile number > 0. This reflects that fact that
additional profiles are generally used for
debug/recovery/factory-reset/storage target mode boots, and those should
never be auto-selected. Hence do a simple check: if profile != 0, simply
do not consider the entry as a default.
We might eventually want to beef this up, and add a property one can set
in the profile metadata that controls this behaviour, but for now let's
just do a this simple fix.
namespace: don't log misleading error in the r > 0 path
fd_is_fs_type() returns < 0 for errors, 0 for false, and > 0 for true, so
in the r > branch we'd most likely report EPERM together with the error
message which is misleading.
timesync: verify the actual size of the received data
iov.iov_len doesn't change after calling recvmsg() so it remains set to
sizeof(ntpmsg), which makes the check for a short packet always false.
Let's fix that by checking the actual size of the received data instead.
Kai Lüke [Thu, 16 Apr 2026 06:24:27 +0000 (15:24 +0900)]
sysupdated: don't crash when an mstack machine image is found
As soon as machinectl list-images has an mstack entry updatectl fails
because systemd-sysupdated crashes with an assertion failing because
the mstack case was not handled.
For now mstack is not supported as image for sysupdate to operate on
and we can skip it.
sd-varlink: Don't log successful sentinel error dispatch as a failure
sd_varlink_error() deliberately returns a negative errno mapped from
the error id on success so callbacks can `return sd_varlink_error(...);`
to enqueue the reply and propagate a matching errno at once. When
varlink_dispatch_method() dispatches a configured error sentinel itself,
it doesn't need that mapping — but it was treating any negative return
as a dispatch failure and logging "Failed to process sentinel" even
though the error reply had been successfully enqueued.
Detect success via the state transition to VARLINK_PROCESSED_METHOD
instead, so only genuine enqueue failures are logged.
* 207e2d0044 Stop building support for openssl engines
* 36a234147f Upload sources
* 3681163f81 Version 260.1
* 8f4f0f58e3 Version 260
* e3fab23aa0 Version 260~rc4
* e4c1c2100b Version 260~rc3
* 453696813e Fix typo in unit name in %post scriptlet
* 154edb7cdb Silence false positive "HWID match failed, no DT blob" error (rhbz#2444759)
* 03b6637c35 riscv64 port has LTO disabled
* ce1dec6a40 Version 260~rc2
* 809049777c Add patch for symlink creation error
* 6ff27708f7 Enable getty@.service through presets
* ba7807fbce Drop scriptlet for upgrades from versions <253
* 455f277188 Move support for tpm2 to systemd-udev subpackage
* 0183bc784e Version 260~rc1
journal-upload: also disable VERIFYHOST when --trust=all is used
When --trust=all disables CURLOPT_SSL_VERIFYPEER, the residual
CURLOPT_SSL_VERIFYHOST check is ineffective since an attacker can
present a self-signed certificate with the expected hostname. Disable
both for consistency and log that server certificate verification is
disabled.
machined: pass user as positional argument in machine_default_shell_args()
Instead of interpolating the user name directly into the sh -c script
body via asprintf %s, pass it as a positional parameter ($1) in a
separate argv entry. This avoids the user string being parsed as part
of the shell script syntax.
Also validate the user name in bus_machine_method_open_shell() with
valid_user_group_name(), matching the validation already done on the
Varlink path via json_dispatch_const_user_group_name().
logind: reject wall messages containing control characters
method_set_wall_message() and the property setter only checked the
message length but not its content. Since wall messages are broadcast
to all TTYs, control characters in the message could interfere with
terminal state. Reject messages containing control characters other
than newline and tab.
core: add missing SELinux access checks when listing units
Add mac_selinux_unit_access_check_varlink() to the unit enumeration
loop in vl_method_list_units(), silently skipping units the caller
is not permitted to see, matching the D-Bus ListUnits behavior.
Add mac_selinux_access_check_varlink() to vl_method_describe_manager().
Yu Watanabe [Sun, 22 Mar 2026 14:39:38 +0000 (23:39 +0900)]
dhcp: fix user class and vendor specific option assignment
The commit 6d7cb9a6b8361d2b327222bc12872a3676358bc3 fixes the assignment
of the these options when specified through SendOption=. However, it
breaks when specified through UserClass= or SendVendorOption=.
When UserClass= or SendVendorOption= is specified, the option length is
calculated from the sd_dhcp_client.user_class or .vendor_options. Hence,
we can use 0 for the length in that case.
Milan Kyselica [Sat, 11 Apr 2026 08:26:13 +0000 (10:26 +0200)]
boot: fix loop bound and OOB in devicetree_get_compatible()
The loop used the byte offset end (struct_off + struct_size) as the
iteration limit, but cursor[i] indexes uint32_t words. This reads
past the struct block when end > size_words.
Use size_words (struct_size / sizeof(uint32_t)) which is the correct
number of words to iterate over.
Also fix a pre-existing OOB in the FDT_BEGIN_NODE handler: the guard
i >= size_words is always false inside the loop (since the loop
condition already ensures i < size_words), so cursor[++i] at the
boundary reads one word past the struct block. Use i + 1 >= size_words
to check before incrementing.
Milan Kyselica [Sat, 11 Apr 2026 08:25:19 +0000 (10:25 +0200)]
boot: fix integer overflow and division by zero in BMP splash parser
Bound image dimensions before computing row_size to prevent overflow
in the depth * x multiplication on 32-bit. Without this, crafted
dimensions like depth=32 x=0x10000001 wrap to a small row_size that
passes all subsequent checks.
Reject channel masks where all bits are set (popcount == 32), since
1U << 32 is undefined behavior and causes division by zero on
architectures where it evaluates to zero. Move the validation before
computing derived values for clarity. Use unsigned 1U in shifts to
avoid signed integer overflow UB for popcount == 31.
journal: limit decompress_blob() output to DATA_SIZE_MAX
We already have checks in place during compression that limit the data
we compress, so they shouldn't decompress to anything larger than
DATA_SIZE_MAX unless they've been tampered with. Let's make this
explicit and limit all our decompress_blob() calls in journal-handling
code to that limit.
One possible scenario this should prevent is when one tries to open and
verify a journal file that contains a compression bomb in its payload:
$ systemd-run --user --wait --pipe -- build-local/journalctl --verify --file=$PWD/test.journal
Running as unit: run-p682422-i4875779.service
000110: Invalid hash (00000000 vs. 11e4948d73bdafdd)
000110: Invalid object contents: Bad message
File corruption detected at /home/fsumsal/repos/@systemd/systemd/test.journal:272 (of 1249896 bytes, 0%).
FAIL: /home/fsumsal/repos/@systemd/systemd/test.journal (Bad message)
Finished with result: exit-code
Main processes terminated with: code=exited, status=1/FAILURE
Service runtime: 48.051s
CPU time consumed: 47.941s
Memory peak: 8G (swap: 0B)
Same could be, in theory, possible with just `journalctl --file=`, but
the reproducer would be a bit more complicated (haven't tried it, yet).
Lastly, the change in journal-remote is mostly hardening, as the maximum
input size to decompress_blob() there is mandated by MHD's connection
memory limit (set to JOURNAL_SERVER_MEMORY_MAX which is 128 KiB at the
time of writing), so the possible output size there is already quite
limited (e.g. ~800 - 900 MiB for xz-compressed data).
test-json: add iszero_safe guards for float division at index 0 and 1
The existing iszero_safe guards at index 9 and 10 were added to
silence Coverity, but the same division-by-float-zero warning also
applies to the divisions at index 0 (DBL_MIN) and 1 (DBL_MAX).
sd-varlink: scale down the limit of connections per UID to 128
1024 connections per UID is unnecessarily generous, so let's scale this
down a bit. D-Bus defaults to 256 connections per UID, but let's be even
more conservative and go with 128.
importctl: fix -N to actually clear keep-download flag
-N was clearing and re-setting the same bit in arg_import_flags_mask,
which is a no-op. It should clear the bit in arg_import_flags instead,
matching what --keep-download=no does via SET_FLAG().
core: fix EBUSY on restart and clean of delegated services
When a service is configured with Delegate=yes and DelegateSubgroup=sub,
the delegated container may write domain controllers (e.g. "pids") into the
service cgroup's cgroup.subtree_control via its cgroupns root. On container
exit the stale controllers remain, and on service restart clone3() with
CLONE_INTO_CGROUP fails with EBUSY because placing a process into a cgroup
that has domain controllers in subtree_control violates the no-internal-
processes rule. The same issue affects systemctl clean, where cg_attach()
fails with EBUSY for the same reason.
Add unit_cgroup_disable_all_controllers() helper in cgroup.c that clears
stale controllers via cg_enable(mask=0) and updates cgroup_enabled_mask to
keep internal tracking in sync. Call it from service_start() and
service_clean() right before spawning, so that resource control is preserved
for any lingering processes from the previous invocation as long as possible.