Daan De Meyer [Fri, 27 Mar 2026 18:48:28 +0000 (18:48 +0000)]
stub: auto-detect console device and append console= to kernel command line
The Linux kernel does not reliably auto-detect serial consoles on
headless systems. While the docs claim serial is used as a fallback
when no VGA card is found, in practice CONFIG_VT's dummy console
(dummycon) registers early and satisfies the kernel's console
requirement, preventing the serial fallback from ever triggering. The
ACPI SPCR table can help on ARM/RISC-V where QEMU generates it, but
x86 QEMU does not produce SPCR, and SPCR cannot describe virtio
consoles at all. This means UKIs booted via sd-stub in headless VMs
produce no visible console output unless console= is explicitly
passed on the kernel command line.
Fix this by having sd-stub auto-detect the console type and append an
appropriate console= argument when one isn't already present.
Detection priority:
1. VirtIO console PCI device (vendor 0x1AF4, device 0x1003): if
exactly one is found, append console=hvc0. This takes highest
priority since a VirtIO console is explicitly configured by the
VMM (e.g. systemd-vmspawn's virtconsole device). If multiple
VirtIO console devices exist, we cannot determine which hvc index
is correct, so we skip this path entirely.
2. EFI Graphics Output Protocol (GOP): if present, don't add any
console= argument. The kernel will use the framebuffer console by
default, and adding a serial console= would redirect the primary
console away from the display.
3. Serial console: first, we count the total number of serial devices
via EFI_SERIAL_IO_PROTOCOL. If there are zero or more than one,
we bail out — with multiple UARTs, the kernel assigns ttyS indices
based on its own enumeration order and we cannot determine which
index the console UART will receive. Only when exactly one serial
device exists (guaranteeing it will be ttyS0) do we proceed to
verify it's actually used as a console by checking for UART device
path nodes (MESSAGING_DEVICE_PATH + MSG_UART_DP). The firmware's
ConOut handle is checked first; if it has no device path (common
with OVMF's ConSplitter virtual handle when using -nographic
-nodefaults), we fall back to enumerating all
EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL handles and checking each one's
device path. The architecture-specific console argument is then
appended:
- x86: console=ttyS0
- ARM: console=ttyAMA0
- Others: console=ttyS0 (RISC-V, LoongArch, MIPS all use ttyS0)
Note on OVMF's VirtioSerialDxe: it exposes virtio serial ports with
the same UART device path nodes as real serial ports (ACPI PNP 0x0501
+ MSG_UART_DP), making them indistinguishable from real UARTs via
device path inspection alone. This is why we check for the VirtIO
console PCI device via EFI_PCI_IO_PROTOCOL before falling back to
device path analysis.
Also add a minimal EFI_PCI_IO_PROTOCOL definition (proto/pci-io.h)
with just enough to call Pci.Read for vendor/device ID enumeration,
and add the MSG_UART_DP subtype to the device path header.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
build(deps): bump the actions group with 3 updates
Bumps the actions group with 3 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact), [redhat-plumbers-in-action/download-artifact](https://github.com/redhat-plumbers-in-action/download-artifact) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release).
Updates `actions/upload-artifact` from 6 to 7
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v6...v7)
Updates `redhat-plumbers-in-action/download-artifact` from 1.1.5 to 1.1.6
- [Release notes](https://github.com/redhat-plumbers-in-action/download-artifact/releases)
- [Commits](https://github.com/redhat-plumbers-in-action/download-artifact/compare/103e5f882470b59e9d71c80ecb2d0a0b91a7c43b...03d5b806a9dca9928eb5628833fe81a0558f23bb)
Updates `softprops/action-gh-release` from 2.5.0 to 2.6.1
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/a06a81a03ee405af7f2048a818ed3f03bbf83c7b...153bb8e04406b158c6c84fc1615b65b24149a1fe)
Michal Rybecky [Wed, 1 Apr 2026 08:00:14 +0000 (10:00 +0200)]
hmac: erase key-derived stack buffers before returning
hmac_sha256() leaves four stack buffers containing key-derived material
(inner_padding, outer_padding, replacement_key, hash state) on the stack
after returning. The inner_padding and outer_padding arrays contain
key XOR 0x36 and key XOR 0x5c respectively, which are trivially
reversible to recover the original HMAC key.
This function is called with security-sensitive keys including the LUKS
volume key (cryptsetup-util.c), TPM2 PIN (tpm2-util.c), and boot secret
(tpm2-swtpm.c). The key material persists on the stack until overwritten
by later unrelated function calls.
Add CLEANUP_ERASE() to all four local buffers, following the same
pattern applied to tpm2-util.c in commit 6c80ce6 (PR #41394).
Daan De Meyer [Mon, 30 Mar 2026 08:43:38 +0000 (08:43 +0000)]
loop-util: work around kernel loop driver partition scan race
The kernel loop driver has a race condition in LOOP_CONFIGURE when
LO_FLAGS_PARTSCAN is set: it sends a KOBJ_CHANGE uevent (with
GD_NEED_PART_SCAN set) before calling loop_reread_partitions(). If
udev opens the device in response to the uevent before
loop_reread_partitions() runs, the kernel's blkdev_get_whole() sees
GD_NEED_PART_SCAN and triggers a first partition scan. Then
loop_reread_partitions() runs a second scan that drops all partitions
from the first scan (via blk_drop_partitions()) before re-adding them.
This causes partition devices to briefly disappear (plugged -> dead ->
plugged), which breaks systemd units with BindsTo= on the partition
device: systemd observes the dead transition, fails the dependent
units with 'dependency', and does not retry when the device reappears.
Work around this in loop_device_make_internal() by splitting the loop
device setup into two steps: first LOOP_CONFIGURE without
LO_FLAGS_PARTSCAN, then LOOP_SET_STATUS64 to enable partscan. This
avoids the race because:
1. LOOP_CONFIGURE without partscan: disk_force_media_change() sets
GD_NEED_PART_SCAN, but GD_SUPPRESS_PART_SCAN remains set. If udev
opens the device, blkdev_get_whole() calls bdev_disk_changed()
which clears GD_NEED_PART_SCAN, but blk_add_partitions() returns
early because disk_has_partscan() is false — no partitions appear,
the flag is drained harmlessly.
2. Between the two ioctls, we open and close the device to ensure
GD_NEED_PART_SCAN is drained regardless of whether udev processed
the uevent yet.
3. LOOP_SET_STATUS64 with LO_FLAGS_PARTSCAN: clears
GD_SUPPRESS_PART_SCAN and calls loop_reread_partitions() for a
single clean scan. Crucially, loop_set_status() does not call
disk_force_media_change(), so GD_NEED_PART_SCAN is never set again.
A proper kernel fix has been submitted:
https://lore.kernel.org/linux-block/20260330081819.652890-1-daan@amutable.com/T/#u
This workaround should be dropped once the fix is widely available.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Tue, 10 Mar 2026 13:10:05 +0000 (14:10 +0100)]
nspawn: keep backing files for boot_id and kmsg bind mounts alive
Both setup_boot_id() and setup_kmsg() previously created temporary files
in /run, bind mounted them over their respective /proc targets, and then
immediately unlinked the backing files. While the bind mount keeps the
inode alive, the kernel marks the dentry as deleted.
This is a problem because bind mounts backed by unlinked files cannot be
replicated: both the old mount API (mount(MS_BIND)) and the new mount
API (open_tree(OPEN_TREE_CLONE) + move_mount()) fail with ENOENT when
the source mount references a deleted dentry. This affects
mount_private_apivfs() in namespace.c, which needs to replicate these
submounts when setting up a fresh /proc instance for services with
ProtectProc= or similar sandboxing options — with an unlinked backing
file, the boot_id submount simply gets lost.
Fix this by using fixed paths (/run/proc-sys-kernel-random-boot-id and
/run/proc-kmsg) instead of randomized tempfiles, and not unlinking them
after the bind mount. The files live in /run which is cleaned up on
shutdown anyway.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Tue, 31 Mar 2026 19:18:12 +0000 (21:18 +0200)]
ci: Rework Claude review workflow to use CLI directly
Replace claude-code-action with a direct claude CLI invocation. This
gives us explicit control over settings, permissions, and output
handling.
Other changes:
- Prepare per-commit git worktrees with pre-generated commit.patch and
commit-message.txt files, replacing the pr-review branch approach.
- Use structured JSON output (--output-format stream-json --json-schema)
instead of having Claude write review-result.json directly.
- Use jq instead of python3 for JSON prettification.
- Add timeout-minutes: 60 to the review job.
- List tool permissions explicitly instead of using a wildcard.
- Fix sandbox filesystem paths to use regular paths instead of the "//"
prefix.
Daan De Meyer [Mon, 30 Mar 2026 13:51:48 +0000 (13:51 +0000)]
loop-util: use auto-detect open mode for loop device setup
When callers do not explicitly request read-only mode, pass open_flags
as -1 (auto-detect) instead of hardcoding O_RDWR. This enables the
existing O_RDWR-to-O_RDONLY retry logic in loop_device_make_by_path_at()
which falls back to O_RDONLY when opening the backing device with O_RDWR
fails with EROFS or similar errors.
Previously, callers passed O_RDWR explicitly when read-only mode was not
requested, which bypassed the retry logic entirely. This meant that
inherently read-only block devices (such as CD-ROMs) would fail to open
instead of gracefully falling back to read-only mode.
Also propagate the unresolved open_flags through
loop_device_make_by_path_at() into loop_device_make_internal() instead
of resolving it to O_RDWR early. For loop_device_make_by_path_memory(),
resolve to O_RDWR immediately since memfds are always writable.
In mstack, switch from loop_device_make() to
loop_device_make_by_path_at() with a NULL path, which reopens the
O_PATH file descriptor with the appropriate access mode. This is
necessary because the backing file descriptor is opened with O_PATH,
which prevents loop_device_make_internal() from auto-detecting the
access mode via fcntl(F_GETFL).
The previous code used strv_join() when it generated the log
message for `varlinkctl --exec`. However this can lead to
inaccurate logging so use `quote_command_line()` instead.
Michael Vogt [Tue, 24 Mar 2026 08:48:39 +0000 (09:48 +0100)]
varlinkctl: add support for `--exec` with `--upgrade`
Having support for `--exec` when using `--upgrade` is nice so this
commit adds it. It does it by extracting a shared helper called
`exec_with_listen_fds()` and then use that in the `verb_call()`
and `varlink_call_and_upgrade()` calls.
Michael Vogt [Tue, 24 Mar 2026 08:48:33 +0000 (09:48 +0100)]
varlinkctl: add protocol upgrade support
The varlink spec supports protocol upgrades and they are very
useful to e.g. transfer binary data directly via varlink. So
far varlinkctl/sd-varlink was not supporting this. This commit
adds support for it in varlinkctl by using the new code in
sd-varlink and the generalized socket-forward code.
Michael Vogt [Thu, 26 Mar 2026 15:51:39 +0000 (16:51 +0100)]
shared: rename internal variables in SimplexForwarder
The SimplexForwader was using the naming of the SocketForwarder
for bi-directional sockets. This was to keep the diff small and
to make it easier to follow what changed and what was reused.
However the name "client/server" for the SimplexForwader does
no longer make much sense. The SimplexForwader is no longer
about client/server but really just read/write. So this commit
adjusts the naming.
Michael Vogt [Tue, 24 Mar 2026 08:48:26 +0000 (09:48 +0100)]
shared: extend socket-forward to support fd-pairs too
Now that the socket forward code is extracted we can
extend it to not just support bidirectional sockets
but also input/output fd-pairs. This will be needed
for e.g. the varlinkctl protocol upgrade support where
one side of the connection is a fd-pair (stdin/stdout).
This is done by creating two half-duplex forwarders
that operate independantly. This also allows to simplify
some state tracking, e.g. because each fd serves only one
direction we don't need to dynamically create the event mask
with EPOLLIN etc, its enough to set it once. It also handles
non-pollable FDs transparently.
Thanks to Lennart for his excellent suggestions here.
Michael Vogt [Tue, 24 Mar 2026 08:48:20 +0000 (09:48 +0100)]
sd-varlink: add sd_varlink_call_and_upgrade() for protocol upgrades
The varlink spec supports protocol upgrades and they are very
useful to e.g. transfer binary data directly via varlink. So
far sd-varlink was not supporting this.
This commit adds a new public sd_varlink_call_and_upgrade()
that sends a method call, waits for the reply, then steals
the connection fds for raw I/O. It returns separate input_fd
and output_fd to support both bidirectional sockets and pipe
pairs.
A helper is extracted and shared between sd_varlink_call_full()
and sd_varlink_call_and_upgrade(). A new `protocol_upgrade`
bool in `struct sd_varlink` ensures that on a protocol upgrade
request we only exactly read the varlink protocol bytes and
leave anything beyond that to the caller that speaks the upgraded
protocol.
Note that this is the client side of the library implementation
only for now. The server side needs work but this is already
useful as it allows to talk to varlink servers that speak protocol
upgrades (like the rust implemenations of varlink).
Daan De Meyer [Sat, 28 Mar 2026 14:10:54 +0000 (14:10 +0000)]
terminal-util: fix boot hang from ANSI terminal size queries
Since v257, terminal_fix_size() is called during early boot via
console_setup() → reset_dev_console_fd() to query terminal dimensions
via ANSI escape sequences. This has caused intermittent boot hangs
where the system gets stuck with a blinking cursor and requires a
keypress to continue (see systemd/systemd#35499).
The function tries CSI 18 first, then falls back to DSR if that fails.
Previously, each method independently opened a non-blocking fd, disabled
echo/icanon, ran its query, restored termios, and closed its fd. This
created two problems:
1. Echo window between CSI 18 and DSR fallback: After CSI 18 times out
and restores termios (re-enabling ECHO and ICANON), there is a brief
window before DSR disables them again. If the terminal's CSI 18
response arrives during this window, it is echoed back to the
terminal — where the terminal interprets \e[8;rows;cols t as a
"resize text area" command — and the response bytes land in the
canonical line buffer as stale input that can confuse the DSR
response parser.
2. Cursor left at bottom-right on DSR timeout: The DSR method worked by
sending two DSR queries — one to save the cursor position, then
moving the cursor to (32766,32766) and sending another to read the
clamped position. If neither response was received (timeout), the
cursor restore was skipped (conditional on saved_row > 0), leaving
the cursor at the bottom-right corner of the terminal. The
subsequent terminal_reset_ansi_seq() then moved it to the beginning
of the last line via \e[1G, making boot output appear at the bottom
of the screen — giving the appearance of a hang even when the system
was still booting.
This commit fixes both issues:
- terminal_fix_size() now opens the non-blocking fd and configures
termios once for both query methods, so echo stays disabled for the
entire CSI 18 → DSR fallback sequence with no gap. tcflush(TCIFLUSH)
is called before each query to drain any stale input from the tty
input queue.
- The DSR method now uses DECSC (\e7) / DECRC (\e8) to save and restore
the cursor position via hardware, instead of querying it with a
separate DSR round-trip. All four sequences (DECSC, CUP to
bottom-right, DSR query, DECRC) are sent in a single write, so the
terminal processes DECRC and restores the cursor regardless of whether
userspace ever reads the DSR response. This eliminates the
cursor-at-bottom-right artifact on timeout and simplifies the read
loop to only need a single DSR response instead of two.
- The repeated setup boilerplate (dumb check, verify_same, fd_reopen,
termios save/disable) is extracted into terminal_prepare_query(),
shared by terminal_get_size_by_csi18(), terminal_get_size_by_dsr(),
and terminal_fix_size().
Fixes: systemd/systemd#35499 Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Tue, 31 Mar 2026 07:55:18 +0000 (07:55 +0000)]
terminal-util: add CLEANUP_TERMIOS_RESET() for automatic termios restore
Add TERMIOS_NULL sentinel, TermiosResetContext, and CLEANUP_TERMIOS_RESET()
macro (modeled after CLEANUP_ARRAY()) to automatically restore terminal
settings when leaving scope, replacing manual goto+tcsetattr patterns.
Migrate ask_string_full(), terminal_get_cursor_position(),
get_default_background_color(), terminal_get_terminfo_by_dcs(),
terminal_get_size_by_dsr() and terminal_get_size_by_csi18() to use the new
cleanup macro, removing the goto-based cleanup labels and replacing them
with direct returns.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Tue, 31 Mar 2026 09:39:21 +0000 (09:39 +0000)]
test-terminal-util: migrate to new assertion macros
Replace assert_se() calls with the more descriptive ASSERT_OK(),
ASSERT_OK_ZERO(), ASSERT_OK_ERRNO(), ASSERT_OK_POSITIVE(),
ASSERT_OK_EQ_ERRNO(), ASSERT_FAIL(), ASSERT_TRUE(), ASSERT_FALSE(),
ASSERT_EQ(), ASSERT_LE(), and ASSERT_NOT_NULL() macros throughout the
test file.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Refactor the option & verb table handling and convert a few more programs (#41335)
After having some more experience with how this works, I think some
changes are in order. So there are a handful of preparatory patches and
then conversion of a few progs that make use of the new functionality.
Valentin David [Mon, 30 Mar 2026 07:54:38 +0000 (09:54 +0200)]
discover-image: Ignore sysupdate temporary files
Sysupdate temporary file names do not match their extension-release names. So
they will always fail. That makes enabling any other sysexts/confexts fail
which has catastrophic consequences. Unfortunately since 260, sysupdate
leaves temporary files for long time instead just while downloading. So
this kind of failure now happens much more often.
Luca Boccassi [Sat, 28 Mar 2026 22:46:35 +0000 (22:46 +0000)]
journald: add assert for allocated buffer size
Coverity flags allocated - 1 as a potential underflow when
allocated is 0. After GREEDY_REALLOC succeeds the buffer is
guaranteed non-empty, but Coverity cannot trace through the
conditional. Add an assert to document this.
Luca Boccassi [Sat, 28 Mar 2026 22:15:56 +0000 (22:15 +0000)]
nspawn-oci: add asserts for UID/GID validity after dispatch
Coverity flags UINT32_MAX - data.container_id as an underflow
when container_id could be UID_INVALID (UINT32_MAX). After
successful sd_json_dispatch_uid_gid(), the values are guaranteed
valid, but Coverity cannot trace through the callback. Add
asserts to document this invariant.
Luca Boccassi [Sat, 28 Mar 2026 22:06:51 +0000 (22:06 +0000)]
boot: clamp setup header copy size to sizeof(SetupHeader)
The setup_size field from the kernel image header is used as part
of the memcpy size. Clamp it to sizeof(SetupHeader) to ensure the
copy does not read beyond the struct bounds even if the kernel
image header contains an unexpected value.
Luca Boccassi [Sat, 28 Mar 2026 22:00:25 +0000 (22:00 +0000)]
creds-util: add assert for output buffer size overflow safety
Coverity flags the multi-term output.iov_len accumulation as a
potential overflow. Add an assert after the calculation to verify
the result is at least as large as the input, catching wraparound.
Luca Boccassi [Sat, 28 Mar 2026 21:56:41 +0000 (21:56 +0000)]
calendarspec: use ADD_SAFE for repeat offset calculation
Use overflow-safe ADD_SAFE() instead of raw addition when
computing the next matching calendar component with repeat.
On overflow, skip the component instead of using a bogus value.
Luca Boccassi [Sat, 28 Mar 2026 21:47:08 +0000 (21:47 +0000)]
test-strv: avoid unsigned wraparound in backwards iteration
Use pre-decrement starting from 3 instead of post-decrement
starting from 2, so that the unsigned counter does not wrap
past zero on the final iteration.
Luca Boccassi [Sat, 28 Mar 2026 21:41:02 +0000 (21:41 +0000)]
sd-bus: add assert_cc for message allocation size
Use CONST_ALIGN_TO to express the compile-time overflow check for
the ALIGN(sizeof(sd_bus_message)) + sizeof(BusMessageHeader)
allocation, since ALIGN() is not constexpr.
Luca Boccassi [Sat, 28 Mar 2026 21:29:58 +0000 (21:29 +0000)]
nss-myhostname: add asserts for buffer index accumulation
Coverity flags idx += 2*sizeof(char*) and idx += sizeof(char*)
as potential overflows. The idx is bounded by the ms buffer size
calculation, add asserts to document this.
Luca Boccassi [Sat, 28 Mar 2026 21:28:56 +0000 (21:28 +0000)]
tree-wide: add assert_cc for time constant multiplications
Coverity flags compile-time constant multiplications of
USEC_PER_SEC, USEC_PER_MSEC, and USEC_PER_HOUR as potential
overflows. Add assert_cc() to prove they fit at build time.
Luca Boccassi [Sat, 28 Mar 2026 21:20:39 +0000 (21:20 +0000)]
repart: add assert for offset + current_size overflow safety
Coverity flags a->after->offset + a->after->current_size as a
potential overflow. Both values are validated as not UINT64_MAX
by existing asserts, add an explicit overflow check to document
the invariant for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 21:19:14 +0000 (21:19 +0000)]
networkd-ndisc: add assert for DNSSL allocation overflow safety
Coverity flags ALIGN(sizeof(NDiscDNSSL)) + strlen(*j) + 1 as a
potential overflow. Domain names are protocol-bounded but add an
assert to make this explicit for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 21:14:35 +0000 (21:14 +0000)]
dns-packet: add asserts for allocation overflow safety
Coverity flags ALIGN(sizeof(DnsPacket)) + size calculations in
dns_packet_new() and dns_packet_dup() as potential overflows. The
sizes are bounded by DNS_PACKET_SIZE_MAX but add asserts to make
this explicit for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 21:12:31 +0000 (21:12 +0000)]
user-util: add asserts for buffer allocation overflow safety
Coverity flags ALIGN(sizeof(struct passwd/group)) + bufsize as
potential overflows in the getpw/getgr helpers. Add asserts to
make the bounds explicit for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 21:11:48 +0000 (21:11 +0000)]
sd-bus: add asserts for message size overflow safety
Coverity flags arithmetic in BUS_MESSAGE_SIZE(),
BUS_MESSAGE_BODY_BEGIN() and message_from_header() as potential
overflows. The values are validated at message creation time, but
add asserts to make the invariants explicit for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 21:03:14 +0000 (21:03 +0000)]
sd-daemon: add assert before CMSG_SPACE subtraction
Coverity flags the subtraction from msg_controllen as a potential
underflow. The CMSG_SPACE was added when send_ucred was set, and
the subtraction only runs when send_ucred was true, so it is safe.
Add an assert to document this invariant.
Luca Boccassi [Sat, 28 Mar 2026 21:01:34 +0000 (21:01 +0000)]
sd-json: silence false positive in sd_json_variant_filter
Same pattern as the fix for sd_json_variant_unset_field in 9b3715d529e4eba79e19c87e85583f7be5ee2c95: cache the element
count in a local variable and assert it is at least 2 before
subtracting.
Luca Boccassi [Sat, 28 Mar 2026 20:59:35 +0000 (20:59 +0000)]
journal: add assert for max_size overflow safety
Coverity flags max_size*2 as a potential overflow. The value is
bounded by MAX_SIZE_UPPER (128 MiB) or JOURNAL_COMPACT_SIZE_MAX
(4 GiB), so doubling is safe within uint64_t. Add an assert to
document this.
Daan De Meyer [Sun, 29 Mar 2026 21:11:52 +0000 (21:11 +0000)]
repart: allow --el-torito= with any --empty= value
The restriction requiring --empty= to be require, force, or create
when using --el-torito= is unnecessary.
context_verify_eltorito_overlap() already validates that the ISO 9660
blocks don't collide with GPT partition entries or the first usable
LBA, which is sufficient to guarantee safety regardless of the empty
mode.
This is needed for two-stage image builds where the first stage creates
the usr and verity partitions, and the second stage adds --el-torito=
to produce a bootable ISO with a UKI containing usrhash= derived from
the verity hash of the first stage. In the second stage, repart runs
with --empty=allow since the image already exists.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Valentin David [Sat, 21 Mar 2026 14:42:13 +0000 (15:42 +0100)]
repart: Optionally write minimal an El Torito boot catalog for EFI
This only points the firmware to the ESP. The ISO9660 is empty.
The initramfs should create a loop device to change block size
and enable GPT partitions.
This was tested using OVMF on qemu, with:
`-drive if=pflash,file=OVMF_CODE.fd,readonly=on,format=raw -drive if=pflash,file=OVMF_VARS.fd,format=raw -drive if=none,id=live-disk,file=dick.iso,media=cdrom,format=raw,readonly=on -device virtio-scsi-pci,id=scsi -device scsi-cd,drive=live-disk`
And a simple definition:
```
[Partition]
Type=esp
Format=vfat
CopyFiles=/usr/lib/systemd/boot/efi/systemd-bootx64.efi:/EFI/BOOT/BOOTX64.EFI
```
I tried to implement a varlink service using sd-varlink, and
not being able to use the approach with sentinel is exteremely
painful. This is useful internally and likewise externally.
Valentin David [Mon, 16 Mar 2026 21:21:55 +0000 (22:21 +0100)]
repart: Make it possible to set persistent allow-discards activation flag
AllowDiscards= will set allow-discards in the persistent flags which will make
activating the device automatically activate with that option. This is
useful for devices discovered through gpt-auto-generator without
needing to use some kernel command line to set the option.
Markdown and HTML don't support mixing ordered and unordered items
within a single list. This means the previous syntax actually produced
three separate lists.
Also, markdown converters don't necesarrily respect the first number in
an ordered list, and may just overwrite it to one. This is the case for
the one that generates the systemd.io page. And even if that wasn't the
case, the numbering of the second ordered list would be off by one.
Luca Boccassi [Sat, 28 Mar 2026 20:24:22 +0000 (20:24 +0000)]
recurse-dir: add assert_cc for DIRENT_SIZE_MAX allocation
Coverity flags offsetof(DirectoryEntries, buffer) + DIRENT_SIZE_MAX * 8
as a potential overflow. All operands are compile-time constants, so add
an assert_cc() to prove this at build time.
Luca Boccassi [Sat, 28 Mar 2026 20:13:03 +0000 (20:13 +0000)]
compress: add assert for space doubling overflow safety
Coverity flags 2 * space as a potential overflow. The space value
is bounded by prior allocation success, but add an explicit assert
to document this for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 20:10:14 +0000 (20:10 +0000)]
importd: add assert for log_message_size accumulation bounds
Coverity flags log_message_size += l as a potential overflow, but l
is bounded by the read() count parameter which is
sizeof(log_message) - log_message_size. Add an assert to make this
invariant explicit.
Luca Boccassi [Sat, 28 Mar 2026 20:08:55 +0000 (20:08 +0000)]
sd-bus: add asserts for rbuffer_size accumulation bounds
Coverity flags rbuffer_size += k as a potential overflow, but k is
always bounded by the iov size (which is the difference between the
allocated buffer and current rbuffer_size). Add asserts to make this
invariant explicit.
Luca Boccassi [Sat, 28 Mar 2026 19:55:35 +0000 (19:55 +0000)]
uid-range: add asserts to document overflow safety in coalesce
Coverity flags the x->start + x->nr and y->start + y->nr additions
as potential overflows. These are safe because uid_range_add_internal()
validates start + nr <= UINT32_MAX before inserting entries. Add asserts
to document this invariant for static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 19:52:09 +0000 (19:52 +0000)]
sd-event: add assert to help static analysis trace signal bounds
Coverity flags the signal_sources array access as a potential
out-of-bounds read because it cannot trace through the SIGNAL_VALID()
macro to know that ssi_signo < _NSIG. Add an explicit assert after
the runtime check to make the constraint visible to static analyzers.
Luca Boccassi [Sat, 28 Mar 2026 19:49:20 +0000 (19:49 +0000)]
cpu-set-util: add asserts to guide static analysis after realloc
Coverity flags CPU_SET_S() calls as potential out-of-bounds writes
because it cannot trace that cpu_set_realloc() guarantees the
allocated buffer is large enough for the given index. Add asserts
to make the size invariant explicit.
Luca Boccassi [Sat, 28 Mar 2026 19:47:27 +0000 (19:47 +0000)]
debug-generator: use unsigned bit shift for breakpoint flags
Using signed int literal '1' in left shift can lead to undefined
behavior if the shift amount causes overflow of a signed int. Use
UINT32_C(1) since the result is stored in a uint32_t variable.
Luca Boccassi [Sat, 28 Mar 2026 19:35:36 +0000 (19:35 +0000)]
scsi_id: use strscpy instead of strncpy for wwn fields
strncpy does not null-terminate the destination buffer if the source
string is longer than the count parameter. Since wwn and
wwn_vendor_extension are char[17] and we copy up to 16 bytes, there's
a risk of missing null termination. Use strscpy which always
null-terminates.
Luca Boccassi [Sat, 28 Mar 2026 18:55:37 +0000 (18:55 +0000)]
stat-util: add assert to silence coverity
Coverity thinks _mntidb can be used uninitialized, but this
is not the case when r == 0. Add a bool variable to make it
clearer instead of reusing 'r' later, and an assert to guide
static analyzers.
Luca Boccassi [Fri, 27 Mar 2026 20:59:46 +0000 (20:59 +0000)]
mkosi: pull in gnu coreutils for Ubuntu 26.04 and newer
The default coreutils in Ubuntu 26.04 moved to uutils, which is broken
in many subtle and annoying ways, breaking various tests. It's also
a giant monolithic megabinary which makes the minimal image size
go up and break other tests.
Force the gnu coreutils to be pulled in all images.
Luca Boccassi [Fri, 27 Mar 2026 17:02:41 +0000 (17:02 +0000)]
test: check for bin/bash in dissect --mtree instead of cat
Ubuntu is doing shenanigans with their coreutils so they are now
symlinks instead of binaries, so the grep fails. Check bash instead
to fix test failure on 26.04.
Luca Boccassi [Fri, 27 Mar 2026 19:32:29 +0000 (19:32 +0000)]
shutdown: remove kexec-tools dependency
'kexec -e' is just a small wrapper that does the xen hypercall
on xen, or otherwise just calls reboot(). Drop the dependency,
and reuse the existing xen hypercall helper.
many: more checks for pointer access without NULL check (#41370)
This is a followup for https://github.com/systemd/systemd/pull/41096
that makes more subsystems pass the new `check-pointer-deref` coccinelle
checks. See the individual commits.
My plan is to do a few more of these PRs until we have it all covered. I
could also do it in a single very big PR but I'm worried about a)
conflicts b) that its just too big/annoying to review. Only 7 dirs left
but some (like src/basic) are quite big (~50k loc) so those PRs will be
a bit bigger.
Daan De Meyer [Fri, 27 Mar 2026 11:40:59 +0000 (11:40 +0000)]
vmspawn: improve firmware selection to match mkosi's implementation
Align find_ovmf_config() with mkosi's find_ovmf_firmware() by adding
checks that were previously missing:
- Filter on interface-types, only selecting UEFI firmware definitions.
Previously non-UEFI (e.g. BIOS-only) firmware could be selected.
- Check machine type compatibility using substring matching against the
target machine patterns in firmware descriptions (e.g. "q35" matches
"pc-q35-*"), following the same approach as mkosi.
- Make nvram-template optional in the firmware JSON mapping. Firmware
definitions without an nvram-template are now parsed successfully
(with vars remaining NULL) rather than failing entirely.
Also rework the firmware target parsing to store both architecture and
machine arrays per target (instead of just a flat architecture list),
and extract the machine matching into firmware_data_matches_machine().
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
Daan De Meyer [Fri, 27 Mar 2026 11:55:32 +0000 (12:55 +0100)]
vmspawn: Add --firmware=describe
It's useful to be able to check what firmware description vmspawn
will select. In particular, this will allow me to figure out the
nvram template file that will be picked up so I can pick it up in
mkosi and operate on it to pass a modified version of it to vmspawn
with --efi-nvram-template=.
Daan De Meyer [Fri, 27 Mar 2026 10:43:26 +0000 (10:43 +0000)]
vmspawn: add --efi-nvram-template= and --firmware-features= options
Add --efi-nvram-template=PATH to specify a custom firmware variables
file to copy and use as the initial EFI NVRAM state instead of the
default template from the firmware definition.
Add --firmware-features=FEATURE[,FEATURE...] to require or exclude
specific firmware features during automatic firmware discovery.
Features prefixed with "!" are excluded. If a feature appears in both
the included and excluded lists, inclusion takes priority. Firmware
with the "enrolled-keys" feature is excluded by default.
Refactor --secure-boot= to operate on the firmware features sets
instead of maintaining a separate tristate. --secure-boot=yes adds
"secure-boot" to the include set, --secure-boot=no adds it to the
exclude set, and --secure-boot=auto removes it from both.
Generalize find_ovmf_config() to accept include/exclude feature sets
instead of a secure boot tristate, removing the special-cased
enrolled-keys and secure-boot filtering logic.
Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
tmpfile-util: don't log about lack of O_TMPFILE support
It's a very common case (vfat...), and it's just too much noise. After
all the whole function exists primarily to deal with O_TMPFILE not being
availeble everywhere...