tree-wide: include asm/sgidefs.h to make _MIPS_SIM_ABI32 and friends defined
The header provides _MIPS_SIM_ABI32 and friends. Glibc indirectly includes
the header through sys/syscall.h or unistd.h, but let's explicitly include
the header where we use _MIPS_SIM_ABI32 and friends.
The header linux/quota.h provides e.g. QIF_DQBLKSIZE or PRJQUOTA, which
is used where the quota-util.h is included.
Let's explicitly include the header with 'IWYU pragma: export' tag.
The same concern as expalined in #37960 exists also in
missing_syscall.h. If we use enough new glibc, a function we want to use
may be already provided by glibc, but our baseline glibc may not. And it
is hard to detect in our daily development.
This moves all prototypes of syscalls to relevant headers, and missing
syscall functions are defined in relevant .c files of libc wrapper. This
way, we can use usual header as is, e.g. when we want to write code with
`move_mount()`, we can simply use sys/mount.h without checking if it is
supported by our baseline glibc.
conf-files: make conf-file enumerators provide more detailed information of enumerated files (#38006)
This introduces `struct ConfFile` that stores detailed information of an
enumerated file, and introduces `conf_files_list_full()` and friends
that provide results in `ConfFile`.
Then make udev, hwdb, catalog, and cat-files use the new function and
struct to make them not read files outside of specified root directory.
tree-wide: several cleanups for generating symbol lists and gperf files
- pass our system include directories to make generators use our libc
wrappers and latest kernel headers,
- include relevant headers in generated gperf file,
- use files() rather than find_program(), as the result of
find_program() cannot be passed to 'input' of custom_target(),
- move generate-bpf-delegate-configs.py to src/core/, as it is only used
by libcore.
selinux-util: downgrade log level to LOG_DEBUG when error code is zero
Previously, the logger is only used in error paths, but since fe3f2ac0734e64dcd729b00992a6261cbf4cc846, the logger is also used in a
success path. Let's not log loudly on success.
Yu Watanabe [Sun, 29 Jun 2025 20:18:32 +0000 (05:18 +0900)]
pretty-print: several cleanups for cat_files()
- drop redundant error messages in cat_files(), as cat_file() internally
logs errors,
- show an empty line and filename before opening file, to make not mix
any error messages with the previous file,
- drop unnecessary fflush(),
- use RET_GATHER() and continue to show files even if some files cannot
be shown.
r = cg_create(SYSTEMD_CGROUP_CONTROLLER, test_a);
- if (IN_SET(r, -EPERM, -EACCES, -EROFS)) {
+ if (IN_SET(r, -EPERM, -EACCES, -EROFS, -ENOENT)) {
log_info_errno(r, "Skipping %s: %m", __func__);
return;
}
```
I confirmed that the `ERRNO_IS_NEG_FS_WRITE_REFUSED` macro is equivalent
to checking the first 3 error codes above, so the addition of the check
for `ENOENT` is still just as relevant as it was in 252, but adding it
into the macro would be inconsistent with its name, description, and
possible other uses. Hence, in this PR I'm adding the extra check into
the `if`.
Plumbing to perform SELinux checks in varlink API (#38146)
This PR does minimal changes to introduce varlink support. Ideally, the
code should switch to using `mac_selinux_get_our_label()` and new
`mac_selinux_get_peer_label()`. But I leave it for now to minimize
breakage. `mac_selinux_get_peer_label()` remains unused.
This is a prep step to merge
https://github.com/systemd/systemd/pull/38032
Add a small paragraph explaining how BPF token works, how it's being
created and its relationship between the BPF filesystem.
Move all the relevant documentation in the PrivateBPF= section and let
point all the BPFDelegate* options to that one.
Introduce ERRNO_IS_FS_WRITE_REFUSED(), and use it in binfmt_mounted() (#38117)
- This introduces ERRNO_IS_FS_WRITE_REFUSED(), and apply it where
usable.
- This makes unexpected errors in access_fd() called by binfmt_mounted()
propagated to the caller.
- Renames binfmt_mounted() to binfmt_mounted_and_writable(), as it also
checks the fs is writable.
- Voidifies one disable_binfmt() call in shutdown.c.
To implement --bind-user in systemd-vmspawn, we need a transient
version of these credentials. These are useful when the home directory
of the user is mounted into the container/vm and every trace of the user
will be (mostly) gone again when the container/vm is shut down.
Li Tian [Tue, 8 Jul 2025 06:44:35 +0000 (14:44 +0800)]
Add --entry-type=type1|type2 option to kernel-install.
Both kernel-core and kernel-uki-virt call kernel-install upon removal. Need an additional argument to avoid complete removal for both traditional kernel and UKI.
Both EPEL 9 and 10 now have the packages we need except for dhcp-server
so let's get rid of the EPEL conditionals and simply skip the tests that
require dhcp-server on CentOS.
While we're at it, make sure we use the new Architecture=uefi match in
mkosi to simplify the uefi checks.
* 184472f0f1 mkosi-tools: make sure p11-kit dir exists when configuring module
* 9fb807884e mkosi-tools: Explicitly install p11-kit
* 9131877d60 Support matching against architectures with uefi support
* f1eab5a783 Rename sandbox verb to box
* d609f55d98 Fix /var/tmp directory cleanup
* 4997b9495c build(deps): bump github/codeql-action from 3.28.18 to 3.29.2
Even if we're not using --accept=, it's very useful to be able to
synchronize on systemd-socket-activate having binded to its listen
socket, so let's always send READY=1. This means the payload can't
send READY=1 anymore but it's doubtful whether that's useful in this
case in the first place.
vmspawn: Use virtio-blk-pci for image instead of virtio-scsi-pci
We don't need a full blown SCSI controller just to present the main
root drive device to the VM. Let's simplify the storage stack by using
virtio-blk-pci instead.
Additionally, virtio-blk-pci is a builtin module in Arch and Fedora
which means we can do qemu direct kernel boot without needing an initrd.
vmspawn: Disable hpet for vmspawn x86 virtual machines
hpet is an emulated clocksource that is generally discouraged in favor
of kvm-clock or tsc for virtual machines. While vmspawn's virtual machines
already use kvm-clock, leaving hpet enabled causes qemu on the host to
consume a non-trivial amount of cpu, so let's disable the hpet feature since
we're not making use of it anyway.
ci: also set TEST_RUNNER environment variable in coverage test
Otherwise, integration-test-wrapper.py will fail.
```
Traceback (most recent call last):
File "/home/runner/work/systemd/systemd/test/integration-tests/integration-test-wrapper.py", line 693, in <module>
main()
~~~~^^
File "/home/runner/work/systemd/systemd/test/integration-tests/integration-test-wrapper.py", line 677, in main
runner = os.environ['TEST_RUNNER']
~~~~~~~~~~^^^^^^^^^^^^^^^
File "<frozen os>", line 717, in __getitem__
KeyError: 'TEST_RUNNER'
```
ukify: fix version detection for aarch64 zboot kernels with gzip or lzma compression
Fixes https://github.com/systemd/systemd/issues/34780. The number in the header
is the size of the *compressed* data, so for gzip we'd read the initial part of
the decompressed data (equal to the size of the compressed data) and not find
the version string. Later on, Fedora switched to zstd compression, and there we
correctly use the number as the size of the compressed data, so we stopped
hitting the issue, but we should still fix it for older kernels.
I verified that the fix works for gzip-compressed kernels. I also made the same
change for the code for lzma compression. I'm pretty sure it is the right thing,
even though I don't have such a kernel at hand to test.
>>> ukify.Uname.scrape('/lib/modules/6.12.0-0.rc2.24.fc42.aarch64/vmlinuz')
Real-Mode Kernel Header magic not found
+ readelf --notes /lib/modules/6.12.0-0.rc2.24.fc42.aarch64/vmlinuz
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
Found uname version: 6.12.0-0.rc2.24.fc42.aarch64
closes #37602, see there for extra motivation and considered
alternatives.
On typical systems, only few services need to create SUID/SGID files.
This often is limited to the user explicitly setting suid/sgid, the
`systemd-tmpfiles*` services, and the package manager. Allowing a
default to globally restrict creation of suid/sgid files makes it easier
to apply this restriction precisely.
## testing done
- built on aarch64-linux and x86_64-linux
- ran a VM test on x86_64-linux, checking for:
- VM system boots successfully
- defaults apply (both `yes`, `no`, and undefined)
- systemd tmpfiles can set suid/sgid on journal log path
- Other services explicitly defining `RestrictSUIDSGID=no` can create
suid files
The recently added test case TEST-07-PID1.subgroup-kill.sh surfaced a
race: if we enumerate PIDs in a cgroup, and the cgroup is unlinked at
the very same time reading will result in ENODEV. We need to handle that
gracefully. Hence let's do so.