]> git.ipfire.org Git - thirdparty/systemd.git/log
thirdparty/systemd.git
3 months agocatalog: do not read catalog files outside of specified root directory 38006/head
Yu Watanabe [Tue, 1 Jul 2025 02:33:22 +0000 (11:33 +0900)] 
catalog: do not read catalog files outside of specified root directory

3 months agohwdb-util: do not read hwdb files outside of specified root directory
Yu Watanabe [Tue, 1 Jul 2025 02:21:09 +0000 (11:21 +0900)] 
hwdb-util: do not read hwdb files outside of specified root directory

3 months agohwdb-util: coding style update
Yu Watanabe [Tue, 1 Jul 2025 02:12:59 +0000 (11:12 +0900)] 
hwdb-util: coding style update

- use 'r' for storing results,
- use RET_GATHER().

3 months agoudev-rules: do not read udev rules files outside of specified root directory
Yu Watanabe [Tue, 1 Jul 2025 02:05:18 +0000 (11:05 +0900)] 
udev-rules: do not read udev rules files outside of specified root directory

3 months agoTEST-17-UDEV: conditionalize test cases for testuser
Yu Watanabe [Fri, 4 Jul 2025 07:54:49 +0000 (16:54 +0900)] 
TEST-17-UDEV: conditionalize test cases for testuser

Then, we can also run the test script in our local machine.

3 months agoudevadm: do not read udev rules files outside of the specified root directory
Yu Watanabe [Mon, 30 Jun 2025 19:46:41 +0000 (04:46 +0900)] 
udevadm: do not read udev rules files outside of the specified root directory

With this change, an invalid symlink and an empty file is silently
ignored. Hence, the test code is slightly updated.

3 months agopretty-print: make conf_files_cat() not show files outside of the specified root.
Yu Watanabe [Sun, 29 Jun 2025 20:22:53 +0000 (05:22 +0900)] 
pretty-print: make conf_files_cat() not show files outside of the specified root.

Then, make the function show the original and resolved path if they are
different.

With this change, procfs needs to be mounted on /proc/, hence the test
code is slightly updated.

3 months agopretty-print: several cleanups for cat_files()
Yu Watanabe [Sun, 29 Jun 2025 20:18:32 +0000 (05:18 +0900)] 
pretty-print: several cleanups for cat_files()

- drop redundant error messages in cat_files(), as cat_file() internally
  logs errors,
- show an empty line and filename before opening file, to make not mix
  any error messages with the previous file,
- drop unnecessary fflush(),
- use RET_GATHER() and continue to show files even if some files cannot
  be shown.

3 months agoconf-files: introduce conf_files_list_full() and friends that provides results in...
Yu Watanabe [Sun, 29 Jun 2025 02:01:52 +0000 (11:01 +0900)] 
conf-files: introduce conf_files_list_full() and friends that provides results in ConfFile

3 months agoconf-files: make conf_files_list() and friends internally use struct ConfFile
Yu Watanabe [Tue, 1 Jul 2025 01:33:54 +0000 (10:33 +0900)] 
conf-files: make conf_files_list() and friends internally use struct ConfFile

No functional change, just refactoring.

3 months agoconf-files: introduce struct ConfFile to store information of found conf file
Yu Watanabe [Sun, 29 Jun 2025 01:12:09 +0000 (10:12 +0900)] 
conf-files: introduce struct ConfFile to store information of found conf file

It is currently unused, will be used later. Preparation for later changes.

3 months agochase: allow to request O_PATH fd even with CHASE_NONEXISTENT
Yu Watanabe [Mon, 30 Jun 2025 06:18:47 +0000 (15:18 +0900)] 
chase: allow to request O_PATH fd even with CHASE_NONEXISTENT

3 months agotest-cgroup: Ignore ENOENT from cg_create(); test-cgroup-util: Ignore ENXIO in one...
Yu Watanabe [Fri, 11 Jul 2025 01:38:04 +0000 (10:38 +0900)] 
test-cgroup: Ignore ENOENT from cg_create(); test-cgroup-util: Ignore ENXIO in one more place (#38158)

This was the only test failure building systemd-252-51.el9 in a
container, also previously reported against 252-rc1 under Gentoo in
#25015

This is a forward-port of the patch we actually started using for CIQ's
builds of the EL9-derived package, which was:

```diff
--- systemd-252/src/test/test-cgroup.c 2022-10-31 18:59:18.000000000 +0000
+++ systemd-252-test/src/test/test-cgroup.c 2025-07-10 00:47:07.541000000 +0000
@@ -62,7 +62,7 @@
         log_info("Paths for test:\n%s\n%s", test_a, test_b);

         r = cg_create(SYSTEMD_CGROUP_CONTROLLER, test_a);
-        if (IN_SET(r, -EPERM, -EACCES, -EROFS)) {
+        if (IN_SET(r, -EPERM, -EACCES, -EROFS, -ENOENT)) {
                 log_info_errno(r, "Skipping %s: %m", __func__);
                 return;
         }
```

I confirmed that the `ERRNO_IS_NEG_FS_WRITE_REFUSED` macro is equivalent
to checking the first 3 error codes above, so the addition of the check
for `ENOENT` is still just as relevant as it was in 252, but adding it
into the macro would be inconsistent with its name, description, and
possible other uses. Hence, in this PR I'm adding the extra check into
the `if`.

3 months agojournalctl: do not fail on SIGTERM/SIGINT or STDOUT disconnect when running with...
Yu Watanabe [Fri, 11 Jul 2025 01:36:31 +0000 (10:36 +0900)] 
journalctl: do not fail on SIGTERM/SIGINT or STDOUT disconnect when running with --follow (#38116)

Closes #38114.

3 months agoio.systemd.Manager.Describe fix context/runtime split (#38135)
Yu Watanabe [Fri, 11 Jul 2025 01:26:57 +0000 (10:26 +0900)] 
io.systemd.Manager.Describe fix context/runtime split (#38135)

This PR rearranges fields in io.systemd.Manager.Describe according to
the guidance by Lennart:

> If a property can be set in a unit file, ever, then it belongs in context.
> Otherwise, it belongs to runtime.

Closes #38124.

3 months agoPlumbing to perform SELinux checks in varlink API (#38146)
Yu Watanabe [Fri, 11 Jul 2025 01:20:36 +0000 (10:20 +0900)] 
Plumbing to perform SELinux checks in varlink API (#38146)

This PR does minimal changes to introduce varlink support. Ideally, the
code should switch to using `mac_selinux_get_our_label()` and new
`mac_selinux_get_peer_label()`. But I leave it for now to minimize
breakage. `mac_selinux_get_peer_label()` remains unused.

This is a prep step to merge
https://github.com/systemd/systemd/pull/38032

3 months agotest-cgroup-util: Ignore ENXIO in one more place 38158/head
Solar Designer [Thu, 10 Jul 2025 23:46:38 +0000 (01:46 +0200)] 
test-cgroup-util: Ignore ENXIO in one more place

3 months agoNEWS: mention about the exit code change in journalctl --follow 38116/head
Yu Watanabe [Tue, 8 Jul 2025 09:42:02 +0000 (18:42 +0900)] 
NEWS: mention about the exit code change in journalctl --follow

3 months agotest: drop unnecessary disablement of pipefail
Yu Watanabe [Tue, 8 Jul 2025 09:36:09 +0000 (18:36 +0900)] 
test: drop unnecessary disablement of pipefail

3 months agomain-func: drop unused DEFINE_MAIN_FUNCTION_WITH_POSITIVE_SIGNAL()
Yu Watanabe [Tue, 8 Jul 2025 09:30:48 +0000 (18:30 +0900)] 
main-func: drop unused DEFINE_MAIN_FUNCTION_WITH_POSITIVE_SIGNAL()

3 months agojournalctl: do not fail on SIGTERM/SIGINT, or when STDOUT is disconnected
Yu Watanabe [Tue, 8 Jul 2025 09:19:05 +0000 (18:19 +0900)] 
journalctl: do not fail on SIGTERM/SIGINT, or when STDOUT is disconnected

The current behavior is not useful when e.g. pipefail is enabled.
Let's exit cleanly in such cases.

Closes #38114.

3 months agotest-cgroup: Ignore ENOENT from cg_create()
Solar Designer [Thu, 10 Jul 2025 22:28:14 +0000 (00:28 +0200)] 
test-cgroup: Ignore ENOENT from cg_create()

which was the only test failure building systemd-252-51.el9 in a
container, also previously reported against 252-rc1 under Gentoo
in #25015

3 months agoman/systemd.exec: explain how BPF token works
Matteo Croce [Thu, 10 Jul 2025 11:37:27 +0000 (13:37 +0200)] 
man/systemd.exec: explain how BPF token works

Add a small paragraph explaining how BPF token works, how it's being
created and its relationship between the BPF filesystem.
Move all the relevant documentation in the PrivateBPF= section and let
point all the BPFDelegate* options to that one.

3 months agojournald: support reloading configuration at runtime
Ubuntu [Wed, 11 Jun 2025 23:32:27 +0000 (23:32 +0000)] 
journald: support reloading configuration at runtime

3 months agoIntroduce ERRNO_IS_FS_WRITE_REFUSED(), and use it in binfmt_mounted() (#38117)
Lennart Poettering [Thu, 10 Jul 2025 19:38:13 +0000 (21:38 +0200)] 
Introduce ERRNO_IS_FS_WRITE_REFUSED(), and use it in binfmt_mounted() (#38117)

- This introduces ERRNO_IS_FS_WRITE_REFUSED(), and apply it where
usable.
- This makes unexpected errors in access_fd() called by binfmt_mounted()
propagated to the caller.
- Renames binfmt_mounted() to binfmt_mounted_and_writable(), as it also
checks the fs is writable.
- Voidifies one disable_binfmt() call in shutdown.c.

3 months agouserdb: Add userdb.transient credentials
DaanDeMeyer [Thu, 3 Jul 2025 19:22:41 +0000 (21:22 +0200)] 
userdb: Add userdb.transient credentials

To implement --bind-user in systemd-vmspawn, we need a transient
version of these credentials. These are useful when the home directory
of the user is mounted into the container/vm and every trace of the user
will be (mostly) gone again when the container/vm is shut down.

3 months agoselinux: mac_selinux_unit_access_check_varlink macros 38146/head
Ivan Kruglov [Thu, 10 Jul 2025 10:47:04 +0000 (03:47 -0700)] 
selinux: mac_selinux_unit_access_check_varlink macros

3 months agoselinux: mac_selinux_access_check_varlink_internal()
Ivan Kruglov [Thu, 10 Jul 2025 10:40:21 +0000 (03:40 -0700)] 
selinux: mac_selinux_access_check_varlink_internal()

3 months agoselinux: check_access()
Ivan Kruglov [Thu, 10 Jul 2025 13:55:18 +0000 (06:55 -0700)] 
selinux: check_access()

3 months agocore: leave a comment about context/runtime split 38135/head
Ivan Kruglov [Wed, 9 Jul 2025 13:15:04 +0000 (06:15 -0700)] 
core: leave a comment about context/runtime split

3 months agocore: move ControlGroup move runtime to context
Ivan Kruglov [Thu, 10 Jul 2025 14:36:13 +0000 (07:36 -0700)] 
core: move ControlGroup move runtime to context

3 months agocore: move ConfirmSpawn from runtime to context
Ivan Kruglov [Thu, 10 Jul 2025 14:33:20 +0000 (07:33 -0700)] 
core: move ConfirmSpawn from runtime to context

3 months agocore: move ShowStatus/Log* from runtime to context
Ivan Kruglov [Wed, 9 Jul 2025 13:00:04 +0000 (06:00 -0700)] 
core: move ShowStatus/Log* from runtime to context

3 months agocore: move WatchdogDevice from runtime to context
Ivan Kruglov [Wed, 9 Jul 2025 12:59:01 +0000 (05:59 -0700)] 
core: move WatchdogDevice from runtime to context

3 months agocore: move RuntimeWatchdog* + RebootWatchdog + KExecWatchdog from runtime to spec
Ivan Kruglov [Wed, 9 Jul 2025 12:57:55 +0000 (05:57 -0700)] 
core: move RuntimeWatchdog* + RebootWatchdog + KExecWatchdog from runtime to spec

3 months agocore: fix double Environment present in both context and runtime
Ivan Kruglov [Wed, 9 Jul 2025 12:54:07 +0000 (05:54 -0700)] 
core: fix double Environment present in both context and runtime

3 months agocore: move Version/Arch/Features/Taints/UnitPath from context to runtime
Ivan Kruglov [Wed, 9 Jul 2025 12:50:32 +0000 (05:50 -0700)] 
core: move Version/Arch/Features/Taints/UnitPath from context to runtime

3 months agoselinux: get_our_contexts()
Ivan Kruglov [Thu, 3 Jul 2025 13:40:14 +0000 (06:40 -0700)] 
selinux: get_our_contexts()

3 months agoselinux: rename mac_selinux_access_check_internal() -> mac_selinux_access_check_bus_i...
Ivan Kruglov [Thu, 3 Jul 2025 11:30:59 +0000 (04:30 -0700)] 
selinux: rename mac_selinux_access_check_internal() -> mac_selinux_access_check_bus_internal()

3 months agoselinux-utils: rename and expose log_selinux_enforcing_errno()
Ivan Kruglov [Thu, 10 Jul 2025 13:47:29 +0000 (06:47 -0700)] 
selinux-utils: rename and expose log_selinux_enforcing_errno()

3 months agoselinux-util: mac_selinux_get_peer_label()
Ivan Kruglov [Thu, 3 Jul 2025 11:53:23 +0000 (04:53 -0700)] 
selinux-util: mac_selinux_get_peer_label()

3 months agoruff: Default to python 3.7 version
DaanDeMeyer [Thu, 10 Jul 2025 12:47:03 +0000 (14:47 +0200)] 
ruff: Default to python 3.7 version

For some use cases we still want python 3.7 compat so let's default
to that and only target python 3.9 in a few specific cases.

3 months agoAdd --entry-type=type1|type2 option to kernel-install.
Li Tian [Tue, 8 Jul 2025 06:44:35 +0000 (14:44 +0800)] 
Add --entry-type=type1|type2 option to kernel-install.

Both kernel-core and kernel-uki-virt call kernel-install upon removal. Need an additional argument to avoid complete removal for both traditional kernel and UKI.

Signed-off-by: Li Tian <litian@redhat.com>
3 months agosocket-activate: Always send NOTIFY=ready
DaanDeMeyer [Thu, 3 Jul 2025 08:41:50 +0000 (10:41 +0200)] 
socket-activate: Always send NOTIFY=ready

Even if we're not using --accept=, it's very useful to be able to
synchronize on systemd-socket-activate having binded to its listen
socket, so let's always send READY=1. This means the payload can't
send READY=1 anymore but it's doubtful whether that's useful in this
case in the first place.

3 months agoTwo trivial nspawn fixes (#38152)
Daan De Meyer [Thu, 10 Jul 2025 14:19:18 +0000 (16:19 +0200)] 
Two trivial nspawn fixes (#38152)

3 months agovmspawn: Use virtio-blk-pci for image instead of virtio-scsi-pci
DaanDeMeyer [Thu, 3 Jul 2025 12:50:05 +0000 (14:50 +0200)] 
vmspawn: Use virtio-blk-pci for image instead of virtio-scsi-pci

We don't need a full blown SCSI controller just to present the main
root drive device to the VM. Let's simplify the storage stack by using
virtio-blk-pci instead.

Additionally, virtio-blk-pci is a builtin module in Arch and Fedora
which means we can do qemu direct kernel boot without needing an initrd.

3 months agoescape: Make quote_command_line() argument const
DaanDeMeyer [Thu, 3 Jul 2025 08:47:15 +0000 (10:47 +0200)] 
escape: Make quote_command_line() argument const

3 months agovmspawn: Disable hpet for vmspawn x86 virtual machines
DaanDeMeyer [Thu, 3 Jul 2025 08:37:25 +0000 (10:37 +0200)] 
vmspawn: Disable hpet for vmspawn x86 virtual machines

hpet is an emulated clocksource that is generally discouraged in favor
of kvm-clock or tsc for virtual machines. While vmspawn's virtual machines
already use kvm-clock, leaving hpet enabled causes qemu on the host to
consume a non-trivial amount of cpu, so let's disable the hpet feature since
we're not making use of it anyway.

3 months agoRevert "resolve: query the parent zone for DS records"
Yu Watanabe [Wed, 16 Apr 2025 21:53:02 +0000 (06:53 +0900)] 
Revert "resolve: query the parent zone for DS records"

This reverts commit 49ff90c70debc59f5a52e5cec5a92507d9868b9d.

3 months agonspawn: Use in_child_chown() in one more place 38152/head
DaanDeMeyer [Fri, 4 Jul 2025 19:21:35 +0000 (21:21 +0200)] 
nspawn: Use in_child_chown() in one more place

3 months agonspawn: Improve log message
DaanDeMeyer [Fri, 4 Jul 2025 19:21:25 +0000 (21:21 +0200)] 
nspawn: Improve log message

3 months agozsh-completion: generate completion for systemd-run from systemd-analyze
Eisuke Kawashima [Wed, 28 May 2025 10:25:17 +0000 (19:25 +0900)] 
zsh-completion: generate completion for systemd-run from systemd-analyze

continuation of #37641

3 months agonews: fix typo
Jörg Behrmann [Thu, 10 Jul 2025 07:52:42 +0000 (09:52 +0200)] 
news: fix typo

3 months agoman: clean up list of literals
Christian Hesse [Wed, 9 Jul 2025 10:26:39 +0000 (12:26 +0200)] 
man: clean up list of literals

3 months agoci: also set TEST_RUNNER environment variable in coverage test
Yu Watanabe [Wed, 9 Jul 2025 06:36:05 +0000 (15:36 +0900)] 
ci: also set TEST_RUNNER environment variable in coverage test

Otherwise, integration-test-wrapper.py will fail.
```
Traceback (most recent call last):
  File "/home/runner/work/systemd/systemd/test/integration-tests/integration-test-wrapper.py", line 693, in <module>
    main()
    ~~~~^^
  File "/home/runner/work/systemd/systemd/test/integration-tests/integration-test-wrapper.py", line 677, in main
    runner = os.environ['TEST_RUNNER']
             ~~~~~~~~~~^^^^^^^^^^^^^^^
  File "<frozen os>", line 717, in __getitem__
KeyError: 'TEST_RUNNER'
```

Follow-up for c0a5801f7b034f3473c10f627d54671e1588963b.

3 months agoshutdown: voidify disable_binfmt() 38117/head
Yu Watanabe [Tue, 8 Jul 2025 09:58:14 +0000 (18:58 +0900)] 
shutdown: voidify disable_binfmt()

3 months agoman: fix typo
Yu Watanabe [Thu, 10 Jul 2025 05:01:01 +0000 (14:01 +0900)] 
man: fix typo

Follow-up for 7baf4034304e2e658473a48a0ccbe0656da7f2f6.

3 months agobinfmt-util: rename binfmt_mounted() -> binfmt_mounted_and_writable()
Yu Watanabe [Wed, 9 Jul 2025 06:18:24 +0000 (15:18 +0900)] 
binfmt-util: rename binfmt_mounted() -> binfmt_mounted_and_writable()

As it does not only check if binfmt_misc is mounted, but also check if
it is writable.

3 months agobinfmt-util: propagate failure in access_fd()
Yu Watanabe [Tue, 8 Jul 2025 09:51:55 +0000 (18:51 +0900)] 
binfmt-util: propagate failure in access_fd()

It is not necessary to hide errors in access_fd(), and only acceptable
errors here are -EROFS, -EACCES, and -EPERM.

3 months agoerrno-util: introduce ERRNO_IS_NEG_FS_WRITE_REFUSED()
Yu Watanabe [Wed, 9 Jul 2025 06:13:23 +0000 (15:13 +0900)] 
errno-util: introduce ERRNO_IS_NEG_FS_WRITE_REFUSED()

3 months agoukify: fix version detection for aarch64 zboot kernels with gzip or lzma compression
Zbigniew Jędrzejewski-Szmek [Wed, 9 Jul 2025 21:02:28 +0000 (23:02 +0200)] 
ukify: fix version detection for aarch64 zboot kernels with gzip or lzma compression

Fixes https://github.com/systemd/systemd/issues/34780. The number in the header
is the size of the *compressed* data, so for gzip we'd read the initial part of
the decompressed data (equal to the size of the compressed data) and not find
the version string. Later on, Fedora switched to zstd compression, and there we
correctly use the number as the size of the compressed data, so we stopped
hitting the issue, but we should still fix it for older kernels.

I verified that the fix works for gzip-compressed kernels. I also made the same
change for the code for lzma compression. I'm pretty sure it is the right thing,
even though I don't have such a kernel at hand to test.

>>> ukify.Uname.scrape('/lib/modules/6.12.0-0.rc2.24.fc42.aarch64/vmlinuz')
Real-Mode Kernel Header magic not found
+ readelf --notes /lib/modules/6.12.0-0.rc2.24.fc42.aarch64/vmlinuz
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
Found uname version: 6.12.0-0.rc2.24.fc42.aarch64

3 months agoTEST-04-JOURNAL: drop unexpected whitespace
Zbigniew Jędrzejewski-Szmek [Tue, 8 Jul 2025 14:42:29 +0000 (16:42 +0200)] 
TEST-04-JOURNAL: drop unexpected whitespace

3 months agocore: followups for the recent subgroup killing commits
Lennart Poettering [Wed, 9 Jul 2025 14:27:28 +0000 (16:27 +0200)] 
core: followups for the recent subgroup killing commits

This is a follow-up for 0f23564ad4a191a92bc5544edf800bb2cfbb3513 and
6b02854f508be3f27b45353dd1d12de7d93cab5f, as suggested here:

https://github.com/systemd/systemd/pull/37855#pullrequestreview-2997596953

3 months agogenerate-bpf-delegate-configs: fix compatibility with Python 3.7
Antonio Alvarez Feijoo [Wed, 9 Jul 2025 08:08:34 +0000 (10:08 +0200)] 
generate-bpf-delegate-configs: fix compatibility with Python 3.7

- Operator ":=" requires Python 3.8 or newer.
- list[str] requires Python 3.9 or newer.

Follow-up for ea9826eb946d57aaba7e6bfa2d6b120136c6b20f

3 months agocore: add 'DefaultRestrictSUIDSGID' config option (#38126)
Yu Watanabe [Thu, 10 Jul 2025 04:30:07 +0000 (13:30 +0900)] 
core: add 'DefaultRestrictSUIDSGID' config option (#38126)

closes #37602, see there for extra motivation and considered
alternatives.

On typical systems, only few services need to create SUID/SGID files.
This often is limited to the user explicitly setting suid/sgid, the
`systemd-tmpfiles*` services, and the package manager. Allowing a
default to globally restrict creation of suid/sgid files makes it easier
to apply this restriction precisely.

## testing done
- built on aarch64-linux and x86_64-linux
- ran a VM test on x86_64-linux, checking for:
    - VM system boots successfully
    - defaults apply (both `yes`, `no`, and undefined)
    - systemd tmpfiles can set suid/sgid on journal log path
- Other services explicitly defining `RestrictSUIDSGID=no` can create
suid files

3 months agoman/systemd.exec: update documentation for PrivateBPF= (#38142)
Yu Watanabe [Thu, 10 Jul 2025 04:13:54 +0000 (13:13 +0900)] 
man/systemd.exec: update documentation for PrivateBPF= (#38142)

Follow-up for #36134

Add a short description about what PrivateBPF=yes does and how it can be
useful.

3 months agoman/systemd.exec: update documentation for PrivateBPF= 38142/head
Matteo Croce [Wed, 9 Jul 2025 22:12:36 +0000 (00:12 +0200)] 
man/systemd.exec: update documentation for PrivateBPF=

Add a short description about what PrivateBPF=yes does
and how it can be useful.

3 months agoman/systemd.exec: use constant instead of literal
Matteo Croce [Wed, 9 Jul 2025 23:25:48 +0000 (01:25 +0200)] 
man/systemd.exec: use constant instead of literal

Use <constant> instead of <literal> otherwise every configuration item
is wrapped in double quotes.

3 months agoupdate TODO
Lennart Poettering [Wed, 9 Jul 2025 20:32:18 +0000 (22:32 +0200)] 
update TODO

3 months agocore: document 'DefaultRestrictSUIDSGID' 38126/head
Grimmauld [Tue, 8 Jul 2025 19:39:06 +0000 (21:39 +0200)] 
core: document 'DefaultRestrictSUIDSGID'

3 months agocore/varlink-manager: Support 'DefaultRestrictSUIDSGID' option
Grimmauld [Wed, 9 Jul 2025 09:28:10 +0000 (11:28 +0200)] 
core/varlink-manager: Support 'DefaultRestrictSUIDSGID' option

3 months agocore/dbus-manager: Support 'DefaultRestrictSUIDSGID' option
Grimmauld [Wed, 9 Jul 2025 09:46:01 +0000 (11:46 +0200)] 
core/dbus-manager: Support 'DefaultRestrictSUIDSGID' option

3 months agocgroup: handle ENODEV on cg_read_pid() gracefully
Lennart Poettering [Wed, 9 Jul 2025 12:28:28 +0000 (14:28 +0200)] 
cgroup: handle ENODEV on cg_read_pid() gracefully

The recently added test case TEST-07-PID1.subgroup-kill.sh surfaced a
race: if we enumerate PIDs in a cgroup, and the cgroup is unlinked at
the very same time reading will result in ENODEV. We need to handle that
gracefully. Hence let's do so.

Noticed while looking at:

https://github.com/systemd/systemd/actions/runs/16143084441/job/45554929264?pr=38120

3 months agorecurse-dir: coding style cleanups; mount-util: teach open_tree_attr_fallback() our...
Yu Watanabe [Wed, 9 Jul 2025 18:32:33 +0000 (03:32 +0900)] 
recurse-dir: coding style cleanups; mount-util: teach open_tree_attr_fallback() our usual AT_EMPTY_PATH trick (#38130)

3 months agocore: add 'DefaultRestrictSUIDSGID' config option
Grimmauld [Tue, 8 Jul 2025 19:21:25 +0000 (21:21 +0200)] 
core: add 'DefaultRestrictSUIDSGID' config option

closes #37602

On typical systems, only few services need to create SUID/SGID files.
This often is limited to the user explicitly setting suid/sgid, the
`systemd-tmpfiles*` services, and the package manager. Allowing a default
to globally restrict creation of suid/sgid files makes it easier to apply
this restriction precisely.

3 months agounits/systemd-tmpfiles-setup.service: explicitly set RestrictSUIDSGID=no
Grimmauld [Tue, 8 Jul 2025 20:02:46 +0000 (22:02 +0200)] 
units/systemd-tmpfiles-setup.service: explicitly set RestrictSUIDSGID=no

The tmpfiles service is used to set file permissions, e.g. for setting
suid bit on the journal log directory [1].

[1] https://github.com/systemd/systemd/blob/48e0f7bc2f94e74d15eed5c9e70b1c0269a495ec/tmpfiles.d/systemd.conf.in#L24-L25

3 months agounits/initrd-cleanup.service: Conflict with emergency.target
Fabian Vogt [Tue, 8 Jul 2025 11:02:47 +0000 (13:02 +0200)] 
units/initrd-cleanup.service: Conflict with emergency.target

This is very similar to 327cd2d3db703555f8d572b4cd055fbe55e1068b:

If emergency.target is started while initrd-cleanup.service/start is queued,
the initrd-cleanup job did not get canceled. In parallel to the emergency
units, it eventually runs the service, which in turn isolates and starts
initrd-switch-root.target. This stops the emergency units and effectively
starts the initrd boot process again, which likely fails again like the
initial attempt. The system is thus stuck in a loop, never really reaching
emergency.target.

This can be triggered if a service in between initrd-parse-etc.service
and initrd.target fails.

With this conflict added, starting emergency.target automatically cancels
initrd-cleanup.service/start, avoiding the loop.

3 months agomount-util: teach open_tree_attr_fallback() our usual AT_EMPTY_PATH trick 38130/head
Mike Yuan [Wed, 9 Jul 2025 08:07:07 +0000 (10:07 +0200)] 
mount-util: teach open_tree_attr_fallback() our usual AT_EMPTY_PATH trick

While at it, rename it to _with_fallback following
the naming scheme we use elsewhere.

3 months agomount-util: regroup functions
Mike Yuan [Wed, 9 Jul 2025 07:19:50 +0000 (09:19 +0200)] 
mount-util: regroup functions

3 months agorecurse-dir: switch to FOREACH_ARRAY
Mike Yuan [Wed, 9 Jul 2025 07:55:15 +0000 (09:55 +0200)] 
recurse-dir: switch to FOREACH_ARRAY

3 months agorecurse-dir: use -EBADF as placeholder for invalid fd
Mike Yuan [Wed, 9 Jul 2025 07:35:40 +0000 (09:35 +0200)] 
recurse-dir: use -EBADF as placeholder for invalid fd

As per our coding style.

3 months agoAdd support for BPF tokens (#36134)
Yu Watanabe [Wed, 9 Jul 2025 06:12:22 +0000 (15:12 +0900)] 
Add support for BPF tokens (#36134)

Add a new option `PrivateBPF=` to mount a private instance of bpffs.
Add also four configuration options
`BPFDelegate{Commands,Maps,Programs,Attachments}=` which set the
corresponding bpffs mount options in order to create BPF tokens:
https://lwn.net/Articles/947173/

Closes#35108.

3 months agocore: add options to delegate BPFFS token creation 36134/head
Matteo Croce [Thu, 15 May 2025 14:32:46 +0000 (16:32 +0200)] 
core: add options to delegate BPFFS token creation

Add four new options BPFDelegate{Commands,Maps,Programs,Attachments}=
in order to delegate to a BPFFS instance the permission to create tokens.

The value is a list of options taken from:
https://github.com/torvalds/linux/blob/v6.14/include/uapi/linux/bpf.h#L922-L1121
The special value "any" means to allow every possible values.

More informations about BPF tokens here:
https://lwn.net/Articles/947173/

3 months agocore: Introduce PrivateBPF= to mount a private BPFFS
Matteo Croce [Fri, 27 Jun 2025 12:17:00 +0000 (14:17 +0200)] 
core: Introduce PrivateBPF= to mount a private BPFFS

Add a new option PrivateBPF= to mount a new instance of bpffs within a
namespace.
PrivateBPF= can be set to "no" to use the host bpffs in readonly mode
and "yes" to do a new mount.
The mount is done with the new fsopen()/fsmount() API because in future
we'll hook some commands between the two calls.

3 months agocore: split out setup_private_users_child()
Matteo Croce [Tue, 26 Nov 2024 10:54:29 +0000 (11:54 +0100)] 
core: split out setup_private_users_child()

Drop support for kernels older than 3.19, as this is where
/proc/<pid>/setgroups was added.

https://github.com/torvalds/linux/commit/9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8

3 months agotests: run test with CAP_BPF
Matteo Croce [Wed, 25 Jun 2025 12:42:48 +0000 (14:42 +0200)] 
tests: run test with CAP_BPF

Add CAP_BPF to tests run with nspawn, so we don't have to use a VM
to test BPF calls.

3 months agonspawn: create mountpoint for bpffs
Matteo Croce [Thu, 5 Jun 2025 08:00:05 +0000 (10:00 +0200)] 
nspawn: create mountpoint for bpffs

When we mount a tmpfs as /sys, create a mountpoint for bpf, as we
already do for cgroup

3 months agocore: fix owner check of PIDFile=, and update document (#38115)
Yu Watanabe [Tue, 8 Jul 2025 14:58:19 +0000 (23:58 +0900)] 
core: fix owner check of PIDFile=, and update document (#38115)

Closes #38108.

3 months agoA few changes related to linking and bitfields (#38118)
Yu Watanabe [Tue, 8 Jul 2025 14:57:44 +0000 (23:57 +0900)] 
A few changes related to linking and bitfields (#38118)

3 months agomeson: drop -ffunction-sections -fdata-sections 38118/head
Zbigniew Jędrzejewski-Szmek [Tue, 8 Jul 2025 11:18:07 +0000 (13:18 +0200)] 
meson: drop -ffunction-sections -fdata-sections

I added them in 41afb5eb7214727301132aedc381831fbfc78e37 without too
much explanation. Most likely the idea was to get rid of unused code
in libsystemd.so [1]. But now that I'm testing this, it doesn't seem
to have an effect. LTO is needed to get rid of unused functions, and
it's enough to have LTO without those options. Those options might have
some downsides [2], so let's disable them since there are doubts and no
particularly good reason to have them.

But keep the -Wl,--gc-sections option. Without this, libsystemd.so
grows a little:
-rwxr-xr-x 1 zbyszek zbyszek 5532424 07-08 13:24 build/libsystemd.so.0.40.0-orig
-rwxr-xr-x 1 zbyszek zbyszek 5614472 07-08 13:26 build/libsystemd.so.0.40.0-no-sections
-rwxr-xr-x 1 zbyszek zbyszek 5532392 07-08 13:27 build/libsystemd.so.0.40.0

Let's apply the --gc-sections option always to make the debug and final
builds more similar.

We need to verify that distro packages don't unexpectedly grow after this.

[1] https://unix.stackexchange.com/a/715901
[2] https://stackoverflow.com/a/36033811

3 months agobasic/stdio-util: use a fixed message in xsprintf
Zbigniew Jędrzejewski-Szmek [Tue, 8 Jul 2025 10:44:06 +0000 (12:44 +0200)] 
basic/stdio-util: use a fixed message in xsprintf

We put the name of the variable in the message, but it is a local variable
and the name does not have global meaning. We end up with pointless copies
of the error string:

$ strings build/libsystemd.so.0.40.0 | grep 'big enough'
xsprintf: p[] must be big enough
xsprintf: error[] must be big enough
xsprintf: prefix[] must be big enough
xsprintf: pty[] must be big enough
xsprintf: mode[] must be big enough
xsprintf: t[] must be big enough
xsprintf: s[] must be big enough
xsprintf: spid[] must be big enough
xsprintf: header_priority[] must be big enough
xsprintf: header_pid[] must be big enough
xsprintf: path[] must be big enough
xsprintf: buf[] must be big enough

The error message already shows the file, line, and function name, which
is enough to identify the problem:

  Assertion 'xsprintf: buffer too small' failed at src/test/test-string-util.c:20, function test_xsprintf(). Aborting.

3 months agotest-string-util: add a small test for xsprintf
Zbigniew Jędrzejewski-Szmek [Tue, 8 Jul 2025 10:55:17 +0000 (12:55 +0200)] 
test-string-util: add a small test for xsprintf

3 months agoMerge shared/exec-directory-util.? into basic/unit-def.?
Zbigniew Jędrzejewski-Szmek [Tue, 8 Jul 2025 10:09:31 +0000 (12:09 +0200)] 
Merge shared/exec-directory-util.? into basic/unit-def.?

Suggested in
https://github.com/systemd/systemd/pull/35892#discussion_r2180322856.

This is a tiny amount of code and does not warrant having a separate file
and spawning a separate instance of the compiler during the build.

Note: it took me a while to confirm that the contents of that table and
function don't end up in libsystemd.so. The issue is that they _are_ present in
it, unless LTO is used. We actually use link_whole[libbasic_static] for
libsystemd, so we end up with all that code there. LTO is needed to clean
that up.

3 months agoman: mention relative PIDFile= in user service is prefixed with $XDG_RUNTIME_DIR 38115/head
Yu Watanabe [Tue, 8 Jul 2025 08:49:52 +0000 (17:49 +0900)] 
man: mention relative PIDFile= in user service is prefixed with $XDG_RUNTIME_DIR

3 months agocore: allow to use PIDFile= in user session services
Yu Watanabe [Tue, 8 Jul 2025 08:37:33 +0000 (17:37 +0900)] 
core: allow to use PIDFile= in user session services

Fixes #38108.

Co-authored-by: 铝箔 <38349409+Sodium-Aluminate@users.noreply.github.com>
3 months agoupdate TODO
Lennart Poettering [Tue, 8 Jul 2025 08:53:51 +0000 (10:53 +0200)] 
update TODO

3 months agoshared/open-file: add line break
Zbigniew Jędrzejewski-Szmek [Mon, 7 Jul 2025 09:13:26 +0000 (11:13 +0200)] 
shared/open-file: add line break

We don't generally parenthesize additions, so drop that too.

3 months agoAdjust bitfields in struct Condition
Zbigniew Jędrzejewski-Szmek [Tue, 1 Jul 2025 11:39:00 +0000 (13:39 +0200)] 
Adjust bitfields in struct Condition

As is usually the case, the bitfields don't create the expected space savings,
because the field that follows needs to be aligned. But we don't want to fully
drop the bitfields here, because then ConditionType and ConditionResult are
each 4 bytes, and the whole struct grows from 32 to 40 bytes (on amd64). We
potentially have lots of little Conditions and that'd waste some memory.

Make each of the four fields one byte. This still allows the compiler to
generate simpler code without changing the struct size:

E.g. in condition_test:
                 c->result = CONDITION_ERROR;
-   78fab:      48 8b 45 e8             mov    -0x18(%rbp),%rax
-   78faf:      0f b6 50 01             movzbl 0x1(%rax),%edx
-   78fb3:      83 e2 03                and    $0x3,%edx
-   78fb6:      83 ca 0c                or     $0xc,%edx
-   78fb9:      88 50 01                mov    %dl,0x1(%rax)
+   78f8b:      48 8b 45 e8             mov    -0x18(%rbp),%rax
+   78f8f:      c6 40 03 03             movb   $0x3,0x3(%rax)

3 months agoupdate TODO
Lennart Poettering [Tue, 8 Jul 2025 07:56:24 +0000 (09:56 +0200)] 
update TODO

4 months agotest: invoke systemd-nspawn properly from a session
Lennart Poettering [Wed, 2 Jul 2025 13:22:35 +0000 (15:22 +0200)] 
test: invoke systemd-nspawn properly from a session

Let's not run user code outside of user context, that's not how things
are deployed, and means we cannot test the session setup properly