Note, BPF_PROG_TYPE_CGROUP_DEVICE is supported since kernel v4.15.
As our baseline on the kernel is v5.4, we can assume the bpf type is
always supported.
core/bpf-firewall: replace bpf_firewall_supported() with bpf_program_supported()
Note, BPF_PROG_TYPE_CGROUP_SKB is supported since kernel v4.10, and
BPF_F_ALLOW_MULTI and program name is supported since kernel v4.15.
As our baseline on the kernel is v5.4, we can assume that the type,
flag, and naming is supported when bpf_program_supported() succeeds.
core: replace cgroup_bpf_supported() with dlopen_bpf_full()
After 3988e2489aaf30034e09918890f688780c154af7, the function is a simple
wrapper of bpf_dlopen() with logging. Let's introduce dlopen_bpf_full()
that takes log level, and replace cgroup_bpf_supported() with it.
man: rework the description of $SYSTEMD_PAGER and $PAGER
$PAGER wasn't documented, but actually we treat it same as $SYSTEMD_PAGER,
except for lower priority. And the two variables can be used to disable the
pager, even if $SYSTEMD_PAGERSECURE is not set.
Behaviour is (obviously) not changed by this patch, it intentionally just
updates the docs to match the code.
man: reword the description of "secure pager" handling
The existing description was not *wrong*, but it was a bit muddled. Let's
reorder the text to give a short intro and then describe what the options
actually do and the clear "true" and "false" cases first, and then describe
autodetection.
Related to https://yeswehack.com/vulnerability-center/reports/346802.
Daan De Meyer [Wed, 7 May 2025 09:43:44 +0000 (11:43 +0200)]
compress: Drop lz4 includes from compress.h
The lz4 functions are only used in test-compress.c, so let's just
put the declarations and includes in there instead of having everyone
including compress.h pull in the lz4 headers.
Daan De Meyer [Tue, 6 May 2025 13:39:03 +0000 (15:39 +0200)]
basic: Override glibc's sys/param.h header with an empty file
Instead of unconditionally including sys/param.h in
macro-fundamental.h which itself includes a bunch of other unnecessary
headers, let's override it with an empty file to avoid it from overriding
our MAX() macro. We can't make including it an error as it's included (
for seemingly no good reason) by <resolv.h>.
Yu Watanabe [Fri, 9 May 2025 02:18:02 +0000 (11:18 +0900)]
userdb: introduce USERDB_SYNTHESIZE_NUMERIC flag
When the flag is set, even if the specified UID/GID does not exist,
create a synthetic user record for the UID/GID.
Currently, only system UID/GID are supported.
With the commit, User=/Group=root was refused and warned that the root
is not a system user. Typically it is not necessary to specify such, but
let's not log confusing warning and honor the setting.
Yu Watanabe [Thu, 1 May 2025 03:44:23 +0000 (12:44 +0900)]
user-util,user-record-nss: initialize buffer before calling getpwnam_r() and friends
The buffer will be used by a library outside of our code base,
and may not be initialized even on success. Let's initialize
them for safety.
Hopefully fixes the following fuzzer warning:
```
==2039==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f9ad8be3ae6 in _nss_files_getsgnam_r (/lib/x86_64-linux-gnu/libnss_files.so.2+0x8ae6) (BuildId: 013bf05b4846ebbdbebdb05585acc9726c2fabce)
#1 0x7f9ad93e5902 in getsgnam_r (/lib/x86_64-linux-gnu/libc.so.6+0x126902) (BuildId: 0323ab4806bee6f846d9ad4bccfc29afdca49a58)
#2 0x7f9ad9b98153 in nss_sgrp_for_group /work/build/../../src/systemd/src/shared/user-record-nss.c:357:21
#3 0x7f9ad9b98926 in nss_group_record_by_gid /work/build/../../src/systemd/src/shared/user-record-nss.c:431:21
#4 0x7f9ad9bcebd7 in groupdb_by_gid_fallbacks /work/build/../../src/systemd/src/shared/userdb.c:1372:29
Uninitialized value was created by a heap allocation
#0 0x556fd5294302 in malloc /src/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1021:3
#1 0x7f9ad9b9811d in nss_sgrp_for_group /work/build/../../src/systemd/src/shared/user-record-nss.c:353:23
#2 0x7f9ad9b98926 in nss_group_record_by_gid /work/build/../../src/systemd/src/shared/user-record-nss.c:431:21
#3 0x7f9ad9bcebd7 in groupdb_by_gid_fallbacks /work/build/../../src/systemd/src/shared/userdb.c:1372:29
```
exec-util: make missing agents a gracefull handled issues
Just downgrade the log message in case of ENOENT of agent binaries to
LOG_DEBUG. Do this in order to support distros which split off some
agent bianries into separate optional binaries.
Todd C. Miller [Tue, 6 May 2025 22:39:14 +0000 (16:39 -0600)]
flush_ports: flush POSIX message queues properly
On Linux, read() on a message queue descriptor returns the message
queue statistics, not the actual message queue data. We need to use
mq_receive() to drain the queues instead.
Fixes a problem where a POSIX message queue socket unit with messages
in the queue at shutdown time could result in a hang on reboot/shutdown.
Mike Yuan [Tue, 6 May 2025 22:36:41 +0000 (00:36 +0200)]
run0: support --chdir='~' for switching to target user's home dir
parse_path_argument() unconditionally makes the passed path absolute,
without handling '~' of any sort. I think this generally makes sense
in most tools, since ~ expansion is typically done by the shell
and we wouldn't be seeing them in the first place and hence special
casing is not worth it. But in run0 let's explicit enable '~'.
Mike Yuan [Wed, 9 Apr 2025 13:22:11 +0000 (15:22 +0200)]
core: accept "|" ExecStart= prefix to spawn target user's shell
When switching to another user it's oftentimes desirable to also spawn
the target user's shell. sudo supports this via -i flag, run0 currently
doesn't. We don't want to proactively query NSS ourselves, since
that would fall short when operating remotely. Let's instead teach
the service manager to spawn the command using the user's default shell.
I opted for "|" instead of "." in the end because the latter seems
a bit obscure. But happy to change it to something else if a better option
comes up.
Yu Watanabe [Wed, 7 May 2025 12:44:22 +0000 (21:44 +0900)]
units: enable IgnoreOnIsolate=yes on systemd-udevd-kernel.socket
Otherwise, initrd-cleanup.service requests isolation thus the socket
is stopped before switching root, and several early events after
switching root may be lost.
test-sd-login: add a "test" that just calls all sd_pid_get_* functions
As a test, it just increases our code coverage in a fake way.
When run manually, it can be used to conveniently print what logind
thinks about various processes:
$ build/test-sd-login
sd_pid_get_session(0) → No data available
sd_pid_get_unit(0) → user@1000.service
sd_pid_get_user_unit(0) → app-ghostty-transient-5088.scope
sd_pid_get_machine_name(0) → No such file or directory
sd_pid_get_slice(0) → user-1000.slice
sd_pid_get_user_slice(0) → app.slice
sd_pid_get_owner_uid(0) → 1000
sd_pid_get_cgroup(0) → /user.slice/user-1000.slice/user@1000.service/app.slice/app-ghostty-transient-5088.scope/surfaces/556FAF50BA40.scope
I initially wrote it this way, but then decided to implement a loop
limit, but forgot to drop the first approach in one place.
Fixup for 74cb65e45fbf3468cf6b522e4b4fa568d95f12c6.
man/systemd.exec: reword description of SystemCallFilter=
The existing text grew organically as features were added and was
not very organized. Reorder it and break into paragraphs grouped
by topic. The description of the :errno syntax is replaced by a short
reference to the SystemCallErrorNumber= setting. This makes the
text shorter and makes it easier to explain how the two settings combine.
damnkiwi6120 [Tue, 6 May 2025 18:53:32 +0000 (02:53 +0800)]
Replace reference URLs with working ones
The linuxfoundation.org entry at L50 goes 404, so I replace it with a working one from kernel.org.
Both links are checked with archive.org.
https://web.archive.org/web/20231114104223/https://lists.linuxfoundation.org/pipermail/virtualization/2015-August/030331.html
https://web.archive.org/web/20230503084037/https://docs.kernel.org/s390/pci.html