core: add quota support for State, Cache, and Log exec directories (#35892)
Based on https://github.com/systemd/systemd/issues/7820, this adds support for
quota enforcement to State, Cache, and Log exec directories.
* Add new directives, StateDirectoryQuota=, CacheDirectoryQuota=, and
LogDirectoryQuota=, to define quotas as percentages (hard limits for
blocks and inodes) or absolute values (hard limits for blocks only).
* Add new directives, StateDirectoryQuotaAccounting=,
CacheDirectoryQuotaAccounting= and LogDirectoryQuotaAccounting= to keep
track of storage quotas but not enforce them (effectively just assigning
a project ID to defined exec directories).
So we exposed different names for the entry types in JSON than we named
our enum values. Which is very confusing. Let's unify that. Given that
the JSON fields are externally visible let's stick to that naming, even
though I think "unified" and "conf" would have been more descriptive.
This ensures we follow our usual logic that the enum identifiers and the
strings they map to use the same naming.
bootspec: rename boot_entry_type_to_string() to boot_entry_type_description_to_string()
This helper does not translate BootEntryType to a string matching the
enum's value names, but instead returns a human readable descriptive
string. Let's make it clearer what this, by including "description" in
the name.
Mike Yuan [Tue, 27 May 2025 23:02:04 +0000 (01:02 +0200)]
core/cgroup: tweak unit_invalidate_cgroup_bpf() a bit
- Rename to unit_invalidate_cgroup_bpf_firewall() to make it clear
that this is about CGROUP_CONTROLLER_BPF_FIREWALL only
- Report whether things changed in unit_invalidate_cgroup()
to avoid duplicate checks
These source files uses symbols provided by sys/stat.h, e.g. struct stat,
S_IFREG, S_IFBLK, and so on. Let's explicitly include sys/stat.h where
necessary.
Glibc's fcntl.h includes bits/stat.h, which provides these symbols, so
these symbols can be used without explicitly including sys/stat.h. But,
based on the discussion in #37922, we should explicitly include relevant
headers, and should not rely on the indirect inclusion.
shared/bus-unit-util: fix PrivateTmp=/PrivateUsers=/ProtectControlGroups= and Ex variants
For some fields, we perform careful parsing and verification on the sender
side. For other fields, we accept any string or strv. I think that actually
this is fine: we should optimize for the correct case, i.e. the user runs a
command that is valid. The server must perform parsing in all cases, so doing
the verification on the sender side doesn't add value. When doing parsing
locally, in case of invalid or unsupported input, we would generate the error
message locally, so we would avoid the D-Bus call, but the message itself is
not better and from the user's point of view, the result is the same. And by
doing the parsing only on the server side, we deal better with the case where
the sender has an older version of the software. By not doing verification, we
implicitly "support" new values. And when the sender has a newer version that
supports additional fields, that does not help as long as the server uses an
older version. So in case of version mismatches, parsing on the server side is
as good or better.
shared/bus-unit-util: tweak bus_append_exec_command to use Ex prop only if necessary
This changes little in behaviour, the conceptual part is more important. The
non-Ex variant is the actual name on the command line, and we should use the
non-Ex D-Bus property too, if it works. This increases compatibility with old
versions. But the code was mostly doing the right thing. Even the tests tested
the right thing.
We generally want to have error messages with a fixed structure that convey the
important information, i.e. field name, error value, and the offending text for
options that take short values. (The text is not printed for strings encoded with
base64 and hexmem or for credentials.)
Let's use a helper that prints the message in a fixed format in the majority of
cases. In the few places where a custom message is useful, the helper is not
used. The helper:
- prints the field name, value, and error info,
- quotes the value,
- handles -ENOMEM, so we don't need to handle it separately everywhere.
When this code was originally written, parse functions would return -1
as error. Nowadays day all return a good errno, so it is fine if we print
the corresponding strerror.
shared/bus-unit-util: tweak error handling in bus_append_exec_command
exec_command_flags_to_strv() should not fail, unless we screwed up, so assert
instead of returning an error. Also, no need to strdup constant _PATH_BSHELL;
drop that so that we can get rid of the oom error handling. Finally, rename
l → cmdline for clarity.
basic/include: replace _Static_assert() with static_assert()
If one of the header is included in a C++ source file, then using
_Static_assert() triggers compile error for some reasons.
Let's use static_assert(), which can be used by both C and C++ code.
Yu Watanabe [Wed, 25 Jun 2025 16:03:26 +0000 (01:03 +0900)]
namespace-util,nsresource: explicitly include sched.h
These source files uses symbols provided by sched.h, e.g.
setns(), unshare(), CLONE_NEWNS, and friends, but they do not explicitly
include sched.h. Currently, it is included indirectly via missing_syscall.h,
which is included by e.g. pidfd-util.h.
Let's explicitly include headers that provides symbols used in the code.
tree-wide: several cleanups for reading/writing /proc/sys/fs/nr_open
- use unsigned for the return value of read_nr_open(), as it does not
fail, and the kernel internally uses unsigned for the value,
- when bumping the value by PID1, let's start from the kernel's maximum
value defined in fs/file.c. The maximum value should be mostly an API
of the kernel, but may changed in a future, hence still try several
times if we fail to bump the value.
Co-authored-by: Jared Baur <jaredbaur@fastmail.com> Co-authored-by: John Rinehart <johnrichardrinehart@gmail.com>
meson: do not reference variable unless feature that defines it is enabled
SYSTEMD_LANGUAGE_FALLBACK_MAP is used by the localed test, and
language_fallback_map is defined by the localed meson.
If the feature is disabled, the test is not built so the env var
is not needed, and the meson variable is not defined so the build
fails.
Enable nspawn job, as there's no nested kvm so VMs are too slow. Fix
some tests that fail in a VM anyway, might add a nightly job later that
runs them.
chase() is arguably a hot path in our code, hence it deserves
some caching whether open_tree() is available. Moreover,
the manual set of r to -EPERM feels kinda ugly. Let's
instead extract this bit into its own function.
seccomp-util: allowlist open_tree() as part of @file-system
Now that we make use of open_tree() in places we previously used
openat() with O_PATH, it makes sense to move it from @mount to
@file-system. Without the OPEN_TREE_CLONE flag open_tree() is after all
unprivileged.
Note that open_tree_attr() I left in @mount, since it's purpose is
really to set mount options when cloning, and that's clearly a mount
related thing, not so much something you could use unpriv.
This addresses an issue tracked down by Antonio Feijoo: since the commit
that started to use open_tree() various apps started to crash because
they used seccomp filters and sd-device started to use open_tree()
internally.
* cc380fbc8a Install new files for upstream build
* 45f81ec53e Install new files for upstream build
* 105837d0ba Update changelog for 257.7-1 release
* bb17074bfd systemd-boot: reduce harmless noise on cleanup
* 363898fe05 systemd-boot: remove fb too on removal
For initrd presets, we can change the default to disable services
by default instead of enabling by default without breaking compat
so let's do that as it makes much more sense as a default than
enabling everything by default.
New workers we got from IBM can be used now. The GHA linter doesn't
recognize them yet, so add a local workaround until the change is
merged in the linter.
shared/bus-unit-util: also send empty array for LogFilterPatterns=
Before, for empty input, we'd send an array with one item with an empty
pattern. Use the helper which sends an empty array instead.
bus_exec_context_set_transient_property() ignores items with an empty
pattern, so the result should be the same.
Request in review:
https://github.com/systemd/systemd/pull/37665#discussion_r2182375988.
test-bus-unit-util: add a test that attempts to serialize all know transient settings
The samples were partially generated using claude.ai. Those examples are
usually fairly boring. I tried to remove obvious repetitions and add some more
interesting examples, but certainly more edge cases could be added.
In some cases, we are quite lenient and do almost no verification on the sender
side.
ssh-generator: generate /etc/issue.d/ with VSOCK ssh info data (#37819)
ssh-generator: generate /etc/issue.d/ with VSOCK ssh info data
I find myself trying to log into a fresh ParticleOS VM started via
systemd-vmspawn all the time, but I don't know its CID. Let's show it on
the getty screen, to make it immediately visible.
ukify: when decompressing kernel before signing, call verify on decompressed file
Otherwise it will fail as it's an archive, not a PE file:
Invalid DOS header magic
Can't open image /boot/vmlinuz.old
/boot/vmlinuz.old is compressed and cannot be loaded by UEFI, decompressing
+ sbverify --list /boot/vmlinuz.old
=========================== short test summary info ============================
FAILED ../src/ukify/test/test_ukify.py::test_efi_signing_sbsign[3650] - subprocess.CalledProcessError: Command '['sbverify', '--list', PosixPath('/boot/vmlinuz.old')]' returned non-zero exit status 1.
FAILED ../src/ukify/test/test_ukify.py::test_efi_signing_sbsign[None] - subprocess.CalledProcessError: Command '['sbverify', '--list', PosixPath('/boot/vmlinuz.old')]' returned non-zero exit status 1.
FAILED ../src/ukify/test/test_ukify.py::test_inspect - subprocess.CalledProcessError: Command '['sbverify', '--list', PosixPath('/boot/vmlinuz.old')]' returned non-zero exit status 1.