Mike Yuan [Sun, 4 Feb 2024 15:22:46 +0000 (23:22 +0800)]
core: reuse credential dir across start and start-post if populated,
fresh otherwise
Currently, exec_setup_credential() always rewrite all credentials
upon exec_invoke(), i.e. invocation of each ExecCommand, and within
a single tmpfs instance. This is problematic though:
* When writing each tmp cred file, we essentially double the size
of the credential. Therefore, if one cred is bigger than half
of CREDENTIALS_TOTAL_SIZE_MAX, confusing ENOSPC occurs (see also
https://github.com/systemd/systemd/pull/24734#issuecomment-1925440546)
* Credential is a unit-wide thing and thus should not change
during the whole lifetime of main process. However, if e.g.
a on-disk credential or SetCredential= in unit file
changes between ExecStart= and ExecStartPost=,
the credentials are overwritten when the latter gets to run,
and the already-running main process is suddenly seeing
completely different creds.
So, let's try to reuse final cred dir if the main process has started
and the tmpfs has been populated, so that the creds used is stable
across all ExecStart= and ExecStartPost=-s. We still want to retain
the ability of updating creds through ExecStartPre= though, therefore
we forcibly use a fresh cred dir for those. 'Fresh' means to actually
unmount the old tmpfs first, so the first problem goes away, too.
Felix Riemann [Fri, 2 Feb 2024 17:08:52 +0000 (18:08 +0100)]
cryptenroll: Fix reading keyfile from socket
systemd-cryptenroll uses the READ_FULL_FILE_CONNECT_SOCKET flag when
reading the keyfile to also allow reading it from a socket. But it also
sets the offset to 0, causing an unnecessary seek to the beginning of
the newly opened keyfile and disables socket support again, as these do
not support seeking.
Disable seeking entirely to remove the unneeded seek and restore support
for reading the keyfile from a socket again as with systemd-cryptsetup.
Also= lists units which should be enabled/disabled together with the first unit.
But userdbd is independent of homed, we shouldn't e.g. disable it even if homed
is disabled.
load-fragment: set PATH_CHECK_NON_API_VFS flag at various other places
I tried to be conservative here, and hence in doubt I left the flag off,
but in some cases I really can't see any reason why it would make sense
to specifiy paths into API VFS, hence add it there, to lock things down
a bit.
parse-helpers: add new PATH_CHECK_NON_API_VFS flag
In various contexts it's a bit icky to allow paths below /proc/, /sys/,
/dev/ i.e. file hierarchies where API VFS are placed. Let's add a new
flag for path_simplify_and_warn() to check for this and refuse a path if
in these paths.
Enable this when parsing WorkingDirectory=.
This is inspired by CVE-2024-21626, which uses trickery around the cwd
and /proc/self/fd/.
AFAICS we are not actually vulnerable to the same issue as explained in
the CVE since we execute the WorkingDirectory= setting very late, i.e.
long after we set up the new mount namespace. But let's filter out icky
stuff better earlier than later, as extra safety precaution.
Luca Boccassi [Fri, 12 Jan 2024 21:32:20 +0000 (21:32 +0000)]
core: add support for pidfd_spawn
Added in glibc 2.39, allows cloning into a cgroup and to get
a pid fd back instead of a pid. Removes race conditions for
both changing cgroups and getting a reliable reference for the
child process.
We already use __VA_OPT__ in multiple places, which was introduced in
gcc 8 [0], so let's bump the baseline to reflect that. I chose gcc 8.4,
as that was the lowest 8.x version I could easily get my hands on when I
verified this (on Ubuntu Focal with the gcc-8 package).
Mike Yuan [Sun, 4 Feb 2024 11:36:06 +0000 (19:36 +0800)]
core/service: don't setup credentials for ExecCondition= and ExecReload=
This seems to be a mistake in #27279. I believe credentials should
not be made available to condition or reload tasks. In most cases
they're irrelevant from the actual job of the service. Also, currently
the first ExecCondition= or ExecReload= cannot access creds anyway,
making the incompatibility introduced negligible.
If people actually come up with valid use cases, we can always
revisit this.
Ivan Shapovalov [Sat, 20 Jan 2024 11:52:28 +0000 (12:52 +0100)]
nspawn: permit --ephemeral with --link-journal=try-* (treat as =no)
Common sense says that to "try" something means "to not fail if
something turns out not to be possible", thus do not make this
combination a hard error.
The actual implementation ignores any --link-journal= setting when
--ephemeral is in effect, so the semantics are upheld.
vpick: use prefix_roota() to avoid double slash in log messages
If the toplevel_path is empty we end up with doubled leading slash,
which looks weird:
[ 4737.028985] testsuite-74.sh[102]: Inode '//var/lib/machines/mytree.v/mytree_37.0_arm64+2-3' has wrong type, found 'dir'.
[ 4737.028985] testsuite-74.sh[102]: Failed to pick version for '/var/lib/machines/mytree.v': Is a directory
...
[ 4316.957536] testsuite-74.sh[99]: Failed to open '//var/lib/machines/mytree.v/mytree_37.0': No such file or directory
...
Yu Watanabe [Sun, 21 Jan 2024 04:11:09 +0000 (13:11 +0900)]
pam: do not warn closing bus connection which is opened after the fork
In pam_systemd.so and pam_systemd_home.so, we open a bus connection on
session close, which is called after fork. Closing the connection is
harmless, and should not warn about that.
This suppresses the following log message:
===
(sd-pam)[127]: PAM Attempted to close sd-bus after fork, this should not happen.
===
networkException [Mon, 29 Jan 2024 21:31:59 +0000 (22:31 +0100)]
resolve: include interface name in org.freedesktop.resolve1 polkit checks
this patch adds the interface name of the interface to be modified
to *details* when verifying dbus calls to the `org.freedesktop.resolve1`
D-Bus interface for all `Set*` and the `Revert` method.
when defining a polkit rule, this allows limiting the access to a specific
interface:
```js
// This rule prevents the user "vpn" to disable DNSoverTLS for any
// other interface than "vpn0". The vpn service should be allowed
// to disable DNSoverTLS on its own as it provides a local DNS
// server with search domains on the interface and this server does
// not support DNSoverTLS.
polkit.addRule(function(action, subject) {
if (action.id == "org.freedesktop.resolve1.set-dns-over-tls" &&
action.lookup("interface") == "vpn0" &&
subject.user == "vpn") {
return polkit.Result.YES;
}
});
```
resolvectl: add JSON output support for "resolvectl query"
It's easy to add. Let's do so.
This only covers record lookups, i.e. with the --type= switch.
The higher level lookups are not covered, I opted instead to print a
message there to use --type= instead.
I am a bit reluctant to defining a new JSON format for the high-level
lookups, hence I figured for now a helpful error is good enough, that
points people to the right use.
Daan De Meyer [Tue, 30 Jan 2024 21:36:12 +0000 (22:36 +0100)]
mkosi: Stop using file provides with CentOS/Fedora
dnf5 does not download filelists metadata by default anymore as this
consists of a pretty big chunk of the repository metadata. Let's make
sure the filelists metadata doesn't have to be downloaded by dnf5 by
removing any usage of file provides from our package lists.
Adrian Vovk [Sun, 21 Jan 2024 01:29:40 +0000 (20:29 -0500)]
homed: Add InhibitSuspend() method
This returns an FD that can be used to temporarily inhibit the automatic
locking on system suspend behavior of homed. As long as the FD is open,
LockAllHomes() won't lock that home directory on suspend. This allows
desktop environments to implement custom more complicated behavior
Frantisek Sumsal [Tue, 30 Jan 2024 10:25:19 +0000 (11:25 +0100)]
meson: don't install broken tmpfiles config with sshd?confdir == 'no'
20-systemd-ssh-generator.conf expands SSHCONFDIR, which is bogus when we
build with -Dsshconfdir=no. Similarly, avoid expanding SSHDCONFDIR in
20-systemd-userdb.conf when building with -Dsshconfdir=no.
Frantisek Sumsal [Tue, 30 Jan 2024 15:27:58 +0000 (16:27 +0100)]
test: explicitly set nsec3-iterations to 0
knot v3.2 and later does this by default. knot v3.1 still has the default set to
10, but it also introduced a warning that the default will be changed to 0 in
later versions, so it effectively complains about its own default, which then
fails the config check. Let's just set the value explicitly to zero to avoid
that.
~# knotc --version
knotc (Knot DNS), version 3.1.6
~# grep nsec3-iterations test/knot-data/knot.conf || echo nope
nope
~# knotc -c /build/test/knot-data/knot.conf conf-check
warning: config, policy[auto_rollover_nsec3].nsec3-iterations defaults to 10, since version 3.2 the default becomes 0
Configuration is valid
Adrian Vovk [Wed, 24 Jan 2024 00:50:21 +0000 (19:50 -0500)]
core: Fail to start/stop/reload unit if frozen
Previously, unit_{start,stop,reload} would call the low-level cgroup
unfreeze function whenever a unit was started, stopped, or reloaded. It
did so with no error checking. This call would ultimately recurse up the
cgroup tree, and unfreeze all the parent cgroups of the unit, unless an
error occurred (in which case I have no idea what would happen...)
After the freeze/thaw rework in a previous commit, this can no longer
work. If we recursively thaw the parent cgroups of the unit, there may
be sibling units marked as PARENT_FROZEN which will no longer actually
have frozen parents. Fixing this is a lot more complicated than simply
disallowing start/stop/reload on a frozen unit
Adrian Vovk [Sun, 21 Jan 2024 20:05:20 +0000 (15:05 -0500)]
core: Rework recursive freeze/thaw
This commit overhauls the way freeze/thaw works recursively:
First, it introduces new FreezerActions that are like the existing
FREEZE and THAW but indicate that the action was initiated by a parent
unit. We also refactored the code to pass these FreezerActions through
the whole call stack so that we can make use of them. FreezerState was
extended similarly, to be able to differentiate between a unit that's
frozen manually and a unit that's frozen because a parent is frozen.
Next, slices were changed to check recursively that all their child
units can be frozen before it attempts to freeze them. This is different
from the previous behavior, that would just check if the unit's type
supported freezing at all. This cleans up the code, and also ensures
that the behavior of slices corresponds to the unit's actual ability
to be frozen
Next, we make it so that if you FREEZE a slice, it'll PARENT_FREEZE
all of its children. Similarly, if you THAW a slice it will PARENT_THAW
its children.
Finally, we use the new states available to us to refactor the code
that actually does the cgroup freezing. The code now looks at the unit's
existing freezer state and the action being requested, and decides what
next state is most appropriate. Then it puts the unit in that state.
For instance, a RUNNING unit with a request to PARENT_FREEZE will
put the unit into the PARENT_FREEZING state. As another example, a
FROZEN unit who's parent is also FROZEN will transition to
PARENT_FROZEN in response to a request to THAW.