parse-helpers: add new PATH_CHECK_NON_API_VFS flag
In various contexts it's a bit icky to allow paths below /proc/, /sys/,
/dev/ i.e. file hierarchies where API VFS are placed. Let's add a new
flag for path_simplify_and_warn() to check for this and refuse a path if
in these paths.
Enable this when parsing WorkingDirectory=.
This is inspired by CVE-2024-21626, which uses trickery around the cwd
and /proc/self/fd/.
AFAICS we are not actually vulnerable to the same issue as explained in
the CVE since we execute the WorkingDirectory= setting very late, i.e.
long after we set up the new mount namespace. But let's filter out icky
stuff better earlier than later, as extra safety precaution.
Luca Boccassi [Fri, 12 Jan 2024 21:32:20 +0000 (21:32 +0000)]
core: add support for pidfd_spawn
Added in glibc 2.39, allows cloning into a cgroup and to get
a pid fd back instead of a pid. Removes race conditions for
both changing cgroups and getting a reliable reference for the
child process.
We already use __VA_OPT__ in multiple places, which was introduced in
gcc 8 [0], so let's bump the baseline to reflect that. I chose gcc 8.4,
as that was the lowest 8.x version I could easily get my hands on when I
verified this (on Ubuntu Focal with the gcc-8 package).
Mike Yuan [Sun, 4 Feb 2024 11:36:06 +0000 (19:36 +0800)]
core/service: don't setup credentials for ExecCondition= and ExecReload=
This seems to be a mistake in #27279. I believe credentials should
not be made available to condition or reload tasks. In most cases
they're irrelevant from the actual job of the service. Also, currently
the first ExecCondition= or ExecReload= cannot access creds anyway,
making the incompatibility introduced negligible.
If people actually come up with valid use cases, we can always
revisit this.
Ivan Shapovalov [Sat, 20 Jan 2024 11:52:28 +0000 (12:52 +0100)]
nspawn: permit --ephemeral with --link-journal=try-* (treat as =no)
Common sense says that to "try" something means "to not fail if
something turns out not to be possible", thus do not make this
combination a hard error.
The actual implementation ignores any --link-journal= setting when
--ephemeral is in effect, so the semantics are upheld.
vpick: use prefix_roota() to avoid double slash in log messages
If the toplevel_path is empty we end up with doubled leading slash,
which looks weird:
[ 4737.028985] testsuite-74.sh[102]: Inode '//var/lib/machines/mytree.v/mytree_37.0_arm64+2-3' has wrong type, found 'dir'.
[ 4737.028985] testsuite-74.sh[102]: Failed to pick version for '/var/lib/machines/mytree.v': Is a directory
...
[ 4316.957536] testsuite-74.sh[99]: Failed to open '//var/lib/machines/mytree.v/mytree_37.0': No such file or directory
...
Yu Watanabe [Sun, 21 Jan 2024 04:11:09 +0000 (13:11 +0900)]
pam: do not warn closing bus connection which is opened after the fork
In pam_systemd.so and pam_systemd_home.so, we open a bus connection on
session close, which is called after fork. Closing the connection is
harmless, and should not warn about that.
This suppresses the following log message:
===
(sd-pam)[127]: PAM Attempted to close sd-bus after fork, this should not happen.
===
networkException [Mon, 29 Jan 2024 21:31:59 +0000 (22:31 +0100)]
resolve: include interface name in org.freedesktop.resolve1 polkit checks
this patch adds the interface name of the interface to be modified
to *details* when verifying dbus calls to the `org.freedesktop.resolve1`
D-Bus interface for all `Set*` and the `Revert` method.
when defining a polkit rule, this allows limiting the access to a specific
interface:
```js
// This rule prevents the user "vpn" to disable DNSoverTLS for any
// other interface than "vpn0". The vpn service should be allowed
// to disable DNSoverTLS on its own as it provides a local DNS
// server with search domains on the interface and this server does
// not support DNSoverTLS.
polkit.addRule(function(action, subject) {
if (action.id == "org.freedesktop.resolve1.set-dns-over-tls" &&
action.lookup("interface") == "vpn0" &&
subject.user == "vpn") {
return polkit.Result.YES;
}
});
```
resolvectl: add JSON output support for "resolvectl query"
It's easy to add. Let's do so.
This only covers record lookups, i.e. with the --type= switch.
The higher level lookups are not covered, I opted instead to print a
message there to use --type= instead.
I am a bit reluctant to defining a new JSON format for the high-level
lookups, hence I figured for now a helpful error is good enough, that
points people to the right use.
Daan De Meyer [Tue, 30 Jan 2024 21:36:12 +0000 (22:36 +0100)]
mkosi: Stop using file provides with CentOS/Fedora
dnf5 does not download filelists metadata by default anymore as this
consists of a pretty big chunk of the repository metadata. Let's make
sure the filelists metadata doesn't have to be downloaded by dnf5 by
removing any usage of file provides from our package lists.
Adrian Vovk [Sun, 21 Jan 2024 01:29:40 +0000 (20:29 -0500)]
homed: Add InhibitSuspend() method
This returns an FD that can be used to temporarily inhibit the automatic
locking on system suspend behavior of homed. As long as the FD is open,
LockAllHomes() won't lock that home directory on suspend. This allows
desktop environments to implement custom more complicated behavior
Frantisek Sumsal [Tue, 30 Jan 2024 10:25:19 +0000 (11:25 +0100)]
meson: don't install broken tmpfiles config with sshd?confdir == 'no'
20-systemd-ssh-generator.conf expands SSHCONFDIR, which is bogus when we
build with -Dsshconfdir=no. Similarly, avoid expanding SSHDCONFDIR in
20-systemd-userdb.conf when building with -Dsshconfdir=no.
Frantisek Sumsal [Tue, 30 Jan 2024 15:27:58 +0000 (16:27 +0100)]
test: explicitly set nsec3-iterations to 0
knot v3.2 and later does this by default. knot v3.1 still has the default set to
10, but it also introduced a warning that the default will be changed to 0 in
later versions, so it effectively complains about its own default, which then
fails the config check. Let's just set the value explicitly to zero to avoid
that.
~# knotc --version
knotc (Knot DNS), version 3.1.6
~# grep nsec3-iterations test/knot-data/knot.conf || echo nope
nope
~# knotc -c /build/test/knot-data/knot.conf conf-check
warning: config, policy[auto_rollover_nsec3].nsec3-iterations defaults to 10, since version 3.2 the default becomes 0
Configuration is valid
Adrian Vovk [Wed, 24 Jan 2024 00:50:21 +0000 (19:50 -0500)]
core: Fail to start/stop/reload unit if frozen
Previously, unit_{start,stop,reload} would call the low-level cgroup
unfreeze function whenever a unit was started, stopped, or reloaded. It
did so with no error checking. This call would ultimately recurse up the
cgroup tree, and unfreeze all the parent cgroups of the unit, unless an
error occurred (in which case I have no idea what would happen...)
After the freeze/thaw rework in a previous commit, this can no longer
work. If we recursively thaw the parent cgroups of the unit, there may
be sibling units marked as PARENT_FROZEN which will no longer actually
have frozen parents. Fixing this is a lot more complicated than simply
disallowing start/stop/reload on a frozen unit
Adrian Vovk [Sun, 21 Jan 2024 20:05:20 +0000 (15:05 -0500)]
core: Rework recursive freeze/thaw
This commit overhauls the way freeze/thaw works recursively:
First, it introduces new FreezerActions that are like the existing
FREEZE and THAW but indicate that the action was initiated by a parent
unit. We also refactored the code to pass these FreezerActions through
the whole call stack so that we can make use of them. FreezerState was
extended similarly, to be able to differentiate between a unit that's
frozen manually and a unit that's frozen because a parent is frozen.
Next, slices were changed to check recursively that all their child
units can be frozen before it attempts to freeze them. This is different
from the previous behavior, that would just check if the unit's type
supported freezing at all. This cleans up the code, and also ensures
that the behavior of slices corresponds to the unit's actual ability
to be frozen
Next, we make it so that if you FREEZE a slice, it'll PARENT_FREEZE
all of its children. Similarly, if you THAW a slice it will PARENT_THAW
its children.
Finally, we use the new states available to us to refactor the code
that actually does the cgroup freezing. The code now looks at the unit's
existing freezer state and the action being requested, and decides what
next state is most appropriate. Then it puts the unit in that state.
For instance, a RUNNING unit with a request to PARENT_FREEZE will
put the unit into the PARENT_FREEZING state. As another example, a
FROZEN unit who's parent is also FROZEN will transition to
PARENT_FROZEN in response to a request to THAW.
creds-util: add a concept of "user-scoped" credentials
So far credentials are a concept for system services only: to encrypt or
decrypt credential you must be privileged, as only then you can access
the TPM and the host key.
Let's break this up a bit: let's add a "user-scoped" credential, that
are specific to users. Internally this works by adding another step to
the acquisition of the symmetric encryption key for the credential: if a
"user-scoped" credential is used we'll generate an symmetric encryption
key K as usual, but then we'll use it to calculate
K' = HMAC(K, flags || uid || machine-id || username)
and then use the resulting K' as encryption key instead. This basically
includes the (public) user's identity in the encryption key, ensuring
that only if the right user credentials are specified the correct key
can be acquired.
Yu Watanabe [Sat, 27 Jan 2024 18:27:41 +0000 (03:27 +0900)]
nspawn: resolve network interface names before moving to container network namespace
To escape a kernel issue fixed by
https://github.com/torvalds/linux/commit/8e15aee621618a3ee3abecaf1fd8c1428098b7ef,
let's resolve provided interface names earlier, and adjust the interface
name pairs with the result.
Yu Watanabe [Sat, 27 Jan 2024 17:49:22 +0000 (02:49 +0900)]
sd-netlink: unify network interface name getter and resolvers
This makes rtnl_resolve_interface() always check the existence of the
resolved interface, which previously did not when a decimal formatted
ifindex is provided, e.g. "1" or "42".