As described in https://github.com/systemd/systemd/issues/31235, the preset
state for systemd-homed-activate.service was unclear. On the one hand, we have
a preset with 'enable systemd-homed.service', and systemd-homed.service has
'Also=systemd-homed-activate.service systemd-homed-firstboot.service', so
'preset systemd-homed.service' would also enable those two services, but
'preset systemd-homed-activate.service' would disable it, because the presets
don't say it is enabled. It seems that this configuration is internally
inconsistent. As described in the issue, maybe systemctl should be smarter
here, or warn about such configs. Either way, let's make our config consistent.
Luca Boccassi [Wed, 7 Feb 2024 00:36:39 +0000 (00:36 +0000)]
portable: add --copy=mixed to copy images and link profiles
This new mode copies resources provided by the client, so that they
remain available for inspect/detach even if the original images are
deleted, but symlinks the profile as that is owned by the OS, so that
updates are automatically applied.
man: mention that preset-all is performed during early boot
The intro of systemd-firstboot is rewritten to make it clearer how it fits into
the big picture. Systemd does some machine-id and presets and
systemd-firstboot.service is used to interactively fill in the blanks.
sd-dhcp6-client: allow setting send-release when client is running
The send-release option only affects to the client when STOPPING. There
is no reason to do not allow this option to be set while the client is
running.
An user might want to delay the decision of sending a RELEASE message to
a later stage where the client is already running.
Yu Watanabe [Fri, 2 Feb 2024 04:08:35 +0000 (13:08 +0900)]
network: set 'removing' flag to remembered object
Previously, if address_remove() or friends called with a temporary
object, the removing flag is assigned to the temporary object, and is
not set to the remembered object. Hence, e.g.
route_is_ready_to_configure() wrongly judge a required address for a
route is (still) ready, hence networkd fails to configure the route.
After the commit, remembered Address objects by Link are always given by
kernel. Hence, it is not necessary to set the flag, as it is always
ignored by the kernel, and the kernel set the flag on notification if it
is necessary.
This is in preparation for https://github.com/systemd/systemd/pull/30360 to be
merged in a future release. As described there:
nscd is known to be racy [1] and it was already deprecated and later dropped
in Fedora a while back [1,2]. We don't need to support obsolete stuff in
systemd, and the cache in systemd-resolved provides a better solution anyway.
Note that our "support" is only the signal to flush the cache that we send at
various points. Nscd itself may still exist, dropping it is a decision to be
made in glibc.
Mike Yuan [Sun, 4 Feb 2024 15:22:46 +0000 (23:22 +0800)]
core: reuse credential dir across start and start-post if populated,
fresh otherwise
Currently, exec_setup_credential() always rewrite all credentials
upon exec_invoke(), i.e. invocation of each ExecCommand, and within
a single tmpfs instance. This is problematic though:
* When writing each tmp cred file, we essentially double the size
of the credential. Therefore, if one cred is bigger than half
of CREDENTIALS_TOTAL_SIZE_MAX, confusing ENOSPC occurs (see also
https://github.com/systemd/systemd/pull/24734#issuecomment-1925440546)
* Credential is a unit-wide thing and thus should not change
during the whole lifetime of main process. However, if e.g.
a on-disk credential or SetCredential= in unit file
changes between ExecStart= and ExecStartPost=,
the credentials are overwritten when the latter gets to run,
and the already-running main process is suddenly seeing
completely different creds.
So, let's try to reuse final cred dir if the main process has started
and the tmpfs has been populated, so that the creds used is stable
across all ExecStart= and ExecStartPost=-s. We still want to retain
the ability of updating creds through ExecStartPre= though, therefore
we forcibly use a fresh cred dir for those. 'Fresh' means to actually
unmount the old tmpfs first, so the first problem goes away, too.
Felix Riemann [Fri, 2 Feb 2024 17:08:52 +0000 (18:08 +0100)]
cryptenroll: Fix reading keyfile from socket
systemd-cryptenroll uses the READ_FULL_FILE_CONNECT_SOCKET flag when
reading the keyfile to also allow reading it from a socket. But it also
sets the offset to 0, causing an unnecessary seek to the beginning of
the newly opened keyfile and disables socket support again, as these do
not support seeking.
Disable seeking entirely to remove the unneeded seek and restore support
for reading the keyfile from a socket again as with systemd-cryptsetup.
Also= lists units which should be enabled/disabled together with the first unit.
But userdbd is independent of homed, we shouldn't e.g. disable it even if homed
is disabled.
load-fragment: set PATH_CHECK_NON_API_VFS flag at various other places
I tried to be conservative here, and hence in doubt I left the flag off,
but in some cases I really can't see any reason why it would make sense
to specifiy paths into API VFS, hence add it there, to lock things down
a bit.
parse-helpers: add new PATH_CHECK_NON_API_VFS flag
In various contexts it's a bit icky to allow paths below /proc/, /sys/,
/dev/ i.e. file hierarchies where API VFS are placed. Let's add a new
flag for path_simplify_and_warn() to check for this and refuse a path if
in these paths.
Enable this when parsing WorkingDirectory=.
This is inspired by CVE-2024-21626, which uses trickery around the cwd
and /proc/self/fd/.
AFAICS we are not actually vulnerable to the same issue as explained in
the CVE since we execute the WorkingDirectory= setting very late, i.e.
long after we set up the new mount namespace. But let's filter out icky
stuff better earlier than later, as extra safety precaution.
Luca Boccassi [Fri, 12 Jan 2024 21:32:20 +0000 (21:32 +0000)]
core: add support for pidfd_spawn
Added in glibc 2.39, allows cloning into a cgroup and to get
a pid fd back instead of a pid. Removes race conditions for
both changing cgroups and getting a reliable reference for the
child process.
We already use __VA_OPT__ in multiple places, which was introduced in
gcc 8 [0], so let's bump the baseline to reflect that. I chose gcc 8.4,
as that was the lowest 8.x version I could easily get my hands on when I
verified this (on Ubuntu Focal with the gcc-8 package).
Mike Yuan [Sun, 4 Feb 2024 11:36:06 +0000 (19:36 +0800)]
core/service: don't setup credentials for ExecCondition= and ExecReload=
This seems to be a mistake in #27279. I believe credentials should
not be made available to condition or reload tasks. In most cases
they're irrelevant from the actual job of the service. Also, currently
the first ExecCondition= or ExecReload= cannot access creds anyway,
making the incompatibility introduced negligible.
If people actually come up with valid use cases, we can always
revisit this.
Ivan Shapovalov [Sat, 20 Jan 2024 11:52:28 +0000 (12:52 +0100)]
nspawn: permit --ephemeral with --link-journal=try-* (treat as =no)
Common sense says that to "try" something means "to not fail if
something turns out not to be possible", thus do not make this
combination a hard error.
The actual implementation ignores any --link-journal= setting when
--ephemeral is in effect, so the semantics are upheld.