This implements a minimal subset of #24961, but in a lot more
restrictive way: we only allow one level of subcgroup (as that's enough
to address the no-processes in inner cgroups rule), and does not change
anything about threaded cgroup logic or similar, or make any of this new
behaviour mandatory.
All this does is this: all non-control processes we invoke for a unit
we'll invoke in a subgroup by the specified name.
We'll later port all our current services that use cgroup delegation
over to this, i.e. user@.service, systemd-nspawn@.service and
systemd-udevd.service.
Let's clean up validation/escaping of cgroup names. i.e. split out code
that tests if name needs escaping. Return proper error codes, and extend
test a bit.
suid binaries and device nodes should not be placed there, hence forbid
it.
Of all the API VFS we mount from PID 1 or via a unit file this one is
the only one where we didn't add MS_NODEV/MS_NOSUID. Let's address that,
since there's really no reason why device nodes or suid binaries would
be placed in hugetlbfs.
image-policy: split out code that "extends" underspecified partition policy flags
When encoding partition policy flags we allow parts of the flags to be
"unspecified" (i.e. entirely zeros), which when actually checking the
policy we'll automatically consider equivalent to "any" (i.e. entirely
ones). This "extension" of the flags was so far done as part of
partition_policy_normalized_flags(). Let's split this logic out into a
new function partition_policy_flags_extend() that simply sets all bits
in a specific part of the flags field if they were entirely zeroes so
far.
When comparing policy objects for equivalence we so far used
partition_policy_normalized_flags() to compare the per-designator flags,
which thus meant that "underspecified" flags, and fully specified ones
that are set to "any" were considered equivalent. Which is great.
However, we forgot to do that for the fallback policy flags, the flags
that apply to all partitions for which no explicit policy flags are
specified.
Let's use the new partition_policy_flags_extend() call to compare them
in extended form, so that there two we can hide the difference between
"underspecified" and "any" flags.
ukify supports signing with multiple keys, so show an example of this, and just
let ukify print the calls to systemd-measure that will be done.
This also does other small cleanups:
- Use more realistic names in examples
- Use $ as the prompt for commands that don't require root (most don't).
Once we switch to operations that don't require a TPM, we should be able to get
rid of the remaining calls that require root.
- Ellipsize or linebreak various parts
- Use --uname. We warn if it is not specified and we have to do autodetection, so
let's nudge people towards including it rather than not.
core/job: use new job ID when we failed to deserialize job ID
This is for the case when we fail to deserialize job ID.
In job_install_deserialized(), we also check the job type, and that is
for the case when we failed to deserialize the job.
Let's gracefully handle the failure in deserializing the job ID.
This is paranoia, and just for safety. Should not change any behavior.
core/transaction: use hashmap_remove_value() to make not remove job with same ID
When we fail to deserialize job ID, or the current_job_id is overflowed,
we may have jobs with the same ID.
This is paranoia, and just for safety.
Note, we already use hashmap_remove_value() in job_uninstall().
We translate 'all' to UNIT64_MAX, which has a lot more 'f's. Use the
helper macro, since a decimal uint64_t will always be >> than a hex
representation.
root@image:~# systemd-run -t --property CoredumpFilter=all ls /tmp
Running as unit: run-u13.service
Press ^] three times within 1s to disconnect TTY.
*** stack smashing detected ***: terminated
[137256.320511] systemd[1]: run-u13.service: Main process exited, code=dumped, status=6/ABRT
[137256.320850] systemd[1]: run-u13.service: Failed with result 'core-dump'.
As described in systemd/systemd#27204 reexecuting the daemon while
running in a systemd-run "session" causes the session end prematurely.
Let's skip the Reexecute() method in dfuzzer and trigger it manually
until the issue is resolved.
journal: Don't try to write garbage if journal entry is corrupted
If journal_file_data_payload() returns -EBADMSG or -EADDRNOTAVAIL,
we skip the entry and go to the next entry, but we never modify
the number of items that we pass to journal_file_append_entry_internal()
if that happens, which means we could try to append garbage to the
journal file.
Let's keep track of the number of fields we've appended to avoid this
problem.
locale: when no xvariant match select the entry with an empty xvariant
When doing a conversion and the specified 'xc->xvariant' has no match, select
the x11 layout entry with a matching layout and an empty xvariant if such entry
exists. It's still better than no conversion at all.
"credential provided widget" would be better spelled as "credential-provided widget".
But let's adjust the message to name the bad credential explicitly: this
makes it easier to fix for the user.
shared/creds-util: return 0 for missing creds in read_credential_strings_many
Realistically, the only thing that the caller can do is ignore failures related
to missing credentials. If the caller requires some credentials to be present,
they should just check which output variables are not NULL. One of the callers
was already doing that, and the other wanted to, but missed -ENOENT. By
suppressing -ENOENT and -ENXIO, both callers are simplified.
Fixes a warning at boot:
systemd-vconsole-setup[221]: Failed to import credentials, ignoring: No such file or directory
This will make things a bit longer for now, but more powerful as we can
reuse the userns fd between calls to remount_idmap() if we need to
adjust multiple mounts.
No change in behaviour, just some minor refactoring.
I guess it was only a question of time until we need to add the final
frontier of notification functions: one that combines the features of
all the others:
1. specifiying a source PID
2. taking a list of fds to send along
3. accepting a format string for the status string
pam: do not attempt to close sd-bus after fork in pam_end()
When pam_end() is called after a fork, and it cleans up caches, it sets
PAM_DATA_SILENT in error_status. FDs will be shared with the parent, so
we do not want to attempt to close them from a child process, or we'll
hit assertions. Complain loudly and skip.
it's a bit confusing that on 32bit systems we'd risk session IDs
overruns like this. Let's expose the same behaviour everywhere and stick
to 64bit ids.
Since we format the ids as strings anyway this doesn't really change
anything performance-wise, it just pushes out collisions by overrun to
basically never happen.
sd-event: store and compare per-module static origin id
sd-event objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-event object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-event are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
sd-journal: store and compare per-module static origin id
sd-journal objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-journal object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-journal are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
sd-bus: store and compare per-module static origin id
sd-bus objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-bus object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-bus are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
Wolfgang Müller [Mon, 24 Apr 2023 18:00:56 +0000 (20:00 +0200)]
cryptsetup-fido2: Depend on libcryptsetup
crypsetup-fido2 always depended on both libfido2 and libcryptsetup, but 0a8e026e825dda142a8f1552a4b45815cbfd0b48 forgot to make the then
implicit dependency on libcryptsetup explicit when moving it from
cryptsetup/ to shared/. This breaks builds when libfido2 is autodetected
but the system is missing libcryptsetup.
Introduce an explicit check for HAVE_LIBCRYPTSETUP such that
cryptsetup-fido2 is only built when both libraries are available.
homed: rename make_userns() to avoid name conflict with mount-util.[ch]
This doesn't really matter too much as both are static functions. But
it's confusing as hell both when debugging and reading code, given that
homed actually uses mount-util.c
Hence, let's just rename one of the two, to minimize confusion.
No actual change in behaviour.
(and sooner or later we might want to export mount-util.c's version of
the function, since it's generically useful)
locale: convert generated vconsole keymap to x11 layout automatically
When doing x11->console conversions, find_converted_keymap() searches
automatically for a candidate in the converted keymap directory for a given x11
layout.
However doing console->x11 conversions, this automatic search is not done hence
simple conversion in this direction can't be achieved without populating
kbd-model-map with entries for converted keymaps.
For example, let's consider "at" layout which is not part of kbd-model-map. The
"at" x11 layout has a generated keymap
"/usr/share/kbd/keymaps/xkb/at.map.gz". If we configure "at" for the x11
layout, localed is able to automatically find the "at" converted vc layout and
the conversion just works :
$ localectl set-x11-keymap at
$ localectl
System Locale: LANG=en_US.UTF-8
VC Keymap: at
X11 Layout: at
However in the opposite direction, ie when setting the vc keymap to "at", no
conversion is done and the x11 layout is not defined:
$ localectl set-keymap at
$ localectl
System Locale: LANG=en_US.UTF-8
VC Keymap: at
X11 Layout: (unset)
This patch fixes this limitation as the implemenation is relatively simple and
it removes the need to populate kbd-model-map with (many) entries for converted
keymaps. However the patch doesn't remove the existing entries in kbd-model-map
which became unneeded after this change to be on the safe side.
Note: by default the automatically generated x11 keyboard configs use keyboard
model "microsoftpro" which should be equivalent to "pc105" model but with the
internet/media key mapping added.
When we're checking if /etc/resolv.conf exists so we can bind mount
on top of it, we care about whether the symlink itself exists if
/etc/resolv.conf exists and not the file it points to, so add
CHASE_NOFOLLOW to make sure we check existence of the symlink and
not the file it points to.