pam-systemd: disconnect bus connection when leaving session hook, even on error
This adds support for systematically destroying connections in
pam_sm_session_open() even on failure, so that under no circumstances
unserved dbus connection are around while the invoking process waits for
the session to end. Previously we'd only do this on success, now do it
in all cases.
This matters since so far we suggested people hook pam_systemd into
their pam stacks prefixed with "-", so that login proceeds even if
pam_systemd fails. This however means that in an error case our
cached connection doesn't get disconnected even if the session then is
invoked. This fixes that.
nspawn: port over to /supervisor/ subcgroup being delegated to nspawn
Let's make use of the new DelegateSubgroup= feature and delegate the
/supervisor/ subcgroup already to nspawn, so that moving the supervisor
process there is unnecessary.
units: make system service manager create init.scope subcgroup for user service manager
This one is basically for free, since the service manager is already
prepared for being invoked in init.scope. Hence let's start it in the
right cgroup right-away.
core: change ownership of subcgroup we create recursively, it shall be owned by the user delegated to
If we create a subcroup (regardless if the '.control' subgroup we
always created or one configured via DelegateSubgroup=) it's inside of
the delegated territory of the cgroup tree, hence it should be owned
fully by the unit's users. Hence do so.
execute: don't apply journal + oomd xattrs to subcgroup
We don't need to apply the journal/oomd xattrs to the subcgroups we add,
since those daemons already look for the xattrs up the tree anyway.
Hence remove this.
This is in particular relevant as it means later changes to the xattr
don#t need to be replicated on the subcgroup either.
This implements a minimal subset of #24961, but in a lot more
restrictive way: we only allow one level of subcgroup (as that's enough
to address the no-processes in inner cgroups rule), and does not change
anything about threaded cgroup logic or similar, or make any of this new
behaviour mandatory.
All this does is this: all non-control processes we invoke for a unit
we'll invoke in a subgroup by the specified name.
We'll later port all our current services that use cgroup delegation
over to this, i.e. user@.service, systemd-nspawn@.service and
systemd-udevd.service.
Let's clean up validation/escaping of cgroup names. i.e. split out code
that tests if name needs escaping. Return proper error codes, and extend
test a bit.
Mike Yuan [Fri, 16 Dec 2022 16:44:06 +0000 (00:44 +0800)]
tmpfiles: add conditionalized execute bit (X) support
According to setfacl(1), "the character X stands for
the execute permission if the file is a directory
or already has execute permission for some user."
After this commit, parse_acl() would return 3 acl
objects. The newly-added acl_exec object contains
entries that are subject to conditionalized execute
bit mangling. In tmpfiles, we would iterate the acl_exec
object, check the permission of the target files,
and remove the execute bit if necessary.
Here's an example entry:
A /tmp/test - - - - u:test:rwX
suid binaries and device nodes should not be placed there, hence forbid
it.
Of all the API VFS we mount from PID 1 or via a unit file this one is
the only one where we didn't add MS_NODEV/MS_NOSUID. Let's address that,
since there's really no reason why device nodes or suid binaries would
be placed in hugetlbfs.
image-policy: split out code that "extends" underspecified partition policy flags
When encoding partition policy flags we allow parts of the flags to be
"unspecified" (i.e. entirely zeros), which when actually checking the
policy we'll automatically consider equivalent to "any" (i.e. entirely
ones). This "extension" of the flags was so far done as part of
partition_policy_normalized_flags(). Let's split this logic out into a
new function partition_policy_flags_extend() that simply sets all bits
in a specific part of the flags field if they were entirely zeroes so
far.
When comparing policy objects for equivalence we so far used
partition_policy_normalized_flags() to compare the per-designator flags,
which thus meant that "underspecified" flags, and fully specified ones
that are set to "any" were considered equivalent. Which is great.
However, we forgot to do that for the fallback policy flags, the flags
that apply to all partitions for which no explicit policy flags are
specified.
Let's use the new partition_policy_flags_extend() call to compare them
in extended form, so that there two we can hide the difference between
"underspecified" and "any" flags.
ukify supports signing with multiple keys, so show an example of this, and just
let ukify print the calls to systemd-measure that will be done.
This also does other small cleanups:
- Use more realistic names in examples
- Use $ as the prompt for commands that don't require root (most don't).
Once we switch to operations that don't require a TPM, we should be able to get
rid of the remaining calls that require root.
- Ellipsize or linebreak various parts
- Use --uname. We warn if it is not specified and we have to do autodetection, so
let's nudge people towards including it rather than not.
core/job: use new job ID when we failed to deserialize job ID
This is for the case when we fail to deserialize job ID.
In job_install_deserialized(), we also check the job type, and that is
for the case when we failed to deserialize the job.
Let's gracefully handle the failure in deserializing the job ID.
This is paranoia, and just for safety. Should not change any behavior.
core/transaction: use hashmap_remove_value() to make not remove job with same ID
When we fail to deserialize job ID, or the current_job_id is overflowed,
we may have jobs with the same ID.
This is paranoia, and just for safety.
Note, we already use hashmap_remove_value() in job_uninstall().
We translate 'all' to UNIT64_MAX, which has a lot more 'f's. Use the
helper macro, since a decimal uint64_t will always be >> than a hex
representation.
root@image:~# systemd-run -t --property CoredumpFilter=all ls /tmp
Running as unit: run-u13.service
Press ^] three times within 1s to disconnect TTY.
*** stack smashing detected ***: terminated
[137256.320511] systemd[1]: run-u13.service: Main process exited, code=dumped, status=6/ABRT
[137256.320850] systemd[1]: run-u13.service: Failed with result 'core-dump'.
As described in systemd/systemd#27204 reexecuting the daemon while
running in a systemd-run "session" causes the session end prematurely.
Let's skip the Reexecute() method in dfuzzer and trigger it manually
until the issue is resolved.
journal: Don't try to write garbage if journal entry is corrupted
If journal_file_data_payload() returns -EBADMSG or -EADDRNOTAVAIL,
we skip the entry and go to the next entry, but we never modify
the number of items that we pass to journal_file_append_entry_internal()
if that happens, which means we could try to append garbage to the
journal file.
Let's keep track of the number of fields we've appended to avoid this
problem.
locale: when no xvariant match select the entry with an empty xvariant
When doing a conversion and the specified 'xc->xvariant' has no match, select
the x11 layout entry with a matching layout and an empty xvariant if such entry
exists. It's still better than no conversion at all.
"credential provided widget" would be better spelled as "credential-provided widget".
But let's adjust the message to name the bad credential explicitly: this
makes it easier to fix for the user.
shared/creds-util: return 0 for missing creds in read_credential_strings_many
Realistically, the only thing that the caller can do is ignore failures related
to missing credentials. If the caller requires some credentials to be present,
they should just check which output variables are not NULL. One of the callers
was already doing that, and the other wanted to, but missed -ENOENT. By
suppressing -ENOENT and -ENXIO, both callers are simplified.
Fixes a warning at boot:
systemd-vconsole-setup[221]: Failed to import credentials, ignoring: No such file or directory
This will make things a bit longer for now, but more powerful as we can
reuse the userns fd between calls to remount_idmap() if we need to
adjust multiple mounts.
No change in behaviour, just some minor refactoring.
I guess it was only a question of time until we need to add the final
frontier of notification functions: one that combines the features of
all the others:
1. specifiying a source PID
2. taking a list of fds to send along
3. accepting a format string for the status string
pam: do not attempt to close sd-bus after fork in pam_end()
When pam_end() is called after a fork, and it cleans up caches, it sets
PAM_DATA_SILENT in error_status. FDs will be shared with the parent, so
we do not want to attempt to close them from a child process, or we'll
hit assertions. Complain loudly and skip.
it's a bit confusing that on 32bit systems we'd risk session IDs
overruns like this. Let's expose the same behaviour everywhere and stick
to 64bit ids.
Since we format the ids as strings anyway this doesn't really change
anything performance-wise, it just pushes out collisions by overrun to
basically never happen.
sd-event: store and compare per-module static origin id
sd-event objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-event object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-event are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
sd-journal: store and compare per-module static origin id
sd-journal objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-journal object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-journal are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
sd-bus: store and compare per-module static origin id
sd-bus objects use hashmaps, which use module-global state, so it is not safe
to pass a sd-bus object created by a module instance to another module instance
(e.g.: when two libraries static linking sd-bus are pulled in a single process).
Initialize a random per-module origin id and store it in the object, and compare
it when entering a public API, and error out if they don't match, together with
the PID.
Wolfgang Müller [Mon, 24 Apr 2023 18:00:56 +0000 (20:00 +0200)]
cryptsetup-fido2: Depend on libcryptsetup
crypsetup-fido2 always depended on both libfido2 and libcryptsetup, but 0a8e026e825dda142a8f1552a4b45815cbfd0b48 forgot to make the then
implicit dependency on libcryptsetup explicit when moving it from
cryptsetup/ to shared/. This breaks builds when libfido2 is autodetected
but the system is missing libcryptsetup.
Introduce an explicit check for HAVE_LIBCRYPTSETUP such that
cryptsetup-fido2 is only built when both libraries are available.
homed: rename make_userns() to avoid name conflict with mount-util.[ch]
This doesn't really matter too much as both are static functions. But
it's confusing as hell both when debugging and reading code, given that
homed actually uses mount-util.c
Hence, let's just rename one of the two, to minimize confusion.
No actual change in behaviour.
(and sooner or later we might want to export mount-util.c's version of
the function, since it's generically useful)