Let's install a simple no-op signal handler without SA_RESTART for
SIGINT/SIGTERM, which allows us to interrupt read_one_char() and follow
it up with a proper cleanup, including restoring the vt to the original
state.
Introduce a new env variable $SYSTEMD_NSPAWN_CHECK_OS_RELEASE, that can
be used to disable the os-release check for bootable OS trees. Useful
when trying to boot a container with empty /etc/ and bind-mounted /usr/.
tpm2-util: add generic helpers for sealing/unsealing data
These helpers tpm2_seal_data()/tpm2_unseal_data() are useful for
sealing/unsealing data without any further semantics around them. This
is different from the existing tpm2_seal()/tpm2_unseal() which seal with
a specific policy and serialize in a specific way, as we use it for disk
encryption.
These new helpers are more generic, they do not serialize in a specific
way or imply policy, they are just the core of the sealing/unsealing.
(We should look into porting tpm2_seal()/tpm2_unseal() onto these new
helpers, but this isn#t trivial, since the classic serialization we use
uses a merged marshalling of private/public key, which we'd have to
change in one way or another)
tpm2-util: add helper for creating/removing/updating NV index with stored policy
This is the primary core of what pcrlock is supposed to do eventually:
maintain a TPM2 policy hash inside an NV index which we then can
reference via a PolicyAuthorizeNV expression to lock other objects
against it.
tpm2-util: add helpers for marshalling public/private keys
Note: we export these new symbols for now. A later commit in this PR
will make them static again. The only reason they are exported here is
to make sure gcc doesn't complain about unused static symbols, and I
really wanted to commit them in a separate commit.
cryptsetup: pass AskPasswordFlags down into pkcs11 module
The pkcs11 cryptsetup token module is a bit different from the tpm2 +
fido2 ones: it asks for the PIN itself, rather than bubbling up a
request to get a PIN. That's because it might need multiple, and because
we don't want to destroy a the pkcs11 session half-way and thus risk
increasing pin counters.
Hence, we sometimes ask for PINs from our code, rather than let the
libcryptsetup caller do that. So far we didn't pass the AskPasswordFlags
field down into the module though. Fix that.
Yu Watanabe [Thu, 2 Nov 2023 05:12:42 +0000 (14:12 +0900)]
network: add meson option to rename .example files on install
Also this renames 80-ethernet.network.example -> 89-ethernet.network.example,
to make it have lower precedence over other default .network files for
Ethernet interfaces.
cryptsetup: disable activation via token plugin if we shall measure the volume key
if we allow cryptsetup to activate a volume via token plugin we never
get access to the volume key, which we'd like to measure. Hence disable
token plugins in that case.
(I tempted to say we probably should disable them entirely, and only use
them if classic cryptsetup is used, but that's a discussion for another
day.)
But no luck, it was suggested I use ELF interposition instead. Hence,
let's do so (but not via ugly LD_PRELOAD, but simply by overriding the
relevant symbol natively in our own code).
This implements a "storage target mode", similar to what MacOS provides
since a long time as "Target Disk Mode":
https://en.wikipedia.org/wiki/Target_Disk_Mode
This implementation is relatively simple:
1. a new generic target "storage-target-mode.target" is added, which
when booted into defines the target mode.
2. a small tool and service "systemd-storagetm.service" is added which
exposes a specific device or all devices as NVMe-TCP devices over the
network. NVMe-TCP appears to be hot shit right now how to expose
block devices over the network. And it's really simple to set up via
configs, hence our code is relatively short and neat.
The idea is that systemd-storagetm.target can be extended sooner or
later, for example to expose block devices also as USB mass storage
devices and similar, in case the system has "dual mode" USB controller
that can also work as device, not just as host. (And people could also
plug in sharing as NBD, iSCSI, whatever they want.)
How to use this? Boot into your system with a kernel cmdline of
"rd.systemd.unit=storage-target-mode.target ip=link-local", and you'll see on
screen the precise "nvme connect" command line to make the relevant
block devices available locally on some other machine. This all requires
that the target mode stuff is included in the initrd of course. And the
system will the stay in the initrd forever.
Why bother? Primarily three use-cases:
1. Debug a broken system: with very few dependencies during boot get
access to the raw block device of a broken machine.
2. Migrate from system to another system, by dd'ing the old to the new
directly.
3. Installing an OS remotely on some device (for example via Thunderbolt
networking)
(And there might be more, for example the ability to boot from a
laptop's disk on another system)
Limitations:
1. There's no authentication/encryption. Hence: use this on local links
only.
2. NVMe target mode on Linux supports r/w operation only. Ideally, we'd
have a read-only mode, for security reasons, and default to it.
Future love:
1. We should have another mode, where we simply expose the homed LUKS
home dirs like that.
2. Some lightweight hookup with plymouth, to display a (shortened)
version of the info we write to the console.
udevadm-lock: switch things over to lock_generic_with_timeout()
This replaces the local implementation of a timeout file lock with our
new generic one.
Note that a comment in the old code claimed we couldn't use alarm()-like timeouts,
but htat's not entirely true: we can if we use SIGKILL, and thus know
for sure that the process will be dead in case the timer is hit before
we actually enter the file lock syscall. But we also know it will be
delivered if we hit after.
lock-util: add a new lock_generic_with_timeout() helper
This is just like lock_generic(), but applies the lock with a timeout.
This requires jumping through some hoops by executing things in a child
process, so that we can abort if necessary via a timer. Linux after all
has no native way to take file locks with a timeout.
process-util: add new FORK_DEATHSIG_SIGKILL flag, rename FORK_DEATHSIG → FORK_DEATHSIG_SIGTERM
Sometimes it makes sense to hard kill a client if we die. Let's hence
add a third FORK_DEATHSIG flag for this purpose: FORK_DEATHSIG_SIGKILL.
To make things less confusing this also renames FORK_DEATHSIG to
FORK_DEATHSIG_SIGTERM to make clear it sends SIGTERM. We already had
FORK_DEATHSIG_SIGINT, hence this makes things nicely symmetric.
A bunch of users are switched over for FORK_DEATHSIG_SIGKILL where we
know it's safe to abort things abruptly. This should make some kernel
cases more robust, since we cannot get confused by signal masks or such.
While we are at it, also fix a bunch of bugs where we didn't take
FORK_DEATHSIG_SIGINT into account in safe_fork()
Luca Boccassi [Thu, 2 Nov 2023 11:01:23 +0000 (11:01 +0000)]
mkosi: explicitly disable KVM in GHA runs
mkosi detects whether /dev/kvm is available and uses it if it is. But
some GHA hosts have it, but it's broken and not supported, so we need
to explicitly disable it.
varlink,json: introduce new varlink_dispatch() helper
varlink_dispatch() is a simple wrapper around json_dispatch() that
returns clean, standards-compliant InvalidParameter error back to
clients, if the specified JSON cannot be parsed properly.
For this json_dispatch() is extended to return the offending field's
name. Because it already has quite a few parameters, I then renamed
json_dispatch() to json_dispatch_full() and made json_dispatch() a
wrapper around it that passes the new argument as NULL. While doing so I
figured we should also get rid of the bad= argument in the short
wrapper, since it's only used in the OCI code.
To simplify the OCI code this adds a second wrapper oci_dispatch()
around json_dispatch_full(), that fills in bad= the way we want.
Net result: instead of one json_dispatch() call there are now:
1. json_dispatch_full() for the fully feature mother of all dispathers.
2. json_dispatch() for the simpler version that you want to use most of
the time.
3. varlink_dispatch() that generates nice Varlink errors
4. oci_dispatch() that does the OCI specific error handling
To avoid timeouts in oss-fuzz. The timeout reported in #29736 happened
with a ~500K test case, so with a conservative 128K limit we should
still be well within a range for any reasonable-ish generated input to
get through, while avoiding timeouts.
resolved: make sure "resolvectl monitor" can properly deal with stub queries
If we receive a query via the two stubs we store the original packet
instead of just the question object. Hence when we send monitor info to
subscribed clients we need to extract its question and also include it
in the returned data.
Since libcs are more interested in the POSIX `fchmodat(3)`, they are
unlikely to provide a direct wrapper for this syscall. Thus, the headers
we examine to set `HAVE_*` are picked somewhat arbitrarily.
Also, hook up `try_fchmodat2()` in `test-seccomp.c`. (Also, correct that
function's prototype, despite the fact that mistake would not matter in
practice)
chase: fix corner case when using CHASE_PARENT with a path ending in ".."
If we use CHASE_PARENT on a path ending in ".." then things are a bit
weird, because we the last path we look at is actually the *parent* and not
the *child* of the preceeding path. Hence we cannot just return the 2nd
to last fd we look at. We have to correct it, by going *two* levels up,
to get to the actual parent, and make sure CHASE_PARENT does what it
should.
Example: for the path /a/b/c chase() with CHASE_PARENT will return
/a/b/c as path, and the fd returned points to /a/b. All good. But now,
for the path /a/b/c/.. chase() with CHASE_PARENT would previously return
/a/b as path (which is OK) but the fd would point to /a/b/c, which is
*not* the parent of /a/b, after all! To get to the actual parent of
/a/b we have to go *two* levels up to get to /a.
Very confusing. But that's what we here for, no?
@mrc0mmand ran into this in https://github.com/systemd/systemd/pull/28891#issuecomment-1782833722