Ondrej Kozina [Tue, 16 Mar 2021 19:13:28 +0000 (20:13 +0100)]
Add support for systemd-tpm2 libcryptsetup plugin.
Add support for systemd-tpm2 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (tpm2 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
This is not called from the systemd.triggers or systemd.macros files. Instead,
it would be called from the scriptlets in systemd rpm package itself, at the
place where we call systemctl daemon-reexec.
See https://github.com/systemd/systemd/pull/20289#issuecomment-885622200 .
rpm: restart user services at the end of the transaction
This closes an important gap: so far we would reexecute the system manager and
restart system services that were configured to do so, but we wouldn't do the
same for user managers or user services.
The scheme used for user managers is very similar to the system one, except
that there can be multiple user managers running, so we query the system
manager to get a list of them, and then tell each one to do the equivalent
operations: daemon-reload, disable --now, set-property Markers=+needs-restart,
reload-or-restart --marked.
The total time that can be spend on this is bounded: we execute the commands in
parallel over user managers and units, and additionally set SYSTEMD_BUS_TIMEOUT
to a lower value (15 s by default). User managers should not have too many
units running, and they should be able to do all those operations very
quickly (<< 1s). The final restart operation may take longer, but it's done
asynchronously, so we only wait for the queuing to happen.
The advantage of doing this synchronously is that we can wait for each step to
happen, and for example daemon-reloads can finish before we execute the service
restarts, etc. We can also order various steps wrt. to the phases in the rpm
transaction.
When this was initially proposed, we discussed a more relaxed scheme with bus
property notifications. Such an approach would be more complex because a bunch
of infrastructure would have to be added to system manager to propagate
appropriate notifications to the user managers, and then the user managers
would have to wait for them. Instead, now there is no new code in the managers,
all new functionality is contained in src/rpm/. The ability to call 'systemctl
--user user@' makes this approach very easy. Also, it would be very hard to
order the user manager steps and the rpm transaction steps.
Note: 'systemctl --user disable' is only called for a user managers that are
running. I don't see a nice way around this, and it shouldn't matter too much:
we'll just leave a dangling symlink in the case where the user enabled the
service manually.
Some rpms install a bunch of units… It seems nicer to invoke them all in
parallel. In particular, timeouts in systemctl also run in parallel, so if
there's some communication mishap, we will wait less.
rpm: use a helper script to actually invoke systemctl commands
Instead of embedding the commands to invoke directly in the macros,
let's use a helper script as indirection. This has a couple of advantages:
- the macro language is awkward, we need to suffix most commands by "|| :"
and "\", which is easy to get wrong. In the new scheme, the macro becomes
a single simple command.
- in the script we can use normal syntax highlighting, shellcheck, etc.
- it's also easier to test the invoked commands by invoking the helper
manually.
- most importantly, the logic is contained in the helper, i.e. we can
update systemd rpm and everything uses the new helper. Before, we would
have to rebuild all packages to update the macro definition.
This raises the question whether it makes sense to use the lua scriptlets when
the real work is done in a bash script. I think it's OK: we still have the
efficient lua scripts that do the short scripts, and we use a single shared
implementation in bash to do the more complex stuff.
The meson version is raised to 0.47 because that's needed for install_mode.
We were planning to raise the required version anyway…
glibc master uses getrandom in malloc since https://sourceware.org/git/?p=glibc.git;a=commit;h=fc859c304898a5ec72e0ba5269ed136ed0ea10e1 , getrandom should be in the default set so to avoid all non trivial programs to fallback to a PRNG.
Add variant of close_all_fds() that does not allocate and use it in freeze()
Even though it's just a fallback path, let's not be sloppy and allocate in
the crash handler.
> The deadlock happens because systemd crash in malloc() then in signal
> handler, it calls malloc() (close_all_fds()-> opendir()-> __alloc_dir())
> again. malloc() is not a signal-safe function, maybe we should re-think
> the logic here.
Currently it's only used in two places in src/shared/, so the function was
already included just once in compiled code. But it seems appropriate to
move it there anyway, because library code should have no need to fork
agents, so it doesn't belong in basic/.
We need a sorted list of fds to skip over when closing. We would allocate a
copy of the passed array to do the sort. But all callers construct a temporary
array to pass to us, so it is pointless to copy it again.
close_all_fds/safe_fork_full/namespace_fork/fork_agent are changed to pass
a non-const int array. I checked all users, and all callers are fine with
the array being sorted.
The function was returning some number (sometimes 1, sometimes the extent
of the range passed over to close_range(), ???). Anyway, all callers only
check for error, so let's return 0 on success.
man: stop recommending putting myhostname after dns
nss-resolve also looks in /etc/hosts, and has the same local hostname
resolving logic as nss-myhostname. We shouldn't recommend another order
than nss-resolve uses internally.
When nss-resolve is used, there's no possibility to override
nss-myhostname hosts via DNS *anyway*.
On top of that, it's not a good idea to allow DNS to override local
hostnames as all - at least not something we should advertise in the
docs.
Luca BRUNO [Thu, 8 Jul 2021 09:47:32 +0000 (09:47 +0000)]
docs: move /var/log/README to a tmpfiles.d symlink
This moves the /var/log/README content out of /var and into the
docs location, replacing the previous file with a symlink
created through a tmpfiles.d entry.
sd-bus: fix missing initializer in SD_BUS_VTABLE_END (#20253)
When two fields were added to the vtable.x.start struct, no initializers
for these were added to SD_BUS_VTABLE_END which also (ab)used that
struct (albeit sneakily by using non-designated initialization).
While C tolerates this, C++ prohibits these missing initializers, and
both g++ and clang++ will complain when using -Wextra.
This patch gives SD_BUS_VTABLE_END its own case in the union and
clarifies its initialization.
I tested the behaviour of g++ 10.2 and clang 11 in various cases. Both will warn
(-Wmissing-field-initializers, implied by -Wextra) if you provide initializers for some
but not all fields of a struct. Declaring x.end as empty struct or using an empty initializer
{} to initialize the union or one of its members is valid C++ but not C, although both gcc
and clang accept it without warning (even at -Wall -Wextra -std=c90/c++11) unless you
use -pedantic (which requires -std=c99/c++2a to support designated initializers).
Interestingly, .x = { .start = { 0, 0, NULL } } is the only initializer I found for the union
(among candidates for SD_BUS_VTABLE_END) where gcc doesn't zero-fill it entirely
when allocated on stack, it looked like it did in all other cases (I only examined this on
32-bit arm). clang always seems to initialize all bytes of the union.
[zjs: test case:
$ cat vtable-test.cc
#include "sd-bus.h"
rpm: don't specify the full path for systemctl and other commands
We can make things a bit simpler and more readable by not specifying the path.
Since we didn't specify the full path for all commands (including those invoked
recursively by anythign we invoke), this didn't really privide any security or
robustness benefits. I guess that full paths were used because this style of
rpm packagnig was popular in the past, with macros used for everything
possible, with special macros for common commands like %{__ln} and %{__mkdir}.
We would copy "forever" into the buffer. This is a fairly common case, so let's
do a microoptimization and return a static string. (All callers use the return
pointer, so this works just as well.)
The prefix "for " was not displayed, because the pointer to the part of the
buffer after "for " was returned. (Maybe it's just me, but I find strpcpy()
and associated functions really hard to use… I always have to look up what the
do exactly and what the return value is.)
network: configure address with requested lifetime
When assigning the same address provided by a dynamic addressing
protocol, the new lifetime is stored on Request::Address, but not
Address object in Link object, which can be obtained by address_get().
So, we need to configure address with Address object in Request.
manager: print status text of the service when waiting for a job
This does two semi-independent but interleaved things: firstly, the manager now
prints the status text (if available) of a service when we have a job running
for that service and it is slow. Because it's hard to fit enough info on the
line, we only do this if the output mode uses unit names. The format of the
line "… job is running for …" is changed to be shorter. This way we can
somewhat reasonably fit two status messages on one line.
Secondly, the manager now sends more information using sd_notify. This mostly
matters for in case of the user manager. In particular, we notify when starting
one of the special units. Without this, when the system manager would display a
line about waiting for the user manager, it would show status like "Ready.",
which is confusing. Now it'll either show something like "Started special unit
shutdown.target", or the line about waiting for a user job.
Also, the timeouts for the user manager are lowered: the user manager usually
(always?) has status disabled, so we would wait for 25 seconds before showing
job progress. Normally we don't expect to have any jobs that take more than a
second. So let's start the progress output fairly quickly, like we would if
status showing was enabled. This obviously makes the output in the system
manager about the user manager more useful. The timeouts are "desynchronized"
by a fraction so if there are multiple jobs running, we'll cycle through
showing all combinations.
We would send READY=1,STATUS="Startup finished in …" once after finishing
boot. This changes the message to just "Ready.". The time used to reach
readiness is not part of the ongoing status — it's just a bit of debug
information that it useful in some scenarious, but completely uninteresting
most of the time. Also, when we start sending status about other things in
subsequent patches, we can't really go back to showing "Startup finished in …"
later on. So let's just show "Ready." whenever we're in the steady state.
In manager_check_finished(), more steps are skipped if MANAGER_IS_FINISHED().
Those steps are idempotent, but no need to waste cycles trying to do them
more than once.
We'll now also check whether to send the status message whenever the job queue
runs empty. If we already sent the exact same message already, we'll not send
again.
manager: always log when starting a "special unit"
This is the initiatation of the machine shutdown/reboot/etc, so it's
useful to log about this. We log about the steps that we take, but
so far we didn't really log why we started the sequence (except at
debug level).
The function is renamed, because we also use it for dbus.service,
not just targets.
The man page says asprintf() pointer is "undefined" on error, but the
only meaningful interpretation is that it's either NULL or points to
something that should be freed with free().
Format output in a manner that can be copypasted as-is to NEWS.
That is, with 8 spaces indentation and wrapped at 80 columns.
Before:
$ tools/git-contrib.sh
Ben Stockett,
Carl Lei,
Frantisek Sumsal,
Gibeom Gwon,
Hugo Osvaldo Barrera,
James Hilliard,
Jan Palus,
Lennart Poettering,
Luca Boccassi,
Luca BRUNO,
Mike Gilbert,
nassir90,
nl6720,
Raul Tambre,
Yegor Alexeyev,
Yu Watanabe,
Zbigniew Jędrzejewski-Szmek,
After:
Contributions from: Ben Stockett, Carl Lei, Frantisek Sumsal,
Gibeom Gwon, Hugo Osvaldo Barrera, James Hilliard, Jan Palus,
Lennart Poettering, Luca Boccassi, Luca BRUNO, Mike Gilbert,
nassir90, nl6720, Raul Tambre, Yegor Alexeyev, Yu Watanabe,
Zbigniew Jędrzejewski-Szmek
network: dhcp4: also support semi-static routes with Gateway=_dhcp4 when UseGateway=no or UseRoutes=no
This makes the default gateway is read from classless static routes or
router option even if UseGateway=no or UseRoutes=no, and will be used
when configuring semi-static routes such that specified with Gateway=_dhcp4.
This also changes the behavior of RoutesToDNS= or RoutesToNTP=.
Previously, the DNS or NTP servers are not in the same network, then the
routes to the servers were not configured when UseGateway=no or
UseRoutes=no. With this commit, the default gateway in classless static
routes or router option will used to connecting the servers even if
UseGateway=no or UseRoutes=no.
It's already listed along with others (Tunnel, VLAN, etc.) and its description matches those. The duplication was introduced by commit c3006a485c9c35c0ab947479ff1dd7149fda9750.
network: check the received interface name is actually new
For some reasons I do not know, on interface renaming, kernel once send
netlink message with old interface name, and then send with new name.
If eth0 is renamed, and then new interface appears as eth0, then the
message with the old name 'eth0' makes the interface enters failed
state.
To ignore such invalid(?) rename event messages, let's confirm the
received interface name.
systemctl: show error when help for unknown unit is requested
Fixes #20189. We would only log at debug level and return failure, which looks
like a noop for the user.
('help' accepts multiple arguments and will show multiple concatenated man
pages in that case. Actually, it will also show multiple concatenated man pages
if the Documentation= setting lists multiple pages. I don't think it's very
terribly useful, but, meh, I don't think we can do much better. If a user
requests a help for a two services, one known and one unknown, there'll now be
a line in the output. It's not very user friendly, but not exactly wrong too.)
Ben Stockett [Fri, 9 Jul 2021 20:29:36 +0000 (20:29 +0000)]
Updated manpage for sd_bus_set_property
Updated manpage for sd_bus_set_property and sd_bus_set_propertyv. In the old manpage, these functions included the parameter sd_bus_message **reply when the actual function had no such argument.
We still sometimes try to grep an empty strace log because strace is not
yet properly initialized. Let's make the check a bit clever and wait
until strace is attached to PID 1 by checking the `TracerPid` field in
`/proc/1/status`.
cunescape() sets output on success, so initialization is not necessary. There
was no comment, but I think they may have been added because the compiler
wasn't convinced that the return value is non-negative on success. It could
have been confused by the int return type on escape*(), which was changed by
the one of preceeding commits to ssize_t, or by the length calculation, so add
an assert to help the compiler.
For some reason coverity thinks the output can be leaked here (CID #1458111).
I don't see how.
We can't say free_and_replace(exec_split[n++], quoted), because the the
argument is evaluated multiple times. But I think that this form is
still easier to read.
test: bump the test timeout to give ldconfig.service enough time to finish
Sometimes the ldconfig.service might take a bit longer to finish,
causing spurious test timeouts:
```
[ 1025.858923] systemd[24]: ldconfig.service: Executing: /sbin/ldconfig -X
...
[ 1043.883620] systemd[1]: ldconfig.service: Main process exited, code=exited, status=0/SUCCESS (success)
...
Trying to halt container. Send SIGTERM again to trigger immediate
termination.
Container TEST-52-HONORFIRSTSHUTDOWN terminated by signal KILL.
E: Test timed out after 20s
```