Daan De Meyer [Mon, 6 Jun 2022 14:01:20 +0000 (16:01 +0200)]
shared: Rename pcre2-dlopen.h/c to pcre2-util.h/c
We already store the dlopen() stuff for other libraries in util headers
as well so let's do the same for pcre2. We also move the definition of
some trivial cleanup functions from journalctl.c to pcre2-util.h
Daan De Meyer [Fri, 3 Jun 2022 11:29:47 +0000 (13:29 +0200)]
meson: Switch default-locale default to C.UTF-8
We're already using C.UTF-8 as the default locale for nspawn. Let's
make the same change for the default-locale option instead of deciding
what to use based on the locale used by the host system. Users can
still override the locale using the default-locale option if needed.
Franck Bui [Thu, 2 Jun 2022 07:31:55 +0000 (09:31 +0200)]
test: enable virtio-rng device for QEMU guests
If rngd is included in the host initrd, QEMU guests need at least one source of
entropy otherwise rngd will refuse to start. Hence this patch enables the
virtio RNG device in QEMU guests (exposed as a HW RNG device available at
/dev/hwrng).
As a safety measure, the patch limits the data sent to the guest to 1KB per
second in order to not let the guest starve the host entropy.
core: rework variable initialization to avoid gcc warning
In file included from ../src/basic/siphash24.h:11,
from ../src/basic/hash-funcs.h:6,
from ../src/basic/hashmap.h:8,
from ../src/shared/fdset.h:6,
from ../src/shared/bpf-program.h:9,
from ../src/core/unit.h:11,
from ../src/core/all-units.h:4,
from ../src/core/manager.c:23:
../src/basic/time-util.h: In function 'manager_dispatch_jobs_in_progress':
../src/basic/time-util.h:140:38: error: 'x' may be used uninitialized [-Werror=maybe-uninitialized]
140 | #define FORMAT_TIMESPAN(t, accuracy) format_timespan((char[FORMAT_TIMESPAN_MAX]){}, FORMAT_TIMESPAN_MAX, t, accuracy)
| ^~~~~~~~~~~~~~~
In function 'manager_print_jobs_in_progress',
inlined from 'manager_dispatch_jobs_in_progress' at ../src/core/manager.c:3007:9:
../src/core/manager.c:219:18: note: 'x' was declared here
219 | uint64_t x;
| ^
cc1: all warnings being treated as errors
For some reason this (false positive) warning starts appearing after
-ftrivial-auto-var-init is used.
core/bpf: prefix log messages from different bpf subsystems
When something goes awry, we would get identical log messages from all the
bpf subsystems. E.g. "Failed to load BPF object: %m" appeared 5 times in the
sources. But it is very important to know *which* object we failed to load.
This could be guessed, e.g. from surroudning messages or from filename/line
metadata, but when we get log messages in bug reports, this might not be
available. Let's make the messages distinguishable.
While at it, some messages were adjusted a bit. In particular, we shouldn't use
internal names like BPFProgram which have no meaning outside of the codebase.
Testing the error paths is very important. If we are not root, we should
try and get a failure, which we should report nicely and mark the test
as skipped. After those checks are removed, this is what seems to happen.
This way we can see what will happen e.g. in the user manager when we try
to perform some bpf ops.
shared/bpf: install log callback and suppress most messages from libbpf
$ build/test-socket-bind
...
libbpf: load bpf program failed: Operation not permitted
libbpf: failed to load program 'sd_bind4'
libbpf: failed to load object 'socket_bind_bpf'
libbpf: failed to load BPF skeleton 'socket_bind_bpf': -1
Failed to load BPF object: Operation not permitted
Now all lines with "libbpf:" are at debug level and will be hidden by
default.
Partially fixes https://bugzilla.redhat.com/show_bug.cgi?id=2084955#c14
(i.e. the error that was exposed when the initial error was fixed.)
resolved: choose correct file descriptor for proxy stub replies
find_socket_fd() does not expect the sender address, but the
listen-address. This is in fact the destination of the DNS packet.
Matching via sender address caused a fallback to the default stub
listener in manager_dns_stub_fd() as the sender address can never
match the proxy stub listen address.
Note that manager_dns_stub_fd() is only used for the default
listener stub and the proxy stub, that means *extra* listeners
stubs (DNSStubListenerExtra=…) have not been affected as
`struct DnsStubListenerExtra` provides a direct link to the event
source.
By using the correct fd we ensure the correct socket options
(like TTL) are used and prevent issues like #23495 in case ifindex
could not be determined.
login: do not issue wall messages on local terminals for suspend and hibernate
Fixes: #23520
[zjs: I added the comment and tweaked the patch a bit.
The call to reset_scheduled_shutdown() is moved down a bit to allow the
callback to have access to information about the operation being cancelled.
This all happens within the same function, so there should be no observable
change in behaviour.]
shared/pager: print the name of the pager we'll try next in debug message
I had a strange failure where the pager was hanging on invocation (gdm crashed
and the kernel got into a strange state where it was hanging on some tasks).
Based on the logs from 'SYSTEMCTL_LOG_LEVEL=debug journalctl', I couldn't even
tell which pager binary we're executing. So let's shorten the function a bit and
provide a bit more detail.
systemctl: drop translation of method names to descriptions in error message
We had yet-another table of descriptive strings to use in error messages.
I started thinking how to synchronize them with the strings in logind, but
ultimately I think it's better to remove those altogether. Those strings
should almost never be used: normally if the call fails, logind will provide
an error message itself, which is probably more detailed than what we can
figure out on the client side. And the most important part that we want to
show here is what exactly we called, in particular RebootWithFlags vs. Reboot,
etc. By using the "descriptive strings" we were obfuscating this. So let's just
simplify our code and print the actual method name, since this is more useful
as an error statement that is googlable and unique.
While at it, let's print the correct method name ;)
logind: rework wall message about pending shutdown/halt/reboot/…
Those messages simply *feel* dated: "The system is going for suspend NOW!".
Let's say "The system will suspend|power off|hibernate|… now!" instead.
The exclamation mark is enough to show the urgency.
Also, the "the" seemed out of place. We're not talking about a specific reboot.
Benjamin Franzke [Tue, 31 May 2022 19:36:55 +0000 (21:36 +0200)]
resolved: define source address for proxy-only stub replies
DnsPacket.ifindex=1 (loopback) is normalized to 0 whenever a message is
received on the loopback iface, so for both listeners, 127.0.0.53 and
127.0.0.54, the ifindex will be set to 0 by manager_recv() for queries
that have a local origin.
Replies to such local messages need to set a proper ifindex in any
case, as the supplied source-address would otherwise be ignored in
manager_ipv4_send() (CMSG generation is skipped due to ifindex > 0 check).
Note that this change only forces `ifindex` to loopback if it was actually
normalized to `0` before (due to a loopback detection) in order to keep the
nat-to-127.0.0.54-from-another-interface usecase that was described in a8d09063447568d87288a8e868fe386c1da7ce09 intact.
Also note that nat is not supported for the main stub 127.0.0.53 which is
why forcing LOOPBACK_IFINDEX was/is fine for that case.
logind: do not print wall messages to local pseudoterminals
Fixes #23520. Replaces #23555.
The problem started with cdf370626f08ed509a5dde9d5618eed29d625032 and 90b1ec03b2ce939f589239133a32f4429f2ad6a6 which together started printing the
wall message in more cases. The motivation for those change was reasonable, but
this clearly causes problems described in #23520: users are getting unexpected
wall messages. Xterm, urxvt, (anything using libutempter?), and tmux (in some
configurations), register local pty sessions in utmp.
So let's try to suppress the message for local pseudo-terminal logins. This
patch based on #23538, but instead of filtering just on /dev/pts, it uses the
.ut_addr_v6 to only filter out local entries.
Yu Watanabe [Tue, 31 May 2022 19:01:10 +0000 (04:01 +0900)]
test-network: call networkctl only when specified interface exists
Otherwise, this easily trigger another exception:
```
======================================================================
ERROR: test_erspan_tunnel_v0 (__main__.NetworkdNetDevTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./test/test-network/systemd-networkd-tests.py", line 686, in wait_online
check_output(*args, env=env)
File "./test/test-network/systemd-networkd-tests.py", line 65, in check_output
return subprocess.check_output(command, universal_newlines=True, **kwargs).rstrip()
File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib64/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/lib/systemd/systemd-networkd-wait-online', '--timeout=20s', '--interface=erspan99:routable', '--interface=erspan98:routable', '--interface=dummy98:degraded']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test/test-network/systemd-networkd-tests.py", line 1808, in test_erspan_tunnel_v0
self.wait_online(['erspan99:routable', 'erspan98:routable', 'dummy98:degraded'])
File "./test/test-network/systemd-networkd-tests.py", line 689, in wait_online
output = check_output(*networkctl_cmd, '-n', '0', 'status', link.split(':')[0], env=env)
File "./test/test-network/systemd-networkd-tests.py", line 65, in check_output
return subprocess.check_output(command, universal_newlines=True, **kwargs).rstrip()
File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib64/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/bin/networkctl', '-n', '0', 'status', 'erspan99']' returned non-zero exit status 1.
```
Jan Janssen [Mon, 23 May 2022 10:32:50 +0000 (12:32 +0200)]
boot: Use strlen8/16
The casts in this and the next few commits are curently necessary
because CHAR8 is defined as uint8_t in gnu-efi, while char is signed.
Once we switch from gnu-efi typedefs to stdint types, the casts
will be dropped.
We currently have a convoluted and complex selection of which random
numbers to use. We can simplify this down to two functions that cover
all of our use cases:
1) Randomness for crypto: this one needs to wait until the RNG is
initialized. So it uses getrandom(0). If that's not available, it
polls on /dev/random, and then reads from /dev/urandom. This function
returns whether or not it was successful, as before.
2) Randomness for other things: this one uses getrandom(GRND_INSECURE).
If it's not available it uses getrandom(GRND_NONBLOCK). And if that
would block, then it falls back to /dev/urandom. And if /dev/urandom
isn't available, it uses the fallback code. It never fails and
doesn't return a value.
These two cases match all the uses of randomness inside of systemd.
I would prefer to make both of these return void, and get rid of the
fallback code, and simply assert in the incredibly unlikely case that
/dev/urandom doesn't exist. But Luca disagrees, so this commit attempts
to instead keep case (1) returning a return value, which all the callers
already check, and fix the fallback code in (2) to be less bad than
before.
For the less bad fallback code for (2), we now use auxval and some
timestamps, together with various counters representing the invocation,
hash it all together and provide the output. Provided that AT_RANDOM is
secure, this construction is probably okay too, though notably it
doesn't have any forward secrecy. Fortunately, it's only used by
random_bytes() and not by crypto_random_bytes().
msizanoen1 [Mon, 30 May 2022 15:08:07 +0000 (22:08 +0700)]
cgroup-util: Properly handle conditions where cgroup.threads is empty after SIGKILL but processes still remain
After sending a SIGKILL to a process, the process might disappear from
`cgroup.threads` but still show up in `cgroup.procs` and still remains in the
cgroup and cause migrating new processes to `Delegate=yes` cgroups to fail with
`-EBUSY`. This is especially likely for heavyweight processes that consume more
kernel CPU time to clean up.
Fix this by only returning 0 when both `cgroup.threads` and
`cgroup.procs` are empty.