journal: when copying journal file to undo NOCOW flag, go via fd
We have the journal file open already, hence reference it via the fd
insted of the file name. After all, some other tool might have
renamed/deleted it already.
Let's not actually reuse the fd though, since we want a separate file
offset for the copying, hence just make it simply and reopen via
/proc/self/fd/.
tests: pass FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION to fuzzers
to let them use reproducible identifiers, which should make it possible
to really use files copied from OSS-Fuzz to reproduce issues on
GHActions and locally. Prompted by https://github.com/systemd/systemd/pull/22365
Rationale: the focus here should be on the fact that these are "unified"
images, whether our stub is used or not, or something else doesn't
really matter. Moreover, these are still Linux entries. Hence, emphasize
that these are *unified* images, and *Linux* images, and deemphesize
that our sd-stub is likely used.
to make sure outgoing packets based on incoming packets are fine.
It's just another follow-up to
https://github.com/systemd/systemd/pull/10200.
Better late than never :-)
dhcp-identifier: always use a fixed machine-id while fuzzing
It's a follow-up to https://github.com/systemd/systemd/pull/10200 where
that fuzzer was introduced. At the time it was run regularly on machines
where machine-id wasn't present so it was kind of reproducible. Now
it's run on CIFuzz and CFLite using GHActions with the public OSS-Fuzz
corpora (based on that particular machine-id) so to fully utilize
those corpora it's necessary to use it always. Other than that
it makes it possible for fuzzers targeting outgoing packets
based on incoming packets like https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1795921
to get past client_parse_message on my machine :-)
Before the commit, when `mkdir_parents_internal()` is called from `mkdir_p()`,
it uses `_mkdir()` as `flag` is zero. But after the commit, `mkdir_safe_internal()`
is always used. Hence, if the path contains a symlink, it fails with -ENOTDIR.
To fix the issue, this makes `mkdir_p()` calls `mkdir_parents_internal()` with
MKDIR_FOLLOW_SYMLINK flag.
Originally, we always used the mmap logic to determine the current end
of the file. ab6e257b3e4e5b95f3750ed019bed6e89989e41b changed this so
that we always used pread().
With this change we'll use pread() from the synchronization thread and
mmap otherwise.
tests: rework test macros to not take code as parameters
C macros are nasty. We use them, but we try to be conservative with
them. In particular passing literal, complex code blocks as argument is
icky, because of "," handling of C, and also because it's quite a
challange for most code highlighters and similar. Hence, let's avoid
that. Using macros for genreating functions is OK but if so, the
parameters should be simple words, not full code blocks.
hence, rework DEFINE_CUSTOM_TEST_MAIN() to take a function name instead
of code block as argument.
As side-effect this also fixes a bunch of cases where we might end up
returning a negative value from main().
Some uses of DEFINE_CUSTOM_TEST_MAIN() inserted local variables into the
main() functions, these are replaced by static variables, and their
destructors by the static destructor logic.
This doesn't fix any bugs or so, it's just supposed to make the code
easier to work with and improve it easthetically.
Or in other words: let's use macros where it really makes sense, but
let's not go overboard with it.
(And yes, FOREACH_DIRENT() is another one of those macros that take
code, and I dislike that too and regret I ever added that.)
units: we need systemd-journald.service from systemd-journal-flush.service
This is a follow-up for d5ee050ffc9d413253932d9340ade8c8fb111092, and
reintroduces a requirement dep from systemd-journal-flush.service onto
systemd-journald.service, but a weaker one than originally: a Wants= one
instead of a Requires= one.
Why? Simply because the service issues an IPC call to the journald,
hence it should pull it in. (Note that socket activation doesn't happen
for the Varlink socket it uses, hence we should pull in the service
itself.)
Joan Bruguera [Sun, 30 Jan 2022 16:56:32 +0000 (17:56 +0100)]
resolved: Allow test-resolved-stream to run concurrently
Since test-resolved-stream brings up a simple DNS server on 127.0.0.1:12345,
only one instance could run at a time, so it would fail when run like
`meson test -C build test-resolved-stream --repeat=1000`.
Similarly, if by chance something is up on port 12345, the test would fail.
To make the test more reliable, run it in an isolated user + network namespace.
If this fails (some distributions disable user namespaces), just run as before.
Joan Bruguera [Sun, 30 Jan 2022 11:51:10 +0000 (12:51 +0100)]
resolved: Read as much as possible per stream EPOLLIN event
In commit 2aaf6bb6e99b0f2bd73e0c49bef9e11a2844bf1a, an issue was fixed where
systemd-resolved could get stuck for multiple seconds waiting for incoming data,
since GnuTLS/OpenSSL can buffer a TLS record, so data could be available, but
no EPOLLIN event would be generated.
To fix this, a somewhat elaborate logic consisting on asking the TLS library
whether it had buffered data, then "faking" an EPOLLIN event was implemented.
However, there is a much simpler solution: Always read as much data as available
(i.e. until we get an event like EAGAIN when trying to read) from the stream
when we get an EPOLLIN event, instead of at most a single packet per event.
This approach does not require asking the TLS library whether it has buffered
data, and the logic is exactly the same for both the TCP and TLS case.
test-resolved-stream is fixed to avoid a latent double free bug.
Joan Bruguera [Mon, 31 Jan 2022 20:28:32 +0000 (21:28 +0100)]
resolved: Avoid multiple SSL writes per DoT packet
In the DoT case, dns_stream_writev decomposed an iovec into multiple
dnstls_stream_write calls, which resulted in multiple SSL writes and multiple
TLS records. This can be checked from a network capture, e.g. using socat:
socat -v -x openssl-listen:853,reuseaddr,fork,cert=my.cert,key=my.key,verify=0 openssl:8.8.8.8:853
Instead, propagate the iovec as-is into the DoT handling code. For GnuTLS, the
library provides support for buffering ('corking') a record. OpenSSL has no
such facility, so we join the iovec into a single buffer then call SSL_write.
socat capture of `resolvectl -4 query --cache=no example.com` before the commit:
Joan Bruguera [Mon, 31 Jan 2022 20:28:21 +0000 (21:28 +0100)]
resolved: Make event flags logic robust for DoT
Since when handling a DNS over TLS stream, the TLS library can override the
requested events through dnstls_events for handshake/shutdown purposes,
obtaining the event flags through sd_event_source_get_io_events and checking
for EPOLLIN or EPOLLOUT does not really tell us whether we want to read/write
a packet. Instead, it could just be OpenSSL/GnuTLS doing something else.
To make the logic more robust (and simpler), save the flags that tell us
whether we want to read/write a packet, and check them instead of the IO flags.
(& use uint32_t for the flags like in sd_event_source_set_io_events prototype)
ASSERT_SE_PTR() is like ASSERT_PTR() but uses assert_se() instead of
assert() internally.
Code should use ASSERT_SE_PTR() where the check should never be
optimized away, even if NDEBUG is set.
Rationale: assert() is the right choice for validating assumptions about
our own code, i.e. checking conditions that are "impossible" to not
hold, because we ourselves hacked things up the "right" way of course.
assert_se() is the right choice for tests that come with a weaker
guarantee, they encode assumptions over other's API behaviour, i.e.
whether something can fail there or not.
When developing tools that are not oom-safe assert_se() is the right
choice: we know that on Linux OOM doesn't really happen, even though
theoretically the API allows it to happen.
Usecase for ASSERT_SE_PTR() is mostly the fatal memory allocation logic
for EFI memory allocations. So far it used regular assert() i.e. OOM
failurs would be totally ignored if NDEBUG is set. We'd rather have our
EFI program to print an assert message and freeze instead though.
Yu Watanabe [Tue, 1 Feb 2022 04:00:51 +0000 (13:00 +0900)]
network: xfrm: refuse zero interface ID
Since kernel 5.17-rc1, 5.16.3, and 5.15.17 (more specifically,
https://github.com/torvalds/linux/commit/8dce43919566f06e865f7e8949f5c10d8c2493f5)
the kernel refuses to create an xfrm interface with zero ID.
Yu Watanabe [Sun, 30 Jan 2022 20:04:52 +0000 (05:04 +0900)]
sd-dhcp-lease: fix reading unaligned memory
The destination address was read twice, one is for prefixlen, and
other is for destination address itself. And for prefixlen, the address
might be read from unaligned buffer.