Yu Watanabe [Fri, 28 Jan 2022 00:29:59 +0000 (09:29 +0900)]
resolve: llmnr: fix never hit condition
Previously, the condition in on_stream_io_impl() never hit, as the
read packet is always taken from the stream in the few lines above.
Instead of the dns_stream_complete() under the condition, the stream
is unref()ed in the on_packet callback for LLMNR stream, unlike the
other on_packet callbacks.
That's quite tricky. Also, potentially, the stream may still have
queued packets to write.
This fix the condition, and drops the unref() in the on_packet callback.
Joan Bruguera [Sun, 23 Jan 2022 16:08:12 +0000 (17:08 +0100)]
resolved: Test for DnsStream (plain TCP DNS and DoT)
Tests DnsStream event handling, both for plain TCP DNS and DNS over TLS.
The DoT test requires the "openssl s_server" command line tool to mock a simple
TLS server. Thus the test's TLS part is skipped if openssl it not available.
The test works for both DNS_OVER_TLS_USE_GNUTLS and DNS_OVER_TLS_USE_OPENSSL.
The DoT case fails due to a bug, which is fixed on the next commit.
Joan Bruguera [Sat, 15 Jan 2022 16:33:25 +0000 (17:33 +0100)]
resolved: Fix DoT timeout on multiple answer records
When sending multiple DNS questions to a DNS-over-TLS server (e.g. a question
for A and AAAA records, as is typical) on the same session, the server may
answer to each question in a separate TLS record, but it may also aggregate
multiple answers in a single TLS record.
(Some servers do this very often (e.g. Cloudflare 1.0.0.1), some do it sometimes
(e.g. Google 8.8.8.8) and some seem to never do it (e.g. Quad9 9.9.9.10)).
Both cases should be handled equivalently, as the byte stream is the same, but
when multiple answers came in a single TLS record, usually the first answer was
processed, but the second answer was entirely ignored, which caused a 10s delay
until the resolution timed out and the missing question was retried.
This can be reproduced by configuring one of the offending server and running
`resolvectl query google.com --cache=no` a few times.
To be notified of incoming data, systemd-resolved listens to `EPOLLIN` events
on the underlying socket. However, when DNS-over-TLS is used, the TLS library
(OpenSSL or GnuTLS) may read and buffer the entire TLS record when reading the
first answer, so usually no further `EPOLLIN` events will be generated, and the
second answer will never be processed.
To avoid this, if there's buffered TLS data, generate a "fake" EPOLLIN event.
This is hacky, but it makes this case transparent to the rest of the IO code.
Daan De Meyer [Mon, 1 Nov 2021 14:33:08 +0000 (14:33 +0000)]
journal: Stop comparing hash values from entry items against data objects
These checks don't achieve anything of value. Assuming they were added to
check for corruption, they don't actually achieve this goal since other parts
of the data object can still get corrupted and we wouldn't notice unless we'd
recalculate the hash every time.
In theory, we could use the entry item hash to avoid a random access lookup
for the data object hash in the journal file in the future to speed up searching,
but for finding all entry objects containing a specific data objects, we already
have entry arrays per data object to get fast access to this information.
This means that duplicating the hashes in the entry item doesn't result in any
added value. In this commit, we remove the checks so that in future commits we
can remove the hashes from the journal file format in the new compact mode.
Daan De Meyer [Tue, 25 Jan 2022 13:26:22 +0000 (13:26 +0000)]
journal: Invert verify entry <=> data consistency checks
Previously, for each entry in a data object's entry array, we'd check
if one of that entry's entry items referred to the data object.
Instead, when verifying the main entry array, let's check if for each
entry item found by iterating the main entry array, the corresponding
data object's entry array refers to that entry.
This enables us to re-use more code from journal-file and turns out to
be roughly 10s faster when verifying my 4G laptop journal.
When verifying data objects, we still check if every entry in the data
object's entry array also exists in the main entry array so that we ensure
we're not missing any entries when iterating the main entry array.
Daan De Meyer [Tue, 25 Jan 2022 13:21:55 +0000 (13:21 +0000)]
journal: Fail gracefully when linking a new entry
Let's always try to link all entry items even if linking one fails
due to not being able to allocate a new entry array. Other entry
items might still be successfully linked if the entry array of the
corresponding data object isn't full yet.
Daan De Meyer [Tue, 25 Jan 2022 11:50:40 +0000 (11:50 +0000)]
journal: Only move to objects when necessary
Let's make sure we only move to objects when it's required. If "ret"
is NULL, the caller isn't interested in the actual object and the
function being called shouldn't move to it unless it has to
inspect/modify the object itself.
Daan De Meyer [Tue, 25 Jan 2022 11:10:26 +0000 (11:10 +0000)]
journal: Pass data objects to journal_file_move_to_entry_..._for_data() functions
This reduces the number of calls to journal_file_move_to_object() which are heavy.
All call sites have easy access to the data object so this change doesn't end up
complicating things.
Daan De Meyer [Mon, 24 Jan 2022 13:40:06 +0000 (13:40 +0000)]
journal: Use offsetof(Object, ...) to retrieve object field offsets
We currently use both offsetof(Object, ...) and offsetof(DataObject, ...).
This makes it harder to grep for usages as we have to make sure we grep for
both usages. Let's unify these all to use offsetof(Object, ...) to make it
easier to grep for usages.
Luca Boccassi [Wed, 26 Jan 2022 19:00:25 +0000 (19:00 +0000)]
core: do not restart a service with Restart=always when ExecCondition fails
When a Condition*= fails, and a service has Restart=always,
the service is not restarted.
Follow the same behaviour for ExecCondition= to avoid inconsistencies.
Daan De Meyer [Wed, 26 Jan 2022 12:08:50 +0000 (12:08 +0000)]
shared: Ensure COPY_HOLES copies trailing holes
Previously, files with a hole at the end would get silently truncated
which breaks reading journal files. This commit makes sure that holes
are punched in existing space and if no more space is available, that
we grow the file and the hole by using ftruncate().
The corresponding test is extended to put a hole at the end of the file
and we make sure that hole is copied correctly.
Yu Watanabe [Wed, 26 Jan 2022 07:48:08 +0000 (16:48 +0900)]
wait-online: make manager_link_is_online() return 0 when in unmanaged state
Previously, even if a link is in unmanaged state, the function may
returns positive value. So, even if all managed links are in the configured
sate but do not satisfy the online criteria, e.g., IPv4 address state,
then wait-online finishes with positive value.
This makes the function always return 0 for unmanaged state. So, at
least one managed link must satisfies the online criteria.
Jan Janssen [Thu, 20 Jan 2022 10:59:49 +0000 (11:59 +0100)]
meson: Remove test-efi-create-disk.sh
The script was probably not used for a very long time. It is currently
passed systemd_boot.so as boot loader, which cannot work. The test
entries it creates are all pointing at non-existant efi/linux binaries,
which means they would not even show up in the menu if the created image
were actually booted. There is also nothing that actually tries to run
the image in the first place.
If we end up creating a proper systemd-boot test suite, it would be
better to start from scratch. In the meantime, mkosi already covers
the bare minimum with a simple bootup test.
user-runtime-dir: error out immediately if mkdir fails
We try to create two directories: /run/user and /run/user/<UID>. For the
first we check the return value and error out if creation fails. But for
the second one we continued based on the assumption that the subsequent
mount will immediately fail anyway. But this has the disadvantage that we
get a somewhat confusing error message:
janv. 23 22:04:31 nsfw systemd-user-runtime-dir[1660]: Failed to mount per-user tmpfs directory /run/user/1000: No such file or directory
Let's instead fail immediately with a precise error message.
For https://bugzilla.redhat.com/show_bug.cgi?id=2044100.
Rename the normalize_mounts() helper to drop_unused_mounts. All the
helpers called in there get rid of mounts that are unused for a variety
of reasons. And whereas the helpers are aptly prefixed with "drop" the
overall helper isn't and instead uses "normalize".
Make it more obvious what the helper actually does by renaming it from
normalize_mounts() to drop_unused_mounts(). Readers of code calling this
helper will immediately see that it will get rid of unused mounts.
core/namespace: allow using ProtectSubset=pid and ProtectHostname=true together
If a service requests both ProtectSubset=pid and ProtectHostname=true
then it will currently fail to start. The ProcSubset=pid option
instructs systemd to mount procfs for the service with subset=pid which
hides all entries other than /proc/<pid>. Consequently trying to
interact with the two files /proc/sys/kernel/{hostname,domainname}
covered by ProtectHostname=true will fail.
Fix this by only performing this check when ProtectSubset=pid is not
requested. Essentially ProtectSubset=pid implies/provides
ProtectHostname=true.
Yu Watanabe [Sat, 22 Jan 2022 17:27:26 +0000 (02:27 +0900)]
sd-dhcp-server: drop unnecessary buffer duplication
The block try to find and remove the existing static lease which matches
the provided client ID, and the provided client ID will not be stored
anywhere. Hence, it is not necessary to duplicate it.
Remove incorrect claim that C escapes (such as \t and \n) are recognized and that control characters are disallowed. Specify the allowed characters and escapes with single quotes, with double quotes, and without quotes.
Thomas Haller [Sat, 22 Jan 2022 14:02:04 +0000 (15:02 +0100)]
sd-event: workaround maybe-uninitalized warning in sd_event_add_inotify()
With LTO, the compiler might think that the variable is uninitialized
(from NetworkManager's fork, with gcc-11.2.1-1.fc35):
src/libnm-systemd-core/src/libsystemd/sd-event/sd-event.c: In function 'sd_event_add_inotify':
src/libnm-systemd-core/src/libsystemd/sd-event/sd-event.c:2120: error: 's' may be used uninitialized in this function [-Werror=maybe-uninitialized]
2120 | *ret = s;
|
src/libnm-systemd-core/src/libsystemd/sd-event/sd-event.c:2102: note: 's' was declared here
2102 | sd_event_source *s;
|
lto1: all warnings being treated as errors
In particular, that would happen for codepaths where event_add_inotify_fd_internal()
returns `-errno`, and the compiler cannot be sure that the returned value will
be negative. Technically, the compiler is right, but we rely on libc functions
to set errno correctly, so this only happens in code paths, where something
bad already happend.
While LTO is prone to such false warnings, we are largely able to build systemd
without warnings. So it is feasible and we should make the effort of working
around warnings as they appear.
Yu Watanabe [Sat, 22 Jan 2022 01:44:50 +0000 (10:44 +0900)]
hostname: allow to override hardware vendor and model
Sometimes hardware vendor does not set DMI info correctly.
Already there is a way that the dbus properties can be overriden by
using hwdb. But that is not user friendly.
Julia Kartseva [Sat, 22 Jan 2022 02:50:26 +0000 (18:50 -0800)]
bpf: name unnamed bpf programs
bpf-firewall and bpf-devices do not have names. This complicates
debugging with bpftool(8).
Assign names starting with 'sd_' prefix:
* firewall program names are 'sd_fw_ingress' for ingress attach
point and 'sd_fw_egress' for egress.
* 'sd_devices' for devices prog
'sd_' prefix is already used in source-compiled programs, e.g.
sd_restrictif_i, sd_restrictif_e, sd_bind6.
The name must not be longer than 15 characters or BPF_OBJ_NAME_LEN - 1.
Assign names only to programs loaded to kernel by systemd since
programs pinned to bpffs are already loaded.
YmrDtnJu [Fri, 21 Jan 2022 17:21:27 +0000 (18:21 +0100)]
Fix journald audit logging with fields > N_IOVEC_AUDIT_FIELDS.
ELEMENTSOF(iovec) is not the correct value for the newly introduced parameter m
to function map_all_fields because it is the maximum number of elements in the
iovec array, including those reserved for N_IOVEC_META_FIELDS. The correct
value is the current number of already used elements in the array plus the
maximum number to use for fields decoded from the kernel audit message.
Jan Janssen [Fri, 21 Jan 2022 17:34:04 +0000 (18:34 +0100)]
boot: Only build with debug symbols in developer mode
The debug symbols are of very limited use in proper deployments
unlike with regular userspace. Unless someone goes through the pain
of setting up an EFI debugger (assuming their firmware even supports
this in the first place) any provided debug symbols will just be
useless.
Debugging under QEMU is possible, but even then it is non-trivial
to set up, so anyone willing to go that far can just build in
developer mode.
Meanwhile, at least x86 firmware tends to refuse binaries that contain
debug symbols. We do strip the files when converted to PE anyway, but
the elf file needs to stay around on other arches as objcopy does not
support PE as input there.
Also, the generated debug symbols seem to be not reproducible when
building with LTO. Whether this is an issue in tooling or our side
is unclear. This works around this issue.
Daan De Meyer [Fri, 21 Jan 2022 14:28:23 +0000 (14:28 +0000)]
meson: Add missing test dependencies
Currently, running "meson build" followed by "meson test -C build"
will result in many failed tests due to missing dependencies. This
commit adds the missing dependencies to make sure no tests fail.