Julia Kartseva [Sat, 22 Jan 2022 02:50:26 +0000 (18:50 -0800)]
bpf: name unnamed bpf programs
bpf-firewall and bpf-devices do not have names. This complicates
debugging with bpftool(8).
Assign names starting with 'sd_' prefix:
* firewall program names are 'sd_fw_ingress' for ingress attach
point and 'sd_fw_egress' for egress.
* 'sd_devices' for devices prog
'sd_' prefix is already used in source-compiled programs, e.g.
sd_restrictif_i, sd_restrictif_e, sd_bind6.
The name must not be longer than 15 characters or BPF_OBJ_NAME_LEN - 1.
Assign names only to programs loaded to kernel by systemd since
programs pinned to bpffs are already loaded.
YmrDtnJu [Fri, 21 Jan 2022 17:21:27 +0000 (18:21 +0100)]
Fix journald audit logging with fields > N_IOVEC_AUDIT_FIELDS.
ELEMENTSOF(iovec) is not the correct value for the newly introduced parameter m
to function map_all_fields because it is the maximum number of elements in the
iovec array, including those reserved for N_IOVEC_META_FIELDS. The correct
value is the current number of already used elements in the array plus the
maximum number to use for fields decoded from the kernel audit message.
Jan Janssen [Fri, 21 Jan 2022 17:34:04 +0000 (18:34 +0100)]
boot: Only build with debug symbols in developer mode
The debug symbols are of very limited use in proper deployments
unlike with regular userspace. Unless someone goes through the pain
of setting up an EFI debugger (assuming their firmware even supports
this in the first place) any provided debug symbols will just be
useless.
Debugging under QEMU is possible, but even then it is non-trivial
to set up, so anyone willing to go that far can just build in
developer mode.
Meanwhile, at least x86 firmware tends to refuse binaries that contain
debug symbols. We do strip the files when converted to PE anyway, but
the elf file needs to stay around on other arches as objcopy does not
support PE as input there.
Also, the generated debug symbols seem to be not reproducible when
building with LTO. Whether this is an issue in tooling or our side
is unclear. This works around this issue.
Daan De Meyer [Fri, 21 Jan 2022 14:28:23 +0000 (14:28 +0000)]
meson: Add missing test dependencies
Currently, running "meson build" followed by "meson test -C build"
will result in many failed tests due to missing dependencies. This
commit adds the missing dependencies to make sure no tests fail.
Luca Boccassi [Mon, 17 Jan 2022 01:14:14 +0000 (01:14 +0000)]
core: add ExtensionDirectories= setting
Add a new setting that follows the same principle and implementation
as ExtensionImages, but using directories as sources.
It will be used to implement support for extending portable images
with directories, since portable services can already use a directory
as root.
Martin Wilck [Thu, 20 Jan 2022 13:31:45 +0000 (14:31 +0100)]
udevadm: cleanup-db: don't delete information for kept db entries
devices with the db_persist property won't be deleted during database
cleanup. This applies to dm and md devices in particular.
For such devices, we should also keep the files under /run/udev/links,
/run/udev/tags, and /run/udev/watch, to make sure that after restart,
udevd has the same information about the devices as it did before
the cleanup.
If we don't do this, a lower-priority device that is discovered in
the coldplug phase may take over symlinks from a device that persisted.
Not removing the watches also enables udevd to resume watching a device
after restart.
core: add %y/%Y specifiers for the fragment path of the unit
Fixes #6308: people want to be able to link a unit file via 'systemctl enable'
from a git checkout or such and refer to other files in the same repo.
The new specifiers make that easy.
%y/%Y is used because other more obvious choices like %d/%D or %p/%P are
not available because at least on of the two letters is already used.
The new specifiers are only available in units. Technically it would be
trivial to add then in [Install] too, but I don't see how they could be
useful, so I didn't do that.
I added both %y and %Y because both were requested in the issue, and because I
think both could be useful, depending on the case. %Y to refer to other files
in the same repo, and %y in the case where a single repo has multiple unit files,
and e.g. each unit has some corresponding asset named after the unit file.
Yu Watanabe [Thu, 20 Jan 2022 18:03:45 +0000 (03:03 +0900)]
resolve: refuse to resolve empty hostname
Previously, varlink or dbus methods return
io.systemd.Resolve.NoNameServers or BUS_ERROR_NO_NAME_SERVERS if an
empty hostname is provided, and thus nss-resolve returns NSS_STATUS_TRYAGAIN.
That causes getaddrinfo() returns 'Temporary failure in name resolution'
instead of 'Name or service not known'.
This makes calling varlink or dbus method with an empty hostname result
-EINVAL, and hence nss-resolve returns NSS_STATUS_NOTFOUND.
Anita Zhang [Wed, 19 Jan 2022 21:26:01 +0000 (13:26 -0800)]
oomd: handle situations when no cgroups are killed
Currently if systemd-oomd doesn't kill anything in a selected cgroup, it
selects a new candidate immediately. But if a selected cgroup wasn't killed,
it is likely due to it disappearing or getting cleaned up between the time
it was selected as a candidate and getting sent SIGKILL(s). We should handle
it as though systemd-oomd did perform a kill so that it will check
swap/pressure again before it tries to select a new candidate.
Anita Zhang [Wed, 19 Jan 2022 18:40:46 +0000 (10:40 -0800)]
oomd: fix race with path unavailability when killing cgroups
There can be a situation where systemd-oomd would kill all of the processes
in a cgroup, pid1 would clean up that cgroup, and systemd-oomd would get
ENODEV trying to iterate the cgroup a final time to ensure it was empty.
systemd-oomd sees this as an error and immediately picks a new candidate even
though pressure may have recovered. To counter this, check and handle
path unavailability errnos specially.
We would busily allocate an empty string to concatenate all of it's
zero characters to the output. Let's make things a bit simpler by letting
the specifier functions return NULL to mean "nothing to append".
Yu Watanabe [Thu, 20 Jan 2022 19:46:14 +0000 (04:46 +0900)]
resolve: drop redundant call of link_allocate_scopes() and link_add_rrs()
In `manager_process_link()`, the function `link_update()` is called just
after `link_process_rtnl()`, and `link_update()` also calls
`link_allocate_scopes()` and `link_add_rrs()`. Hence, the calls in
`link_process_rtnl()` are redundant.
by always calling journal_remote_server_destroy, which resets global
variables like journal_remote_server_global. It should prevent crashes like
```
Assertion 'journal_remote_server_global == NULL' failed at src/journal-remote/journal-remote.c:312, function int journal_remote_server_init(RemoteServer *, const char *, JournalWriteSplitMode, _Bool, _Bool)(). Aborting.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==24769==ERROR: AddressSanitizer: ABRT on unknown address 0x0539000060c1 (pc 0x7f23b4d5818b bp 0x7ffcbc4080c0 sp 0x7ffcbc407e70 T0)
SCARINESS: 10 (signal)
#0 0x7f23b4d5818b in raise /build/glibc-eX1tMB/glibc-2.31/sysdeps/unix/sysv/linux/raise.c:51:1
#1 0x7f23b4d37858 in abort /build/glibc-eX1tMB/glibc-2.31/stdlib/abort.c:79:7
#2 0x7f23b5731809 in log_assert_failed systemd/src/basic/log.c:866:9
```
systemd seems to be failing to compile there with gcc-12 but considering
that gcc-12 hasn't been released yet it doesn't seem to make sense
to add workarounds to get it to compile there. Until gcc-12 is
stabilized it should be enough to build systemd on fedora-35 to
make sure it's buildable on i386.
Luca Boccassi [Wed, 19 Jan 2022 00:27:45 +0000 (00:27 +0000)]
sysext: use LO_FLAGS_PARTSCAN when opening image
Jan 17 12:34:59 myguest1 (sd-sysext)[486]: Device '/var/lib/extensions/myext.raw' is loopback block device with partition scanning turned off, please turn it on.
Yu Watanabe [Fri, 14 Jan 2022 08:24:49 +0000 (17:24 +0900)]
udev/net: also support [SR-IOV] section in .link files
The same section is already supported by .network files. But such
low-level inteerface setting should be done by udevd, instead of
networkd. Let's also support the same semantics by .link files.
Prompted by https://github.com/systemd/systemd/issues/20474#issuecomment-901901360.
Luca Boccassi [Wed, 19 Jan 2022 00:01:48 +0000 (00:01 +0000)]
dissect-image: validate extension-release even if the host has only ID in os-release
A rolling distro won't set VERSION_ID or SYSEXT_LEVEL in os-release,
which means we skip validation of ExtensionImages.
Validate even with just an ID, the lower level helper already
recognizes and accepts this use case.
When bootctl lists the boot entries, considers also the ones
returned by systemd-boot (via the efi LoaderEntries variable),
created at boot time.
Unfortunately this list may became incorrect if (e.g.) the user remove a
kernel package.
This patch changes this behaviour, so bootctl ignores some the
boot entries returned by systemd-boot.
In any case, bootctl still considers the 'auto-xxx' boot entries
listed below:
Boot entrie name Title
----------------------------- ------------------------------
auto-osx macOS boot loader
auto-windows Windows Boot Manager
auto-efi-shell EFI Shell
auto-efi-default EFI Default Loader
auto-reboot-to-firmware-setup Reboot Into Firmware Interface
The other entries that systemd-boot synthetizes (e.g. the ones loaded from
/efi/loader/entries/<uuid>) can be synthetized by bootctl too, so no
information is lost.
Jan Janssen [Tue, 18 Jan 2022 12:01:39 +0000 (13:01 +0100)]
boot: Simplify looking for the xboot hard drive
The device path should not contain multiple hard drive nodes in it,
so looking at them all should not be needed.
If some crazy firmware/driver were to make nested GPT drives
available like that, we should be only looking at the last partition
and its containing GPT drive anyway.
Looks like https://github.com/mesonbuild/meson/issues/957 was
reintroduced in meson-0.57.0 (and looking and https://mesonbuild.com/Release-notes-for-0-57-0.html
I'm not sure whether it was intentional or not) so run_command can no
longer be used to get around
https://github.com/mesonbuild/meson/issues/3589. Let's just force
ctags to always use absolute paths to fix it once and for all.
Daan De Meyer [Fri, 14 Jan 2022 16:49:06 +0000 (16:49 +0000)]
journal: Copy holes when archiving BTRFS journal files
Previously, the holes we punched earlier would get removed when
copying the file. Let's enable the new COPY_HOLES flag to make
sure this doesn't happen.
In my test, this drops a 800MB btrfs journal (without compression)
to 720 MB.
meson: drop unused SYSTEMD_STDIO_BRIDGE_BINARY_PATH
The whole point of systemd-stdio-bridge is to be executed on "foreign" systems
where the path might be different, so we use $PATH to find the binary everywhere.
man: enhance the description of systemd-stdio-bridge
I hope that this fixes the comment
https://github.com/systemd/systemd/pull/22141#issuecomment-1013960371
> As someone who doesn't know what this prog does
The listing in the man page is sorted according to logical
use: all the options setting the address are now together.
ci: get Coverity and CodeQL to analyze the "libxkbcommon" part
By analogy with https://github.com/systemd/systemd/pull/22138, to get
the static analyzers to analyze that part of code that package should
be installed there as well.