Rename the normalize_mounts() helper to drop_unused_mounts. All the
helpers called in there get rid of mounts that are unused for a variety
of reasons. And whereas the helpers are aptly prefixed with "drop" the
overall helper isn't and instead uses "normalize".
Make it more obvious what the helper actually does by renaming it from
normalize_mounts() to drop_unused_mounts(). Readers of code calling this
helper will immediately see that it will get rid of unused mounts.
core/namespace: allow using ProtectSubset=pid and ProtectHostname=true together
If a service requests both ProtectSubset=pid and ProtectHostname=true
then it will currently fail to start. The ProcSubset=pid option
instructs systemd to mount procfs for the service with subset=pid which
hides all entries other than /proc/<pid>. Consequently trying to
interact with the two files /proc/sys/kernel/{hostname,domainname}
covered by ProtectHostname=true will fail.
Fix this by only performing this check when ProtectSubset=pid is not
requested. Essentially ProtectSubset=pid implies/provides
ProtectHostname=true.
core: add %y/%Y specifiers for the fragment path of the unit
Fixes #6308: people want to be able to link a unit file via 'systemctl enable'
from a git checkout or such and refer to other files in the same repo.
The new specifiers make that easy.
%y/%Y is used because other more obvious choices like %d/%D or %p/%P are
not available because at least on of the two letters is already used.
The new specifiers are only available in units. Technically it would be
trivial to add then in [Install] too, but I don't see how they could be
useful, so I didn't do that.
I added both %y and %Y because both were requested in the issue, and because I
think both could be useful, depending on the case. %Y to refer to other files
in the same repo, and %y in the case where a single repo has multiple unit files,
and e.g. each unit has some corresponding asset named after the unit file.
Yu Watanabe [Thu, 20 Jan 2022 18:03:45 +0000 (03:03 +0900)]
resolve: refuse to resolve empty hostname
Previously, varlink or dbus methods return
io.systemd.Resolve.NoNameServers or BUS_ERROR_NO_NAME_SERVERS if an
empty hostname is provided, and thus nss-resolve returns NSS_STATUS_TRYAGAIN.
That causes getaddrinfo() returns 'Temporary failure in name resolution'
instead of 'Name or service not known'.
This makes calling varlink or dbus method with an empty hostname result
-EINVAL, and hence nss-resolve returns NSS_STATUS_NOTFOUND.
Anita Zhang [Wed, 19 Jan 2022 21:26:01 +0000 (13:26 -0800)]
oomd: handle situations when no cgroups are killed
Currently if systemd-oomd doesn't kill anything in a selected cgroup, it
selects a new candidate immediately. But if a selected cgroup wasn't killed,
it is likely due to it disappearing or getting cleaned up between the time
it was selected as a candidate and getting sent SIGKILL(s). We should handle
it as though systemd-oomd did perform a kill so that it will check
swap/pressure again before it tries to select a new candidate.
Anita Zhang [Wed, 19 Jan 2022 18:40:46 +0000 (10:40 -0800)]
oomd: fix race with path unavailability when killing cgroups
There can be a situation where systemd-oomd would kill all of the processes
in a cgroup, pid1 would clean up that cgroup, and systemd-oomd would get
ENODEV trying to iterate the cgroup a final time to ensure it was empty.
systemd-oomd sees this as an error and immediately picks a new candidate even
though pressure may have recovered. To counter this, check and handle
path unavailability errnos specially.
We would busily allocate an empty string to concatenate all of it's
zero characters to the output. Let's make things a bit simpler by letting
the specifier functions return NULL to mean "nothing to append".
Yu Watanabe [Thu, 20 Jan 2022 19:46:14 +0000 (04:46 +0900)]
resolve: drop redundant call of link_allocate_scopes() and link_add_rrs()
In `manager_process_link()`, the function `link_update()` is called just
after `link_process_rtnl()`, and `link_update()` also calls
`link_allocate_scopes()` and `link_add_rrs()`. Hence, the calls in
`link_process_rtnl()` are redundant.
by always calling journal_remote_server_destroy, which resets global
variables like journal_remote_server_global. It should prevent crashes like
```
Assertion 'journal_remote_server_global == NULL' failed at src/journal-remote/journal-remote.c:312, function int journal_remote_server_init(RemoteServer *, const char *, JournalWriteSplitMode, _Bool, _Bool)(). Aborting.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==24769==ERROR: AddressSanitizer: ABRT on unknown address 0x0539000060c1 (pc 0x7f23b4d5818b bp 0x7ffcbc4080c0 sp 0x7ffcbc407e70 T0)
SCARINESS: 10 (signal)
#0 0x7f23b4d5818b in raise /build/glibc-eX1tMB/glibc-2.31/sysdeps/unix/sysv/linux/raise.c:51:1
#1 0x7f23b4d37858 in abort /build/glibc-eX1tMB/glibc-2.31/stdlib/abort.c:79:7
#2 0x7f23b5731809 in log_assert_failed systemd/src/basic/log.c:866:9
```
systemd seems to be failing to compile there with gcc-12 but considering
that gcc-12 hasn't been released yet it doesn't seem to make sense
to add workarounds to get it to compile there. Until gcc-12 is
stabilized it should be enough to build systemd on fedora-35 to
make sure it's buildable on i386.
Luca Boccassi [Wed, 19 Jan 2022 00:27:45 +0000 (00:27 +0000)]
sysext: use LO_FLAGS_PARTSCAN when opening image
Jan 17 12:34:59 myguest1 (sd-sysext)[486]: Device '/var/lib/extensions/myext.raw' is loopback block device with partition scanning turned off, please turn it on.
Yu Watanabe [Fri, 14 Jan 2022 08:24:49 +0000 (17:24 +0900)]
udev/net: also support [SR-IOV] section in .link files
The same section is already supported by .network files. But such
low-level inteerface setting should be done by udevd, instead of
networkd. Let's also support the same semantics by .link files.
Prompted by https://github.com/systemd/systemd/issues/20474#issuecomment-901901360.
Luca Boccassi [Wed, 19 Jan 2022 00:01:48 +0000 (00:01 +0000)]
dissect-image: validate extension-release even if the host has only ID in os-release
A rolling distro won't set VERSION_ID or SYSEXT_LEVEL in os-release,
which means we skip validation of ExtensionImages.
Validate even with just an ID, the lower level helper already
recognizes and accepts this use case.
When bootctl lists the boot entries, considers also the ones
returned by systemd-boot (via the efi LoaderEntries variable),
created at boot time.
Unfortunately this list may became incorrect if (e.g.) the user remove a
kernel package.
This patch changes this behaviour, so bootctl ignores some the
boot entries returned by systemd-boot.
In any case, bootctl still considers the 'auto-xxx' boot entries
listed below:
Boot entrie name Title
----------------------------- ------------------------------
auto-osx macOS boot loader
auto-windows Windows Boot Manager
auto-efi-shell EFI Shell
auto-efi-default EFI Default Loader
auto-reboot-to-firmware-setup Reboot Into Firmware Interface
The other entries that systemd-boot synthetizes (e.g. the ones loaded from
/efi/loader/entries/<uuid>) can be synthetized by bootctl too, so no
information is lost.
Jan Janssen [Tue, 18 Jan 2022 12:01:39 +0000 (13:01 +0100)]
boot: Simplify looking for the xboot hard drive
The device path should not contain multiple hard drive nodes in it,
so looking at them all should not be needed.
If some crazy firmware/driver were to make nested GPT drives
available like that, we should be only looking at the last partition
and its containing GPT drive anyway.
Looks like https://github.com/mesonbuild/meson/issues/957 was
reintroduced in meson-0.57.0 (and looking and https://mesonbuild.com/Release-notes-for-0-57-0.html
I'm not sure whether it was intentional or not) so run_command can no
longer be used to get around
https://github.com/mesonbuild/meson/issues/3589. Let's just force
ctags to always use absolute paths to fix it once and for all.
Daan De Meyer [Fri, 14 Jan 2022 16:49:06 +0000 (16:49 +0000)]
journal: Copy holes when archiving BTRFS journal files
Previously, the holes we punched earlier would get removed when
copying the file. Let's enable the new COPY_HOLES flag to make
sure this doesn't happen.
In my test, this drops a 800MB btrfs journal (without compression)
to 720 MB.
meson: drop unused SYSTEMD_STDIO_BRIDGE_BINARY_PATH
The whole point of systemd-stdio-bridge is to be executed on "foreign" systems
where the path might be different, so we use $PATH to find the binary everywhere.
man: enhance the description of systemd-stdio-bridge
I hope that this fixes the comment
https://github.com/systemd/systemd/pull/22141#issuecomment-1013960371
> As someone who doesn't know what this prog does
The listing in the man page is sorted according to logical
use: all the options setting the address are now together.
ci: get Coverity and CodeQL to analyze the "libxkbcommon" part
By analogy with https://github.com/systemd/systemd/pull/22138, to get
the static analyzers to analyze that part of code that package should
be installed there as well.
Daan De Meyer [Fri, 14 Jan 2022 16:41:28 +0000 (16:41 +0000)]
shared: Copy holes in sparse files in copy_bytes_full()
Previously, all holes in sparse files copied with copy_bytes_full()
would be expanded in the target file. Now, we correctly detect holes
in the input file and we replicate them in the target file.