cgroup: downgrade warning if we can't get ID off cgroup
The cgroupid feature was not available in old cgroupvs2 kernels, hence
try to get it but if we can't because it's not supported, then only
debug log about it and proceed.
(We only needs this for cgroup bpf stuff, but that isn't available on
such old kernels anyway)
basic: do not warn in mkdir_p() when parent directory exists
This effectively disables warnings about type/mode/ownership of existing
directories when recursively creating parent directories. (Or files. If there's
a file in a place we expect a directory, the code will later try to create
a file and fail. This follows the general pattern where we do (void)mkdir()
if the mkdir() is immediately followed by opening of a file.)
I was recently debugging an issue with the fstab-generator [1], and it says:
'Directory "/tmp" already exists, but has mode 0777 that is too permissive (0644 was requested), refusing.'
which is very specific but totally wrong in this context.
This output was added in 37c1d5e97dbc869edd8fc178427714e2d9428d2b, and I still
think it is worth to do it, because if you actually *do* want the directory, if
there's something wrong, the precise error message will make it much easier to
diagnose. And we can't easily pass the information what failed up the call chain
because there are multiple things we check (ownership, permission mask, type)…
So passing a param whether to warn or not down into the library code seems like
the best solution, despite not being very elegant.
when they go down resolved prints
```
Event source mdns-ipv4 (type io) returned error, disabling
```
instead of
```
Event source n/a (type io) returned error, disabling
```
Frantisek Sumsal [Thu, 10 Feb 2022 16:19:27 +0000 (17:19 +0100)]
tree-wide: move `unsigned` to the start of type declaration
Even though ISO C11 doesn't mandate in which order the type specifiers
should appear, having `unsigned` at the beginning of each type
declaration feels more natural and, more importantly, it unbreaks
Coccinelle, which has a hard time parsing `long unsigned` and others:
```
init_defs_builtins: /usr/lib64/coccinelle/standard.h
init_defs: /home/mrc0mmand/repos/systemd/coccinelle/macros.h
HANDLING: src/shared/mount-util.c
: 1: strange type1, maybe because of weird order: long unsigned
```
Most of the codebase already "complies", so let's fix the remaining
"offenders".
Frantisek Sumsal [Thu, 10 Feb 2022 10:59:27 +0000 (11:59 +0100)]
test: document how to manually run Ubuntu CI stuff
Every time I need it I have to first relearn autopkgtest and find where
all the necessary stuff lives, so let's document it somewhere close to
systemd for (at least) future me.
Frantisek Sumsal [Thu, 10 Feb 2022 11:29:53 +0000 (12:29 +0100)]
test: accept GC'ed units in newer LVM
Since lvm 2.03.15 the transient units are started without `-r`, thus
disappearing once they finish and breaking the test (which expects them
to remain loaded after finishing). Let's accept `LoadState=not-found` as
a valid result as well to fix this.
Joerie de Gram [Tue, 8 Feb 2022 13:56:26 +0000 (22:56 +0900)]
network: attempt to trigger kernel IPv6LL address generation
Try to ensure kernel IPv6 link local address generation occurs by
setting the per-if addr_gen_mode sysctl when the link is already up,
instead of the netlink interface (IFLA_INET6_ADDR_GEN_MODE).
The netlink setting is sufficient in cases where the interface is not
yet up when networkd configures an interface - bringing the interface
up will trigger in-kernel address generation.
If the interface is already up, yet the interface has no IPv6LL assigned
setting IFLA_INET6_ADDR_GEN_MODE has no effect.
Writing the addr_gen_mode sysctl is a best effort attempt at triggering
address generation regardless of interface state because it also works
in cases where the interface is already up.
Alvin Šipraga [Thu, 10 Feb 2022 07:19:28 +0000 (08:19 +0100)]
udev/net: support Match.Firmware= in .link files (#22462)
In cbcdcaaa0ec5 ("Add support for conditions on the machines firmware")
a new Firmware= directive was added for .netdev and .network files.
While it was also documented to work on .link files, in actual fact the
support was missing. Add that one extra line to make it work, and also
update the fuzzer directives.
Luca Boccassi [Tue, 8 Feb 2022 13:19:52 +0000 (13:19 +0000)]
meson: disable export-dbus-interfaces target when cross-compiling
ERROR:
Cannot use target systemd as a generator because it is built for the
host machine and no exe wrapper is defined or needs_exe_wrapper is
true. You might want to set `native: true` instead to build it for
the build machine.
Santa Wiryaman [Mon, 3 May 2021 22:48:26 +0000 (18:48 -0400)]
Add support for `isolated` parameter
Add the "Isolated" parameter in the *.network file, e.g.,
[Bridge]
Isolated=true|false
When the Isolated parameter is true, traffic coming out of this port
will only be forward to other ports whose Isolated parameter is false.
When Isolated is not specified, the port uses the kernel default
setting (false).
The "Isolated" parameter was introduced in Linux 4.19.
See man bridge(8) for more details.
But even though the kernel and bridge/iproute2 recognize the "Isolated"
parameter, systemd-networkd did not have a way to set it.
some actions like Coverity and CFLite aren't run on every PR so to make
sure they are more or less fine when they are changed it makes sense to
at least check them with superlinter/actionlint: https://github.com/rhysd/actionlint
The following warnings were fixed along the way:
```
.github/workflows/mkosi.yml:55:7: shellcheck reported issue in this script: SC2086:info:6:14: Double quote to prevent globbing and word splitting [shellcheck]
|
55 | run: |
| ^~~~
.github/workflows/mkosi.yml:55:7: shellcheck reported issue in this script: SC2046:warning:6:40: Quote this to prevent word splitting [shellcheck]
|
55 | run: |
| ^~~~
.github/workflows/mkosi.yml:55:7: shellcheck reported issue in this script: SC2006:style:6:40: Use $(...) notation instead of legacy backticked `...` [shellcheck]
|
55 | run: |
| ^~~~
```
```
.github/workflows/coverity.yml:31:9: shellcheck reported issue in this script: SC2086:info:1:93: Double quote to prevent globbing and word splitting [shellcheck]
|
31 | run: echo "COVERITY_SCAN_NOTIFICATION_EMAIL=$(git log -1 ${{ github.sha }} --pretty=\"%aE\")" >> $GITHUB_ENV
| ^~~~
```
Curtis Klein [Sun, 10 Oct 2021 01:18:54 +0000 (18:18 -0700)]
watchdog: saturate to kernel's max watchdog timeout
Since version 4.5, the max possible timeout is UINT_MAX / 1000 since it
does calculations in milliseconds. A small helper function is added to
make this conversion and saturation and will be used more in the next
commit.
Also document the usage of signed integers by the kernel userspace API.
sd-boot: return 0 (not 1) from ticks_read() in fallback implementation
The single consumer of ticks_read() (i.e. time_usec()) checks for == 0
to detect the "not supported/invalid" case, hence actually return the
right value for that.
We should suppress the TSC data when we generate it if we assume its
invalid, not when we consume it, because at that point we don't even
know if the data stems from TSC or something else.
coredump: raise the coredump save size on 64bit systems to 32G (and lower it to 1G on 32bit systems)
Apparently 2G is too low for various real-life systems. But raising it
universally above 2^32 sounds wrong to me, since that makes no sense on
32bit systems, that we still support.
Hence, let's raise the limit to 32G on 64bit systems, and *lower* it to
1G on 32bit systems.
32G is 4 orders of magnitude higher then the old settings. Let's hope
that's enough for now. Should this not be enough we can raise it
further.
This queries the sector size from libfdisk instead of assuming 512, and
uses that when converting from bytes to the offset/size values libfdisk
expects.
This is an alternative to Tom Yan's #21823, but prefers using libfdisk's
own ideas of the sector size instead of going directly to the backing
device via ioctls. (libfdisk can after all also operate on regular
files, where the sector size concept doesn't necessarily apply the same
way.)
This also makes the "grain" variable, i.e. how we'll align the
partitions. Previously this was hardcoded to 4K, and that still will be
the minimum grain we use, but should the sector size be larger than that
we'll use the next multiple of the sector size instead.
Yu Watanabe [Sat, 5 Feb 2022 13:31:06 +0000 (22:31 +0900)]
resolve: reuse timer event source for DnsQuery
If the query get CNAME or DNAME, then the query will be restarted.
Even in that case, previously, the event source was freed and allocated
again. Let's slightly optimize it.
Yu Watanabe [Sat, 5 Feb 2022 13:03:19 +0000 (22:03 +0900)]
resolve: fix possible memleak
Fortunately, unlike the issue fixed in the previous commit, the memleak
should be superficial and not become apparent, as the queries handled
here are managed by the stub stream, and will be freed when the stream
is closed.
Just for safety, and slightly reducing the runtime memory usage by the
stub stream.
Yu Watanabe [Sat, 5 Feb 2022 12:37:01 +0000 (21:37 +0900)]
resolve: fix potential memleak and use-after-free
When stub stream is closed early, then queries associated to the stream
are freed. Previously, the timer event source for queries may not be
disabled, hence may be triggered with already freed query.
See also dns_stub_stream_complete().
Note that we usually not set NULL or zero when freeing simple objects.
But, here DnsQuery is large and complicated object, and the element may
be referenced in subsequent freeing process in the future. Hence, for
safety, let's set NULL to the pointer.
Benjamin Berg [Mon, 7 Feb 2022 16:34:21 +0000 (17:34 +0100)]
oom: Cleanup of information dump code after kill
This is a follow up to 29f4185a9cdc ("oomd: Dump top offenders after a
kill action") to clean up the code a bit for review comments that
happened after the code had been merged already.
Coverity (and I, initially) get really confused about "fn"'s validity
here. it doesn't grok that free_and_strdup() is actually a NOP in case
the string isn't changed, and assumes it always invalidates the
specified buffer, which it doesn't do though.
Daan De Meyer [Mon, 7 Feb 2022 20:19:29 +0000 (20:19 +0000)]
journal: Improve handling of corruption during upwards entry iteration
If we're going upwards in the journal file during entry iteration and we
can't reach the current entry due to corruption, start iterating upwards
from the last reachable entry array. This is equivalent to skipping
all entries in the array that can't be reached anymore.
Daan De Meyer [Mon, 7 Feb 2022 20:15:07 +0000 (20:15 +0000)]
journal: Fix upwards iteration of entry items in case of corruption
8d801e35cb155faa08235a5af8b4d6ad60715837 didn't take into account
upwards iteration of entry items when we're working on a corrupted
journal file. Instead of moving to the previous entry array, we'd
always move to the next array, regardless of the iteration direction.
To fix this, we introduce bump_entry_array() that moves to the next
or previous entry array depending on the given direction. Since the
entry array chains are singly linked lists, we have to start iterating
from the front to find the previous array. We only reach this logic
if we're working on a corrupted journal file so being slow here shouldn't
matter too much.
tests: also fuzz packets sent in the DHCP6_STATE_SOLICITATION state
With aborts enabled the fuzzer can catch issues like
https://github.com/systemd/systemd/commit/26a63b81322a3bd8b9fbd43f75897c391708de2c
Let's extend it a bit to let it cover issues like
https://github.com/systemd/systemd/pull/22406#discussion_r798932098
Thomas Haller [Thu, 3 Feb 2022 17:55:18 +0000 (18:55 +0100)]
sd-dhcp6-client: fix sending prefix delegation request during rebind
Fixes an assertion failure "pd->type == SD_DHCP6_OPTION_IA_PD" in dhcp6_option_append_pd().
Something similar was done in commit 26a63b81322a ('sd-dhcp6-client: Fix
sending prefix delegation request (#17136)'). The justification is
probably the same.
In D-Bus, clients connect to a bus (the usual case), or use direct
questions to each other (the unusual case). A bus is a program one can
connect to and implemented by dbus-daemon or dbus-broker. HOwever,
busses never connect between each other, that doesn't exist. Hence don't
claim so.
This is probably confusion about the fact that sd-bus calls D-Bus
connection objects just "sd_bus" for simplicity, given they are used in
99% of the cases to connect to a bus — only in exceptional cases they
are used for direct connections between peers without involving a bus.
journal-file: explicitly handle file systems that do not support hole punching
Apparently the error code fallocate() returns if hole punching is not
supported is not too well defined (man page just says "an error is
returned"), hence let's accept the usual set of errors, and the
normalize it to EOPNOTSUPP, and generate a clear error message in this
case.
Michael Olbrich [Wed, 2 Feb 2022 14:33:07 +0000 (15:33 +0100)]
shutdown: don't stop the watchdog
This basically reverts #22079.
Stopping the watchdog is wrong. The reboot watchdog is supposed to cover
the whole time from the point when systemd start systemd-reboot until the
hardware resets.
Otherwise the system may hang in the final shutdown phase.
Add a comment, why keeping the watchdog running is correct here.
Michael Olbrich [Wed, 2 Feb 2022 14:26:53 +0000 (15:26 +0100)]
watchdog: fix watchdog_set_device() when the default watchdog device is used
If watchdog_set_device() is not called before open_watchdog() then
'watchdog_device' remains 'NULL' while the device is open.
As a result, the "same device" check in watchdog_set_device() does not work
correctly: If no device is specified (e.g. from watchdog_free_device())
then the current fd is not closed.
Fix this by setting 'watchdog_device' to the correct device during
open_watchdog()