Fixes the failure found in
https://autopkgtest.ubuntu.com/results/autopkgtest-noble-upstream-systemd-ci-systemd-ci/noble/amd64/s/systemd-upstream/20241115_182040_92382@/log.gz
. Relevant logs:
```
Nov 16 02:48:36 systemd-networkd[2706]: veth99: Reconfiguring with /run/systemd/network/25-dhcp-client-ipv6-only.network.
Nov 16 02:48:36 systemd-networkd[2706]: veth99: NDISC: Started IPv6 Router Solicitation client
Nov 16 02:48:36 systemd-networkd[2706]: veth99: IPv6 Router Discovery is configured and started.
Nov 16 02:48:36 systemd-networkd[2706]: veth99: NDISC: Sent Router Solicitation, next solicitation in 3s
Nov 16 02:48:36 systemd-networkd[2706]: veth99: NDISC: Received Router Advertisement from fe80::1034:56ff:fe78:9abd: flags=0xc0(managed, other), preference=medium, lifetime=30min
Nov 16 02:48:36 systemd-networkd[2706]: veth99: NDISC: Invoking callback for 'router' event.
Nov 16 02:48:36 systemd-networkd[2706]: veth99: link_check_ready(): dynamic addressing protocols are enabled but none of them finished yet.
Nov 16 02:48:36 systemd-networkd[2706]: veth99: DHCPv6 client: Starting in Solicit mode
Nov 16 02:48:36 systemd-networkd[2706]: veth99: DHCPv6 client: State changed: stopped -> solicitation
Nov 16 02:48:36 systemd-networkd[2706]: veth99: Acquiring DHCPv6 lease on NDisc request
Nov 16 02:48:36 systemd-networkd[2706]: veth99: DHCPv6 client: Sent Solicit
Nov 16 02:48:36 systemd-networkd[2706]: veth99: DHCPv6 client: Next retransmission in 1s
Nov 16 02:48:37 systemd-networkd[2706]: veth99: DHCPv6 client: Sent Solicit
Nov 16 02:48:37 systemd-networkd[2706]: veth99: DHCPv6 client: Next retransmission in 1s
Nov 16 02:48:39 systemd-networkd[2706]: veth99: NDISC: Received Neighbor Advertisement from fe80::1034:56ff:fe78:9abd: Router=yes, Solicited=yes, Override=no
Nov 16 02:48:39 systemd-networkd[2706]: veth99: NDISC: Invoking callback for 'neighbor' event.
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 client: Processed Reply message
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 client: T1 expires in 50s
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 client: T2 expires in 55s
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 client: Valid lifetime expires in 2min
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 client: State changed: solicitation -> bound
Nov 16 02:48:39 systemd-networkd[2706]: veth99: DHCPv6 address 2600::15/128 (valid for 1min 59s, preferred for 1min 59s)
Nov 16 02:48:41 systemd-networkd[2706]: veth99: Received updated DHCPv6 address (configured): 2600::15/128 (valid for 1min 58s, preferred for 1min 58s), flags: no-prefixroute, scope: global
Nov 16 02:48:41 systemd-networkd[2706]: veth99: DHCPv6 addresses and routes set.
Nov 16 02:48:41 systemd-networkd[2706]: veth99: link_check_ready(): IPv4LL:no DHCPv4:no DHCPv6:yes DHCP-PD:no NDisc:no
Nov 16 02:48:41 systemd-networkd[2706]: veth99: State changed: configuring -> configured
```
The interface veth99 entered the configured state after 5 seconds, but
at the same time, the `wait_online()` in the test script considered the
test failed.
The function `wait_online()` first invokes
`systemd-networkd-wait-online` with `--timeout=20`, then check setup
states of interfaces with 5 seconds timeout. So, the failure suggests
that `systemd-networkd-wait-online` finishes immediately, as the state
file was not updated when it is invoked, and thus it handles the
interface veth99 already in the configured state.
Yu Watanabe [Wed, 20 Nov 2024 18:43:32 +0000 (03:43 +0900)]
test-network: actually check metric and preference
Otherwise, nexthop ID may contain e.g. 300, then
===
AssertionError: '300' unexpectedly found in
'default nhid 3860882700 via fe80::1034:56ff:fe78:9a99 proto ra metric 512 expires 1798sec pref high\n
default nhid 2639230080 via fe80::1034:56ff:fe78:9a98 proto ra metric 2048 expires 1798sec pref low'
===
killall: gracefully handle processes inserted into containers via nsenter -a
"nsenter -a" doesn't migrate the specified process into the target
cgroup (it really should). Thus the cgroup will remain in a cgroup
that is (due to cgroup ns) outside our visibility. The kernel will
report the cgroup path of such cgroups as starting with "/../". Detect
that and print a reasonably error message instead of trying to resolve
that.
cryptenroll: show better log message if slot to wipe does not exist
```
$ systemd-cryptenroll /dev/vda3
SLOT TYPE
0 password
$ systemd-cryptenroll --wipe-slot 1 /dev/vda3
Failed to wipe slot 1, continuing: No such file or directory
```
test: fix generate-sym-test using the wrong array (#35185)
For my understanding bsearch is searching in the wrong array. Or, if
it's the right one, then the size is wrong. In another commit I made the
arrays different by mistake and that triggered a SIGSEV during tests.
systemctl: grey out tasks limit the same way we grey out the fd store limit in the output
"systemctl status systemd-logind" otherwise looks a bit weird, since the
tasks and the fdstore lines are so close to each other but formatted
quite differently when it comes to coloring.
Mike Yuan [Mon, 18 Nov 2024 18:41:07 +0000 (19:41 +0100)]
core/exec-invoke: suppress placeholder home only in build_environment()
Currently, get_fixed_user() employs USER_CREDS_SUPPRESS_PLACEHOLDER,
meaning home path is set to NULL if it's empty or root. However,
the path is also used for applying WorkingDirectory=~, and we'd
spuriously use the invoking user's home as fallback even if
User= is changed in that case.
Let's instead delegate such suppression to build_environment(),
so that home is proper initialized for usage at other steps.
shell doesn't actually suffer from such problem, but it's changed
too for consistency.
Yu Watanabe [Mon, 18 Nov 2024 05:09:49 +0000 (14:09 +0900)]
network/ndisc: first process options with zero lifetime
Fixes IPv6 Core Conformance test failures reported at #33468.
https://www.ipv6ready.org/docs/Core_Conformance.pdf
Test v6LC.2.2.23 h and j: Processing Router Advertisement with Route
Information Option (Host Only)
When a RA contains route option with ::/0 prefix, then previously that
may contradict with the default route requested with the RA header.
If the route option has zero lifetime, the existing default route should
be removed, and a new route based on the RA header should be configured.
If the route option has non-zero lifetime, the RA header should be
ignored.
So, we first need to process options with zero lifetime (not only
route option, as the similar reasons), then configure the default route
based on the RA, finally process options with non-zero lifetime.
ukify: fix parsing of SignTool configuration option
This partially reverts 02eabaffe98c9a3b5dec1c4837968a4d3e2ff7db.
As noted in https://github.com/systemd/systemd/pull/35211:
> The configuration parsing simply stores the string as-is, rather than
> creating the appropriate object
One way to fix the issue would be to store the "appropriate object", i.e.
actually the class. But that makes the code very verbose, with the conversion
being done in two places. And that still doesn't fix the issue, because we need
to map the class objects back to the original name in error messages.
So instead, store the setting as a string and only map it to the class much
later. This makes the code simpler and fixes the error messages too.
Michał Górny [Sun, 17 Nov 2024 15:34:35 +0000 (16:34 +0100)]
nspawn: Include arm_fadvise64_64 in syscall allow_list
Add the `arm_fadvise64_64` syscall to the allow_list, in addition
to the existing `fadvise64` and `fadvise64_64` syscalls, as this is
the syscall actually defined for `arm` architecture. Adding it fixes
the syscall being rejected in arm32 containers.
Daan De Meyer [Sun, 17 Nov 2024 12:00:59 +0000 (13:00 +0100)]
mkosi: update debian commit reference
* 51cd22f368 Update changelog for 257~rc2-3 release
* 5308c3b905 Backport patch to remove faulty unit test assertion
* b7d805151b Update changelog for 257~rc2-2 release
* 5afc23b288 Backport patch to fix FTBFS due to failing unit test
* 0ca89ce40c Update changelog for 257~rc2-1 release
* f27216d493 Update lintian override to ignore false positive typos
* 2caa74f473 d/rules: adjust blhc override to account for source files being moved
* 6b48328ead systemd-ukify: recommend systemd-repart
* 5e01b67f43 systemd-ukify: downgrade dependency on systemd, not mandatory
* 3a4dd59e41 Install new systemd-keyutil binary in the systemd-repart package
* e64cffab71 Drop all patches, merged upstream
* 0fcef228c7 Update upstream source from tag 'upstream/257_rc2'
* a01322bb29 d/t/control: add more packages to dummy hint-testsuite-triggers
Daan De Meyer [Sun, 17 Nov 2024 12:00:57 +0000 (13:00 +0100)]
mkosi: update fedora commit reference
* 7bd1d09f7f Change sysusers u! lines to u because we don't have support in rpm
* 943bd94cf6 Version 257~rc2
* 6162965002 Disable freezing of user sessions
* 0c236cedb9 Upload sources
* ea947ce068 Version 257~rc1
* 834ba50e79 Use %posttrans instead of %postun to restart services
* 8dafa3810b Disable OpenSSL v3 ENGINE on RHEL
* 8f44e8097d Add forgotten patch
* 86ca699d18 Backport user manager reexec changes
* 009c64d6a2 Use %systemd_preun in systemd-resolved
Daan De Meyer [Fri, 15 Nov 2024 15:40:57 +0000 (16:40 +0100)]
bootctl: Only create loader/keys/auto if required
systemd-boot uses the existance of loader/keys/auto to determine
whether to auto-enroll secure boot or not so only create the directory
if we're actually going to put auto-enroll signature lists in it.
The second check was searching the symbols into the same array, but
using the size of the other. This generated a SIGSEV when they
occassionally mismatched.
Frantisek Sumsal [Fri, 15 Nov 2024 13:31:53 +0000 (14:31 +0100)]
test: ignore inconsistent coverage errors
lcov 2.1 introduced additional consistency checks [0] which make it trip
over our coverage results quite often:
Summary coverage rate:
source files: 915
lines.......: 36.9% (78950 of 214010 lines)
functions...: 53.3% (6906 of 12949 functions)
Message summary:
73 warning messages:
inconsistent: 73
lcov: ERROR: (corrupt) unable to read trace file '/var/tmp/systemd-test-TEST-04-JOURNAL/coverage-info.new': lcov: ERROR: (inconsistent) "/build/src/shutdown/umount.c":298: function 'umount_with_timeout' is not hit but line 317 is.
To skip consistency checks, see the 'check_data_consistency' section in man lcovrc(5).
(use "lcov --ignore-errors inconsistent ..." to bypass this error)
(use "lcov --ignore-errors corrupt ..." to bypass this error)
This is caused by coverage collected during shutdown which is a bit
unreliable, especially towards the final shutdown stage(s). Let's just
ignore the consistency errors for now.
boot: make .hwids PE section more flexible to cover more than DT one day
The proposal in https://github.com/systemd/systemd/pull/35091 suggests
that there are going to be more resources sooner or later that shall be
embeddable in a UKI, but are specific to some machine. The .hwids logic
as it is implemented right now is conceptually flexible enough to cover
that too (as long as the system has SMBIOS and thus CHIDs). Hence, let's
prepare the ground for a future (that might possibly never come, but
let's keep the door open) where the section can be reused for this
purpose.
The patch is really dumb ultimately. it just changes the initial field
in the "Device" struct to carry not just the size of it (as before) but
also a type indicator, that is for now fixed to 1, indicating DT blobs.
This breaks compatibility, hence this should get merged before we do the
v257 release, so that this is done properly before the first release
with .hwids.
pid1: make clear that $WATCHDOG_USEC is set for the shutdown binary, noone else
We use the $WATCHDOG_USEC variable for two very closely uses: as part of
the sd_watchdog_enabled() protocol for implementing service watchdogs.
And as part of the protocol between the service manager and
systemd-shutdown across the PID 1 execve() transition during shutdown.
Apparently some exitrds tools got confused by the latter use. Let's
address that by setting $WATCHDOG_PID to 1, in accordance to the
sd_watchdog_enabled() protocol to make clear this is only intended for
PID 1 and nothing else.
boot: explain the 4G quirks we apply to initrd memory allocations
Given how long it took to come to a conclusion of the discussions around
https://github.com/systemd/systemd/issues/35026, let's add a comment
that makes this easier to grok for the next time this comes up.
Luca Boccassi [Thu, 14 Nov 2024 16:19:25 +0000 (16:19 +0000)]
test: skip TEST-84-STORAGETM if running with bugged libnvme
libnvme 1.11 appears to require a kernel built with NVME TLS
kconfigs, and fails hard if it is not, as the expected
privileged keyring '.nvme' is not present. We cannot just
create it from userspace, as privileged keyrings can only
be created by the kernel itself (those starting with '.').
Skip the test if the library exactly matches this version.
Luca Boccassi [Thu, 14 Nov 2024 16:26:01 +0000 (16:26 +0000)]
ukify: Support building UKIs with .dtbauto and .hwids sections (#34158)
Stub behavior will be as following:
1. If there are no `.dtbauto` sections then is used `.dtb` if present
2. If there are `.dtbauto` sections and there is at least one matching
(either with the firmware-provided DT or via `.hwids`) then it'll be
used instead of the `.dtb`.
Based on #28959 and [dtbloader](https://github.com/TravMurav/dtbloader)
If nspawn is invoked with DevicePolicy= but DeviceAllow= does not
contain /dev/fuse, nspawn will fail to get fuse version with -EPERM.
Let's silence the warning in that case.
andre4ik3 [Thu, 14 Nov 2024 04:20:09 +0000 (08:20 +0400)]
boot/stub: allocate pages for combined initrds below 4GiB only on x86 (#35149)
Outside of x86, some machines (e.g. Apple silicon, AMD Opteron A1100)
have physical memory mapped above 4GiB, meaning this allocation will
fail, causing the entire boot process to fail on these machines.
This commit makes it so that the below-4GB address space allocation
requirement is only set on x86 platforms, and not on other platforms
(that don't have the specific Linux x86 boot protocol), thereby fixing
boot on those that have no memory mapped below 4GiB in their address
space.
Tested on an Apple silicon M1 laptop and an AMD x86_64 desktop tower.
Yu Watanabe [Mon, 11 Nov 2024 17:13:04 +0000 (02:13 +0900)]
network/ndisc: dynamically configure nexthops when routes with gateway are requested
Previously, when multiple routers send RAs with the same preference,
then the kernel merges routes with the same gateway address:
===
default proto ra metric 1024 expires 595sec pref medium
nexthop via fe80::200:10ff:fe10:1060 dev enp0s9 weight 1
nexthop via fe80::200:10ff:fe10:1061 dev enp0s9 weight 1
===
This causes IPv6 Conformance Test v6LC.2.2.11 failure, as reported in #33470.
To avoid the coalescing issue, we can use nexthop, as suggested by Ido Schimmel:
https://lore.kernel.org/netdev/ZytjEINNRmtpadr_@shredder/
> BTW, you can avoid the coalescing problem by using the nexthop API.
> # ip nexthop add id 1 via fe80::200:10ff:fe10:1060 dev enp0s9
> # ip -6 route add default nhid 1 expires 600 proto ra
> # ip nexthop add id 2 via fe80::200:10ff:fe10:1061 dev enp0s9
> # ip -6 route append default nhid 2 expires 600 proto ra
> # ip -6 route
> fe80::/64 dev enp0s9 proto kernel metric 256 pref medium
> default nhid 1 via fe80::200:10ff:fe10:1060 dev enp0s9 proto ra metric 1024 expires 563sec pref medium
> default nhid 2 via fe80::200:10ff:fe10:1061 dev enp0s9 proto ra metric 1024 expires 594sec pref medium
Yu Watanabe [Wed, 6 Nov 2024 18:43:50 +0000 (03:43 +0900)]
network: keep all dynamically acquired configurations when KeepConfiguration=dhcp-on-stop
By the previous commit, configuration source of addresses and routes are
saved on stop and restored on start. Hence, we can keep dynamic
configurations on stop.
Currently, only configuration sources and providers of addresses and
routes are serialized/deserialized.
This should mostly not change behavior, as dynamic (except for DHCPv4)
configurations will be dropped before stopping networkd, and for DHCPv4
protocol, we have already had another logic to handle DHCPv4
configurations.
Preparation for later commits.