Martin Wilck [Fri, 16 Sep 2022 19:36:52 +0000 (21:36 +0200)]
feat(nvmf): add code for parsing the NBFT
Add code to parse the Nvme-oF Boot Firmware Table (NBFT) according
to the NVM Express Boot Specification 1.0 [1]. The implementation in
dracut follows a similar general approach as iBFT support in the
iscsi module.
NBFT support requires two steps:
(1) Setting up the network and routing according to the
HFI ("Host Fabric Interface") records in the NBFT,
(2) Establishing the actual NVMe-oF connection.
(1) is accomplished by reading the NBFT using JSON output from
the "nvme nbft show" command, and transforming it into command
line options ("ip=", "rd.neednet", etc.) understood by dracut's
network module and its backends. The resulting network setup code
is backend-agnostic. It has been tested with the "network-legacy"
and "network-manager" network backend modules. The network setup
code supports IPv4 and IPv6 with static, RA, or DHCP configurations,
802.1q VLANs, and simple routing / gateway setup.
(2) is done using the "nvme connect-all" command [2] in the netroot handler,
which is invoked by networking backends when an interface gets fully
configured. This patch adds support for "netboot=nbft". The "nbftroot"
handler calls nvmf-autoconnect.sh, which contains the actual connect
logic. nvmf-autoconnect.sh itself is preserved, because there are
other NVMe-oF setups like NVMe over FC which don't depend on the
network.
The various ways to configure NVMe-oF are prioritized like this:
1 FC autoconnect from kernel commandline (rd.nvmf.discover=fc,auto)
2 NBFT, if present
3 discovery.conf or config.json, if present, and cmdline.d parameters,
if present (rd.nvmf.discovery=...)
4 FC autoconnect (without kernel command line)
The reason for this priorization is that in the initial RAM fs, we try
to activate only those connections that are necessary to mount the root
file system. This avoids confusion, possible contradicting or ambiguous
configuration, and timeouts from unavailable targets.
A retry logic is implemented for enabling the NVMe-oF connections,
using the "settled" initqueue, the netroot handler, and eventually, the
"timeout" initqueue. This is similar to the retry logic of the iscsi module.
In the "timeout" case, connection to all possible NVMe-oF subsystems
is attempted.
Two new command line parameters are introduced to make it possible to
change the priorities above:
- "rd.nvmf.nonbft" causes the NBFT to be ignored,
- "rd.nvmf.nostatic" causes any statically configured NVMe-oF targets
(config.json, discovery.conf, and cmdline.d) to be ignored.
These parameters may be helpful to skip attempts to set up broken
configurations.
At initramfs build time, the nvmf module is now enabled if an NBFT
table is detected in the system.
[1] https://nvmexpress.org/wp-content/uploads/NVM-Express-Boot-Specification-2022.11.15-Ratified.pdf
[2] NBFT support in nvme-cli requires the latest upstream code (> v2.4).
Signed-off-by: Martin Wilck <mwilck@suse.com> Co-authored-by: John Meneghini <jmeneghi@redhat.com> Co-authored-by: Charles Rose <charles.rose@dell.com>
Martin Wilck [Thu, 9 Mar 2023 15:55:36 +0000 (16:55 +0100)]
fix(nvmf): support /etc/nvme/config.json
Since nvme-cli 2.0, configuration of subsystems to connect to is
stored under `/etc/nvme` in either `discovery.conf` or `config.json`.
Attempt discovery also if the latter exists, but not the former.
Also, install "config.json" if it's present on the root FS.
As before, "rd.nvmf.discover=fc,auto" will force either file to be ignored,
and NBFT-defined targets take precedence if found.
Martin Wilck [Thu, 12 Jan 2023 10:06:35 +0000 (11:06 +0100)]
fix(nvmf): install 8021q module unconditionally
In NBFT setups, VLAN can be configured in the firmware.
Add the 8021q module in hostonly mode even if VLAN is currently
not used to be prepared for such configuration change.
Andrew Ammerlaan [Sat, 17 Jun 2023 06:18:55 +0000 (08:18 +0200)]
fix(install.d): respect more kernel-install env variables
- If kernel-install has defined a staging area for us
(KERNEL_INSTALL_STAGING_AREA) install generated initrd/uki.efi there.
The actual install is then handled by 90-loaderentry.install or
90-uki-copy-install.
- Also skip regeneration if an uki.efi already exists.
- Pass --kernel-image to dracut, this is required to generate an uki (uefi=yes)
- Add --no-uefi argument to dracut rescue image generation, this ensures that
it at least installs correctly. TODO: Rework 51-dracut-rescue.install to also
work with uki's.
This fixes installing a kernel with uefi=yes in dracut config and layout=uki
in kernel/install.conf.
Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>
fix(bluetooth): include it if Appearance matches the value assigned for keyboard
Following the Bluetooth spec [1], Assigned Numbers Document, Rev. 2023-05-04,
Section 2.6.3, Appearance Sub-category, the Appearance value defined for
keyboards is 0x03C1.
This value must be checked to include the bluetooth module in hostonly mode,
because some Bluetooth keyboards do not set the Class attribute.
fix(dracut-init.sh): correct check in `is_qemu_virtualized` function
Do not redirect `systemd-detect-virt` to /dev/null, otherwise, the `vm` variable
is always empty. This function was working only thanks to the following /sys
check.
Henrik Gombos [Wed, 14 Jun 2023 19:17:20 +0000 (19:17 +0000)]
ci: add dependencies to Debian container
- add systemd-boot-efi for test 18
- tgt is needed for test 30 and 35
- nbd-server is needed for test 40
- gawk dependency has been introduced by f32e95bcadbc5158843530407adc1e7b700561b1
- install dracut instead of initramfs-tools to match actual usage
- remove workaround for https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=962300
as it is now fixed on Debian 12.
fix(systemd-pcrphase): only include systemd-pcrphase-initrd.service
The only systemd-pcrphase related unit configured to run in the initrd is
systemd-pcrphase-initrd.service.
Both systemd-pcrphase.service and systemd-pcrphase-sysinit.service contain
`ConditionPathExists=!/etc/initrd-release`.
Henrik Gombos [Fri, 2 Jun 2023 14:05:12 +0000 (14:05 +0000)]
ci: cleanup containers
Remove /etc/profile.d/dracut-test.sh from test containers
No use to override default command
Remove references to docker. These files work just fine
with podman as well.
Frederick Grose [Sun, 21 May 2023 22:05:12 +0000 (18:05 -0400)]
fix(resolve-deps): check the existing file—not the source
Check for dependencies on the file actually installed, otherwise,
such as when the shebang is a link not equal to the initramfs link,
the wrong file may be tested.
If some "finished" initscripts keep failing, dracut will start
printing warnings after a while. But it will warn about all scripts
in the finished initqueue, not only those that have failed. That
makes it difficult to identify the script that has actually
caused the failure.
To avoid this, delete finished initqueue scripts when they succeed.
Also, instead of returning as soon as one of the scripts fails, try
all scripts, deleting those that succeed, and return failure if at
least one script failed.
If a previously deleted script is recreated by some other part of
the code, it will be re-run the next time the check_finished() function
is called, and will be re-deleted if it still succeeds.
The only case where I see that this might cause issues is if some
condition needs to be tested over and over again, because it succeeds
and then fails later (for example, a device showing up and then being
removed again). But I think that this is not the intended logic.
In general, when a device shows up or another "finished" condition
is met, we assume that this condition will hold at least until the
initramfs switches root and exits. If all conditions are met, the
current code will also exit the initqueue without retrying any of
the conditions again.
fix(dracut-systemd): rootfs-generator cannot write outside of generator dir
Although it was already documented in systemd.generator(7) that generators must
not write to locations other than those passed as arguments, since
https://github.com/systemd/systemd/commit/ca6ce62d systemd executes generators
in a mount namespace "sandbox", so now the hooks created by the rootfs-generator
are lost.
These hooks are created using the root= cmdline argument, so this patch moves
the creation of these hooks to a cmdline hook.
keentux [Wed, 22 Mar 2023 10:40:39 +0000 (10:40 +0000)]
fix(dracut.sh): handle imagebase for uefi
* UEFI creation didn't handle the ImageBase data for the PE file
generation. Create an UKI thanks a stub file with a non zero BaseImage
logs some warning ans generate a bad file offset management. The efi
becomes unloadable.
* This commit parse the PE file header, get the data and apply the
ImageBase on the objcopy command.
Daniel McIlvaney [Fri, 28 Apr 2023 21:02:05 +0000 (14:02 -0700)]
fix(dracut-functions): avoid calling grep with PCRE (-P)
Invoking grep in Perl mode requires JIT'ing the Perl regex.
This can run into issues with SELinix policy which will generally try to
limit use of execmem in general purpose scripts. This occurs since the
JIT'd code will live in executable memory.
The PCRE only '\K' command in the Perl REGEX can be replaced by a call
to awk instead.
Tao Liu [Wed, 12 Apr 2023 15:02:25 +0000 (23:02 +0800)]
fix(dracut-functions.sh): convert mmcblk to the real kernel module name
In some x86_64 platforms such as Intel Elkhartlake, an issue of missing
necessary modules due to udevadm drivers field unmatch the real kernel module
name is found:
$ udevadm info -a /dev/block/179:1
looking at parent device '/devices/pci0000:00/0000:00:1a.0/mmc_host/mmc0/mmc0:0001':
KERNELS=="mmc0:0001"
SUBSYSTEMS=="mmc"
DRIVERS=="mmcblk"
....
The DRIVERS field, aka mmcblk will be given to instmods to install the
corresponding mmc_block.ko kernel module. However mmc_block.ko cannot be
selected by string mmcblk, as a result, mmc_block.ko cannot be installed
in hostonly-mode strict, which will fail to bootup the machine such as in
kdump cases:
`Also=multipathd.socket` is not the correct behavior for a
socket-activated service. This directive has been removed upstream
and dracut should do the same.
This fixes #2289, #2175 where in the cleanup hook running multipath
binary triggers activation of multipathd.service after it is stopped
as dracut prepares to switch root in initrd-cleanup.service.
Frederick Grose [Tue, 28 Mar 2023 02:16:04 +0000 (22:16 -0400)]
fix(99base): adjust to allow mksh as initrd shell
Use printf instead of echo in str_replace() to preserve escapes.
Find command paths while PATH includes /usr/sbin. (switch_root was
not found after the original environment is restored on Fedora.)
fix(dracut.sh): use dynamically uefi's sections offset
* Uefi section are creating by `objcopy` with hardcoded sections
offset. This commit allow to have the correct offset between
each part of the efi file, needed to create an UKI. Offsets
are simply calculated so no sections overlap, as recommended
in https://wiki.archlinux.org/title/Unified_kernel_image#Manually
Moreover, efi stub file's header is parsed to apply the correct
offsets according the section alignment factor.
* Remove EFI_SECTION_VMA_INITRD, no need anymore as initrd
section offset dynamically calculated
Tao Liu [Fri, 3 Mar 2023 10:27:25 +0000 (18:27 +0800)]
fix(lvmthinpool-monitor): activate lvm thin pool before extend its size
The state of lvm thin pool may change into inactived state when kdump into
2nd kernel. As a result, lvextend will fail to extend its size. For example:
“Masahiro [Tue, 7 Feb 2023 09:30:36 +0000 (18:30 +0900)]
feat(test): nfs_fetch_url test into nfs test
This is to check the behavior of nfs_fetch_url() in nfs-lib.sh.
nfs_fetch_url() calls nfs_already_mounted() internally.
A file /nfs/client/root/fetchfile is on NFS server, which is fetched
from clients for testing with nfs_fetch_url().
“Masahiro [Fri, 3 Feb 2023 03:08:26 +0000 (12:08 +0900)]
fix(url-lib.sh): nfs_already_mounted() with trailing slash in nfs path
nfs_already_mounted() doesn't work when the installation ISO and kickstart file on a same NFS share are specified with inst.repo and inst.ks boot parameter as below.
NOTE: /home/data is configured for nfs share on 192.168.1.1
One problem is a file (not a directory) was passed into nfs_already_mounted().
nfs_already_mounted() is the function to judge if the given directory is already mounted.
So, filepath should be passed in nfs_fetch_url().
The other problem is about the trailing slash in the nfs path in /proc/mounts.
The /proc/mounts has an entry after nfs mount of inst.repo.
As LGTM is going to be shut down by EOY[0], let's move the code scanning to
CodeQL as recommended. Thanks to GH integration the results from such
scans will be shown both in the respective PR and in the Security ->
Code Scanning tab[1].
Adrien Thierry [Wed, 15 Feb 2023 19:13:56 +0000 (14:13 -0500)]
fix(dracut-install): prevent possible infinite recursion with suppliers
During search for fw_devlink suppliers, it's possible to encounter a
situation where supplier A depends on supplier B, and supplier B has a
parent node that depends on supplier A. This leads to an infinite
recursion.
To fix this, make sure suppliers are only processed once.
John Meneghini [Tue, 14 Feb 2023 21:28:57 +0000 (16:28 -0500)]
build: remove rpm spec file and build rules
As discussed in issue #2204 this patch removes the dracut.spec file from
the repository. The advantage of this patch is that it creates a
dracut-version.tar.xv file that can be more easily consumed by the
downstream distributions because there's no rpm spec file included in
the distribution.
Tested with a downstream rpm spec fiie:
```
cd dracut
VERSION=`git describe --abbrev=0 --tags --always`
make clean
make dist
cp dracut-${VERSION}.tar.xz ../
cd ..
Adrien Thierry [Mon, 13 Feb 2023 15:43:32 +0000 (10:43 -0500)]
fix(kernel-modules): use modalias info in get_dev_module()
When calling dracut with '--hostonly-mode=strict', get_dev_module() gets
called on the system's block devices to find the required drivers. The
driver name is retrieved using udevadm. However, the driver name
returned by udevadm is not necessarily the same as the module name.
This is the case for the Qualcomm UFS driver: udevadm returns
'ufshcd-qcom' while the module name is 'ufs-qcom', so dracut-install is
not able to find the module afterwards.
To solve this, make get_dev_module() also return the module alias info
from the modalias files contained in the sysfs directories parsed by
udevadm.
On EL8.3 the NetworkManager keep restarting even if it exits successfully
while waiting for Clevis to unlock. This patch ensures NetworkManager runs
only once in initrd.
Yes; NetworkManager is run multiple times, so that it's able to
configure interfaces that haven't been seen previously (because bus was
slow to scan or device took time to initialize).
It's not clear what problem was the original commit trying to fix.
I suspect there was no problem, just a misunderstanding.
fix(dracut.sh): handle sbsign errors for UEFI builds
`sbsign` does not issue any error if there is not enough disk space to create
the signed file using its `--output` option. So, verify the signed image after
its creation using `sbverify`.
Martin Wilck [Tue, 7 Feb 2023 21:24:15 +0000 (22:24 +0100)]
fix(network): IPv6: don't wait for RA for static IPv6 assignments
This patch reverts commit c603419 ("wait for IPv6 RA if using none/static IPv6 assignment").
It's not generally correct wait for a default route to be established
for an interface, or to wait for "proto ra" routes in general.
For example, if the system is a router itself, it will receive no
RAs. In isolated networks, no gateway may be advertized, either.
This is similar in spirit to 76f6566 ("Revert "wait for IPv6 RA
if using none/static IPv6 assignment"")
Whatever c603419 ("wait for IPv6 RA if using none/static IPv6 assignment")
was supposed to achieve, it should be done differently.