git.ipfire.org Git - thirdparty/mdadm.git/log

mdmonitor: use MAILFROM to set sendmail envelope sender address

Modern mail relays may reject emails with unknown envelope sender
address.

Use the MAILFROM address also as envelope sender address to work
around this issue.

Signed-off-by: Martin Wilck <mwilck@suse.com>

mdadm/assemble: Don't stop array after creating it

It stops the array which is just created. From the comment it wants to
stop the array if it has no content. But it hasn't added member disks,
so it's a clean array. It's meaningless to do it.

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: remove POSIX check

Neil Brown in #159 pointed that mdadm should been keep in base utility
style, allowing much more with no strict limitations until absolutely
necessary to prevent crashes.

This view, supported with regression #160 caused by POSIX portable
character set requirement leads me to revert it.

Revert the POSIX portable character set verification of name and
devname. Make it IMSM only.

Fixes: e2eb503bd797 ("mdadm: Follow POSIX Portable Character Set")
Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm: enable sync file for udev rules

Mounting an md device may fail during boot from mdadm's claim
on the device not being released before systemd attempts to mount.

In this case it was found that essentially there is a race condition
occurring in which the mount cannot happen without some kind of delay
being added BEFORE the mount itself triggers, or manual intervention
after a timeout.

The findings:
the inode was for a tmp block node made by mdadm for md0.

crash> detailedsearch ff1b0c398ff28380
ff1b0c398f079720: ff1b0c398ff28380 slab:filp state:alloc
obj:ff1b0c398f079700 size:256
ff1b0c398ff284f8: ff1b0c398ff28380 slab:shmem_inode_cache
state:alloc obj:ff1b0c398ff28308 size:768

crash> struct file.f_inode,f_path ff1b0c398f079700
f_inode = 0xff1b0c398ff28380,
f_path = {
mnt = 0xff1b0c594aecc7a0,
dentry = 0xff1b0c3a8c614f00
},
crash> struct dentry.d_name 0xff1b0c3a8c614f00
d_name = {
{
{ hash = 3714992780, len = 16 },
hash_len = 72434469516
},
name = 0xff1b0c3a8c614f38 ".tmp.md.1454:9:0"
},

For the race condition, mdadm and udev have some infrastructure for making
the device be ignored while under construction. e.g.

$ cat lib/udev/rules.d/01-md-raid-creating.rules

do not edit this file, it will be overwritten on update
While mdadm is creating an array, it creates a file
/run/mdadm/creating-mdXXX. If that file exists, then
the array is not "ready" and we should make sure the
content is ignored.
KERNEL=="md*", TEST=="/run/mdadm/creating-$kernel", ENV{SYSTEMD_READY}="0"

However, this feature currently is only used by the mdadm create command.
See calls to udev_block/udev_unblock in the mdadm code as to where and when
this behavior is used. Any md array being started by incremental or
normal assemble commands does not use this udev integration. So assembly
of an existing array does not look to have any explicit protection from
systemd/udev seeing an array as in a usable state before an mdadm instance
with O_EXCL closes its file handle.
This is for the sake of showing the use case for such an option and why
it would be helpful to delay the mount itself.

While mdadm is still constructing the array mdadm --incremental
that is called from within /usr/lib/udev/rules.d/64-md-raid-assembly.rules,
there is an attempt to mount the md device, but there is not a creation
of "/run/mdadm/creating-xxx" file when in incremental mode that
the rule is looking for. Therefore the device is not marked
as SYSTEMD_READY=0 in
"/usr/lib/udev/rules.d/01-md-raid-creating.rules" and missing
synchronization using the "/run/mdadm/creating-xxx" file.

As to this change affecting containers or IMSM...
(container's array state is inactive all the time)

Even if the "array_state" reports "inactive" when previous components
are added, the mdadm call for the very last array component that makes
it usable/ready, still needs to be synced properly - mdadm needs to drop
the claim first calling "close", then delete the "/run/mdadm/creating-xxx".
Then lets the udev know it is clear to act now (the "udev_unblock" in
mdadm code that generates a synthetic udev event so the rules are
reevalutated). It's this processing of the very last array component
that is the issue here (which is not IO error, but it is that trying to
open the dev returns -EBUSY because of the exclusive claim that mdadm
still holds while the mdadm device is being processed already by udev in
parallel, and that is what the
/run/mdadm/creating-xxx should prevent exactly).

The patch to Incremental.c is to enable creating the
"/run/mdadm/creating-xxx" file during incremental mode.

For the change to Create.c, the unlink is called right before dropping
the exculusive claim for the device. This should be the other way round
to avoid the race 100%. That is, if there's a "close" call and
"udev_unblock" call, the "close" should go first, then followed
"udev_unblock".

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>

optim[al]ize; write-indent -> write-intent

Former is highly non-standard, latter is wrong

Signed-off-by: наб <nabijaczleweli@nabijaczleweli.xyz>

mdadm/tests: mark 10ddf-fail-readd-readonly broken

10ddf-fail-readd-readonly fails sometimes. Mark this case broken.

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm/tests: mark 09imsm-assemble broken

09imsm-assemble fails sometimes. So mark it as broken.

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm/tests: mark 10ddf-fail-two-spares broken

Sometimes 10ddf-fail-two-spares fail because:
++ grep -q 'state\[1\] : Optimal, Consistent' /tmp/mdtest-5k3MzO
++ echo ERROR: /dev/md/vol1 should be optimal in meta data
ERROR: /dev/md/vol1 should be optimal in meta data

Mark this as broken.

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: give more time to wait sync thread to reap

01r5fail case reports error sometimes:
++ '[' -n '2248 / 35840' ']'
++ die 'resync or recovery is happening!'
++ echo -e '\n\tERROR: resync or recovery is happening! \n'

ERROR: resync or recovery is happening!

sync thread is reapped in md_thread. So we need to give more time to
wait sync thread to reap.

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: add attribute nonstring for signature

It reports building error in f42:
error: initializer-string for array of ‘unsigned char’ truncates NULL
terminator but destination lacks ‘nonstring’ attribute (5 chars into 4
available) [-Werror=unterminated-string-initialization]

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: fix building errors

Some building errors are found in ppc64le platform:
format '%llu' expects argument of type 'long long unsigned int', but
argument 3 has type 'long unsigned int' [-Werror=format=]

Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: use standard libc nftw

commit bd648e3bec3d ("mdadm: Remove klibc and uclibc support") removes
macro HAVE_NFTW/HAVE_FTW and uses libc header ftw.h. But it leaves the
codes in lib.c which let mdadm command call nftw defined in lib.c. It
needs to remove these codes.

The bug can be reproduced by:
mdadm -CR /dev/md0 --level raid5 --metadata=1.1 --chunk=32 --raid-disks 3
--size 10000 /dev/loop1 /dev/loop2 /dev/loop3
mdadm /dev/md0 --grow --chunk=64
mdadm: /dev/md0: cannot open component -unknown-

Fixes: bd648e3bec3d ("mdadm: Remove klibc and uclibc support")
Signed-off-by: Xiao Ni <xni@redhat.com>

mdadm: allow any valid minor number in md device name

Since 25aa732 ("mdadm: numbered names verification"), it is not possible
any more to create arrays /dev/md${N} with N >= 127. The limit has later
been increased to 1024, which is also artificial. The error message printed
by mdadm is misleading, as the problem is not POSIX compatibility here.

# mdadm -C -v /dev/md9999 --name=foo -l1 -n2 /dev/loop0 /dev/loop1
mdadm: Value "/dev/md9999" cannot be set as devname. Reason: Not POSIX compatible.

Given that mdadm creates an array with minor number ${N} if the argument is
/dev/md${N}, the natural limit for the number is the highest minor number
available, which is (1 << MINORBITS) with MINORBITS=20 on Linux.

Fixes: 25aa732 ("mdadm: numbered names verification")
Fixes: f786072 ("mdadm: Increase number limit in md device name to 1024.")
Signed-off-by: Martin Wilck <mwilck@suse.com>

tests: support second runner

Second runner has different VM name. Honor that when coping
and removing logs.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

Update run_mdadm_tests.sh

Signed-off-by: Paul Luse <paul.e.luse@intel.com>

Update tests.yml

Signed-off-by: Paul Luse <paul.e.luse@intel.com>

mdadm: use kernel raid headers

For a years we redefined these headers in mdadm. We should reuse headers
exported by kernel to integrate driver and mdadm better.
Include them and remove mdadm owned headers.

There are 3 defines not available in kernel headers, so define them
directly but put them in ifndef guard to make them transparent later.

Use MD_FEATURE_CLUSTERED instead of MD_FEATURE_BITMAP_VERSIONED. The
value is same, kernel define has different name.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm: include asm/byteorder.h

It will be included by raid/md_p.h anyway. Include it directly and
remove custom functions. It is not a problem now.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm: Remove klibc and uclibc support

Klibc compilation is not working for at least 3 years because of
following error:
mdadm.h:1912:15: error: unknown type name 'sighandler_t'

It will have a conflict with le/be_to_cpu() functions family provided by
asm/byteorder.h which will be included with raid/md_p.h. Therefore we
need to remove support for it. Also, remove uclibc because it is not actively
maintained.

Remove klibc and uclibc targets from Makefile and special klibc code.
Targets can be removed safely because using CC is recommended.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

Update README.md

Needed to remove the word "test" as part of testing updated CI workflow.

Signed-off-by: Paul Luse <paul.e.luse@intel.com>

This is a test for CI, do not merge

Signed-off-by: Paul E Luse <paul.e.luse@intel.com>

Update tests.yml

Signed-off-by: Paul Luse <paul.e.luse@intel.com>

Allow RAID0 to be created with v0.90 metadata #161

It is not currently possible to create a RAID0 with 0.90 metadata.
This is because 0.90 cannot specify the layout of RAID0 (it is
traditionally ignored) and different kernels do different things with
RAID0 layouts.

However it should be possible to use --layout=dangerous as that
acknowledges the risk.
It also should be possible to create a RAID0 with all devices the same
size because in that case all layouts are identical.

The metadata handler can only check that all devices are the same size
quite late - in write_init_super(). By that time the default is
currently set - set to a value that super0 cannot handle.

So this patch delays the setting of the default value and leave it for
the metadata handler (or for the Build handler).

super1 selects ORIG in that case.
intel and ddf don't support non-uniform RAID0 so they don't need any
change.
super0 now checks the sizes of devices if the default RAID0 layout was
requested and rejects the request in they are not the same.

validiate_geometry0 now allows "dangerous" layouts for raid0.

Signed-off-by: NeilBrown <neil@brown.name>

imsm: Fix RAID0 to RAID10 migration

Support for RAID10 with +4 disks in IMSM introduced an inconsistency
between the VROC UEFI driver and Linux IMSM. VROC UEFI does not
support RAID10 with +4 disks, therefore appropriate protections were
added to the mdadm IMSM code that results in skipping processing of
such RAID in the UEFI phase. Unfortunately the case of migration
RAID0 2 disks to RAID10 4 disks was omitted, this case requires
maintaining compatibility with the VROC UEFI driver because it is
supported.

For RAID10 +4 disk the MPB_ATTRIB_RAID10_EXT attribute is set in the
metadata, thanks to which the UEFI driver does not process such RAID.
In the series adding support, a new metadata raid level value
IMSM_T_RAID10 was also introduced. It is not recognized by VROC UEFI.

The issue is caused by the fact that in the case of the mentioned
migration, IMSM_T_RAID10 is entered into the metadata but attribute
MPB_ATTRIB_RAID10_EXT is not entered, which causes an attempt to
process such RAID in the UEFI phase. This situation results in
the platform hang during booting in UEFI phase, this also results in
data loss after failed and interrupted RAID processing in VROC UEFI.

The above situation is result of the update_imsm_raid_level()
function, for the mentioned migration function is executed on a map
with a not yet updated number of disks.

The fix is to explicitly handle migration in the function mentioned
above to maintain compatibility with VROC UEFI driver.

Steps to reproduce:
mdadm -C /dev/md/imsm0 -e imsm -n 2 /dev/nvme[1,2]n1 -R
mdadm -C /dev/md/vol -l 0 -n 2 /dev/nvme[1,2]n1 --assume-clean -R
mdadm -a /dev/md127 /dev/nvme3n1
mdadm -a /dev/md127 /dev/nvme4n1
mdadm -G /dev/md126 -l 10
reboot

Fixes: 27550b13297a ("imsm: add support for literal RAID 10")
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

super1: Clear extra flags when initializing metadata

When adding a disk to a RAID1 array, the metadata is read from the
existing member disks for sync. However, only the bad_blocks flag are
copied, the bad_blocks records are not copied, so the bad_blocks
records are all zeros. The kernel function super_1_load() detects
bad_blocks flag and reads the bad_blocks record, then sets the bad
block using badblocks_set().

After the kernel commit 1726c7746783 (badblocks: improve badblocks_set()
for multiple ranges handling) if the length of a bad_blocks record is 0,
it will return a failure. Therefore the device addition will fail.

So when adding a new disk, some flags cannot be sync and need to be clead.

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>

Regression fix (#156)

Signed-off-by: Xiao Ni <xni@redhat.com>

mdmon: imsm: fix metadata corruption when managing new array

When manager thread detects new array, it will invoke manage_new().
For imsm array, it will further invoke imsm_open_new(). Since
commit bbab0940fa75("imsm: write bad block log on metadata sync"),
it preallocates bad block log when opening the array, that requires
increasing the mpb buffer size.
For that, imsm_open_new() invokes function imsm_update_metadata_locally(),
which first uses imsm_prepare_update() to allocate a larger mpb buffer
and store it at "mpb->next_buf", and then invoke imsm_process_update()
to copy the content from current mpb buffer "mpb->buf" to "mpb->next_buf",
and then free the current mpb buffer and set the new buffer as current.

There is a small race window, when monitor thread is syncing metadata,
it gets current buffer pointer in imsm_sync_metadata()->write_super_imsm(),
but before flushing the buffer to disk, manager thread does above switching
buffer which frees current buffer, then monitor thread will run into
use-after-free issue and could cause on-disk metadata corruption.
If system keeps running, further metadata update could fix the corruption,
because after switching buffer, the new buffer will contain good metadata,
but if panic/power cycle happens while disk metadata is corrupted,
the system will run into bootup failure if array is used as root,
otherwise the array can not be assembled after boot if not used as root.

This issue will not happen for imsm array with only one member array,
because the memory array has not be opened yet, monitor thread will not
do any metadata updates.
This can happen for imsm array with at lease two member array, in the
following two scenarios:
1. Restarting mdmon process with at least two member array
This will happen during system boot up or user restart mdmon after mdadm
upgrade
2. Adding new member array to exist imsm array with at least one member
array.

To fix this, delay the switching buffer operation to monitor thread.

Fixes: bbab0940fa75 ("imsm: write bad block log on metadata sync")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>

Rework MAINTAINERS file

Remove Mateusz. Intergrate it with README.md

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

Move release steps to documentation/

Make a room for release MAINTAINERS file.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

bitmap.h: Minor fixes

Move documentation to documentation/bitmap.md. Add Neil's copyrights,
add missing license. Remove unused macros.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

bitmap.h - clear __KERNEL__ based headers

It is unused for years. Clear it.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

super-ddf: optimize DDF header search for widely used RAID controllers

Implemented fallback logic to search the last 32MB of the device
for the DDF header (magic). If found, proceeds to load the DDF metadata
from the located position.

When clearing metadata as required by the mdadm --zero (function Kill),
also erase the last 32MB of data; otherwise, it may result in an
infinite loop.

According to the specification, the Anchor Header should be placed at
the end of the disk. However,some widely used RAID hardware, such as
LSI and PERC, do not position it within the last 512 bytes of the disk.

Signed-off-by: lilinzhe <llz@antiy.cn>

super-ddf: Prevent crash when handling DDF metadata

A dummy function is defined because availability of ss->update_super is
not always verified.

This fix addresses a crash reported when assembling a RAID array using
mdadm with DDF metadata. For more details, see the discussion at:
https://lore.kernel.org/all/
CALHdMH30LuxR4tz9jP2ykDaDJtZ3P7L3LrZ+9e4Fq=Q6NwSM=Q@mail.gmail.com/

The discussion centers on an issue with mdadm where attempting to
assemble a RAID array caused a null pointer dereference. The problem
was traced to a missing update_super() function in super-ddf.c, which
led to a crash in Assemble.c.

Signed-off-by: lilinzhe <llz@antiy.cn>

platform-intel: Disable legacy option ROM scan on UEFI machines

The legacy option ROM memory range from 0xc0000-0xeffff is not defined
on UEFI machines so don't attempt to scan it. This avoids lockdown log
spam when Secure Boot is enabled (avoids use of /dev/mem).

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

mdadm: fix --grow with --add for linear

For the case mdadm --grow with --add, the s.btype should not be
initialized yet, hence BitmapUnknown should be checked instead of
BitmapNone.

Noted that this behaviour should only support by md-linear, which is
removed from kernel, howerver, it turns out md-linear is used widely
in home NAS and we're planning to reintroduce it soon.

Fixes: 581ba1341017 ("mdadm: remove bitmap file support")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

udev: persist properties of MD devices after switch_root

dracut installs in the initrd a custom udev rule for MD devices
(59-persistent-storage-md.rules) only to set the db_persist option (see
[1]). The main purpose is that if an MD device is activated in the initrd,
its properties are kept on the udev database after the transition from the
initrd to the rootfs. This was added to fix detection issues when LVM is
on top.

This patch would allow to remove the custom udev rule shipped by dracut
(63-md-raid-arrays.rules is already being installed in the initrd), and it
will also benefit other initrd generators that do not want to create
custom udev rules.

[1] https://github.com/dracutdevs/dracut/blob/master/modules.d/90mdraid

Signed-off-by: Antonio Alvarez Feijoo <antonio.feijoo@suse.com>

mdopen: add sbin path to env PATH when call system("modprobe md_mod")

During the boot process if mdadm is called in udev context, sbin paths
like /sbin, /usr/sbin, /usr/local/sbin normally not defined in PATH env
variable, calling system("modprobe md_mod") in create_named_array() may
fail with 'sh: modprobe: command not found' error message.

We don't want to move modprobe binary into udev private directory, so
setting the PATH env is a more proper method to avoid the above issue.

This patch sets PATH env variable with "/sbin:/usr/sbin:/usr/local/sbin"
before calling system("modprobe md_mod"). The change only takes effect
within the udev worker context, not seen by global udev environment.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm/raid6check: add xmalloc.h to raid6check.c

It reports building error:
raid6check.c:324:26: error: implicit declaration of function xmalloc

Add xmalloc.h to raid6check.c file to fix this.

Signed-off-by: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/r/20250117071540.4094-1-xni@redhat.com
Signed-off-by: Song Liu <song@kernel.org>

Refactor continue_via_systemd()

Refactor continue_via_systemd() and it's calls to make it more readable.
No functional changes.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Better error messages for broken reshape

mdadm --grow --continue has no functionality to restore critical sectors
if reshape was stopped during operation. This functionality belongs to
assemble or incremental.

This patch adds hints to error messages, to try to reassemble array in
case of reshape failure to restore critical sector, so assemble can
handle restoration.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm: Do not start reshape before switchroot

There are numerous issues for --grow --continue in switchroot phrase,
they include:
* Events being missed for restarting grow-continue service. This is
  apparent mostly on OS on RAID scenarios. When a checkpoint (next step)
  is committed, we have no reliable way to gracefully stop reshape until
  it reaches that checkpoint. During boot, there's heavy I/O utilisation,
  which causes sync speed drop, and naturally checkpoint takes longer to
  reach. This further causes systemd to forcefully kill grow-continue
  service due to timeouts, which results in udev event being missed for
  grow-continue service restart.
* Grow-continue (seemingly) was not designed to be restarted without
  reassembly, some things like stopping chunksize (to lower) migration
  were straight up not working until recently.
This patch makes grow-continue (actual reshape) start after switchroot
phrase. This way we should not encounter issues related to restarting
the service.

Add checks not start a reshape if in initrd, let it initialise only.
Change grow-continue udev rule to be triggered whenever there's a
reshape happening in metadata, rely on udev event to kick reshape after
switchroot. Add handle_forking helper function for reshapes to avoid
duplicating code.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Detail: Export reshape status

Display if there's an ongoing reshape happening in mdadm --detail
--export output.

This change is needed for incoming patches that will change "grow
continue" udev rules, to be based on actual array state.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Remove --freeze-reshape logic

This commit removes --freeze-reshape logic, it basicaly reverts
commit b76b30e0f950 ("Do not continue reshape during initrd phase").
--freeze-reshape was supposed to be used to restore critical sector in
incremental and assemble operations without starting a reshape process,
but it's meaning has been lost through the years and it is not
currently used.

A replacement for this logic will be added in incoming patches, so
reshapes won't be started in initrd phrase.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm.man: Remove external bitmap

Remove external bitmap support from manual.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm: add MAINTAINERS file

Create a maintainers file to keep track of people
to contact when dealing with mdadm questions/issues.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>

checkpatch.conf: ignore NEW_TYPEDEFS

In mdadm, we have more flexible apporach to typedefs.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Incremental: Simplify remove logic

Incremental_remove() does not execute Manage_subdevs() now.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

sysfs: functions for writing md/<memb>/state

Add dedicated enum to reflect possible values for mentioned file.
Not all values are mapped. Add map to present sysfs keywords.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Incremental: Document workaround

Keep it documented in code.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Coverity fixes resources leaks

Handle variable going out of scope leaks the handle.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>

Release mdadm-4.4

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

tests: increase sleeps from 1s to 2s

The issue here is that some of the tests sporadically fail due to things
being still processed. Default 1s delays proven not to be sufficient for
newly created CI, as tests tend to ocassionally fail.

This patch increases default 1s sleep to 2s, to hopefully get rid of
sporadical fails.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: increase sleeps for 04r5swap and 05r tests

This commit increases sleep times from 4 seconds to 6 as some of the
tests seem to be randomly failing due to this.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 01r5fail

Increase sleep to 2s to give driver more time to stop recovery.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 05r1-re-add-nosuper

Add one second sleep before calling check() for array to process.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07autoassemble

Block device check in testdev() is not sufficient as it does not account
for symlinks. Fix the check to use lsblk instead. Add mdstat check for
better debugging TC and change md0 for md127 as that will be array name
after assembly.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07autodetect

Change graceful exit to skip to indicate the test cannot be run.
Add some sleep after creation, let's see if that's enough.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: mark 07changelevels broken

Test 07changelevels can fail in multiple ways:
- R5 -> R6 migration can make driver unresponsive
- R6 -> R5 migration can fail

Mark the test as broken to clear the CI.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07layouts

Remove redundant backup file creation so mdadm does not complain it
already exists.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: remove redundant new-line from save_log()

Remove redundant new-line character from "echo" call in save_log().

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: skip 07testreshape5 if no test_stripe

For test 07testreshape5 to succeed test_stripe binary must be first
compiled. Add check to skip test if no binary.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 09imsm-assemble

Refactor imsm_check_removal() to give mdadm a chance to remove the
device, add retries.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 10ddf-create

There are two issues with 10ddf-create:
- get_rootdev() failed if test was run in VM. Simplify and refactor the
function.
- tests fails at assemble due to segfault. Mark test as broken to clear
the CI.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 24raid10deadlock

Skip tests if fault injection is not enabled.
Remove 24raid10deadlock.inject_error empty file.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: add skip option

As for now the test either fails or succeeds. Add third option: skip.
This is to be used for tests that might not be possible to execute for
example due to missing (software) components or kernel not being
compiled with debugging options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix tests 25raid246

This commit fixes tests 25raid246 so CI can pass.
Details:
- Change array size to 10M.
- Change filesystem from xfc to ext4 (more distros should have toolset
out of the box).
- Mark 25raid456-reshape-while-recovery as broken. It's too much effort
to fix it for now.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm: remove bitmap file support

Because it's marked deprecated for a long time now, and it's not worthy
to support it for new bitmap.

Now that we don't need to store filename for bitmap, also declare a new
enum type bitmap_type to simplify code.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm: ask user if bitmap is not set

Instead of auto-forcing bitmap only for large arrays, it is more
reasonable to let user do the chooice if bimtap is not set.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Manage: forbid re-add to the array without metadata

For the build mode or external metadata, re-add is not supported,
because it will not trigger full disk recovery, user should add a new disk
to the array instead.

Also update test/05r1-re-add-nosuper to reflect this.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

tests/05r1-re-add-nosuper: remove bitmap file test

Prepare to remove bitmap file support.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

tests/04update-uuid: remove bitmap file test

Prepare to remove the bitmap file support.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: Fix saving artifacts

Currently, if error is returned by test command, execution of other
steps is aborted. In that case, continue-on-error safe artifact but
return error later and fail the job.

If executions passed, there are no artifacts to safe, therefore do not
safe them.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: return fail if any failed

GH action status should be failed if any test failed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: Log execution time

To start optymalizing test suite, we need to know which tests are the
most time consuming. Log execution time after every test.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

imsm: fix tpv drvies check in add_to_super

Before the mentioned patch, the check to verify if IMSM on current
platform supports a use of TPV (other than Intel) disk, was only performed
for non-Intel disks, after it is performed for all. This change causes
inability to use any disk when platform does not support TPV drives,
attempt results in the following error.

mdadm: Platform configuration does not support non-Intel NVMe drives.
Please refer to Intel(R) RSTe/VROC user guide.

This change restores the check if the disk is non-Intel.

Fixes: 734e7db4dfc5 ("imsm: Remove warning and refactor add_to_super_imsm code")
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

tests: fix "foreign" verification for nameing tests.

Mdadm supports DEVNODE in multiple form, we cannot trust that because it
does not always reflect name in metadata. Tests are defining clear
expectations- we must use them.

Do foreign verification against WANTED_NAME instead of passed DEVNODE.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

platform-intel: fix buffer overflow

mdadm -C /dev/md/imsm0 -e imsm -n 2 /dev/nvme5n1 /dev/nvme4n1 -R
mdadm -C /dev/md/r0d2 -l 0 -n 2 /dev/nvme5n1 /dev/nvme4n1 -R
*** buffer overflow detected ***: terminated
Aborted (core dumped)

Issue is related to D_FORTIFY_SOURCE=3 flag and depends on environment,
especially compiler version. In function active_arrays_by_format length of
path buffer is calculated dynamically based on parameters, while PATH_MAX
is used in snprintf, this is my lead to buffer overflow.

It is fixed by change dynamic length calculation, to use define PATH_MAX
for path length.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

CI: run mdadm tests on test scripts change

Run mdadm tests scope on every change related to test files.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

debug: add timestamps for debug messages

Timestamps on debug messages help establish what takes long to process.
Debug messages are print only if DDEBUG flag is passed.

Add timestamps for debug messages. Remove dead code from dprintf dummies
for non-debug builds. Remove timestamps from current debug messages.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

CI: assign ret to numeric value

Use variable to store tests exit status. Return its value when test
script finished.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

README: Rephrase mailing list chapter

As suggested by Dan, make it sounds more welcomed.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

CI: use self-hosted runner to run tests

Use prepared VM machine in GitHub actions to run mdadm tests on it.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

func.sh: do not hang when grow-continue can't finish

When grow-continue process is ongoing, sync_action indicates that
recovery is in progress. If grow-continue does not finish,
even if sync_action is not "reshape" anymore, the test should fail.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

Fix 07reshape5initr test

This test could hang if "check" action is not written to sync_action. If
this value didn't appear, test hanged on infinite while loop. Add 5
second timeout to loop.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

imsm: add print license for VMD

Add print IMSM license for VMD controllers in --detail-platform.
The license specifies the scope of RAID support in the platform for
the VMD controller.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

tests: remove --auto

It is deprecated and it is not tested now.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdopen: remove wrong condition

After mentioned patch, this condition get opposite meaning and it
is blocking creation in cases where it was supported.

Remove it now.

Fixes: 119cdcad049e ("mdadm: drop auto= support")
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm.conf: remove refferences to old kernels.

Remove them.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

md.man: Remove refferences to not supported kernel

Reader doesn't need it. Remove it.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm.man: Remove refferences to legacy kernels

We are not supporting kernels older than 3.10.
Update mdadm man.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm: drop auto= support

According to author (and what was described in man):
"With mdadm 3.0, device creation is normally left up to udev so this is
option is unlikely to be needed"
This was a workaround for kernel 2.6 family issues (partitionable and
non-partitionable arrays hell) and I believe we are far away from it now.

I'm not aware of any usage of it, hence it is removed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

ReadMe: Fix stylistic issues

No functional changes, just adopt style to allow checkpatch to pass.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdmon: delegate removal to managemon

Starting from [1], kernel requires suspend lock on member drive remove
path. It causes deadlock with external management because monitor
thread may be locked on suspend and is unable to switch array to active,
for example if badblock is reported in this time.

It is blocking action now, so it must be delegated to managemon thread
but we must ensure that monitor does metadata update first, just after
detecting faulty.

This patch adds appropriative support. Monitor thread detects "faulty",
and updates the metadata. After that, it is asking manager thread to
remove the device. Manager must be careful because closing descriptors
used by select() may lead to abort with D_FORTIFY_SOURCE=2. First, it
must ensure that device descriptors are not used by monitor.

There is unlimited numer of remove retries and recovery is blocked
until all failed drives are removed. It is safe because "faulty"
device is not longer used by MD.

Issue will be also mitigated by optimalization on badlbock recording path
in kernel. It will check if device is not failed before badblock is
recorded but relying on this is not ideologically correct. Userspace
must keep compatibility with kernel and since it is blocking action,
we must tract is as blocking action.

[1] kernel commit cfa078c8b80d ("md: use new apis to suspend array
for adding/removing rdev from state_store()")

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

monitor: Add DS_EXTERNAL_BB flag

If this is set, then metadata handler must support external badblocks.
Remove checks for superswitch functions. If mdi->state_fd is not set
then we should not try to record badblock, we cannot trust this device.

No functional changes.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

sysfs: add sysfs_open_memb_attr()

Function is added to not repeat defining "dev-%s", disk_name.
Related code branches are updated. Ioctl way for setting disk
faulty/remove is removed, sysfs is always used now.

Some non functional style issues are fixed in Manage_subdevs().

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

[PATCH] mdadm: Grow.c distinguish takeover vs reshape on grow operation

Correcting the terminology on the output when doing a takeover
vs a reshape.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>