git.ipfire.org Git - thirdparty/mdadm.git/log

super1: Clear extra flags when initializing metadata

When adding a disk to a RAID1 array, the metadata is read from the
existing member disks for sync. However, only the bad_blocks flag are
copied, the bad_blocks records are not copied, so the bad_blocks
records are all zeros. The kernel function super_1_load() detects
bad_blocks flag and reads the bad_blocks record, then sets the bad
block using badblocks_set().

After the kernel commit 1726c7746783 (badblocks: improve badblocks_set()
for multiple ranges handling) if the length of a bad_blocks record is 0,
it will return a failure. Therefore the device addition will fail.

So when adding a new disk, some flags cannot be sync and need to be clead.

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>

Regression fix (#156)

Signed-off-by: Xiao Ni <xni@redhat.com>

mdmon: imsm: fix metadata corruption when managing new array

When manager thread detects new array, it will invoke manage_new().
For imsm array, it will further invoke imsm_open_new(). Since
commit bbab0940fa75("imsm: write bad block log on metadata sync"),
it preallocates bad block log when opening the array, that requires
increasing the mpb buffer size.
For that, imsm_open_new() invokes function imsm_update_metadata_locally(),
which first uses imsm_prepare_update() to allocate a larger mpb buffer
and store it at "mpb->next_buf", and then invoke imsm_process_update()
to copy the content from current mpb buffer "mpb->buf" to "mpb->next_buf",
and then free the current mpb buffer and set the new buffer as current.

There is a small race window, when monitor thread is syncing metadata,
it gets current buffer pointer in imsm_sync_metadata()->write_super_imsm(),
but before flushing the buffer to disk, manager thread does above switching
buffer which frees current buffer, then monitor thread will run into
use-after-free issue and could cause on-disk metadata corruption.
If system keeps running, further metadata update could fix the corruption,
because after switching buffer, the new buffer will contain good metadata,
but if panic/power cycle happens while disk metadata is corrupted,
the system will run into bootup failure if array is used as root,
otherwise the array can not be assembled after boot if not used as root.

This issue will not happen for imsm array with only one member array,
because the memory array has not be opened yet, monitor thread will not
do any metadata updates.
This can happen for imsm array with at lease two member array, in the
following two scenarios:
1. Restarting mdmon process with at least two member array
This will happen during system boot up or user restart mdmon after mdadm
upgrade
2. Adding new member array to exist imsm array with at least one member
array.

To fix this, delay the switching buffer operation to monitor thread.

Fixes: bbab0940fa75 ("imsm: write bad block log on metadata sync")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>

Rework MAINTAINERS file

Remove Mateusz. Intergrate it with README.md

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

Move release steps to documentation/

Make a room for release MAINTAINERS file.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

bitmap.h: Minor fixes

Move documentation to documentation/bitmap.md. Add Neil's copyrights,
add missing license. Remove unused macros.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

bitmap.h - clear __KERNEL__ based headers

It is unused for years. Clear it.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

super-ddf: optimize DDF header search for widely used RAID controllers

Implemented fallback logic to search the last 32MB of the device
for the DDF header (magic). If found, proceeds to load the DDF metadata
from the located position.

When clearing metadata as required by the mdadm --zero (function Kill),
also erase the last 32MB of data; otherwise, it may result in an
infinite loop.

According to the specification, the Anchor Header should be placed at
the end of the disk. However,some widely used RAID hardware, such as
LSI and PERC, do not position it within the last 512 bytes of the disk.

Signed-off-by: lilinzhe <llz@antiy.cn>

super-ddf: Prevent crash when handling DDF metadata

A dummy function is defined because availability of ss->update_super is
not always verified.

This fix addresses a crash reported when assembling a RAID array using
mdadm with DDF metadata. For more details, see the discussion at:
https://lore.kernel.org/all/
CALHdMH30LuxR4tz9jP2ykDaDJtZ3P7L3LrZ+9e4Fq=Q6NwSM=Q@mail.gmail.com/

The discussion centers on an issue with mdadm where attempting to
assemble a RAID array caused a null pointer dereference. The problem
was traced to a missing update_super() function in super-ddf.c, which
led to a crash in Assemble.c.

Signed-off-by: lilinzhe <llz@antiy.cn>

platform-intel: Disable legacy option ROM scan on UEFI machines

The legacy option ROM memory range from 0xc0000-0xeffff is not defined
on UEFI machines so don't attempt to scan it. This avoids lockdown log
spam when Secure Boot is enabled (avoids use of /dev/mem).

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

mdadm: fix --grow with --add for linear

For the case mdadm --grow with --add, the s.btype should not be
initialized yet, hence BitmapUnknown should be checked instead of
BitmapNone.

Noted that this behaviour should only support by md-linear, which is
removed from kernel, howerver, it turns out md-linear is used widely
in home NAS and we're planning to reintroduce it soon.

Fixes: 581ba1341017 ("mdadm: remove bitmap file support")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

udev: persist properties of MD devices after switch_root

dracut installs in the initrd a custom udev rule for MD devices
(59-persistent-storage-md.rules) only to set the db_persist option (see
[1]). The main purpose is that if an MD device is activated in the initrd,
its properties are kept on the udev database after the transition from the
initrd to the rootfs. This was added to fix detection issues when LVM is
on top.

This patch would allow to remove the custom udev rule shipped by dracut
(63-md-raid-arrays.rules is already being installed in the initrd), and it
will also benefit other initrd generators that do not want to create
custom udev rules.

[1] https://github.com/dracutdevs/dracut/blob/master/modules.d/90mdraid

Signed-off-by: Antonio Alvarez Feijoo <antonio.feijoo@suse.com>

mdopen: add sbin path to env PATH when call system("modprobe md_mod")

During the boot process if mdadm is called in udev context, sbin paths
like /sbin, /usr/sbin, /usr/local/sbin normally not defined in PATH env
variable, calling system("modprobe md_mod") in create_named_array() may
fail with 'sh: modprobe: command not found' error message.

We don't want to move modprobe binary into udev private directory, so
setting the PATH env is a more proper method to avoid the above issue.

This patch sets PATH env variable with "/sbin:/usr/sbin:/usr/local/sbin"
before calling system("modprobe md_mod"). The change only takes effect
within the udev worker context, not seen by global udev environment.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm/raid6check: add xmalloc.h to raid6check.c

It reports building error:
raid6check.c:324:26: error: implicit declaration of function xmalloc

Add xmalloc.h to raid6check.c file to fix this.

Signed-off-by: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/r/20250117071540.4094-1-xni@redhat.com
Signed-off-by: Song Liu <song@kernel.org>

Refactor continue_via_systemd()

Refactor continue_via_systemd() and it's calls to make it more readable.
No functional changes.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Better error messages for broken reshape

mdadm --grow --continue has no functionality to restore critical sectors
if reshape was stopped during operation. This functionality belongs to
assemble or incremental.

This patch adds hints to error messages, to try to reassemble array in
case of reshape failure to restore critical sector, so assemble can
handle restoration.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm: Do not start reshape before switchroot

There are numerous issues for --grow --continue in switchroot phrase,
they include:
* Events being missed for restarting grow-continue service. This is
  apparent mostly on OS on RAID scenarios. When a checkpoint (next step)
  is committed, we have no reliable way to gracefully stop reshape until
  it reaches that checkpoint. During boot, there's heavy I/O utilisation,
  which causes sync speed drop, and naturally checkpoint takes longer to
  reach. This further causes systemd to forcefully kill grow-continue
  service due to timeouts, which results in udev event being missed for
  grow-continue service restart.
* Grow-continue (seemingly) was not designed to be restarted without
  reassembly, some things like stopping chunksize (to lower) migration
  were straight up not working until recently.
This patch makes grow-continue (actual reshape) start after switchroot
phrase. This way we should not encounter issues related to restarting
the service.

Add checks not start a reshape if in initrd, let it initialise only.
Change grow-continue udev rule to be triggered whenever there's a
reshape happening in metadata, rely on udev event to kick reshape after
switchroot. Add handle_forking helper function for reshapes to avoid
duplicating code.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Detail: Export reshape status

Display if there's an ongoing reshape happening in mdadm --detail
--export output.

This change is needed for incoming patches that will change "grow
continue" udev rules, to be based on actual array state.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

Remove --freeze-reshape logic

This commit removes --freeze-reshape logic, it basicaly reverts
commit b76b30e0f950 ("Do not continue reshape during initrd phase").
--freeze-reshape was supposed to be used to restore critical sector in
incremental and assemble operations without starting a reshape process,
but it's meaning has been lost through the years and it is not
currently used.

A replacement for this logic will be added in incoming patches, so
reshapes won't be started in initrd phrase.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm.man: Remove external bitmap

Remove external bitmap support from manual.

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

mdadm: add MAINTAINERS file

Create a maintainers file to keep track of people
to contact when dealing with mdadm questions/issues.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>

checkpatch.conf: ignore NEW_TYPEDEFS

In mdadm, we have more flexible apporach to typedefs.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Incremental: Simplify remove logic

Incremental_remove() does not execute Manage_subdevs() now.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

sysfs: functions for writing md/<memb>/state

Add dedicated enum to reflect possible values for mentioned file.
Not all values are mapped. Add map to present sysfs keywords.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Incremental: Document workaround

Keep it documented in code.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Coverity fixes resources leaks

Handle variable going out of scope leaks the handle.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>

Release mdadm-4.4

Signed-off-by: Mariusz Tkaczyk <mtkaczyk@kernel.org>

tests: increase sleeps from 1s to 2s

The issue here is that some of the tests sporadically fail due to things
being still processed. Default 1s delays proven not to be sufficient for
newly created CI, as tests tend to ocassionally fail.

This patch increases default 1s sleep to 2s, to hopefully get rid of
sporadical fails.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: increase sleeps for 04r5swap and 05r tests

This commit increases sleep times from 4 seconds to 6 as some of the
tests seem to be randomly failing due to this.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 01r5fail

Increase sleep to 2s to give driver more time to stop recovery.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 05r1-re-add-nosuper

Add one second sleep before calling check() for array to process.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07autoassemble

Block device check in testdev() is not sufficient as it does not account
for symlinks. Fix the check to use lsblk instead. Add mdstat check for
better debugging TC and change md0 for md127 as that will be array name
after assembly.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07autodetect

Change graceful exit to skip to indicate the test cannot be run.
Add some sleep after creation, let's see if that's enough.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: mark 07changelevels broken

Test 07changelevels can fail in multiple ways:
- R5 -> R6 migration can make driver unresponsive
- R6 -> R5 migration can fail

Mark the test as broken to clear the CI.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 07layouts

Remove redundant backup file creation so mdadm does not complain it
already exists.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: remove redundant new-line from save_log()

Remove redundant new-line character from "echo" call in save_log().

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: skip 07testreshape5 if no test_stripe

For test 07testreshape5 to succeed test_stripe binary must be first
compiled. Add check to skip test if no binary.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 09imsm-assemble

Refactor imsm_check_removal() to give mdadm a chance to remove the
device, add retries.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 10ddf-create

There are two issues with 10ddf-create:
- get_rootdev() failed if test was run in VM. Simplify and refactor the
function.
- tests fails at assemble due to segfault. Mark test as broken to clear
the CI.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix 24raid10deadlock

Skip tests if fault injection is not enabled.
Remove 24raid10deadlock.inject_error empty file.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: add skip option

As for now the test either fails or succeeds. Add third option: skip.
This is to be used for tests that might not be possible to execute for
example due to missing (software) components or kernel not being
compiled with debugging options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

tests: fix tests 25raid246

This commit fixes tests 25raid246 so CI can pass.
Details:
- Change array size to 10M.
- Change filesystem from xfc to ext4 (more distros should have toolset
out of the box).
- Mark 25raid456-reshape-while-recovery as broken. It's too much effort
to fix it for now.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

mdadm: remove bitmap file support

Because it's marked deprecated for a long time now, and it's not worthy
to support it for new bitmap.

Now that we don't need to store filename for bitmap, also declare a new
enum type bitmap_type to simplify code.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm: ask user if bitmap is not set

Instead of auto-forcing bitmap only for large arrays, it is more
reasonable to let user do the chooice if bimtap is not set.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Manage: forbid re-add to the array without metadata

For the build mode or external metadata, re-add is not supported,
because it will not trigger full disk recovery, user should add a new disk
to the array instead.

Also update test/05r1-re-add-nosuper to reflect this.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

tests/05r1-re-add-nosuper: remove bitmap file test

Prepare to remove bitmap file support.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

tests/04update-uuid: remove bitmap file test

Prepare to remove the bitmap file support.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: Fix saving artifacts

Currently, if error is returned by test command, execution of other
steps is aborted. In that case, continue-on-error safe artifact but
return error later and fail the job.

If executions passed, there are no artifacts to safe, therefore do not
safe them.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: return fail if any failed

GH action status should be failed if any test failed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

test: Log execution time

To start optymalizing test suite, we need to know which tests are the
most time consuming. Log execution time after every test.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

imsm: fix tpv drvies check in add_to_super

Before the mentioned patch, the check to verify if IMSM on current
platform supports a use of TPV (other than Intel) disk, was only performed
for non-Intel disks, after it is performed for all. This change causes
inability to use any disk when platform does not support TPV drives,
attempt results in the following error.

mdadm: Platform configuration does not support non-Intel NVMe drives.
Please refer to Intel(R) RSTe/VROC user guide.

This change restores the check if the disk is non-Intel.

Fixes: 734e7db4dfc5 ("imsm: Remove warning and refactor add_to_super_imsm code")
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

tests: fix "foreign" verification for nameing tests.

Mdadm supports DEVNODE in multiple form, we cannot trust that because it
does not always reflect name in metadata. Tests are defining clear
expectations- we must use them.

Do foreign verification against WANTED_NAME instead of passed DEVNODE.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

platform-intel: fix buffer overflow

mdadm -C /dev/md/imsm0 -e imsm -n 2 /dev/nvme5n1 /dev/nvme4n1 -R
mdadm -C /dev/md/r0d2 -l 0 -n 2 /dev/nvme5n1 /dev/nvme4n1 -R
*** buffer overflow detected ***: terminated
Aborted (core dumped)

Issue is related to D_FORTIFY_SOURCE=3 flag and depends on environment,
especially compiler version. In function active_arrays_by_format length of
path buffer is calculated dynamically based on parameters, while PATH_MAX
is used in snprintf, this is my lead to buffer overflow.

It is fixed by change dynamic length calculation, to use define PATH_MAX
for path length.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

CI: run mdadm tests on test scripts change

Run mdadm tests scope on every change related to test files.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

debug: add timestamps for debug messages

Timestamps on debug messages help establish what takes long to process.
Debug messages are print only if DDEBUG flag is passed.

Add timestamps for debug messages. Remove dead code from dprintf dummies
for non-debug builds. Remove timestamps from current debug messages.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>

CI: assign ret to numeric value

Use variable to store tests exit status. Return its value when test
script finished.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

README: Rephrase mailing list chapter

As suggested by Dan, make it sounds more welcomed.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

CI: use self-hosted runner to run tests

Use prepared VM machine in GitHub actions to run mdadm tests on it.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

func.sh: do not hang when grow-continue can't finish

When grow-continue process is ongoing, sync_action indicates that
recovery is in progress. If grow-continue does not finish,
even if sync_action is not "reshape" anymore, the test should fail.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

Fix 07reshape5initr test

This test could hang if "check" action is not written to sync_action. If
this value didn't appear, test hanged on infinite while loop. Add 5
second timeout to loop.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

imsm: add print license for VMD

Add print IMSM license for VMD controllers in --detail-platform.
The license specifies the scope of RAID support in the platform for
the VMD controller.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>

tests: remove --auto

It is deprecated and it is not tested now.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdopen: remove wrong condition

After mentioned patch, this condition get opposite meaning and it
is blocking creation in cases where it was supported.

Remove it now.

Fixes: 119cdcad049e ("mdadm: drop auto= support")
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm.conf: remove refferences to old kernels.

Remove them.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

md.man: Remove refferences to not supported kernel

Reader doesn't need it. Remove it.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm.man: Remove refferences to legacy kernels

We are not supporting kernels older than 3.10.
Update mdadm man.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm: drop auto= support

According to author (and what was described in man):
"With mdadm 3.0, device creation is normally left up to udev so this is
option is unlikely to be needed"
This was a workaround for kernel 2.6 family issues (partitionable and
non-partitionable arrays hell) and I believe we are far away from it now.

I'm not aware of any usage of it, hence it is removed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

ReadMe: Fix stylistic issues

No functional changes, just adopt style to allow checkpatch to pass.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdmon: delegate removal to managemon

Starting from [1], kernel requires suspend lock on member drive remove
path. It causes deadlock with external management because monitor
thread may be locked on suspend and is unable to switch array to active,
for example if badblock is reported in this time.

It is blocking action now, so it must be delegated to managemon thread
but we must ensure that monitor does metadata update first, just after
detecting faulty.

This patch adds appropriative support. Monitor thread detects "faulty",
and updates the metadata. After that, it is asking manager thread to
remove the device. Manager must be careful because closing descriptors
used by select() may lead to abort with D_FORTIFY_SOURCE=2. First, it
must ensure that device descriptors are not used by monitor.

There is unlimited numer of remove retries and recovery is blocked
until all failed drives are removed. It is safe because "faulty"
device is not longer used by MD.

Issue will be also mitigated by optimalization on badlbock recording path
in kernel. It will check if device is not failed before badblock is
recorded but relying on this is not ideologically correct. Userspace
must keep compatibility with kernel and since it is blocking action,
we must tract is as blocking action.

[1] kernel commit cfa078c8b80d ("md: use new apis to suspend array
for adding/removing rdev from state_store()")

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

monitor: Add DS_EXTERNAL_BB flag

If this is set, then metadata handler must support external badblocks.
Remove checks for superswitch functions. If mdi->state_fd is not set
then we should not try to record badblock, we cannot trust this device.

No functional changes.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

sysfs: add sysfs_open_memb_attr()

Function is added to not repeat defining "dev-%s", disk_name.
Related code branches are updated. Ioctl way for setting disk
faulty/remove is removed, sysfs is always used now.

Some non functional style issues are fixed in Manage_subdevs().

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

[PATCH] mdadm: Grow.c distinguish takeover vs reshape on grow operation

Correcting the terminology on the output when doing a takeover
vs a reshape.

Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>

mdadm/Grow: Check new_level interface rather than kernel version

Different os distributions have different kernel version themselves.
Check new_level sysfs interface rather than kernel version.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Manage: Clear superblock if adding new device fails

The superblock is kept if adding new device fails. It should clear the
superblock if it fails to add a new disk.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

util: use only /dev directory in open_dev()

Previously, open_dev() tried to open device in two ways - using /dev and
/tmp directory. This method could be used by users which have no access
to /tmp directory (e.g. udev) and dev_open() fails which may affect many
processes. Remove try to open in /tmp directory.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

mdadm.man: Add udev-rules flag

--udev-rules flag is added and point to mdadm.conf man page
for further explanations about POLICY.

Signed-off-by: Andre Paiusco <github@paiusco.org>

mdadm.conf.man: Explain udev rule

Clarify a filename is accepted and the need of reloading the
udev rules.

Small correction on example order.

Signed-off-by: Andre Paiusco <github@paiusco.org>

mdadm: Add mdadm_status.h

Move mdadm_status_t to mdadm_status.h file. Add status for memory
allocation failure.

Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>

mdadm.man: elaborate more about mdmonitor.service

Describe how it behaves and how it can be configured to work.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdmonitor: Abandon custom configuration files

Operating system vendors are customizing mdmonitor service beacause
the default form is not satifying for them (expect SUSE). As a result,
support is complicated (maintainers have to check the system) and man page
is not detailed.

I propose to abandon custom configuration files via sysconfig and keep
it inside mdadm.conf only.

Detailed comment in service for OSV maintainers is added to help with
transition.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

super-intel: move scsi_get_serial from sg_io

scsi_get_serial() function is used only by super-intel.c. Move function
to this file and remove sg_io.c file.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

Rename Monitor.c to mdmonitor.c

Rename Monitor.c to mdmonitor.c to avoid errors during compilation on
case-insensitive filesystems.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

util: fix sys_hot_remove_disk()

Instead of "remove", "faulty" was called.

Fixes: d95edceb362a ("sysfs: add function for writing to sysfs fd")
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

md.man: update refference to raid5-ppl.rst

Documentation/md has moved to Documentation/driver-api/md.
Update and and rework sentence.

Remove refference to not supported kernel close to updated text.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm: add xmalloc.h

Move memory declaration helpers outside mdadm.h. They seems to be
useful so keep them but include separatelly. Rework them to not reffer
to Name[] declared internally in mdadm/mdmon.

This is first step to start decomplexing mdadm.h.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Mdmonitor: Fix startup with missing directory

Commit 0a07dea8d3b78 ("Mdmonitor: Refactor check_one_sharer() for
better error handling") introduced an issue, if directory /run/mdadm
is missing, monitor fails to start. Move the directory creation
earlier to ensure it is always created.

Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>

sysfs: add function for writing to sysfs fd

Proposed function sysfs_wrte_descriptor() unifies error handling for
write() done to sysfs files. Main purpose is to use it with MD sysfs
file but it can be used elsewhere.

No functional changes.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

Incremental: Rename IncrementalRemove

Rename it to Incremental_remove for better readability.
No functional changes.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

CI: do not install unnecessary packages

Updating all of the packages every time is not needed and costs a lot of
resources. Install only necessary packages and their dependencies.

Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>

Remove INSTALL and dev/null

INSTALL is not needed because it added to README.md
dev/null was created accidentally.

Remove them.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Manage: record errno

Sometimes it reports:
mdadm: failed to stop array /dev/md0: Success
It's the reason the errno is reset. So record errno during the loop.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/tests: remove 09imsm-assemble.broken

09imsm-assemble can run successfully.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/tests: 07testreshape5 fix

Init dir to avoid test failure.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/tests: Remove 07reshape5intr.broken

07reshape5intr can run successfully now.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/tests: 07changelevels fix

There are five changes to this case.

1. remove testdev check. It can't work anymore and check if it's a
block device directly.

2. It can't change level and chunk size at the same time

3. Sleep more than 10s before check wait.
The test devices are small. Sometimes it can finish so quickly once
the reshape just starts. mdadm will be stuck before it waits reshape
to start. So the sync speed is limited. And it restores the sync speed
when it waits reshape to finish. It's good for case without backup
file.

It uses systemd service mdadm-grow-continue to monitor reshape
progress when specifying backup file. If reshape finishes so quickly
before it starts monitoring reshape progress, the daemon will be stuck
too. Because reshape_progress is 0 which means the reshape hasn't been
started. So give more time to let service can get right information
from kernel space.

But before getting these information. It needs to suspend array. At
the same time the reshape is running. The kernel reshape daemon will
update metadata 10s. So it needs to limit the sync speed more than 10s
before restoring sync speed. Then systemd service can suspend array
and start monitoring reshape progress.

4. Wait until mdadm-grow-continue service exits
mdadm --wait doesn't wait systemd service. For the case that needs
backup file, systemd service deletes the backup file after reshape
finishes. In this test case, it runs next case when reshape finishes.
And it fails because it can't create backup file because the backup
file exits.

5. Don't reshape from raid5 to raid1. It can't work now.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/tests: wait until level changes

check wait waits reshape finishes, but it doesn't wait level changes.
The level change happens in a forked child progress. So we need to
search the child progress and monitor it.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Grow: sleep a while after removing disk in impose_level

It needs to remove disks when reshaping from raid456 to raid0. In
kernel space it sets MD_RECOVERY_RUNNING. And it will fail to change
level. So wait sometime to let md thread to clear this flag.

This is found by test case 05r6tor0.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Grow: Can't open raid when running --grow --continue

It passes 'array' as devname in Grow_continue. So it fails to
open raid device. Use mdinfo to open raid device.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Grow: Update reshape_progress to need_back after reshape finishes

It tries to update data offset when kicking off reshape. If it can't
change data offset, it needs to use child_monitor to monitor reshape
progress and do back up job. And it needs to update reshape_progress
to need_back when reshape finishes. If not, it will be in a infinite
loop.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>

mdadm/Grow: Update new level when starting reshape

Reshape needs to specify a backup file when it can't update data offset
of member disks. For this situation, first, it starts reshape and then
it kicks off mdadm-grow-continue service which does backup job and
monitors the reshape process. The service is a new process, so it needs
to read superblock from member disks to get information.

But in the first step, it doesn't update new level in superblock. So
it can't change level after reshape finishes, because the new level is
not right. So records the new level in the first step.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>