]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
13 months agoRemove the config files in mdcheck_start|continue service
Xiao Ni [Fri, 7 Apr 2023 00:45:28 +0000 (08:45 +0800)] 
Remove the config files in mdcheck_start|continue service

We set MDADM_CHECK_DURATION in the mdcheck_start|continue.service files.
And mdcheck doesn't use any configs from the config file. So we can remove
the dependencies.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoBump minimum kernel version to 2.6.32
Jes Sorensen [Mon, 10 Apr 2023 15:45:34 +0000 (11:45 -0400)] 
Bump minimum kernel version to 2.6.32

Summary: At this point it probably is reasonable to drop support for
anything prior to 3.10.

Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoFix some cases eyesore formatting
Jes Sorensen [Mon, 10 Apr 2023 15:40:42 +0000 (11:40 -0400)] 
Fix some cases eyesore formatting

Summary: No functional change .... just make it more readable.

Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agosuper1: fix truncation check for journal device
Hristo Venev [Sat, 1 Apr 2023 20:01:35 +0000 (23:01 +0300)] 
super1: fix truncation check for journal device

The journal device can be smaller than the component devices.

Fixes: 171e9743881e ("super1: report truncated device")
Signed-off-by: Hristo Venev <hristo@venev.name>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoFix null pointer for incremental in mdadm
miaoguanqin [Tue, 4 Apr 2023 11:31:24 +0000 (19:31 +0800)] 
Fix null pointer for incremental in mdadm

when we excute mdadm --assemble, udev-md-raid-assembly.rules is triggered.
Then we stop array, we found an coredump for mdadm --incremental.func
stack are as follows:

#0  enough (level=10, raid_disks=4, layout=258, clean=1,
    avail=avail@entry=0x0) at util.c:555
#1  0x0000562170c26965 in Incremental (devlist=<optimized out>,
    c=<optimized out>, st=0x5621729b6dc0) at Incremental.c:514
#2  0x0000562170bfb6ff in main (argc=<optimized out>,
    argv=<optimized out>) at mdadm.c:1762

func enough() use array avail,avail allocate space in func count_active,
it may not alloc space, causing a coredump.We fix this coredump.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: lixiaokeng <lixiaokeng@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoCreate: Fix checking for container in update_metadata
Mateusz Grzonka [Thu, 23 Mar 2023 11:50:00 +0000 (12:50 +0100)] 
Create: Fix checking for container in update_metadata

The commit 8a4ce2c05386 ("Create: Factor out add_disks() helpers")
introduced a regression that caused timeouts and udev failing to create
links.

Steps to reproduce the issue were as following:
$ mdadm -CR imsm -e imsm -n4 /dev/nvme[0-3]n1
$ mdadm -CR vol -l5 -n4 /dev/nvme[0-3]n1 --assume-clean

I found the check for container was wrong because negation was missing.

Fixes: 8a4ce2c05386 ("Create: Factor out add_disks() helpers")
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoRevert "Revert "mdadm/systemd: remove KillMode=none from service file""
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:13:18 +0000 (17:13 +0100)] 
Revert "Revert "mdadm/systemd: remove KillMode=none from service file""

This reverts commit 28a083955c6f58f8e582734c8c82aff909a7d461.

Resolved by commit 723d1df4946e ("mdmon: Improve switchroot
interactions.") We are ready to drop it.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoImprovements for IMSM_NO_PLATFORM testing.
NeilBrown [Mon, 20 Mar 2023 03:43:54 +0000 (14:43 +1100)] 
Improvements for IMSM_NO_PLATFORM testing.

Factor out IMSM_NO_PLATFORM testing into a single function that caches
the result.

Allow mdmon to explicitly set the result to "1" so that we don't need
the ENV var in the unit file

Check if the kernel command line contains "mdadm.imsm.test=1" and in
that case assert NO_PLATFORM.  This simplifies testing in a virtual
machine.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdopen: always try create_named_array()
NeilBrown [Tue, 14 Mar 2023 00:06:25 +0000 (11:06 +1100)] 
mdopen: always try create_named_array()

mdopen() will use create_named_array() to ask the kernel to create the
given md array, but only if it is given a number or name.
If it is NOT given a name and is required to choose one itself using
find_free_devnm() it does NOT use create_named_array().

On kernels with CONFIG_BLOCK_LEGACY_AUTOLOAD not set, this can result in
failure to assemble an array.  This can particularly seen when the
"name" of the array begins with a host name different to the name of the
host running the command.

So add the missing call to create_named_array().

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217074
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: Improve switchroot interactions.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: Improve switchroot interactions.

We need a new mdmon@mdfoo instance to run in the root filesystem after
switch root, as /sys and /dev are removed from the initrd.

systemd will not start a new unit with the same name running while the
old unit is still active, and we want the two mdmon processes to overlap
in time to avoid any risk of deadlock, which can happen when a write is
attempted with no mdmon running.

So we need a different unit name in the initrd than in the root.  Apart
from the name, everything else should be the same.

This is easily achieved using a different instance name as the
mdmon@.service unit file already supports multiple instances (for
different arrays).

So start "mdmon@mdfoo.service" from root, but
"mdmon@initrd-mdfoo.service" from the initrd.  udev can tell which
circumstance is the case by looking for /etc/initrd-release.
continue_from_systemd() is enhanced so that the "initrd-" prefix can be
requested.

Teach mdmon that a container name like "initrd/foo" should be treated
just like "foo".  Note that systemd passes the instance name
"initrd-foo" as "initrd/foo".

We don't need a similar mechanism at shutdown because dracut runs
"mdmon --takeover --all" when appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: Remove need for KillMode=none
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: Remove need for KillMode=none

mdmon needs to keep running during the switchroot out of (at boot) and
then back into (at shutdown) the initrd.  It runs until a new mdmon
takes over.

Killmode=none is used to achieve this, with the help of --offroot which
sets argv[0][0] to '@' which systemd understands.

This is needed because mdmon is currently run in system-mdmon.slice
which conflicts with shutdown.target so without Killmode=none mdmon
would get killed early in shutdown when system.mdmon.slice is removed.

As described in systemd.service(5), this conflict with shutdown can be
resolved by explicitly requesting system.slice, which is a natural
counterpart to DefaultDependencies=no.

So add that, and also add IgnoreOnIsolate=true to avoid another possible
source of an early death.  With these we no longer need KillMode=none
which the systemd developers have marked as "deprecated".

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: change systemd unit file to use --foreground
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: change systemd unit file to use --foreground

There is no value in mdmon forking when it is running under systemd -
systemd can still track it anyway.

So add --foreground option, and remove "Type=forking".

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: don't test both 'all' and 'container_name'.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: don't test both 'all' and 'container_name'.

If 'all' is not set, then container_name must be NULL, as nothing else
can set it.  So simplify the test to ignore container_name.
This makes the purpose of the code more obvious.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoUse existence of /etc/initrd-release to detect initrd.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
Use existence of /etc/initrd-release to detect initrd.

Since v183, systemd has used the existence of /etc/initrd-release to
detect if it is running in an initrd, rather than looking at the magic
number of the root filesystem's device.  It is time for mdadm to do the
same.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoDefine alignof using _Alignof when using C11 or newer
Khem Raj [Wed, 18 Jan 2023 08:32:36 +0000 (00:32 -0800)] 
Define alignof using _Alignof when using C11 or newer

WG14 N2350 made very clear that it is an UB having type definitions
within "offsetof" [1]. This patch enhances the implementation of macro
alignof_slot to use builtin "_Alignof" to avoid undefined behavior on
when using std=c11 or newer

clang 16+ has started to flag this [2]

Fixes build when using -std >= gnu11 and using clang16+

Older compilers gcc < 4.9 or clang < 8 has buggy _Alignof even though it
may support C11, exclude those compilers too

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm
[2] https://reviews.llvm.org/D133574

Upstream-Status: Pending
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agomanpage: Add --write-zeroes option to manpage
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:35 +0000 (13:41 -0700)] 
manpage: Add --write-zeroes option to manpage

Document the new --write-zeroes option in the manpage.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agotests/00raid5-zero: Introduce test to exercise --write-zeros.
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:34 +0000 (13:41 -0700)] 
tests/00raid5-zero: Introduce test to exercise --write-zeros.

Attempt to create a raid5 array with --write-zeros. If it is successful
check the array to ensure it is in sync.

If it is unsuccessful and an unsupported error is printed, skip the
test.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agomdadm: Add --write-zeros option for Create
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:33 +0000 (13:41 -0700)] 
mdadm: Add --write-zeros option for Create

Add the --write-zeros option for Create which will send a write zeros
request to all the disks before assembling the array. After zeroing
the array, the disks will be in a known clean state and the initial
sync may be skipped.

Writing zeroes is best used when there is a hardware offload method
to zero the data. But even still, zeroing can take several minutes on
a large device. Because of this, all disks are zeroed in parallel using
their own forked process and a message is printed to the user. The main
process will proceed only after all the zeroing processes have completed
successfully.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agomdadm: Introduce pr_info()
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:32 +0000 (13:41 -0700)] 
mdadm: Introduce pr_info()

Feedback was given to avoid informational pr_err() calls that print
to stderr, even though that's done all through out the code.

Using printf() directly doesn't maintain the same format (an "mdadm"
prefix on every line.

So introduce pr_info() which prints to stdout with the same format
and use it for a couple informational pr_err() calls in Create().

Future work can make this call used in more cases.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoCreate: Factor out add_disks() helpers
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:31 +0000 (13:41 -0700)] 
Create: Factor out add_disks() helpers

The Create function is massive with a very large number of variables.
Reading and understanding the function is almost impossible. To help
with this, factor out the two pass loop that adds the disks to the array.

This moves about 160 lines into three new helper functions and removes
a bunch of local variables from the main Create function. The main new
helper function add_disks() does the two pass loop and calls into
add_disk_to_super() and update_metadata(). Factoring out the
latter two helpers also helps to reduce a ton of indentation.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoCreate: remove safe_mode_delay local variable
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:30 +0000 (13:41 -0700)] 
Create: remove safe_mode_delay local variable

All .getinfo_super() call sets the info.safe_mode_delay variables
to a constant value, so no matter what the current state is
that function will always set it to the same value.

Create() calls .getinfo_super() multiple times while creating the array.
The value is stored in a local variable for every disk in the loop
to add disks (so the last disc call takes precedence). The local
variable is then used in the call to sysfs_set_safemode().

This can be simplified by using info.safe_mode_delay directly. The info
variable had .getinfo_super() called on it early in the function so, by the
reasoning above, it will have the same value as the local variable which
can thus be removed.

Doing this allows for factoring out code from Create() in a subsequent
patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoCreate: goto abort_locked instead of return 1 in error path
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:29 +0000 (13:41 -0700)] 
Create: goto abort_locked instead of return 1 in error path

The return 1 after the fstat_is_blkdev() check should be replaced
with an error return that goes through the error path to unlock
resources locked by this function.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agosuper-ddf.c: fix memleak in get_vd_num_of_subarray()
Wu Guanghao [Fri, 3 Mar 2023 16:21:35 +0000 (00:21 +0800)] 
super-ddf.c: fix memleak in get_vd_num_of_subarray()

sra = sysfs_read() should be free before return in
get_vd_num_of_subarray()

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agosuper-intel.c: fix memleak in find_disk_attached_hba()
Wu Guanghao [Fri, 3 Mar 2023 16:21:34 +0000 (00:21 +0800)] 
super-intel.c: fix memleak in find_disk_attached_hba()

If disk_path = diskfd_to_devpath(), we need free(disk_path) before
return, otherwise there will be a memory leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoisuper-intel.c: fix double free in load_imsm_mpb()
Wu Guanghao [Fri, 3 Mar 2023 16:21:33 +0000 (00:21 +0800)] 
isuper-intel.c: fix double free in load_imsm_mpb()

In load_imsm_mpb() there is potential double free issue on super->buf.

The first location to free super->buf is from get_super_block() <==
load_and_parse_mpb() <== load_imsm_mpb():
 4514         if (posix_memalign(&super->migr_rec_buf, MAX_SECTOR_SIZE,
 4515             MIGR_REC_BUF_SECTORS*MAX_SECTOR_SIZE) != 0) {
 4516                 pr_err("could not allocate migr_rec buffer\n");
 4517                 free(super->buf);
 4518                 return 2;
 4519         }

If the above error condition happens, super->buf is freed and value 2
is returned to get_super_block() eventually. Then in the following code
block inside load_imsm_mpb(),
 5289  error:
 5290         if (!err) {
 5291                 s->next = *super_list;
 5292                 *super_list = s;
 5293         } else {
 5294                 if (s)
 5295                         free_imsm(s);
 5296                 close_fd(&dfd);
 5297         }
at line 5295 when free_imsm() is called, super->buf is freed again from
the call chain free_imsm() <== __free_imsm(), in following code block,
 4651         if (super->buf) {
 4652                 free(super->buf);
 4653                 super->buf = NULL;
 4654         }

This patch sets super->buf as NULL after line 4517 in load_imsm_mpb()
to avoid the potential double free().

(Coly Li helps to re-compose the commit log)

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoDetail.c: fix memleak in Detail()
Wu Guanghao [Fri, 3 Mar 2023 16:21:32 +0000 (00:21 +0800)] 
Detail.c: fix memleak in Detail()

char *sysdev = xstrdup() but not free() in for loop, will cause memory
leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoutil.c: fix memleak in parse_layout_faulty()
Wu Guanghao [Fri, 3 Mar 2023 16:21:31 +0000 (00:21 +0800)] 
util.c: fix memleak in parse_layout_faulty()

char *m is allocated by xstrdup but not free() before return, will cause
a memory leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoutil.c: reorder code lines in parse_layout_faulty()
Coly Li [Fri, 3 Mar 2023 16:21:30 +0000 (00:21 +0800)] 
util.c: reorder code lines in parse_layout_faulty()

Resort the code lines in parse_layout_faulty() to make it more
comfortable, no logic change.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Refactor check_one_sharer() for better error handling
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:04 +0000 (12:27 +0100)] 
Mdmonitor: Refactor check_one_sharer() for better error handling

Also check if autorebuild.pid is a symlink, which we shouldn't accept.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Refactor write_autorebuild_pid()
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:03 +0000 (12:27 +0100)] 
Mdmonitor: Refactor write_autorebuild_pid()

Add better error handling and check for symlinks when opening MDMON_DIR.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoAdd helpers to determine whether directories or files are soft links
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:02 +0000 (12:27 +0100)] 
Add helpers to determine whether directories or files are soft links

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Add helper functions
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:01 +0000 (12:27 +0100)] 
Mdmonitor: Add helper functions

Add functions:
- is_email_event(),
- get_syslog_event_priority(),
- sprint_event_message(),
with kernel style comments containing more detailed descriptions.

Also update event syslog priorities to be consistent with man. MoveSpare event was described in man as priority info, while implemented as warning. Move event data into a struct, so that it is passed between different functions if needed.
Sort function declarations alphabetically and remove redundant alert() declaration.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Pass events to alert() using enums instead of strings
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:00 +0000 (12:27 +0100)] 
Mdmonitor: Pass events to alert() using enums instead of strings

Add events enum, and mapping_t struct, that maps them to strings, so
that enums are passed around instead of strings.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Make alert_info global
Mateusz Grzonka [Thu, 2 Feb 2023 11:26:59 +0000 (12:26 +0100)] 
Mdmonitor: Make alert_info global

Move information about --test flag and hostname into alert_info.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoFix NULL dereference in super_by_fd
Li Xiao Keng [Mon, 27 Feb 2023 03:12:07 +0000 (11:12 +0800)] 
Fix NULL dereference in super_by_fd

When we create 100 partitions (major is 259 not 254) in a raid device,
mdadm may coredump:

Core was generated by `/usr/sbin/mdadm --detail --export /dev/md1p7'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
74 VPCMPEQ (%rdi), %ymm0, %ymm1
(gdb) bt
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x00007fbb9a7e4139 in __strcpy_chk (dest=dest@entry=0x55d55d6a13ac "", src=0x0, destlen=destlen@entry=32) at strcpy_chk.c:28
#2  0x000055d55ba1766d in strcpy (__src=<optimized out>, __dest=0x55d55d6a13ac "") at /usr/include/bits/string_fortified.h:79
#3  super_by_fd (fd=fd@entry=3, subarrayp=subarrayp@entry=0x7fff44dfcc48) at util.c:1289
#4  0x000055d55ba273a6 in Detail (dev=0x7fff44dfef0b "/dev/md1p7", c=0x7fff44dfe440) at Detail.c:101
#5  0x000055d55ba0de61 in misc_list (c=<optimized out>, ss=<optimized out>, dump_directory=<optimized out>, ident=<optimized out>, devlist=<optimized out>) at mdadm.c:1959
#6  main (argc=<optimized out>, argv=<optimized out>) at mdadm.c:1629

The direct cause is fd2devnm returning NULL, so add a check.

Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Signed-off-by: Wu Guang Hao <wuguanghao3@huawei.com>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Coly Li <colyli@suse.de <mailto:colyli@suse.de>>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoGrow: fix can't change bitmap type from none to clustered.
Heming Zhao [Thu, 23 Feb 2023 14:39:39 +0000 (22:39 +0800)] 
Grow: fix can't change bitmap type from none to clustered.

Commit a042210648ed ("disallow create or grow clustered bitmap with
writemostly set") introduced this bug. We should use 'true' logic not
'== 0' to deny setting up clustered array under WRITEMOSTLY condition.

How to trigger

```
~/mdadm # ./mdadm -Ss && ./mdadm --zero-superblock /dev/sd{a,b}
~/mdadm # ./mdadm -C /dev/md0 -l mirror -b clustered -e 1.2 -n 2 \
/dev/sda /dev/sdb --assume-clean
mdadm: array /dev/md0 started.
~/mdadm # ./mdadm --grow /dev/md0 --bitmap=none
~/mdadm # ./mdadm --grow /dev/md0 --bitmap=clustered
mdadm: /dev/md0 disks marked write-mostly are not supported with clustered bitmap
```

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
15 months agoRevert "mdadm/systemd: remove KillMode=none from service file"
Mariusz Tkaczyk [Thu, 2 Feb 2023 07:56:31 +0000 (08:56 +0100)] 
Revert "mdadm/systemd: remove KillMode=none from service file"

This reverts commit 52c67fcdd6dadc4138ecad73e65599551804d445.

The functionality is marked as deprecated but we don't have alternative
solution yet. Shutdown hangs if OS is installed on external array:

task:umount state:D stack: 0 pid: 6285 ppid: flags:0x00004084
Call Trace:
__schedule+0x2d1/0x830
? finish_wait+0x80/0x80
schedule+0x35/0xa0
md_write_start+0x14b/0x220
? finish_wait+0x80/0x80
raid1_make_request+0x3c/0x90 [raid1]
md_handle_request+0x128/0x1b0
md_make_request+0x5b/0xb0
generic_make_request_no_check+0x202/0x330
submit_bio+0x3c/0x160

Use it until new solution is implemented.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomanage: move comment with function description
Kinga Tanska [Thu, 5 Jan 2023 05:31:25 +0000 (06:31 +0100)] 
manage: move comment with function description

Move the function description from the function body to outside
to obey kernel coding style.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-intel: make freesize not required for chunk size migration
Kinga Tanska [Fri, 28 Oct 2022 02:51:17 +0000 (04:51 +0200)] 
super-intel: make freesize not required for chunk size migration

Freesize is needed to be set for migrations where size of RAID could
be changed - expand. It tells how many free space is determined for
members. In chunk size migartion freesize is not needed to be set,
pointer shouldn't be checked if exists. This commit moves check to
condition which contains size calculations, instead of checking it
always at the first step.
Fix return value when superblock is not set.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoincremental, manage: do not verify if remove is safe
Kinga Tanska [Tue, 27 Dec 2022 05:50:43 +0000 (06:50 +0100)] 
incremental, manage: do not verify if remove is safe

Function is_remove_safe() was introduced to verify if removing
member device won't cause failed state of the array. This
verification should be used only with set-faulty command. Add
special mode indicating that Incremental removal was executed.
If this mode is used do not execute is_remove_safe() routine.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoManage: do not check array state when drive is removed
Kinga Tanska [Tue, 27 Dec 2022 05:50:42 +0000 (06:50 +0100)] 
Manage: do not check array state when drive is removed

Array state doesn't need to be checked when drive is
removed, but until now clean state was required. Result
of the is_remove_safe() function will be independent
from array state.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdadm/udev: Don't handle change event on raw devices
Xiao Ni [Wed, 4 Jan 2023 16:29:20 +0000 (00:29 +0800)] 
mdadm/udev: Don't handle change event on raw devices

The raw devices are ready when add event happpens and the raid
can be assembled. So there is no need to handle change events.
And it can cause some inconvenient problems.

For example, the OS is installed on md0(/root) and md1(/home).
md0 and md1 are created on partitions. When it wants to re-install
OS, anaconda can't clear the storage configure. It deletes one
partition and does some jobs. The change event happens. Now
the raid device is assembled again. It can't delete the other
partitions.

So in this patch, we don't handle change event on raw devices
anymore.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoutil: remove obsolete code from get_md_name
Mateusz Kusiak [Mon, 2 Jan 2023 08:46:22 +0000 (09:46 +0100)] 
util: remove obsolete code from get_md_name

get_md_name() is used only with mdstat entries.
Remove dead code and simplyfy function.

Remove redundadnt checks from mdmon.c

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdmon: fix segfault
Mateusz Kusiak [Mon, 2 Jan 2023 08:46:21 +0000 (09:46 +0100)] 
mdmon: fix segfault

Mdmon crashes if stat2devnm returns null.
Use open_mddev to check if device is mddevice and get name using
fd2devnm.
Refactor container name handling.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoChange char* to enum in context->update & refactor code
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:24 +0000 (09:35 +0100)] 
Change char* to enum in context->update & refactor code

Storing update option in string is bad for frequent comparisons and
error prone.
Replace char array with enum so already existing enum is passed around
instead of string.
Adapt code to changes.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoManage&Incremental: code refactor, string to enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:23 +0000 (09:35 +0100)] 
Manage&Incremental: code refactor, string to enum

Prepare Manage and Incremental for later changing context->update to enum.
Change update from string to enum in multiple functions and pass enum
where already possible.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoChange update to enum in update_super and update_subarray
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:22 +0000 (09:35 +0100)] 
Change update to enum in update_super and update_subarray

Use already existing enum, change update_super and update_subarray
update to enum globally.
Refactor function references also.
Remove code specific options from update_options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-intel: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:21 +0000 (09:35 +0100)] 
super-intel: refactor the code for enum

It prepares super-intel for change context->update to enum.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper1: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:20 +0000 (09:35 +0100)] 
super1: refactor the code for enum

It prepares update_super1 for change context->update to enum.
Change if else statements into switch.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper0: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:19 +0000 (09:35 +0100)] 
super0: refactor the code for enum

It prepares update_super0 for change context->update to enum.
Change if else statements to switch.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-ddf: Remove update_super_ddf.
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:18 +0000 (09:35 +0100)] 
super-ddf: Remove update_super_ddf.

This is not supported by ddf.
It hides errors by returning success status for some updates.
Remove update_super_dff().

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoAdd code specific update options to enum.
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:17 +0000 (09:35 +0100)] 
Add code specific update options to enum.

Some of update options aren't taken from user input, but are hard-coded
as strings.
Include those options in enum.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoFix --update-subarray on active volume
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:16 +0000 (09:35 +0100)] 
Fix --update-subarray on active volume

Options: bitmap, ppl and name should not be updated when array is active.
Those features are mutually exclusive and share the same data area in IMSM (danger of overwriting by kernel).
Remove check for active subarrays from super-intel.
Since ddf is not supported, apply it globally for all options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdadm: Add option validation for --update-subarray
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:15 +0000 (09:35 +0100)] 
mdadm: Add option validation for --update-subarray

Subset of options available for "--update" is not same as for "--update-subarray".
Define maps and enum for update options and use them instead of direct comparisons.
Add proper error message.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdadm: create ident_init()
Mariusz Tkaczyk [Wed, 21 Dec 2022 11:50:17 +0000 (12:50 +0100)] 
mdadm: create ident_init()

Add a wrapper for repeated initializations in mdadm.c and config.c.
Move includes up.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoGrow: fix possible memory leak.
Blazej Kucman [Tue, 20 Dec 2022 11:07:51 +0000 (12:07 +0100)] 
Grow: fix possible memory leak.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoUpdate mdadm Monitor manual.
Blazej Kucman [Mon, 19 Dec 2022 10:21:58 +0000 (11:21 +0100)] 
Update mdadm Monitor manual.

- describe monitor work modes,
- clarify the turning off condition,
- describe the mdmonitor.service as a prefered management way.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoMonitor: block if monitor modes are combined.
Blazej Kucman [Mon, 19 Dec 2022 10:21:57 +0000 (11:21 +0100)] 
Monitor: block if monitor modes are combined.

Block monitoring start if --scan mode and MD devices list are combined.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoMdmonitor: Split alert() into separate functions
Mateusz Grzonka [Wed, 7 Sep 2022 12:56:49 +0000 (14:56 +0200)] 
Mdmonitor: Split alert() into separate functions

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
19 months agoMdmonitor: Omit non-md devices
Lukasz Florczak [Thu, 22 Sep 2022 06:29:50 +0000 (08:29 +0200)] 
Mdmonitor: Omit non-md devices

Fix segfault commit [1] introduced check whether given device is
mddevice, but it happend to terminate Mdmonitor if at least one of given
devices didn't fulfill that condition. In result Mdmonitor service was
no longer started on boot (with --scan option) when config contained some
non-existent array entry.

This commit introduces ommiting non-md devices so scan option can still
be used when config is wrong and allow Mdmonitor service to run on boot.

Giving a list of devices to monitor containing non-existing or
non-md devices will result in monitoring only confirmed mddevices.

[1] https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=e702f392959d1c2ad2089e595b52235ed97b4e18

Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
19 months agomdadm: replace container level checking with inline
Kinga Tanska [Fri, 2 Sep 2022 06:49:23 +0000 (08:49 +0200)] 
mdadm: replace container level checking with inline

To unify all containers checks in code, is_container() function is
added and propagated.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agoReadMe: fix command-line help
Mariusz Tkaczyk [Fri, 9 Sep 2022 13:50:34 +0000 (15:50 +0200)] 
ReadMe: fix command-line help

Make command-line help consistent with manual page.
Copied from Debian.

Cc: Felix Lechner <felix.lechner@lease-up.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: Add Documentation entries to systemd services
Mariusz Tkaczyk [Fri, 9 Sep 2022 13:50:33 +0000 (15:50 +0200)] 
mdadm: Add Documentation entries to systemd services

Add documentation section.
Copied from Debian.

Cc: Felix Lechner <felix.lechner@lease-up.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: added support for Intel Alderlake RST on VMD platform
Oldřich Jedlička [Wed, 31 Aug 2022 17:57:29 +0000 (19:57 +0200)] 
mdadm: added support for Intel Alderlake RST on VMD platform

Alderlake RST on VMD uses RstVmdV UEFI variable name, so detect it.

Signed-off-by: Oldřich Jedlička <oldium.pro@gmail.com>
Reviewed-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agoMonitor: Fix statelist memory leaks
Pawel Baldysiak [Thu, 1 Sep 2022 09:20:31 +0000 (11:20 +0200)] 
Monitor: Fix statelist memory leaks

Free statelist in error path in Monitor initialization.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agoManage: Block unsafe member failing
Mateusz Kusiak [Thu, 18 Aug 2022 09:47:21 +0000 (11:47 +0200)] 
Manage: Block unsafe member failing

Kernel may or may not block mdadm from removing member device if it
will cause arrays failed state. It depends on raid personality
implementation in kernel.
Add verification on requested removal path (#mdadm --set-faulty
command).

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: Correct typos, punctuation and grammar in man
Mateusz Grzonka [Fri, 12 Aug 2022 14:52:12 +0000 (16:52 +0200)] 
mdadm: Correct typos, punctuation and grammar in man

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Reviewed-by: Wol <anthony@youngman.org.uk>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agosuper1: report truncated device
NeilBrown [Thu, 25 Aug 2022 22:55:56 +0000 (08:55 +1000)] 
super1: report truncated device

When the metadata is at the start of the device, it is possible that it
describes a device large than the one it is actually stored on.  When
this happens, report it loudly in --examine.

....
   Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
          State : clean TRUNCATED DEVICE
....

Also report in --assemble so that the failure which the kernel will
report will be explained.

mdadm: Device /dev/sdb is not large enough for data described in superblock
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted

Scenario can be demonstrated as follows:

mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/test started.
mdadm: stopped /dev/md/test
   Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
          State : clean TRUNCATED DEVICE
   Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
          State : clean TRUNCATED DEVICE

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agoAssemble: check if device is container before scheduling force-clean update
Kinga Tanska [Fri, 19 Aug 2022 00:55:46 +0000 (02:55 +0200)] 
Assemble: check if device is container before scheduling force-clean update

Up to now using assemble with force flag making each array as clean.
Force-clean should not be done for the container. This commit add
check if device is different than container before cleaning.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agoGrow: Split Grow_reshape into helper function
Mateusz Kusiak [Thu, 28 Jul 2022 12:20:53 +0000 (20:20 +0800)] 
Grow: Split Grow_reshape into helper function

Grow_reshape should be split into helper functions given its size.
- Add helper function for preparing reshape on external metadata.
- Close cfd file descriptor.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: Don't open md device for CREATE and ASSEMBLE
Logan Gunthorpe [Wed, 27 Jul 2022 21:52:46 +0000 (15:52 -0600)] 
mdadm: Don't open md device for CREATE and ASSEMBLE

The mdadm command tries to open the md device for most modes, first
thing, no matter what. When running to create or assemble an array,
in most cases, the md device will not exist, the open call will fail
and everything will proceed correctly.

However, when running tests, a create or assembly command may be run
shortly after stopping an array and the old md device file may still
be around. Then, if create_on_open is set in the kernel, a new md
device will be created when mdadm does its initial open.

When mdadm gets around to creating the new device with the new_array
parameter it issues this error:

   mdadm: Fail to create md0 when using
   /sys/module/md_mod/parameters/new_array, fallback to creation via node

This is because an mddev was already created by the kernel with the
earlier open() call and thus the new one being created will fail with
EEXIST. The mdadm command will still successfully be created due to
falling back to the node creation method. However, the error message
itself will fail any test that's running it.

This issue is a race condition that is very rare, but a recent change
in the kernel caused this to happen more frequently: about 1 in 50
times.

To fix this, don't bother trying to open the md device for CREATE,
ASSEMBLE and BUILD commands, as the file descriptor will never be used
anyway even if it is successfully openned. The mdfd has not been used
for these commands since:

   7f91af49ad09 ("Delay creation of array devices for assemble/build/create")

The checks that were done on the open device can be changed to being
done with stat.

Side note: it would be nice to disable create_on_open as well to help
solve this, but it seems the work for this was never finished. By default,
mdadm will create using the old node interface when a name is specified
unless the user specifically puts names=yes in a config file, which
doesn't seem to be common or desirable to require this..

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: move data_offset to struct shape
Mariusz Tkaczyk [Tue, 19 Jul 2022 12:48:23 +0000 (14:48 +0200)] 
mdadm: move data_offset to struct shape

Data offset is a shape property so move it there to remove additional
parameter from some functions.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: remove symlink option
Mariusz Tkaczyk [Tue, 19 Jul 2022 12:48:22 +0000 (14:48 +0200)] 
mdadm: remove symlink option

The option is not used. Remove it from code.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agotests: add test for names
Mariusz Tkaczyk [Tue, 19 Jul 2022 12:48:21 +0000 (14:48 +0200)] 
tests: add test for names

Current behavior is not documented and tested. This test is a base for
future improvements. It is enough to test it only with native metadata,
because it is generic code. Generated properties are passed to metadata
handler.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agotests/00readonly: Run udevadm settle before setting ro
Logan Gunthorpe [Wed, 27 Jul 2022 21:52:45 +0000 (15:52 -0600)] 
tests/00readonly: Run udevadm settle before setting ro

In some recent kernel versions, 00readonly fails with:

  mdadm: failed to set readonly for /dev/md0: Device or resource busy
  ERROR: array is not read-only!

This was traced down to a race condition with udev holding a reference
to the block device at the same time as trying to set it read only.

To fix this, call udevadm settle before setting the array read only.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
20 months agomdadm: Replace obsolete usleep with nanosleep
Mateusz Grzonka [Fri, 12 Aug 2022 14:36:02 +0000 (16:36 +0200)] 
mdadm: Replace obsolete usleep with nanosleep

According to POSIX.1-2001, usleep is considered obsolete.
Replace it with a wrapper that uses nanosleep, as recommended in man.
Add handy macros for conversions between msec, usec and nsec.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
21 months agotests: Add broken files for all broken tests
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:19 +0000 (14:25 -0600)] 
tests: Add broken files for all broken tests

Each broken file contains the rough frequency of brokeness as well
as a brief explanation of what happens when it breaks. Estimates
of failure rates are not statistically significant and can vary
run to run.

This is really just a view from my window. Tests were done on a
small VM with the default loop devices, not real hardware. We've
seen different kernel configurations can cause bugs to appear as well
(ie. different block schedulers). It may also be that different race
conditions will be seen on machines with different performance
characteristics.

These annotations were done with the kernel currently in md/md-next:

 facef3b96c5b ("md: Notify sysfs sync_completed in md_reap_sync_thread()")

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomdadm/test: Mark and ignore broken test failures
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:18 +0000 (14:25 -0600)] 
mdadm/test: Mark and ignore broken test failures

Add functionality to continue if a test marked as broken fails.

To mark a test as broken, a file with the same name but with the suffix
'.broken' should exist. The first line in the file will be printed with
a KNOWN BROKEN message; the rest of the file can describe the how the
test is broken.

Also adds --skip-broken and --skip-always-broken to skip all the tests
that have a .broken file or to skip all tests whose .broken file's first
line contains the keyword always.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomdadm/test: Add a mode to repeat specified tests
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:17 +0000 (14:25 -0600)] 
mdadm/test: Add a mode to repeat specified tests

Many tests fail infrequently or rarely. To help find these, add
an option to run the tests multiple times by specifying --loop=N.

If --loop=0 is specified, the test will be looped forever.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agotests/02lineargrow: clear the superblock at every iteration
Sudhakar Panneerselvam [Wed, 22 Jun 2022 20:25:16 +0000 (14:25 -0600)] 
tests/02lineargrow: clear the superblock at every iteration

This fixes 02lineargrow test as prior metadata causes --add operation
to misbehave.

Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agotests/04update-metadata: avoid passing chunk size to raid1
Sudhakar Panneerselvam [Wed, 22 Jun 2022 20:25:15 +0000 (14:25 -0600)] 
tests/04update-metadata: avoid passing chunk size to raid1

'04update-metadata' test fails with error, "specifying chunk size is
forbidden for this level" added by commit, 5b30a34aa4b5e. Hence,
correcting the test to ignore passing chunk size to raid1.

Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
[logang@deltatee.com: fix if/then style and dropped unrelated hunk]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agotests: fix raid0 tests for 0.90 metadata
Sudhakar Panneerselvam [Wed, 22 Jun 2022 20:25:14 +0000 (14:25 -0600)] 
tests: fix raid0 tests for 0.90 metadata

Some of the test cases fail because raid0 creation fails with the error,
"0.90 metadata does not support layouts for RAID0" added by commit,
329dfc28debb. Fix some of the test cases by switching from raid0 to
linear level for 0.9 metadata where possible.

Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agotests/00raid0: add a test that validates raid0 with layout fails for 0.9
Sudhakar Panneerselvam [Wed, 22 Jun 2022 20:25:13 +0000 (14:25 -0600)] 
tests/00raid0: add a test that validates raid0 with layout fails for 0.9

329dfc28debb disallows the creation of raid0 with layouts for 0.9
metadata. This test confirms the new behavior.

Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomdadm: Fix optional --write-behind parameter
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:12 +0000 (14:25 -0600)] 
mdadm: Fix optional --write-behind parameter

The commit noted below changed the behaviour of --write-behind to
require an argument. This broke the 06wrmostly test with the error:

  mdadm: Invalid value for maximum outstanding write-behind writes: (null).
         Must be between 0 and 16383.

To fix this, check if optarg is NULL before parising it, as the origial
code did.

Fixes: 60815698c0ac ("Refactor parse_num and use it to parse optarg.")
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomdadm: Fix mdadm -r remove option regression
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:11 +0000 (14:25 -0600)] 
mdadm: Fix mdadm -r remove option regression

The commit noted below globally adds a parameter to the -r option but missed
the fact that -r is used for another purpose: --remove.

After that commit, a command such as:

  mdadm /dev/md0 -r /dev/loop0

will do nothing seeing the device parameter will be consumed as a
argument to the -r option; thus, there will only be one device
seen one the command line, devs_found will only be 1 and nothing will
happen.

This caused the 01r5integ and 01raid6integ tests to hang indefinitely
as mdadm did not remove the failed device. With the device not removed,
it would not be readded. Then the loop waiting for the array status to
change would loop forever.

This commit was recently reverted, but the legitimate fix for the
monitor operations was still not fixed. So add specific monitor
short ops to re-fix the --monitor -r option.

Fixes: 546047688e1c ("mdadm: fix coredump of mdadm --monitor -r")
Fixes: 190dc029b141 ("Revert "mdadm: fix coredump of mdadm --monitor -r"")
Cc: Wu Guanghao <wuguanghao3@huawei.com>
Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomonitor: Avoid segfault when calling NULL get_bad_blocks
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:10 +0000 (14:25 -0600)] 
monitor: Avoid segfault when calling NULL get_bad_blocks

Not all struct superswitch implement a get_bad_blocks() function,
yet mdmon seems to call it without checking for NULL and thus
occasionally segfaults in the test 10ddf-geometry.

Fix this by checking for NULL before calling it.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agomdadm/Grow: Fix use after close bug by closing after fork
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:09 +0000 (14:25 -0600)] 
mdadm/Grow: Fix use after close bug by closing after fork

The test 07reshape-grow fails most of the time. But it succeeds around
1 in 5 times. When it does succeed, it causes the tests to die because
mdadm has segfaulted.

The segfault was caused by mdadm attempting to repoen a file
descriptor that was already closed. The backtrace of the segfault
was:

  #0  __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101
  #1  0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956
  #2  0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0)
                         at util.c:1072
  #3  0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079
  #4  0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244
  #5  0x000056146e329f36 in start_array (mdfd=4,
              mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860,
              st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0,
              bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5,
      sparecnt=0,  rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90,
      clean=1,  avail=0x56146fc78720 "\001\001\001\001\001",
      start_partial_ok=0, err_ok=0, was_forced=0)
                  at Assemble.c:1206
  #6  0x000056146e32c36e in Assemble (st=0x56146fc78660,
               mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70,
       devlist=0x56146fc6e2d0, c=0x7ffc55342e90)
                 at Assemble.c:1914
  #7  0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238)
                         at mdadm.c:1510

The file descriptor was closed early in Grow_continue(). The noted commit
moved the close() call to close the fd above the fork which caused the
parent process to return with a closed fd.

This meant reshape_array() and Grow_continue() would return in the parent
with the fd forked. The fd would eventually be passed to reopen_mddev()
which returned an unhandled NULL from fd2devnm() which would then be
dereferenced in devnm2devid.

Fix this by moving the close() call below the fork. This appears to
fix the 07revert-grow test. While we're at it, switch to using
close_fd() to invalidate the file descriptor.

Fixes: 77b72fa82813 ("mdadm/Grow: prevent md's fd from being occupied during delayed time")
Cc: Alex Wu <alexwu@synology.com>
Cc: BingJing Chang <bingjingc@synology.com>
Cc: Danny Shih <dannyshih@synology.com>
Cc: ChangSyun Peng <allenpeng@synology.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agoDDF: Fix NULL pointer dereference in validate_geometry_ddf()
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:08 +0000 (14:25 -0600)] 
DDF: Fix NULL pointer dereference in validate_geometry_ddf()

A relatively recent patch added a call to validate_geometry() in
Manage_add() that has level=LEVEL_CONTAINER and chunk=NULL.

This causes some ddf tests to segfault which aborts the test suite.

To fix this, avoid dereferencing chunk when the level is
LEVEL_CONTAINER or LEVEL_NONE.

Fixes: 1f5d54a06df0 ("Manage: Call validate_geometry when adding drive to external container")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agoDDF: Cleanup validate_geometry_ddf_container()
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:07 +0000 (14:25 -0600)] 
DDF: Cleanup validate_geometry_ddf_container()

Move the function up so that the function declaration is not necessary
and remove the unused arguments to the function.

No functional changes are intended but will help with a bug fix in the
next patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agoMakefile: Don't build static build with everything and everything-test
Logan Gunthorpe [Wed, 22 Jun 2022 20:25:06 +0000 (14:25 -0600)] 
Makefile: Don't build static build with everything and everything-test

Running the test suite requires building everything, but it seems to be
difficult to build the static version of mdadm now seeing there
is no readily available static udev library.

The test suite doesn't need the static binary so just don't build it
with the everything or everything-test targets.

Leave the mdadm.static and install-static targets in place in case
someone still has a use case for the static binary.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
21 months agoMonitor: use snprintf to fill device name
Kinga Tanska [Thu, 14 Jul 2022 07:02:11 +0000 (09:02 +0200)] 
Monitor: use snprintf to fill device name

Safe string functions are propagated in Monitor.c.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
21 months agoMonitor: use devname as char array instead of pointer
Kinga Tanska [Thu, 14 Jul 2022 07:02:10 +0000 (09:02 +0200)] 
Monitor: use devname as char array instead of pointer

Device name wasn't filled properly due to incorrect use of strcpy.
Strcpy was used twice. Firstly to fill devname with "/dev/md/"
and then to add chosen name. First strcpy result was overwritten by
second one (as a result <device_name> instead of "/dev/md/<device_name>"
was assigned). This commit changes this implementation to use snprintf
and devname with fixed size.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
21 months agomdadm: Remove dead code in imsm_fix_size_mismatch
Lukasz Florczak [Fri, 22 Jul 2022 06:43:48 +0000 (08:43 +0200)] 
mdadm: Remove dead code in imsm_fix_size_mismatch

imsm_create_metadata_update_for_size_change() that returns u_size value
could return 0 in the past. As its behavior changed, and returned value
is always the size of imsm_update_size_change structure, check for
u_size is no longer needed.

Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
21 months agomdadm: Fix array size mismatch after grow
Lukasz Florczak [Fri, 22 Jul 2022 06:43:47 +0000 (08:43 +0200)] 
mdadm: Fix array size mismatch after grow

imsm_fix_size_mismatch() is invoked to fix the problem, but it couldn't
proceed due to migration check. This patch allows for intended behavior.

Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agomdadm: block update=ppl for non raid456 levels
Lukasz Florczak [Wed, 15 Jun 2022 12:28:39 +0000 (14:28 +0200)] 
mdadm: block update=ppl for non raid456 levels

Option ppl should be used only for raid levels 4, 5 and 6. Cancel update
for other levels.

Applied globally for imsm and ddf format.

Additionally introduce is_level456() helper function.

Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agoimsm: block changing slots during creation
Mariusz Tkaczyk [Mon, 20 Jun 2022 16:10:43 +0000 (00:10 +0800)] 
imsm: block changing slots during creation

If user specifies drives for array creation, then slot order across
volumes is not preserved.
Ideally, it should be checked in validate_geometry() but it is not
possible in current implementation (order is determined later).
Add verification in add_to_super_imsm_volume() and throw error if
mismatch is detected.
IMSM allows to use only same members within container.
This is not hardware dependency but metadata limitation.
Therefore, 09-imsm-overlap test is removed. Testing it is pointless.
After this patch, creation in this scenario is blocked. Offset
verification is covered in other tests.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agoimsm: use same slot across container
Mariusz Tkaczyk [Mon, 20 Jun 2022 16:10:42 +0000 (00:10 +0800)] 
imsm: use same slot across container

Autolayout relies on drives order on super->disks list, but
it is not quaranted by readdir() in sysfs_read(). As a result
drive could be put in different slot in second volume.

Make it consistent by reffering to first volume, if exists.

Use enum imsm_status to unify error handling.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agoimsm: introduce get_disk_slot_in_dev()
Mariusz Tkaczyk [Mon, 20 Jun 2022 16:10:41 +0000 (00:10 +0800)] 
imsm: introduce get_disk_slot_in_dev()

The routine was added to remove unnecessary get_imsm_dev() and
get_imsm_map() calls, used only to determine disk slot.

Additionally, enum for IMSM return statues was added for further usage.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agomdadm/super1: restore commit 45a87c2f31335 to fix clustered slot issue
Heming Zhao [Mon, 20 Jun 2022 16:10:40 +0000 (00:10 +0800)] 
mdadm/super1: restore commit 45a87c2f31335 to fix clustered slot issue

Commit 9d67f6496c71 ("mdadm:check the nodes when operate clustered
array") modified assignment logic for st->nodes in write_bitmap1(),
which introduced bitmap slot issue:

load_super1 didn't set up supertype.nodes, which made spare disk only
have one slot info. Then it triggered kernel md_bitmap_load_sb to get
wrong bitmap slot data.

For fixing this issue, there are two methods:

1> revert the related code of commit 9d67f6496c71. and restore the code
   from former commit 45a87c2f31335 ("super1: add more checks for
   NodeNumUpdate option").
   st->nodes value would be 0 & 1 under current code logic. i.e.
   When adding a spare disk, there is no place to init st->nodes, and
   the value is ZERO.

2> keep 9d67f6496c71, add additional ->nodes handling in load_super1(),
   let load_super1 to set st->nodes when bitmap is BITMAP_MAJOR_CLUSTERED.
   Under current mdadm code logic, load_super1 will be called many
   times, any new code in load_super1 will cost mdadm running more time.
   And more reason is I prefer as much as possible to limit clustered
   code spreading in every corner.

So I used method <1> to fix this issue.

How to trigger:

dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdc
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -a /dev/md0 /dev/sdc
mdadm /dev/md0 --fail /dev/sda
mdadm /dev/md0 --remove /dev/sda
mdadm -Ss
mdadm -A /dev/md0 /dev/sdb /dev/sdc

the output of current "mdadm -X /dev/sdc":
(there should be (by default) 4 slot info for correct output)
```
        Filename : /dev/sdc
           Magic : 6d746962
         Version : 5
            UUID : a74642f8:a6b1fba8:58e1f8db:cfe7b082
          Events : 29
  Events Cleared : 0
           State : OK
       Chunksize : 64 MB
          Daemon : 5s flush period
      Write Mode : Normal
       Sync Size : 306176 (299.00 MiB 313.52 MB)
          Bitmap : 5 bits (chunks), 5 dirty (100.0%)
```

And mdadm later operations will trigger kernel output error message:
(triggered by "mdadm -A /dev/md0 /dev/sdb /dev/sdc")
```
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 1
kernel: md-cluster: Could not gather bitmaps from slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 3
kernel: md-cluster: Could not gather bitmaps from slot 3
kernel: md-cluster: failed to gather all resyn infos
kernel: md0: detected capacity change from 0 to 612352
```

Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
22 months agoutil: replace ioctl use with function
Kinga Tanska [Mon, 20 Jun 2022 16:10:39 +0000 (00:10 +0800)] 
util: replace ioctl use with function

Replace using of ioctl calling to get md array info with
special function prepared to it.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>