]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
5 months agoMdmonitor: Improve udev event handling
Mateusz Grzonka [Tue, 21 Nov 2023 00:58:23 +0000 (01:58 +0100)] 
Mdmonitor: Improve udev event handling

Mdmonitor is waiting for udev queue to become empty.
Even if the queue becomes empty, udev might still be processing last event.
However we want to wait and wake up mdmonitor when udev finished
processing events..

Also, the udev queue interface is considered legacy and should not be
used outside of udev.

Use udev monitor instead, and wake up mdmonitor on every event triggered
by udev for md block device.

We need to generate more change events from kernel, because they are
missing in some situations, for example, when rebuild started.
This will be addressed in a separate patch.

Move udev specific code into separate functions, and place them in udev.c file.
Also move use_udev() logic from lib.c into newly created file.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agoFix assembling RAID volume by using incremental
Pawel Piatkowski [Thu, 19 Oct 2023 14:35:25 +0000 (16:35 +0200)] 
Fix assembling RAID volume by using incremental

After change "mdadm: remove container_enough logic"
IMSM volumes are started immediately. If volume is during
reshape, then it will be blocked by block_subarray() during
first mdadm -I <devname>. Assemble_container_content() for
next disk will see the change because metadata version from
sysfs and metadata doesn't match and will execute
sysfs_set_array again. Then it fails to set same
component_size, it is prohibited by kernel.

If array is frozen then first sign from metadata version
is different ("/" vs "-"), so exclude it from comparison.
All we want is to double check that base properties are set
and we don't need to call sysfs_set_array again.

Signed-off-by: Pawel Piatkowski <pawel.piatkowski@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm: remove container_enough logic
Pawel Piatkowski [Thu, 19 Oct 2023 14:35:24 +0000 (16:35 +0200)] 
mdadm: remove container_enough logic

Arrays without enough disk count will be assembled but not
started.
Now RAIDs will be assembled always (even if they are failed).
RAID devices in all states will be assembled and exposed
to mdstat.
This change affects only IMSM (for ddf it wasn't used,
container_enough was set to true always).
Removed this logic from incremental_container as well with
runstop checking because runstop condition is being verified
in assemble_container_content function.

Signed-off-by: Pawel Piatkowski <pawel.piatkowski@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm/super1: Add MD_FEATURE_RAID0_LAYOUT if kernel>=5.4
Xiao Ni [Tue, 17 Oct 2023 12:35:46 +0000 (20:35 +0800)] 
mdadm/super1: Add MD_FEATURE_RAID0_LAYOUT if kernel>=5.4

After and include kernel v5.4, it adds one feature bit MD_FEATURE_RAID0_LAYOUT.
It must need to specify a layout for raid0 with more than one zone. But for
raid0 with one zone, in fact it also has a defalut layout.

Now for raid0 with one zone, *unknown* layout can be seen when running mdadm -D
command. It's the reason that mdadm doesn't set MD_FEATURE_RAID0_LAYOUT for
raid0 with one zone. Then in kernel space, super_1_validate sets mddev->layout
to -1 because of no MD_FEATURE_RAID0_LAYOUT. In fact, in raid0 io path, it
uses the default layout. Set raid0_need_layout to true if kernel_version<=v5.4.

Fixes: 329dfc28debb ('Create: add support for RAID0 layouts.')
Signed-off-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm/ddf: Abort when raid disk is smaller in getinfo_super_ddf
Xiao Ni [Wed, 11 Oct 2023 13:03:32 +0000 (21:03 +0800)] 
mdadm/ddf: Abort when raid disk is smaller in getinfo_super_ddf

The metadata is corrupted when the raid_disk<0. So abort directly.
This also can avoid a building error:
super-ddf.c:1988:58: error: array subscript -1 is below array bounds of ‘struct phys_disk_entry[0]’

Suggested-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Ackedy-by: Xiao Ni <xni@redhat.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm/tests: Don't run mknod before losetup
Xiao Ni [Fri, 8 Sep 2023 08:44:35 +0000 (16:44 +0800)] 
mdadm/tests: Don't run mknod before losetup

Sometimes it can fail:
losetup: /var/tmp/mdtest0: failed to set up loop device: No such device or address
/dev/loop0 and /var/tmp/mdtest0 are already created before losetup.

Because losetup can create device node by itself. So remove mknod.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agoFix race of "mdadm --add" and "mdadm --incremental"
Li Xiao Keng [Thu, 7 Sep 2023 11:37:44 +0000 (19:37 +0800)] 
Fix race of "mdadm --add" and "mdadm --incremental"

There is a raid1 with sda and sdb. And we add sdc to this raid,
it may return -EBUSY.

The main process of --add:
1. dev_open(sdc) in Manage_add
2. store_super1(st, di->fd) in write_init_super1
3. fsync(fd) in store_super1
4. close(di->fd) in write_init_super1
5. ioctl(ADD_NEW_DISK)

Step 2 and 3 will add sdc to metadata of raid1. There will be
udev(change of sdc) event after step4. Then "/usr/sbin/mdadm
--incremental --export $devnode --offroot $env{DEVLINKS}"
will be run, and the sdc will be added to the raid1. Then
step 5 will return -EBUSY because it checks if device isn't
claimed in md_import_device()->lock_rdev()->blkdev_get_by_dev()
->blkdev_get().

It will be confusing for users because sdc is added first time.
The "incremental" will get map_lock before add sdc to raid1.
So we add map_lock before write_init_super in "mdadm --add"
to fix the race of "add" and "incremental".

Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm/tests: Fix regular expression failure
Xiao Ni [Thu, 7 Sep 2023 08:57:44 +0000 (16:57 +0800)] 
mdadm/tests: Fix regular expression failure

The test fails because of the regular expression.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agoIncremental: remove obsoleted calls to udisks
Coly Li [Sun, 13 Aug 2023 16:46:13 +0000 (00:46 +0800)] 
Incremental: remove obsoleted calls to udisks

Utility udisks is removed from udev upstream, calling this obsoleted
command in run_udisks() doesn't make any sense now.

This patch removes the calls chain of udisks, which includes routines
run_udisk(), force_remove(), and 2 locations where force_remove() are
called. Considering force_remove() is removed with udisks util, it is
fair to remove Manage_stop() inside force_remove() as well.

In the two modifications where calling force_remove() are removed,
the failure from Manage_subdevs() can be safely ignored, because,
1) udisks doesn't exist, no need to check the return value to umount
   the file system by udisks and remove the component disk again.
2) After the 'I' inremental remove, there is another 'r' hot remove
   following up. The first incremental remove is a best-try effort.

Therefore in this patch, where force_remove() is removed, the return
value of calling Manage_subdevs() is not checked too.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm: Follow POSIX Portable Character Set
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:50 +0000 (09:27 +0200)] 
mdadm: Follow POSIX Portable Character Set

When the user creates a device with a name that contains whitespace,
mdadm timeouts and throws an error. This issue is caused by udev, which
truncates /dev/md link until the first whitespace.

This patch introduces prohibition of characters other than A-Za-z0-9.-_
in the device name. Also, it prohibits using leading "-" in device name,
so name won't be confused with cli parameter.
Set of allowed characters is taken from POSIX 3.280 Portable Character
Set. Also, device name length now is limited to NAME_MAX.

In some places, there are other requirements for string length (e.g. size
up to MD_NAME_MAX for device name). This routine is made to follow POSIX
and other, more strict limitations should be checked separately.
We are aware of the risk of regression in exceptional cases (as
escape_devname function is removed) that should be fixed by updating
the array name.

The POSIX validation is added for:
- 'name' parameter in every mode.
- first devlist entry, for Build, Create, Assemble, Manage, Grow.
- config entries, both devname and "name=".

Additionally, some manual cleanups are made.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm: define ident_set_devname()
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:49 +0000 (09:27 +0200)] 
mdadm: define ident_set_devname()

Use dedicated set method for ident->devname. Now, devname validation
is done early for modes where device is created (Build, Create and
Assemble). The rules, used for devname validation are derived from
config file.

It could cause regression with execeptional cases where existing device
has name which doesn't match criteria for Manage and Grow modes. It is
low risk and those modes are not omitted from early devname validation.
Use can used main numbered devnode to avoid this problem.
Messages exposed to user are changed so it might cause a regression
in negative scenarios. Error codes are not changed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm: refactor ident->name handling
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:48 +0000 (09:27 +0200)] 
mdadm: refactor ident->name handling

Create dedicated setter for name in mddev_ident and propagate it.
Following changes are made:
- move duplicated code from  config.c and mdadm.c into new function.
- Add error enum in mdadm.h.
- Use MD_NAME_MAX instead of hardcoded value in mddev_ident.
- Use secure functions.
- Add more detailed verification of the name.
- make error messages reusable for cmdline and config:
    - for cmdline, these are errors so use pr_err().
    - for config, these are just warnings, so use pr_info().

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agomdadm: set ident.devname if applicable
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:47 +0000 (09:27 +0200)] 
mdadm: set ident.devname if applicable

This patch tries to propagate the usage of struct mddev_ident for cmdline
where it is applicable. To avoid regression, this value is derived
from devlist->devname for applicable modes only.
As a result, the whole structure is passed to some functions. It produces
some changes for Build, Create and Assemble.
No functional changes intended.

The goal of the change is to unify devname validation which is done in
next patches.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: create 00confnames
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:46 +0000 (09:27 +0200)] 
tests: create 00confnames

The test is an attempt to document current implementation of devnode
and name handling for config entries. It is focused on incremental-
default way of array assembling on boot.
The expectations are aligned to current implementation for native
metadata because it is the most complicated scenario- both variables
can be set.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: create names_template
Mariusz Tkaczyk [Thu, 1 Jun 2023 07:27:45 +0000 (09:27 +0200)] 
tests: create names_template

Create templates directory and names_template. Move code from
00createnames. This code will be reused for 00confnames in next patch.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test for raid456 deadlock again
Yu Kuai [Mon, 29 May 2023 13:28:26 +0000 (21:28 +0800)] 
tests: add a regression test for raid456 deadlock again

This is a regression test for commit ("md/raid5: fix a deadlock in the
case that reshape is interrupted").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test that reshape can corrupt data
Yu Kuai [Mon, 29 May 2023 13:28:25 +0000 (21:28 +0800)] 
tests: add a regression test that reshape can corrupt data

This is a regression test for commit 1544e95c6dd8 ("md: fix data
corruption for raid456 when reshape restart while grow up").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test that raid456 can't assemble again
Yu Kuai [Mon, 29 May 2023 13:28:24 +0000 (21:28 +0800)] 
tests: add a regression test that raid456 can't assemble again

This is a regression test for commit 0aecb06e2249 ("md/raid5: don't allow
replacement while reshape is in progress").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test that raid456 can't assemble
Yu Kuai [Mon, 29 May 2023 13:28:23 +0000 (21:28 +0800)] 
tests: add a regression test that raid456 can't assemble

If recovery is interrupted and reshape is started, then this array can't
assemble anymore. The problem is supposed to be fixed by [1].

[1] https://lore.kernel.org/linux-raid/20230529031045.1760883-1-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test for raid456 deadlock
Yu Kuai [Mon, 29 May 2023 13:28:22 +0000 (21:28 +0800)] 
tests: add a regression test for raid456 deadlock

The deadlock is described in [1], as the last patch described, it's
fixed first by [2], however this fix will be reverted and the deadlock
is supposed to be fixed by [3].

[1] https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
[2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
[3] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a regression test for raid10 deadlock
Yu Kuai [Mon, 29 May 2023 13:28:21 +0000 (21:28 +0800)] 
tests: add a regression test for raid10 deadlock

The deadlock is described in [1], it's fixed first by [2], however,
it turns out this commit will trigger other problems[3], hence this
commit will be reverted and the deadlock is supposed to be fixed by [1].

[1] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/
[2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/
[3] https://lore.kernel.org/linux-raid/20230322064122.2384589-2-yukuai1@huaweicloud.com/

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: support to skip checking dmesg
Yu Kuai [Mon, 29 May 2023 13:28:20 +0000 (21:28 +0800)] 
tests: support to skip checking dmesg

Prepare to add a regression test for raid10 that require error injection
to trigger error path, and kernel will complain about io error, checking
dmesg for error log will make it impossible to pass this test.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agotests: add a new test for rdev lifetime
Yu Kuai [Mon, 29 May 2023 13:28:19 +0000 (21:28 +0800)] 
tests: add a new test for rdev lifetime

This test add and remove a underlying disk to raid concurretly, verify
that the following problem is fixed:

run mdadm test 23rdev-lifetime at Fri Apr 28 03:25:30 UTC 2023
md: could not open device unknown-block(1,0).
sysfs: cannot create duplicate filename '/devices/virtual/block/md0/md/dev-ram0'
CPU: 26 PID: 10521 Comm: test Not tainted 6.3.0-rc2-00134-g7b3a8828043c #115
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe7/0x180
 dump_stack+0x18/0x30
 sysfs_warn_dup+0xa2/0xd0
 sysfs_create_dir_ns+0x119/0x140
 kobject_add_internal+0x143/0x4d0
 kobject_add_varg+0x35/0x70
 kobject_add+0x64/0xd0
 bind_rdev_to_array+0x254/0x840 [md_mod]
 new_dev_store+0x14d/0x350 [md_mod]
 md_attr_store+0xc1/0x1a0 [md_mod]
 sysfs_kf_write+0x51/0x70
 kernfs_fop_write_iter+0x188/0x270
 vfs_write+0x27e/0x460
 ksys_write+0x85/0x180
 __x64_sys_write+0x21/0x30
 do_syscall_64+0x6c/0xe0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f26bacf5387
Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 84
RSP: 002b:00007ffe98d79e68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f26bacf5387
RDX: 0000000000000004 RSI: 000055bd10282bf0 RDI: 0000000000000001
RBP: 000055bd10282bf0 R08: 000000000000000a R09: 00007f26bad8b4e0
R10: 00007f26bad8b3e0 R11: 0000000000000246 R12: 0000000000000004
R13: 00007f26badc8520 R14: 0000000000000004 R15: 00007f26badc8700
 </TASK>

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
6 months agoAssemble: fix redundant memory free
Kinga Tanska [Tue, 12 Sep 2023 02:27:01 +0000 (04:27 +0200)] 
Assemble: fix redundant memory free

Commit e9fb93af0f76 ("Fix memory leak in file Assemble")
fixes few memory leaks in Assemble, but it introduces
problem with assembling RAID volume. It was caused by
clearing metadata too fast, not only on fail in
select_devices() function.
This commit removes redundant memory free.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoAdd compiler defenses flags
Mateusz Grzonka [Mon, 17 Jul 2023 13:19:10 +0000 (15:19 +0200)] 
Add compiler defenses flags

It is essential to avoid buffer overflows and similar bugs as much as
possible.

According to Intel rules we are obligated to verify certain
compiler flags, so it will be much easier if they are added to the
Makefile.

Add gcc flags for prevention of buffer overflows, format string vulnerabilities,
stack protection to prevent stack overwrites and aslr enablement through -fPIE.
Also make the flags configurable.

The changes were verified on gcc versions 7.5, 8.3, 9.2, 10 and 12.2.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: Add reading vmd register for finding imsm capability
Mateusz Grzonka [Wed, 5 Jul 2023 14:23:17 +0000 (16:23 +0200)] 
imsm: Add reading vmd register for finding imsm capability

Currently mdadm does not find imsm capability when running inside VM.
This patch adds the possibility to read from vmd register and check for
capability, effectively allowing to use mdadm with imsm inside virtual machines.

Additionally refactor find_imsm_capability() to make assignments in new
lines.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoplatform-intel: limit guid length
Kinga Tanska [Thu, 11 May 2023 02:55:13 +0000 (04:55 +0200)] 
platform-intel: limit guid length

Moving GUID_STR_MAX to header to use it as
a length limitation for snprintf function.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoFix unsafe string functions
Kinga Tanska [Thu, 11 May 2023 02:55:12 +0000 (04:55 +0200)] 
Fix unsafe string functions

Add string length limitations where necessary to
avoid buffer overflows.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoFix memory leak in file mdadm
Guanqin Miao [Mon, 24 Apr 2023 08:06:37 +0000 (16:06 +0800)] 
Fix memory leak in file mdadm

When we test mdadm with asan, we found some memory leaks in mdadm.c
We fix these memory leaks based on code logic.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoFix memory leak in file Manage
Guanqin Miao [Mon, 24 Apr 2023 08:06:36 +0000 (16:06 +0800)] 
Fix memory leak in file Manage

When we test mdadm with asan, we found some memory leaks in Manage.c
We fix these memory leaks based on code logic.

v2: Fix free() of uninitialized 'tst' in abort path.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoFix memory leak in file Kill
Guanqin Miao [Mon, 24 Apr 2023 08:06:35 +0000 (16:06 +0800)] 
Fix memory leak in file Kill

When we test mdadm with asan, we found some memory leaks in Kill.c
We fix these memory leaks based on code logic.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoFix memory leak in file Assemble
Guanqin Miao [Mon, 24 Apr 2023 08:06:34 +0000 (16:06 +0800)] 
Fix memory leak in file Assemble

When we test mdadm with asan, we found some memory leaks in Assemble.c
We fix these memory leaks based on code logic.

v2: Set st = NULL before jumping to loop

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agomdadm: Stop mdcheck_continue timer when mdcheck_start service can finish check
Xiao Ni [Fri, 25 Aug 2023 12:55:41 +0000 (20:55 +0800)] 
mdadm: Stop mdcheck_continue timer when mdcheck_start service can finish check

mdcheck_continue is triggered by mdcheck_start timer. It's used to
continue check action if the raid is too big and mdcheck_start
service can't finish check action. If mdcheck start can finish check
action, it doesn't need to mdcheck continue service anymore. So stop
it when mdcheck start service can finish check action.

Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoAdd secure gethostname() wrapper
Blazej Kucman [Fri, 16 Jun 2023 19:45:55 +0000 (21:45 +0200)] 
Add secure gethostname() wrapper

gethostname() func does not ensure null-terminated string
if hostname is longer than buffer length.
For security, a function s_gethostname() has been added
to ensure that "\0" is added to the end of the buffer.
Previously this had to be handled in each place
of the gethostname() call.

Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: fix free space calculations
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:38 +0000 (15:52 +0200)] 
imsm: fix free space calculations

Between two volumes or between last volume and metadata at least
IMSM_RESERVED_SECTORS gap must exist. Currently the gap can be doubled
because metadata reservation contains IMSM_RESERVED_SECTORS too.

Divide reserve variable into pre_reservation and post_reservation to be
more flexible and decide separately if each reservation is needed.

Pre_reservation is needed only when a volume is created and it is not a
real first volume in a container (we can check that by extent_idx).
This type of reservation is not needed for expand.

Post_reservation is not needed only if real last volume is created or
expanded because reservation is done with the metadata.

The volume index in metadata cannot be trusted, because the real volume
order can be reversed. It is safer to use extent table, it is sorted by
start position.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: return free space after volume for expand
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:37 +0000 (15:52 +0200)] 
imsm: return free space after volume for expand

merge_extends() routine searches for the biggest free space. For expand,
it works only in standard cases where the last volume is expanded and
the free space is determined after the last volume.
Add volume index to extent struct and use that do determine size after
super->current_vol during expand.

Limitation to last volume is no longer needed. It unblocks scenarios
where kill-subarray is used to remove first volume and later it is
recreated (now it is the second volume, even if it is placed before
existing one).

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: move expand verification code into new function
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:36 +0000 (15:52 +0200)] 
imsm: move expand verification code into new function

The code here is too complex. Move it to separate function and
simplify it. Add more error messages.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: introduce round_member_size_to_mb()
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:35 +0000 (15:52 +0200)] 
imsm: introduce round_member_size_to_mb()

Extract rounding logic to separate function.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: imsm_get_free_size() refactor.
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:34 +0000 (15:52 +0200)] 
imsm: imsm_get_free_size() refactor.

Move minsize calculations up. Add error message if free size is too small.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
8 months agoimsm: move sum_extents calculations to merge_extents()
Mariusz Tkaczyk [Mon, 29 May 2023 13:52:33 +0000 (15:52 +0200)] 
imsm: move sum_extents calculations to merge_extents()

This logic is only used by merge_extents() code, there is no need to pass
it as parameter. Move it up. Add proper description.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
9 months agoimsm: Fix possible segfault in check_no_platform()
Mateusz Grzonka [Wed, 5 Jul 2023 14:34:56 +0000 (16:34 +0200)] 
imsm: Fix possible segfault in check_no_platform()

conf_line() may return NULL, which is not handled and might cause
segfault.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
12 months agoenable RAID for SATA under VMD
Kevin Friedberg [Thu, 16 Feb 2023 04:41:34 +0000 (23:41 -0500)] 
enable RAID for SATA under VMD

Detect when a SATA controller has been mapped under Intel Alderlake RST
VMD, so that it can use the VMD controller's RAID capabilities. Create
new device type SYS_DEV_SATA_VMD and list separate controller to prevent
mixing with the NVMe SYS_DEV_VMD devices on the same VMD domain.

Signed-off-by: Kevin Friedberg <kev.friedberg@gmail.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
12 months agomdadm: numbered names verification
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:50:17 +0000 (17:50 +0100)] 
mdadm: numbered names verification

New functions added to remove literals and make the code reusable.
Use parse_num() instead of is_numer().

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
12 months agomdadm: define is_devname_ignore()
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:50:16 +0000 (17:50 +0100)] 
mdadm: define is_devname_ignore()

Use function instead of direct checks across code.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
12 months agomdadm: define DEV_NUM_PREF
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:50:15 +0000 (17:50 +0100)] 
mdadm: define DEV_NUM_PREF

Use define instead of inlines. Add _LEN define.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
12 months agomdadm: define DEV_MD_DIR
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:50:14 +0000 (17:50 +0100)] 
mdadm: define DEV_MD_DIR

It is used many times. Additionally define _LEN to avoid repeated
strlen() calls when length is needed.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoRemove the config files in mdcheck_start|continue service
Xiao Ni [Fri, 7 Apr 2023 00:45:28 +0000 (08:45 +0800)] 
Remove the config files in mdcheck_start|continue service

We set MDADM_CHECK_DURATION in the mdcheck_start|continue.service files.
And mdcheck doesn't use any configs from the config file. So we can remove
the dependencies.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoBump minimum kernel version to 2.6.32
Jes Sorensen [Mon, 10 Apr 2023 15:45:34 +0000 (11:45 -0400)] 
Bump minimum kernel version to 2.6.32

Summary: At this point it probably is reasonable to drop support for
anything prior to 3.10.

Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoFix some cases eyesore formatting
Jes Sorensen [Mon, 10 Apr 2023 15:40:42 +0000 (11:40 -0400)] 
Fix some cases eyesore formatting

Summary: No functional change .... just make it more readable.

Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agosuper1: fix truncation check for journal device
Hristo Venev [Sat, 1 Apr 2023 20:01:35 +0000 (23:01 +0300)] 
super1: fix truncation check for journal device

The journal device can be smaller than the component devices.

Fixes: 171e9743881e ("super1: report truncated device")
Signed-off-by: Hristo Venev <hristo@venev.name>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoFix null pointer for incremental in mdadm
miaoguanqin [Tue, 4 Apr 2023 11:31:24 +0000 (19:31 +0800)] 
Fix null pointer for incremental in mdadm

when we excute mdadm --assemble, udev-md-raid-assembly.rules is triggered.
Then we stop array, we found an coredump for mdadm --incremental.func
stack are as follows:

#0  enough (level=10, raid_disks=4, layout=258, clean=1,
    avail=avail@entry=0x0) at util.c:555
#1  0x0000562170c26965 in Incremental (devlist=<optimized out>,
    c=<optimized out>, st=0x5621729b6dc0) at Incremental.c:514
#2  0x0000562170bfb6ff in main (argc=<optimized out>,
    argv=<optimized out>) at mdadm.c:1762

func enough() use array avail,avail allocate space in func count_active,
it may not alloc space, causing a coredump.We fix this coredump.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: lixiaokeng <lixiaokeng@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoCreate: Fix checking for container in update_metadata
Mateusz Grzonka [Thu, 23 Mar 2023 11:50:00 +0000 (12:50 +0100)] 
Create: Fix checking for container in update_metadata

The commit 8a4ce2c05386 ("Create: Factor out add_disks() helpers")
introduced a regression that caused timeouts and udev failing to create
links.

Steps to reproduce the issue were as following:
$ mdadm -CR imsm -e imsm -n4 /dev/nvme[0-3]n1
$ mdadm -CR vol -l5 -n4 /dev/nvme[0-3]n1 --assume-clean

I found the check for container was wrong because negation was missing.

Fixes: 8a4ce2c05386 ("Create: Factor out add_disks() helpers")
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoRevert "Revert "mdadm/systemd: remove KillMode=none from service file""
Mariusz Tkaczyk [Thu, 23 Mar 2023 16:13:18 +0000 (17:13 +0100)] 
Revert "Revert "mdadm/systemd: remove KillMode=none from service file""

This reverts commit 28a083955c6f58f8e582734c8c82aff909a7d461.

Resolved by commit 723d1df4946e ("mdmon: Improve switchroot
interactions.") We are ready to drop it.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoImprovements for IMSM_NO_PLATFORM testing.
NeilBrown [Mon, 20 Mar 2023 03:43:54 +0000 (14:43 +1100)] 
Improvements for IMSM_NO_PLATFORM testing.

Factor out IMSM_NO_PLATFORM testing into a single function that caches
the result.

Allow mdmon to explicitly set the result to "1" so that we don't need
the ENV var in the unit file

Check if the kernel command line contains "mdadm.imsm.test=1" and in
that case assert NO_PLATFORM.  This simplifies testing in a virtual
machine.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdopen: always try create_named_array()
NeilBrown [Tue, 14 Mar 2023 00:06:25 +0000 (11:06 +1100)] 
mdopen: always try create_named_array()

mdopen() will use create_named_array() to ask the kernel to create the
given md array, but only if it is given a number or name.
If it is NOT given a name and is required to choose one itself using
find_free_devnm() it does NOT use create_named_array().

On kernels with CONFIG_BLOCK_LEGACY_AUTOLOAD not set, this can result in
failure to assemble an array.  This can particularly seen when the
"name" of the array begins with a host name different to the name of the
host running the command.

So add the missing call to create_named_array().

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217074
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: Improve switchroot interactions.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: Improve switchroot interactions.

We need a new mdmon@mdfoo instance to run in the root filesystem after
switch root, as /sys and /dev are removed from the initrd.

systemd will not start a new unit with the same name running while the
old unit is still active, and we want the two mdmon processes to overlap
in time to avoid any risk of deadlock, which can happen when a write is
attempted with no mdmon running.

So we need a different unit name in the initrd than in the root.  Apart
from the name, everything else should be the same.

This is easily achieved using a different instance name as the
mdmon@.service unit file already supports multiple instances (for
different arrays).

So start "mdmon@mdfoo.service" from root, but
"mdmon@initrd-mdfoo.service" from the initrd.  udev can tell which
circumstance is the case by looking for /etc/initrd-release.
continue_from_systemd() is enhanced so that the "initrd-" prefix can be
requested.

Teach mdmon that a container name like "initrd/foo" should be treated
just like "foo".  Note that systemd passes the instance name
"initrd-foo" as "initrd/foo".

We don't need a similar mechanism at shutdown because dracut runs
"mdmon --takeover --all" when appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: Remove need for KillMode=none
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: Remove need for KillMode=none

mdmon needs to keep running during the switchroot out of (at boot) and
then back into (at shutdown) the initrd.  It runs until a new mdmon
takes over.

Killmode=none is used to achieve this, with the help of --offroot which
sets argv[0][0] to '@' which systemd understands.

This is needed because mdmon is currently run in system-mdmon.slice
which conflicts with shutdown.target so without Killmode=none mdmon
would get killed early in shutdown when system.mdmon.slice is removed.

As described in systemd.service(5), this conflict with shutdown can be
resolved by explicitly requesting system.slice, which is a natural
counterpart to DefaultDependencies=no.

So add that, and also add IgnoreOnIsolate=true to avoid another possible
source of an early death.  With these we no longer need KillMode=none
which the systemd developers have marked as "deprecated".

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: change systemd unit file to use --foreground
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: change systemd unit file to use --foreground

There is no value in mdmon forking when it is running under systemd -
systemd can still track it anyway.

So add --foreground option, and remove "Type=forking".

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdmon: don't test both 'all' and 'container_name'.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
mdmon: don't test both 'all' and 'container_name'.

If 'all' is not set, then container_name must be NULL, as nothing else
can set it.  So simplify the test to ignore container_name.
This makes the purpose of the code more obvious.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoUse existence of /etc/initrd-release to detect initrd.
NeilBrown [Mon, 13 Mar 2023 03:42:58 +0000 (14:42 +1100)] 
Use existence of /etc/initrd-release to detect initrd.

Since v183, systemd has used the existence of /etc/initrd-release to
detect if it is running in an initrd, rather than looking at the magic
number of the root filesystem's device.  It is time for mdadm to do the
same.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoDefine alignof using _Alignof when using C11 or newer
Khem Raj [Wed, 18 Jan 2023 08:32:36 +0000 (00:32 -0800)] 
Define alignof using _Alignof when using C11 or newer

WG14 N2350 made very clear that it is an UB having type definitions
within "offsetof" [1]. This patch enhances the implementation of macro
alignof_slot to use builtin "_Alignof" to avoid undefined behavior on
when using std=c11 or newer

clang 16+ has started to flag this [2]

Fixes build when using -std >= gnu11 and using clang16+

Older compilers gcc < 4.9 or clang < 8 has buggy _Alignof even though it
may support C11, exclude those compilers too

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm
[2] https://reviews.llvm.org/D133574

Upstream-Status: Pending
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomanpage: Add --write-zeroes option to manpage
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:35 +0000 (13:41 -0700)] 
manpage: Add --write-zeroes option to manpage

Document the new --write-zeroes option in the manpage.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agotests/00raid5-zero: Introduce test to exercise --write-zeros.
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:34 +0000 (13:41 -0700)] 
tests/00raid5-zero: Introduce test to exercise --write-zeros.

Attempt to create a raid5 array with --write-zeros. If it is successful
check the array to ensure it is in sync.

If it is unsuccessful and an unsupported error is printed, skip the
test.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdadm: Add --write-zeros option for Create
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:33 +0000 (13:41 -0700)] 
mdadm: Add --write-zeros option for Create

Add the --write-zeros option for Create which will send a write zeros
request to all the disks before assembling the array. After zeroing
the array, the disks will be in a known clean state and the initial
sync may be skipped.

Writing zeroes is best used when there is a hardware offload method
to zero the data. But even still, zeroing can take several minutes on
a large device. Because of this, all disks are zeroed in parallel using
their own forked process and a message is printed to the user. The main
process will proceed only after all the zeroing processes have completed
successfully.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agomdadm: Introduce pr_info()
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:32 +0000 (13:41 -0700)] 
mdadm: Introduce pr_info()

Feedback was given to avoid informational pr_err() calls that print
to stderr, even though that's done all through out the code.

Using printf() directly doesn't maintain the same format (an "mdadm"
prefix on every line.

So introduce pr_info() which prints to stdout with the same format
and use it for a couple informational pr_err() calls in Create().

Future work can make this call used in more cases.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoCreate: Factor out add_disks() helpers
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:31 +0000 (13:41 -0700)] 
Create: Factor out add_disks() helpers

The Create function is massive with a very large number of variables.
Reading and understanding the function is almost impossible. To help
with this, factor out the two pass loop that adds the disks to the array.

This moves about 160 lines into three new helper functions and removes
a bunch of local variables from the main Create function. The main new
helper function add_disks() does the two pass loop and calls into
add_disk_to_super() and update_metadata(). Factoring out the
latter two helpers also helps to reduce a ton of indentation.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoCreate: remove safe_mode_delay local variable
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:30 +0000 (13:41 -0700)] 
Create: remove safe_mode_delay local variable

All .getinfo_super() call sets the info.safe_mode_delay variables
to a constant value, so no matter what the current state is
that function will always set it to the same value.

Create() calls .getinfo_super() multiple times while creating the array.
The value is stored in a local variable for every disk in the loop
to add disks (so the last disc call takes precedence). The local
variable is then used in the call to sysfs_set_safemode().

This can be simplified by using info.safe_mode_delay directly. The info
variable had .getinfo_super() called on it early in the function so, by the
reasoning above, it will have the same value as the local variable which
can thus be removed.

Doing this allows for factoring out code from Create() in a subsequent
patch.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
13 months agoCreate: goto abort_locked instead of return 1 in error path
Logan Gunthorpe [Wed, 1 Mar 2023 20:41:29 +0000 (13:41 -0700)] 
Create: goto abort_locked instead of return 1 in error path

The return 1 after the fstat_is_blkdev() check should be replaced
with an error return that goes through the error path to unlock
resources locked by this function.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agosuper-ddf.c: fix memleak in get_vd_num_of_subarray()
Wu Guanghao [Fri, 3 Mar 2023 16:21:35 +0000 (00:21 +0800)] 
super-ddf.c: fix memleak in get_vd_num_of_subarray()

sra = sysfs_read() should be free before return in
get_vd_num_of_subarray()

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agosuper-intel.c: fix memleak in find_disk_attached_hba()
Wu Guanghao [Fri, 3 Mar 2023 16:21:34 +0000 (00:21 +0800)] 
super-intel.c: fix memleak in find_disk_attached_hba()

If disk_path = diskfd_to_devpath(), we need free(disk_path) before
return, otherwise there will be a memory leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoisuper-intel.c: fix double free in load_imsm_mpb()
Wu Guanghao [Fri, 3 Mar 2023 16:21:33 +0000 (00:21 +0800)] 
isuper-intel.c: fix double free in load_imsm_mpb()

In load_imsm_mpb() there is potential double free issue on super->buf.

The first location to free super->buf is from get_super_block() <==
load_and_parse_mpb() <== load_imsm_mpb():
 4514         if (posix_memalign(&super->migr_rec_buf, MAX_SECTOR_SIZE,
 4515             MIGR_REC_BUF_SECTORS*MAX_SECTOR_SIZE) != 0) {
 4516                 pr_err("could not allocate migr_rec buffer\n");
 4517                 free(super->buf);
 4518                 return 2;
 4519         }

If the above error condition happens, super->buf is freed and value 2
is returned to get_super_block() eventually. Then in the following code
block inside load_imsm_mpb(),
 5289  error:
 5290         if (!err) {
 5291                 s->next = *super_list;
 5292                 *super_list = s;
 5293         } else {
 5294                 if (s)
 5295                         free_imsm(s);
 5296                 close_fd(&dfd);
 5297         }
at line 5295 when free_imsm() is called, super->buf is freed again from
the call chain free_imsm() <== __free_imsm(), in following code block,
 4651         if (super->buf) {
 4652                 free(super->buf);
 4653                 super->buf = NULL;
 4654         }

This patch sets super->buf as NULL after line 4517 in load_imsm_mpb()
to avoid the potential double free().

(Coly Li helps to re-compose the commit log)

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoDetail.c: fix memleak in Detail()
Wu Guanghao [Fri, 3 Mar 2023 16:21:32 +0000 (00:21 +0800)] 
Detail.c: fix memleak in Detail()

char *sysdev = xstrdup() but not free() in for loop, will cause memory
leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoutil.c: fix memleak in parse_layout_faulty()
Wu Guanghao [Fri, 3 Mar 2023 16:21:31 +0000 (00:21 +0800)] 
util.c: fix memleak in parse_layout_faulty()

char *m is allocated by xstrdup but not free() before return, will cause
a memory leak

Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoutil.c: reorder code lines in parse_layout_faulty()
Coly Li [Fri, 3 Mar 2023 16:21:30 +0000 (00:21 +0800)] 
util.c: reorder code lines in parse_layout_faulty()

Resort the code lines in parse_layout_faulty() to make it more
comfortable, no logic change.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Refactor check_one_sharer() for better error handling
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:04 +0000 (12:27 +0100)] 
Mdmonitor: Refactor check_one_sharer() for better error handling

Also check if autorebuild.pid is a symlink, which we shouldn't accept.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Refactor write_autorebuild_pid()
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:03 +0000 (12:27 +0100)] 
Mdmonitor: Refactor write_autorebuild_pid()

Add better error handling and check for symlinks when opening MDMON_DIR.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoAdd helpers to determine whether directories or files are soft links
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:02 +0000 (12:27 +0100)] 
Add helpers to determine whether directories or files are soft links

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Add helper functions
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:01 +0000 (12:27 +0100)] 
Mdmonitor: Add helper functions

Add functions:
- is_email_event(),
- get_syslog_event_priority(),
- sprint_event_message(),
with kernel style comments containing more detailed descriptions.

Also update event syslog priorities to be consistent with man. MoveSpare event was described in man as priority info, while implemented as warning. Move event data into a struct, so that it is passed between different functions if needed.
Sort function declarations alphabetically and remove redundant alert() declaration.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Pass events to alert() using enums instead of strings
Mateusz Grzonka [Thu, 2 Feb 2023 11:27:00 +0000 (12:27 +0100)] 
Mdmonitor: Pass events to alert() using enums instead of strings

Add events enum, and mapping_t struct, that maps them to strings, so
that enums are passed around instead of strings.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoMdmonitor: Make alert_info global
Mateusz Grzonka [Thu, 2 Feb 2023 11:26:59 +0000 (12:26 +0100)] 
Mdmonitor: Make alert_info global

Move information about --test flag and hostname into alert_info.

Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoFix NULL dereference in super_by_fd
Li Xiao Keng [Mon, 27 Feb 2023 03:12:07 +0000 (11:12 +0800)] 
Fix NULL dereference in super_by_fd

When we create 100 partitions (major is 259 not 254) in a raid device,
mdadm may coredump:

Core was generated by `/usr/sbin/mdadm --detail --export /dev/md1p7'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
74 VPCMPEQ (%rdi), %ymm0, %ymm1
(gdb) bt
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x00007fbb9a7e4139 in __strcpy_chk (dest=dest@entry=0x55d55d6a13ac "", src=0x0, destlen=destlen@entry=32) at strcpy_chk.c:28
#2  0x000055d55ba1766d in strcpy (__src=<optimized out>, __dest=0x55d55d6a13ac "") at /usr/include/bits/string_fortified.h:79
#3  super_by_fd (fd=fd@entry=3, subarrayp=subarrayp@entry=0x7fff44dfcc48) at util.c:1289
#4  0x000055d55ba273a6 in Detail (dev=0x7fff44dfef0b "/dev/md1p7", c=0x7fff44dfe440) at Detail.c:101
#5  0x000055d55ba0de61 in misc_list (c=<optimized out>, ss=<optimized out>, dump_directory=<optimized out>, ident=<optimized out>, devlist=<optimized out>) at mdadm.c:1959
#6  main (argc=<optimized out>, argv=<optimized out>) at mdadm.c:1629

The direct cause is fd2devnm returning NULL, so add a check.

Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Signed-off-by: Wu Guang Hao <wuguanghao3@huawei.com>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Coly Li <colyli@suse.de <mailto:colyli@suse.de>>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
14 months agoGrow: fix can't change bitmap type from none to clustered.
Heming Zhao [Thu, 23 Feb 2023 14:39:39 +0000 (22:39 +0800)] 
Grow: fix can't change bitmap type from none to clustered.

Commit a042210648ed ("disallow create or grow clustered bitmap with
writemostly set") introduced this bug. We should use 'true' logic not
'== 0' to deny setting up clustered array under WRITEMOSTLY condition.

How to trigger

```
~/mdadm # ./mdadm -Ss && ./mdadm --zero-superblock /dev/sd{a,b}
~/mdadm # ./mdadm -C /dev/md0 -l mirror -b clustered -e 1.2 -n 2 \
/dev/sda /dev/sdb --assume-clean
mdadm: array /dev/md0 started.
~/mdadm # ./mdadm --grow /dev/md0 --bitmap=none
~/mdadm # ./mdadm --grow /dev/md0 --bitmap=clustered
mdadm: /dev/md0 disks marked write-mostly are not supported with clustered bitmap
```

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
15 months agoRevert "mdadm/systemd: remove KillMode=none from service file"
Mariusz Tkaczyk [Thu, 2 Feb 2023 07:56:31 +0000 (08:56 +0100)] 
Revert "mdadm/systemd: remove KillMode=none from service file"

This reverts commit 52c67fcdd6dadc4138ecad73e65599551804d445.

The functionality is marked as deprecated but we don't have alternative
solution yet. Shutdown hangs if OS is installed on external array:

task:umount state:D stack: 0 pid: 6285 ppid: flags:0x00004084
Call Trace:
__schedule+0x2d1/0x830
? finish_wait+0x80/0x80
schedule+0x35/0xa0
md_write_start+0x14b/0x220
? finish_wait+0x80/0x80
raid1_make_request+0x3c/0x90 [raid1]
md_handle_request+0x128/0x1b0
md_make_request+0x5b/0xb0
generic_make_request_no_check+0x202/0x330
submit_bio+0x3c/0x160

Use it until new solution is implemented.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomanage: move comment with function description
Kinga Tanska [Thu, 5 Jan 2023 05:31:25 +0000 (06:31 +0100)] 
manage: move comment with function description

Move the function description from the function body to outside
to obey kernel coding style.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-intel: make freesize not required for chunk size migration
Kinga Tanska [Fri, 28 Oct 2022 02:51:17 +0000 (04:51 +0200)] 
super-intel: make freesize not required for chunk size migration

Freesize is needed to be set for migrations where size of RAID could
be changed - expand. It tells how many free space is determined for
members. In chunk size migartion freesize is not needed to be set,
pointer shouldn't be checked if exists. This commit moves check to
condition which contains size calculations, instead of checking it
always at the first step.
Fix return value when superblock is not set.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoincremental, manage: do not verify if remove is safe
Kinga Tanska [Tue, 27 Dec 2022 05:50:43 +0000 (06:50 +0100)] 
incremental, manage: do not verify if remove is safe

Function is_remove_safe() was introduced to verify if removing
member device won't cause failed state of the array. This
verification should be used only with set-faulty command. Add
special mode indicating that Incremental removal was executed.
If this mode is used do not execute is_remove_safe() routine.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoManage: do not check array state when drive is removed
Kinga Tanska [Tue, 27 Dec 2022 05:50:42 +0000 (06:50 +0100)] 
Manage: do not check array state when drive is removed

Array state doesn't need to be checked when drive is
removed, but until now clean state was required. Result
of the is_remove_safe() function will be independent
from array state.

Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdadm/udev: Don't handle change event on raw devices
Xiao Ni [Wed, 4 Jan 2023 16:29:20 +0000 (00:29 +0800)] 
mdadm/udev: Don't handle change event on raw devices

The raw devices are ready when add event happpens and the raid
can be assembled. So there is no need to handle change events.
And it can cause some inconvenient problems.

For example, the OS is installed on md0(/root) and md1(/home).
md0 and md1 are created on partitions. When it wants to re-install
OS, anaconda can't clear the storage configure. It deletes one
partition and does some jobs. The change event happens. Now
the raid device is assembled again. It can't delete the other
partitions.

So in this patch, we don't handle change event on raw devices
anymore.

Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoutil: remove obsolete code from get_md_name
Mateusz Kusiak [Mon, 2 Jan 2023 08:46:22 +0000 (09:46 +0100)] 
util: remove obsolete code from get_md_name

get_md_name() is used only with mdstat entries.
Remove dead code and simplyfy function.

Remove redundadnt checks from mdmon.c

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdmon: fix segfault
Mateusz Kusiak [Mon, 2 Jan 2023 08:46:21 +0000 (09:46 +0100)] 
mdmon: fix segfault

Mdmon crashes if stat2devnm returns null.
Use open_mddev to check if device is mddevice and get name using
fd2devnm.
Refactor container name handling.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoChange char* to enum in context->update & refactor code
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:24 +0000 (09:35 +0100)] 
Change char* to enum in context->update & refactor code

Storing update option in string is bad for frequent comparisons and
error prone.
Replace char array with enum so already existing enum is passed around
instead of string.
Adapt code to changes.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoManage&Incremental: code refactor, string to enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:23 +0000 (09:35 +0100)] 
Manage&Incremental: code refactor, string to enum

Prepare Manage and Incremental for later changing context->update to enum.
Change update from string to enum in multiple functions and pass enum
where already possible.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoChange update to enum in update_super and update_subarray
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:22 +0000 (09:35 +0100)] 
Change update to enum in update_super and update_subarray

Use already existing enum, change update_super and update_subarray
update to enum globally.
Refactor function references also.
Remove code specific options from update_options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-intel: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:21 +0000 (09:35 +0100)] 
super-intel: refactor the code for enum

It prepares super-intel for change context->update to enum.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper1: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:20 +0000 (09:35 +0100)] 
super1: refactor the code for enum

It prepares update_super1 for change context->update to enum.
Change if else statements into switch.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper0: refactor the code for enum
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:19 +0000 (09:35 +0100)] 
super0: refactor the code for enum

It prepares update_super0 for change context->update to enum.
Change if else statements to switch.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agosuper-ddf: Remove update_super_ddf.
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:18 +0000 (09:35 +0100)] 
super-ddf: Remove update_super_ddf.

This is not supported by ddf.
It hides errors by returning success status for some updates.
Remove update_super_dff().

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoAdd code specific update options to enum.
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:17 +0000 (09:35 +0100)] 
Add code specific update options to enum.

Some of update options aren't taken from user input, but are hard-coded
as strings.
Include those options in enum.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agoFix --update-subarray on active volume
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:16 +0000 (09:35 +0100)] 
Fix --update-subarray on active volume

Options: bitmap, ppl and name should not be updated when array is active.
Those features are mutually exclusive and share the same data area in IMSM (danger of overwriting by kernel).
Remove check for active subarrays from super-intel.
Since ddf is not supported, apply it globally for all options.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
16 months agomdadm: Add option validation for --update-subarray
Mateusz Kusiak [Mon, 2 Jan 2023 08:35:15 +0000 (09:35 +0100)] 
mdadm: Add option validation for --update-subarray

Subset of options available for "--update" is not same as for "--update-subarray".
Define maps and enum for update options and use them instead of direct comparisons.
Add proper error message.

Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>