Mariusz Tkaczyk [Tue, 25 Jun 2024 10:53:46 +0000 (12:53 +0200)]
Incremental: support devnode in IncrementalRemove.
There are no reasons to keep this interface different than others.
Allow to use devnode but keep old way for backward compatibility.
Method is added to verify that only devnode or kernel name is used.
Anna Sztukowska [Thu, 8 Aug 2024 15:02:38 +0000 (17:02 +0200)]
Examine.c: Fix memory leaks in Examine()
Fix memory leaks in Examine() reported by SAST analysis. Implement a
method to traverse and free all the nodes of the doubly linked list.
Replace for loop with while loop in order to improve redability of the
code and free allocated memory correctly.
Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
If reshape (eg. chunksize migration) is gracefully stopped via SIGTERM
the checkpoint is not saved and reshape cannot be resumed due to "data
being present in copy area". This is because UNIT_SRC_NORMAL isn't set
if SIGTERM occurred.
Move SIGTERM handling at the end of the loop to allow saving checkpoint
(and state) so reshapes can be properly resumed.
mdadm: Increase number limit in md device name to 1024.
Updated the maximum device number in md device names from 127 to 1024.
The previous limit was causing issues in the automation framework.
This change ensures backward compatibility and allows for future
scalability.
Mariusz Tkaczyk [Thu, 22 Aug 2024 10:18:06 +0000 (12:18 +0200)]
imsm: add IMSM_OROM_CAPABILITIES_TPV to nvme orom
Add it to avoid excluding. It has some value for users even if it is
always true for nvme virtual orom.
Rework detail-platform printing code, move printing 3rd party nvmes
to print_imsm_capability (as it should be), but keep it meaningful
only for nvme controllers (NVME and VMD hba types). Pass whole
orom_entry instead of orom there.
Squash code responsible for printing NVME and VMD hbas.
Mariusz Tkaczyk [Thu, 8 Aug 2024 11:07:50 +0000 (13:07 +0200)]
imsm: get bus from VMD driver directory
Enumeration of VMD child devices is started early, kernel is not waiting
for VMD enumeration to finish. It causes that:
/sys/bus/pci/drivers/vmd/{dev}/domain/device link might be not yet ready.
With PCI gen5 devices we can observe that mdadm is failing to start IMSM
raid arrays because of that. In that case, it needs to find bus path
manually.
Look for bus device in VMD driver directory if realpath() failed with
ENOENT.
EFI vars depends on userspace, they need to be mounted to be accessible.
Sporadic problems have been observed with availability at an early
assemble stage. It is not possible to fully synchronize EFI vars mounts
with udev rules processing.
For the reason above, read of IMSM OROM from ACPI tables as secondary
option is added. This method will be used for SATA and VMD family
controllers.
ACPI tables are generated by sysfs, earlier in the boot process, before
the stage of RAID assembly. The way of loading OROM via EFI vars is
retained, ACPI tables will be a backup way.
Two paths will be maintained, because IMSM hardware capabilities are
necessary for RAID assembly during booting, so access to them must be
provided.
Nigel Croxon [Wed, 7 Aug 2024 15:33:23 +0000 (11:33 -0400)]
mdadm: util.c fix coverity issues
Fixing the following coding errors the coverity tools found:
* Event check_return: Calling "open" without checking return value
* Event check_return: Calling "lseek(fd, sector_size, 0)" without
checking return value.
* Event leaked_handle: Handle variable "fd" going out of scope leaks
the handle.
* Event leaked_storage: Variable "dir" going out of scope leaks the
storage it points to.
* Event fixed_size_dest: You might overrun the 32-character fixed-size
string "st->devnm" by copying "_devnm" without checking the length.
* Event fixed_size_dest: You might overrun the 32-character fixed-size
string "container" by copying "dev" without checking the length.
Mariusz Tkaczyk [Tue, 6 Aug 2024 14:11:18 +0000 (16:11 +0200)]
mdstat: fix list detach issues
Move ent = ent->next; to while. It was outside the loop so if there
are more than 2 elements and we are looking for 3rd element it causes
infinite loop..
Fix el->next zeroing. It causes segfault in mdstat_free(). Theses
issues were not visible in my testing because I had only 2 MD devices.
Grow_reshape: set only component_size for size grow
Component_size couldn't be set using ioctl when new drive size is big
(e.g. 5TB). Command value is bigger than 32 bits and error is reported
- it is known ioctl limitation. Remove updating array properties using
ioctl, use sysfs instead. Sysfs was introduced in 3.10, so now it is old
enough to be safely used. Array_size in sysfs should be set for every
size change for external metadata, when grow is performed without
errors.
Add new released gcc to compilation test during GH action.
Change runner to Ubuntu 24.04 which supports gcc versions up to 14.
Previously ubuntu-latest was used (22.04) which didn't support gcc 13
and 14. Add verification if correct gcc was installed during test.
Kinga Stefaniuk [Tue, 6 Aug 2024 09:14:02 +0000 (11:14 +0200)]
super-intel: fix compilation error
Fix compilation error:
super-intel.c: In function ‘end_migration’:
super-intel.c:4360:29: error: writing 2 bytes into a region
of size 0 [-Werror=stringop-overflow=]
4360 | dev->vol.migr_state = 0;
| ~~~~~~~~~~~~~~~~~~~~^~~
cc1: note: destination object is likely at address zero
cc1: all warnings being treated as errors
make: *** [Makefile:232: super-intel.o] Error 1
reported, when GCC 14 is used. Return when dev is NULL, to avoid it.
Anna Sztukowska [Thu, 11 Jul 2024 12:31:57 +0000 (14:31 +0200)]
policy.c: Fix check_return issue in Write_rules()
Refactor Write_rules() in policy.c to eliminate check_return issue found
by SAST analysis. Create udev rules file directly using rule_name
instead of creating temporary file and renaming it.
Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
- add imsm_chunk_ops struct for better code readability,
- move chunk size mapping to string into array,
- add function to print supported chunk sizes by IMSM controller.
To avoid repeating mdstat_read() in IncrementalRemove(), new function
mdstat_find_by_member_name() has been proposed. With that,
IncrementalRemove() handles own copy of mdstat content and there is no
need to repeat reading for external stop.
Additionally, It proposed few helper to avoid repeating
mdstat_ent->metadata_version checks across code.
It is essential to avoid vulnerabilities in code as much
as possible using safe compilation flags. It is easier if
they are added to the Makefile and applied during compilation.
Add new gcc flags and make them configurable, because they
may not be supported for some compilers.
Set FORTIFY_SOURCE with the highest supported value for platform.
super0: use define for char array in examine_super0
Using nb with 11 length may cause format-truncation errors,
because it was possible to use snprintf with 12 length input
and write it to 11 length output. Added new define and use it
to avoid this error.
Based on documentation SCSI Primary Commands - 4 (SPC-4) only first 7 bits
of first byte in sense data are used to store response code. The current
verification uses all 8 bits for comparison of response code.
Incorrect verification may make impossible to use SATA disks with IMSM,
because IMSM requires verification of the encryption state before use.
There was issue in kernel libata [1]. This issue hides bug in mdadm because
last bit was not set.
Example output with affected mdadm:
Port3 : /dev/sde (BTPR212503EK120LGN)
mdadm: Failed ata passthrough12 ioctl. Device: /dev/sde.
mdadm: Failed to get drive encryption information
The fix is use the first 7 bits of Byte 0, to compare with the expected
values.
Mentioned commit (see Fixes) causes that devices with UUID
equal to uuid_zero was not recognized properly. For few devices
the first one was taken always, and the same information was
printed. It caused regression, when few containers were created,
symlinks were generated only for the first one.
Add checking if uuid is uuid_zero and, if yes, use devname to
differentiate devices.
GH action is using checkout plugin, which takes fetch-depth
as a parameter to specify number of commits to fetch. Setting it
to 0 to fetch all of the history of all branches and tags.
Do not allow to use '.' on first place for named MD device.
Having leading dot might be confusing, MD device cannot be hidden.
It also removes possibility to create md device with name '.'.
Fixing the following coding errors the coverity tools found:
* Event parameter_hidden: declaration hides parameter "dv".
* Event leaked_storage: Variable "mdi" going out of scope leaks the storage
it points to.
* Event overwrite_var: Overwriting "mdi" in "mdi = mdi->devs" leaks the
storage that "mdi" points to.
* Event leaked_handle: Handle variable "lfd" going out of scope leaks
the handle.
* Event leaked_handle: Returning without closing handle "fd" leaks it.
* Event fixed_size_dest: You might overrun the 32-character fixed-sizei
string "devnm" by copying the return value of "fd2devnm" without
checking the length.
* Event fixed_size_dest: You might overrun the 32-character fixed-size
string "nm" by copying "nmp" without checking the length.
* Event fixed_size_dest: You might overrun the 32-character fixed-size
string "devnm" by copying the return value of "fd2devnm" without
checking the length.
* Event assigned_value: Assigning value "-1" to "tfd" here, but that
stored value is overwritten before it can be used.
mdadm/clustermd_tests: adjust test cases to support md module changes
Since kernel commit db5e653d7c9f ("md: delay choosing sync action to
md_start_sync()") delays the start of the sync action, clustermd
array sync/resync jobs can happen on any leg of the array. This
commit adjusts the test cases to follow the new kernel layer behavior.
Fixing the following coding errors the coverity tools found:
* Calling "lseek64" without checking return value. This library function may
fail and return an error code.
* Overrunning array "anchor->pad2" of 3 bytes by passing it to a function
which accesses it at byte offset 398 using argument "399UL".
* Event leaked_storage: Variable "sra" going out of scope leaks the storage
it points to.
* Event leaked_storage: Variable "super" going out of scope leaks the storage
it points to.
* Event leaked_handle: Handle variable "dfd" going out of scope leaks the
handle.
* Event leaked_storage: Variable "dl1" going out of scope leaks the storage
it points to
* Event leaked_handle: Handle variable "cfd" going out of scope leaks the
handle.
* Variable "avail" going out of scope leaks the storage it points to.
* Passing unterminated string "super->anchor.revision" to "fprintf", which
expects a null-terminated string.
* You might overrun the 32-character fixed-size string "st->container_devnm"
by copying the return value of "fd2devnm" without checking the length.
* Event fixed_size_dest: You might overrun the 33-character fixed-size string
"dev->name" by copying "(*d).devname" without checking the length.
* Event uninit_use_in_call: Using uninitialized value "info.array.raid_disks"
when calling "getinfo_super_ddf"
V2: clean up validate_geometry_ddf() routine with Mariusz Tkaczyk recommendations.
V3: clean up spaces with Blazej Kucman recommendations.
V4: clean up recommended by Mariusz Tkaczyk.
V5: clean up recommended by Mariusz Tkaczyk.
* Event negative_returns: "fd" is passed to a parameter that cannot be negative. Which
is set to -1 to start.
* Event open_fn: Returning handle opened by "open_dev_excl".
* Event var_assign: Assigning: "container_fd" = handle returned from
"open_dev_excl(st->container_devnm)"
* Event leaked_handle: Handle variable "container_fd" going out of scope leaks the handle
CI: use prepared checkpatch.conf file only for GH actions
Configuration file .checkpatch.conf is working properly only with
GH actions, because flags from GH plugin are used there. This file
shall not be placed in main repo directory, because it causes errors
while using checkpatch from Linux. Add step to review.yml to copy
this file before checkpatch action is started.
mdadm: Fix socket connection failure when mdmon runs in foreground mode.
While creating an IMSM RAID, mdadm will wait for the mdmon main process
to finish if mdmon runs in forking mode. This is because with
"Type=forking" in the mdmon service unit file, "systemctl start service"
will block until the main process of mdmon exits. At that moment, mdmon
has already created the socket, so the subsequent socket connect from
mdadm will succeed.
However, when mdmon runs in foreground mode (without "Type=forking" in
the service unit file), "systemctl start service" will return once the
mdmon process starts. This causes mdadm and mdmon to run in parallel,
which may lead to a socket connection failure since mdmon has not yet
initialized the socket when mdadm tries to connect. If the next
instruction/command is to access this device and try to write to it, a
permission error will occur since mdmon has not yet set the array to RW
mode.
Kinga Stefaniuk [Tue, 25 Jun 2024 08:48:33 +0000 (10:48 +0200)]
CI: fix excluded files in checkpatch.conf
--exclude flag in checkpatch.conf is configured to work on directories
only. When checkpatch.conf contains files, checkpatch scan is not started.
Remove file names and keep only directories which should be excluded.
Nigel Croxon [Tue, 25 Jun 2024 11:57:28 +0000 (07:57 -0400)]
mdadm: Assemble.c fix coverity issues
Fixing the following coding errors the coverity tools found:
* Event dereference: Dereferencing "pre_exist", which is known to be "NULL".
* Event parameter_hidden: Declaration hides parameter "c".
* Event leaked_storage: Variable "pre_exist" going out of scope leaks the
storage it points to.
* Event leaked_storage: Variable "avail" going out of scope leaks the
storage it points to.
connect_monitor() is called from ping_monitor() but this function is often
used as advice, without verification that mdmon is really working. This
produces hangs in many scenarios.
Gwendal Grignou [Wed, 15 May 2024 21:30:59 +0000 (14:30 -0700)]
Makefile: Do not call gcc directly
When mdadm is compiled with clang, direct gcc will fail.
Make sure to use $(CC) variable instead.
Note that Clang does not support --help=warnings,
--print-diagnostic-options should be used instead.
So with Clang, the compilation will go through, but the
extra warning flags will never be added.
mdadm: Fix socket connection failure when mdmon runs in foreground mode.
While creating an IMSM RAID, mdadm will wait for the mdmon main process
to finish if mdmon runs in forking mode. This is because with
"Type=forking" in the mdmon service unit file, "systemctl start service"
will block until the main process of mdmon exits. At that moment, mdmon
has already created the socket, so the subsequent socket connect from
mdadm will succeed.
However, when mdmon runs in foreground mode (without "Type=forking" in
the service unit file), "systemctl start service" will return once the
mdmon process starts. This causes mdadm and mdmon to run in parallel,
which may lead to a socket connection failure since mdmon has not yet
initialized the socket when mdadm tries to connect. If the next
instruction/command is to access this device and try to write to it, a
permission error will occur since mdmon has not yet set the array to RW
mode.
Mateusz Kusiak [Fri, 15 Mar 2024 20:03:09 +0000 (16:03 -0400)]
test: pass flags to services
Commit 4c12714d1ca0 ("test: run tests on system level mdadm") removed
MDADM_NO_SYSTEMCTL flag from test suite. This causes imsm tests to fail
as mdadm no longer triggers mdmon and flags exists only within session.
Use systemd set/unset-environment to pass necessary flags.
Introduce colors to grab users attention to warnings and key messages.
Make test suite setup systemd environment.
Add setup/clean_systemd_env() functions.
Warn user about altering systemd environment.
Logan Gunthorpe [Tue, 4 Jun 2024 16:38:36 +0000 (10:38 -0600)]
mdadm: Fix hang race condition in wait_for_zero_forks()
Running a create operation with --write-zeros can randomly hang
forever waiting for child processes. This happens roughly on in
ten runs with when running with small (20MB) loop devices.
The bug is caused by the fact that signals can be coallesced into
one if they are not read by signalfd quick enough. So if two children
finish at exactly the same time, only one SIGCHLD will be received
by the parent.
To fix this, wait on all processes with WNOHANG every time a SIGCHLD
is received and exit when all processes have been waited on.
Kinga Stefaniuk [Tue, 11 Jun 2024 05:58:49 +0000 (07:58 +0200)]
imsm: make freesize required to volume autolayout
Autolayout_imsm() shall be executed when IMSM_NO_PLATFORM=1 is set.
It was fixed by listed commit, checking super->orom was removed, but
also checking freesize. Freesize is not set for operations on RAID
volume with no size update, that's why it is not required to have
this value and always run autolayout_imsm().
Fix it by making autolayout_imsm() dependent on freesize.
Fixes: 46f192 ("imsm: fix first volume autolayout with IMSM_NO_PLATFORM") Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
Mariusz Tkaczyk [Thu, 23 May 2024 10:06:36 +0000 (12:06 +0200)]
imsm: fix first volume autolayout with IMSM_NO_PLATFORM
Autolayout_imsm() is not executed if IMSM_NO_PLATFORM=1 is set.
This causes that first volume cannot be created. Disk for new volume are
never configured.
Fix it by making autolayout_imsm() independent from super->orom because
NULL there means that IMSM_NO_PLATFORM=1 is set. There are not platform
restrictions to create volume, we just analyze drives. It is safe.
Fixes: 6d4d9ab295de ("imsm: use same slot across container") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Tue, 28 May 2024 13:51:47 +0000 (21:51 +0800)]
mdadm/tests: bitmap cases enhance
It fails because bitmap dirty number is smaller than 400 sometimes. It's not
good to compare bitmap dirty bits with a number. It depends on the test
machine, it can flush soon before checking the number. So remove related codes.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:55 +0000 (16:50 +0800)]
mdadm/tests: 07changelevelintr
It needs to specify a 2 powered array size when updating array size.
If not, it can't change chunksize.
And sometimes it reports error reshape doesn't happen. In fact the
reshape has finished. It doesn't need to wait before checking
reshape action. Because check function waits itself.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:53 +0000 (16:50 +0800)]
mdadm/tests: 07autoassemble
This test is used to test stacked array auto assemble.
There are two different cases depends on if array is foreign or not.
If the array is foreign, the stacked array (md0 is on md1 and md2)
can't be assembled with name md0. Because udev rule will run when md1
and md2 are assembled and mdadm -I doesn't specify homehost. So it
will treat stacked array (md0) as foreign array and choose md127 as
the device node name (/dev/md127)
Add the case that stacked array is local.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:44 +0000 (16:50 +0800)]
mdadm/tests: 03assem-incr enhance
It fails when hostname lenght > 32. Because the super1 metadata name
doesn't include hostname when hostname length > 32. Then mdadm thinks
the array is a foreign array if no device link is specified when
assembling the array. It chooses a minor number from 127.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:43 +0000 (16:50 +0800)]
mdadm/tests: names_template enhance
For super1, if the length of hostname is >= 32, it doesn't add hostname
in metadata name. Fix this problem by checking the length of hostname.
Because other cases may use need to check this, so do the check in
do_setup.
And this patch adds a check if link /dev/md/name exists.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:41 +0000 (16:50 +0800)]
mdadm/tests: test enhance
There are two changes.
First, if md module is not loaded, it gives error when reading
speed_limit_max. So read the value after loading md module which
is done in do_setup
Second, sometimes the test reports error sync action doesn't
happen. But dmesg shows sync action is done. So limit the sync
speed before test. It doesn't affect the test run time. Because
check wait sets the max speed before waiting sync action. And
recording speed_limit_max/min in do_setup.
Fixes: 4c12714d1ca0 ('test: run tests on system level mdadm') Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Xiao Ni [Wed, 22 May 2024 08:50:39 +0000 (16:50 +0800)]
mdadm: Start update_opt from 0
Before f2e8393bd722 ('Manage&Incremental: code refactor, string to enum'), it uses
NULL to represent it doesn't need to update. So init UOPT_UNDEFINED to 0. This
problem is found by test case 05r6tor0.
Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Valery Ushakov [Wed, 22 May 2024 14:07:38 +0000 (17:07 +0300)]
Makefile: fix make -s detection
Only check the first word of MAKEFLAGS for 's', that's where all the
single letter options are collected.
MAKEFLAGS contains _all_ make flags, so if any command line argument
contains a letter 's', the silent test will be false positive. Think
e.g. make 'DESTDIR=.../aports/main/mdadm/pkg/mdadm' install
Mariusz Tkaczyk [Fri, 29 Mar 2024 14:21:54 +0000 (15:21 +0100)]
mdadm: deprecate bitmap custom file
This option has been deprecated in kernel by Christoph in commit 0ae1c9d38426 ("md: deprecate bitmap file support"). Do the same in
mdadm.
With this change, user must acknowledge it, it is not
skippable. The implementation of custom bitmap file looks like it's
abandoned. It cannot be done by Incremental so it is not respected by
any udev based system and it seems to not be recorded by metadata.
User must assemble such volume manually.
Tests for bitmap custom file are removed because now they will not
pass because interaction with user is mandatory.
Nigel Croxon [Wed, 22 May 2024 20:53:22 +0000 (16:53 -0400)]
mdadm: super-intel fix bad shift
In the expression "1 << i", left shifting by more than 31 bits has undefined behavior.
The shift amount, "i", is as much as 63. The operand has type "int" (32 bits) and will
be shifted as an "int". The fix is to change to a 64 bit int.