]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
10 years agoMake sure "mdmon" doesn't get called "@dmon".
NeilBrown [Mon, 2 Sep 2013 01:02:09 +0000 (11:02 +1000)] 
Make sure "mdmon" doesn't get called "@dmon".

The Anaconda installer (via its "loader" program) will try to kill
many processes at shutdown, but not "mdmon".

However when mdadm runs mdmon in the Anaconda environment, mdmon
sets argv[0][0] to '@' resulting in "@dmon" which confuses
"loader".

So change mdadm to set argv[0] to a path so that mdmon becomes e.g.
  "@usr/sbin/mdmon"
which "loader" will recognise as being "mdmon".

Reported-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix hang when growing a RAID5.
NeilBrown [Wed, 28 Aug 2013 07:00:53 +0000 (17:00 +1000)] 
Grow: fix hang when growing a RAID5.

Since:

commit 84d11e6c6a3b827b2daa32e16303235ce33d49f5
Author: NeilBrown <neilb@suse.de>
Date:   Thu Aug 1 11:16:14 2013 +1000

    Grow: exit background thread cleanly on SIGTERM.

removed the setting of "sync_max" from abort_reshape() we need
to do it explicitly here.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoin_initrd: fix gcc compiler error
mwilck@arcor.de [Fri, 16 Aug 2013 18:21:59 +0000 (20:21 +0200)] 
in_initrd: fix gcc compiler error

On some systems, this code caused a "comparison between signed
and unsigned" error.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: increase default value for safe_mode_delay to 4000ms
mwilck@arcor.de [Fri, 16 Aug 2013 18:21:58 +0000 (20:21 +0200)] 
DDF: increase default value for safe_mode_delay to 4000ms

That is the same value that IMSM uses. The current default of 200ms
seems to have been copied from the native MD meta data. That value
appears to be much too low for DDF, given that writing the DDF meta
data means that easily several MB worth of data need to be written to
disk.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: container_content_ddf: set safe_mode_delay > 0
mwilck@arcor.de [Fri, 16 Aug 2013 18:21:57 +0000 (20:21 +0200)] 
DDF: container_content_ddf: set safe_mode_delay > 0

Set safe_mode_delay to something >0, otherwise all container subarrays
assembled will have safe_mode_delay=0. That will break the assumption that
meta data becomes clean after running mdadm --wait-clean.

Use the same value as in getinfo_super_ddf_bvd. It would be cleaner
to call that directly from container_content_ddf, but I need to check
possible side effects first.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: export_examine_super_ddf: print MD_DEVICES
mwilck@arcor.de [Fri, 16 Aug 2013 18:21:56 +0000 (20:21 +0200)] 
DDF: export_examine_super_ddf: print MD_DEVICES

Have mdadm -E --export print the number of RAID devices,
like other meta data formats do. Anaconda (RHEL/CentOS installer)
depends on it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_activate_spare: fix gcc -O2 uninitialized warning
NeilBrown [Fri, 16 Aug 2013 18:21:55 +0000 (20:21 +0200)] 
DDF: ddf_activate_spare: fix gcc -O2 uninitialized warning

At this point 'di' and 'rv' both have the same value.  gcc doesn't
realise that and a human reader might not either.
'rv' makes more sense too, so use that.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd ANNOUNCE-3.2.6 from different branch
NeilBrown [Mon, 26 Aug 2013 05:28:43 +0000 (15:28 +1000)] 
Add ANNOUNCE-3.2.6 from different branch

just for completeness...

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd raid6check to .gitignore
NeilBrown [Mon, 26 Aug 2013 05:26:54 +0000 (15:26 +1000)] 
Add raid6check to .gitignore

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoChange "mdadm --run" to use the same code as "mdadm --IRs".
NeilBrown [Mon, 26 Aug 2013 05:24:53 +0000 (15:24 +1000)] 
Change "mdadm --run" to use the same code as "mdadm --IRs".

Current "mdadm --run /dev/mdX" will not handle external metadata
properly.  mdmon won't be started etc.

So use the code from "mdadm -IRs" instead - that already does all
the right things.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: fix setting of data_offset for 1.0 metadata.
NeilBrown [Wed, 14 Aug 2013 07:06:22 +0000 (17:06 +1000)] 
super1: fix setting of data_offset for 1.0 metadata.

commit 23bf42cc79d46de019d4b27c16354a191a98ed41
    super1: simplify setting of array size.

removed the setting for sb->data_offset for 1.0 metadata for some reason,
and messed up the size calculation for 1.0 metadata too.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoFix bug with adding to 0.90 array
NeilBrown [Wed, 14 Aug 2013 05:20:02 +0000 (15:20 +1000)] 
Fix bug with adding to 0.90 array

commit 7ccc4cc4fc6889680bbe4ec673cab3f6aa49aad3
    Manage: remove call to validate_geometry.

used entirely the wrong number for "4TB" !!

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_open_new: check device status for new subarray
mwilck@arcor.de [Tue, 6 Aug 2013 21:38:02 +0000 (23:38 +0200)] 
DDF: ddf_open_new: check device status for new subarray

It is possible that mdadm creates a new subarray containing failed
devices. This may happen if a device has failed, but the meta data
containing that information hasn't been written out yet.

This code tests for this situation, and handles it in the monitor.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-create-race: test handling of fail/create race
mwilck@arcor.de [Tue, 6 Aug 2013 21:38:01 +0000 (23:38 +0200)] 
tests/10ddf-fail-create-race: test handling of fail/create race

If a disk fails and simulaneously a new array is created, a race
condition may arise because the meta data on disk doesn't reflect
the disk failure yet. This is a test for that case.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-spare: more sophisticated result checks
mwilck@arcor.de [Tue, 6 Aug 2013 21:38:00 +0000 (23:38 +0200)] 
tests/10ddf-fail-spare: more sophisticated result checks

This test can succeed two ways, depending on timing.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-two-spares: new unit test
mwilck@arcor.de [Wed, 7 Aug 2013 20:38:03 +0000 (22:38 +0200)] 
tests/10ddf-fail-two-spares: new unit test

This is one more unit test for failure/recovery, this time with
double redundancy, which isn't covered by the other tests.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoCreate: fix warning about pre-existing filesystems.
NeilBrown [Wed, 7 Aug 2013 23:16:43 +0000 (09:16 +1000)] 
Create: fix warning about pre-existing filesystems.

An ext[234] filesystem larger than 2TB was beign reported with
a negative size - which looks odd.

So fix it to use suitably large and unsigned values.

Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: Write new conf entries with a single write.
NeilBrown [Wed, 7 Aug 2013 06:59:26 +0000 (16:59 +1000)] 
DDF: Write new conf entries with a single write.

The recent change to skip over invalid conf entries was bad because
it could leave garbage on the disk.
But we don't to write each entry separately as the writes a O_DIRECT
and so synchronous so it takes way too long.

So allocate a large buffer (probably the one used to read the config records)
and fill that then write it all at once.

Reported-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotest: allow LVM volumes or RAM disks as test devices
mwilck@arcor.de [Mon, 5 Aug 2013 20:37:51 +0000 (22:37 +0200)] 
test: allow LVM volumes or RAM disks as test devices

Allow other device types for testing; this allows to test on
a larger variety of devices.

Option --dev=[loop|lvm|ram] selects loop device (default), lvm,
and ram disk, respecively. To use RAM disks with DDF,
the kernel parameter ramdisk_size=65536 must be used.
For LVM, use --volgroup=<vg> to specify the name of the volume
group in which the test LVs will be created.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: get_extents: don't allocate space on failed disks
mwilck@arcor.de [Mon, 5 Aug 2013 20:37:50 +0000 (22:37 +0200)] 
DDF: get_extents: don't allocate space on failed disks

We should skip known failed disks when allocating space for
new arrays. This fixes the problem with 10ddf-fail-spare.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-spare: new unit test
mwilck@arcor.de [Mon, 5 Aug 2013 20:37:49 +0000 (22:37 +0200)] 
tests/10ddf-fail-spare: new unit test

This is Albert Pauw's latest test. Note that this FAILS.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-twice: remove hard-coded assumptions
mwilck@arcor.de [Mon, 5 Aug 2013 20:37:48 +0000 (22:37 +0200)] 
tests/10ddf-fail-twice: remove hard-coded assumptions

This test has some randomness because it is not always deterministic
which of the two arrays gets the spare and which remains degraded.
Handle it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/env-ddf-template: some helper functions
mwilck@arcor.de [Mon, 5 Aug 2013 20:37:47 +0000 (22:37 +0200)] 
tests/env-ddf-template: some helper functions

helper functions to determine the list of devices in an array,
etc.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMakefile: check that 'run' directory exists.
NeilBrown [Mon, 5 Aug 2013 06:39:45 +0000 (16:39 +1000)] 
Makefile: check that 'run' directory exists.

mdadm default to using /run/mdadm.  However not all distros
provide /run yet.  This can confuse people who build their own
mdadm.
So have "make" complain if the given directory doesn't exist.
This will make it harder to build an mdadm which doesn't work.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: don't use 'ghost' values from an inactive array.
NeilBrown [Mon, 5 Aug 2013 05:40:16 +0000 (15:40 +1000)] 
mdmon: don't use 'ghost' values from an inactive array.

It is possible for mdmon to see (in /proc/mdstat) and array
in 'inactive' state, "mdadm -S" has written "inactive" to
"array_state".

In this state values such as "raid_disk" are not meaningful
and so should be ignored by manage_member().

Reported-by: "Dorau, Lukasz" <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: fix removal of failed devices.
NeilBrown [Mon, 5 Aug 2013 04:56:23 +0000 (14:56 +1000)] 
DDF: fix removal of failed devices.

Commit c7079c84 arrange for DDF to forget about any device
that is failed and not still marked as part of any array.

However such devices could still be part of the container and this
removal and updating of 'pdnum' can result in multiple devices having
the same pdnum.  This in turn easily leads to confusion and
corruption.

So only discard pd entries for devices which are failed, not listed in
any virtual device, and for which we don't have a handle on the
device.

pd entries will not get removed until a new device is added after
the device has been removed from the container, either by
"mdadm --remove" or by assembling without the failed devices.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Analysed-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotest: ensure testing uses correct mdmon
NeilBrown [Mon, 5 Aug 2013 04:55:13 +0000 (14:55 +1000)] 
test: ensure testing uses correct mdmon

When testing we want to run mdmon directly, not use
systemctl to get systemd to run it.

So allow an environment variable to make that choice.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomanagemon: fix typo affecting incrmental assembly.
NeilBrown [Mon, 5 Aug 2013 04:25:15 +0000 (14:25 +1000)] 
managemon: fix typo affecting incrmental assembly.

This clearly should be 'st2'.
As it is the 'raid_disk' value being tested is completely
meaningless in the context of the new device.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: fix writing metadata updates.
NeilBrown [Mon, 5 Aug 2013 04:21:10 +0000 (14:21 +1000)] 
DDF: fix writing metadata updates.

Recent commit 273989b93a3185c0e4d54f0d1bc404248a92d157
skipped writing some large blocks of 0xFF, but didn't seek
over the space, so subsequent data was written wrongly.

When we don't write, we need to seek.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-fail-twice: New unit test
mwilck@arcor.de [Thu, 1 Aug 2013 22:35:17 +0000 (00:35 +0200)] 
tests/10ddf-fail-twice: New unit test

This is the test by Albert Pauw. Fail 2 disks, and add one.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: no need for GET_LAYOUT any more
mwilck@arcor.de [Thu, 1 Aug 2013 22:35:15 +0000 (00:35 +0200)] 
DDF: no need for GET_LAYOUT any more

With the previous patch, mdmon will provide the layout property for us.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: always get layout from sysfs
mwilck@arcor.de [Thu, 1 Aug 2013 22:35:14 +0000 (00:35 +0200)] 
mdmon: always get layout from sysfs

commit 71d68ff62 uses the array layout. It needs to be initialized.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: don't lie to systemd.
NeilBrown [Thu, 1 Aug 2013 05:59:24 +0000 (15:59 +1000)] 
mdmon: don't lie to systemd.

Now that mdmon responds fairly well to SIGTERM, stop lying to
systemd about being started on the initrd.

Note that if mdmon is rerun (--takeover) for some reason, and systemd
chooses to kill processes before remounting / readonly, then the
unmount will hang.

If systemd ever lets us tell it that we don't want to be killed until
root is readonly, then we should do that.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: clear safe_mode_delay on shutdown
NeilBrown [Thu, 1 Aug 2013 05:45:17 +0000 (15:45 +1000)] 
mdmon: clear safe_mode_delay on shutdown

When we receive a signal, set the safemode delay to v.small
so that we can ge clean arrays and exit quickly

Signed-off-by: NeilBrown <neilb@suse.de>o
10 years agoDDF: differentiate between new metadata and metadata updates.
NeilBrown [Thu, 1 Aug 2013 05:21:57 +0000 (15:21 +1000)] 
DDF: differentiate between new metadata and metadata updates.

When writing an update, we don't need to overwrite lots of
empty fields.  This makes updates somewhat faster.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: use some #defines instead of bare constants.
NeilBrown [Thu, 1 Aug 2013 05:21:24 +0000 (15:21 +1000)] 
DDF: use some #defines instead of bare constants.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIntroduce devid2kname - slightly different to devid2devnm.
NeilBrown [Thu, 1 Aug 2013 04:32:04 +0000 (14:32 +1000)] 
Introduce devid2kname - slightly different to devid2devnm.

The purpose od devid2devnm is to return a kernel name of an
md device, whether that device is a whole device or a partition,
we want the whole device.  md4, never md4p2.

In one place I was using devid2devnm where I really wanted the
partition if there was one ... and wasn't really interested in it
being an md device.
So introduce a new 'devid2kname' for that case.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDon't lie to systemd about mdadm's status.
NeilBrown [Thu, 1 Aug 2013 04:04:07 +0000 (14:04 +1000)] 
Don't lie to systemd about mdadm's status.

Telling systemd that mdadm was started from the initrd
is often a lie and never necessary.  Now that the reshape monitoring
thread handles SIGTERM gracefully it is OK for system to kill
and mdadm that it finds running.

mdmon still have a bit of a question mark over it so I won't remove
the '@' from there just yet.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: exit background thread cleanly on SIGTERM.
NeilBrown [Thu, 1 Aug 2013 01:16:14 +0000 (11:16 +1000)] 
Grow: exit background thread cleanly on SIGTERM.

If the mdadm thread that monitors a reshape gets SIGTERM it should
exit cleanly and clear the 'suspended' region of the array.
However it mustn't clear 'sync_max' as that would allow the
reshape to continue unmonitored.

If the thread ever does get killed, the array should really be
shutdown soon after if possible.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/env-ddf-template: helper for new unit test
Martin Wilck [Wed, 31 Jul 2013 05:36:32 +0000 (07:36 +0200)] 
tests/env-ddf-template: helper for new unit test

I forgot to check in this helper script, similar to the one for IMSM.
It is needed by tests/10ddf-create-fail-rebuild.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-create-fail-rebuild: new unit test for DDF
Martin Wilck [Tue, 30 Jul 2013 21:18:34 +0000 (23:18 +0200)] 
tests/10ddf-create-fail-rebuild: new unit test for DDF

This test adds a new unit test similar to 009imsm-create-fail-rebuild.
With the previous patches, it actually succeeds on my system.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: manage_member: fix race condition during slow meta data writes
Martin Wilck [Tue, 30 Jul 2013 21:18:33 +0000 (23:18 +0200)] 
mdmon: manage_member: fix race condition during slow meta data writes

In order to track kernel state changes, the monitor needs to
notice changes in sysfs. If the changes are transient, and the
monitor is busy writing meta data, it can happen that the changes
are missed. This will cause the meta data to be inconsistent with
the real state of the array.

I can reproduce this in  a test scenario with a DDF container and
two subarrays, where I set a disk to "failed" and then add a global
hot-spare. On a typical MD test setup with loop devices, I can
reliably reproduce a failure where the metadata show degraded members
although the kernel finished the recovery successfully.

This patch fixes this problem by applying two changes. First, when
a metadata update is queued, wait until it is certain that the monitor
actually applied these meta data (the for loop is actually needed to
avoid failures completely in my test case). Second, after triggering the
recovery, set prev_state of the changed array to "recover", in case
the monitor misses the transient "recover" state.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: manage_member: debug messages for array state
Martin Wilck [Tue, 30 Jul 2013 21:18:32 +0000 (23:18 +0200)] 
mdmon: manage_member: debug messages for array state

Add debug messages to watch the manager's steps.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: wait_and_act: fix debug message for SIGUSR1
Martin Wilck [Tue, 30 Jul 2013 21:18:31 +0000 (23:18 +0200)] 
mdmon: wait_and_act: fix debug message for SIGUSR1

Correctly print out wake reason if it was a signal. Previous code
would print misleading select events (pselect(2) man page says the
fdsets become undefined in case of error).

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomonitor: read_and_act: log status when called
Martin Wilck [Tue, 30 Jul 2013 21:18:30 +0000 (23:18 +0200)] 
monitor: read_and_act: log status when called

read_and_act() currently prints a debug message only very late.
Print the status seen by mdmon right away, to track mdmon's
actions more closely. Add a time stamp to observe long delays
between read_and_act calls, e.g. caused by meta data writes.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_set_disk: add some debug messages
Martin Wilck [Tue, 30 Jul 2013 21:18:29 +0000 (23:18 +0200)] 
DDF: ddf_set_disk: add some debug messages

Adds more verbose debugging in ddf_set_disk, to understand failures
better.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: load_ddf_header: more error logging
Martin Wilck [Tue, 30 Jul 2013 21:18:28 +0000 (23:18 +0200)] 
DDF: load_ddf_header: more error logging

Try to determine problem if load_ddf_header fails. May be useful
for determining compatibility problems with Fake RAID BIOSes.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_process_update: log offsets for conf changes
Martin Wilck [Tue, 30 Jul 2013 21:18:27 +0000 (23:18 +0200)] 
DDF: ddf_process_update: log offsets for conf changes

I needed this for tracking a bug with wrong offsets after array
creation.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: log disk status changes more nicely
Martin Wilck [Tue, 30 Jul 2013 21:18:26 +0000 (23:18 +0200)] 
DDF: log disk status changes more nicely

In particular, include refnum for better tracking. This makes
it a little easier for humans to track what happened to which disk.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_activate_spare: bugfix for 62ff3c40
Martin Wilck [Tue, 30 Jul 2013 21:18:25 +0000 (23:18 +0200)] 
DDF: ddf_activate_spare: bugfix for 62ff3c40

Move the check for good drives in the dl loop - otherwise dl
may be NULL and mdmon may crash.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoFix is_resync_complete for RAID10
NeilBrown [Tue, 30 Jul 2013 23:18:57 +0000 (09:18 +1000)] 
Fix is_resync_complete for RAID10

For RAID10, 'sync' numbers go up to the array size rather than the
component size.  is_resync_complete() needs to allow for this.

Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAvoid double close()
Jes Sorensen [Tue, 30 Jul 2013 16:30:03 +0000 (18:30 +0200)] 
Avoid double close()

Coverity discovered a possible double close(fd2) in Grow.c. Avoided by
invalidating fd2 after the first close.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: simplify setting of array size.
NeilBrown [Tue, 30 Jul 2013 06:51:38 +0000 (16:51 +1000)] 
super1: simplify setting of array size.

Currently the extra space to leave before the data in the array
is calculated in two separate places, and they can be inconsistent.

Instead, do it all in validate_geometry.  This records the
'data_offset' chosen which all other devices then use.

'write_init_super' now just uses the value rather than doing all the
calculations again.

This results in more consistent numbers.

Also, load_super sets st->data_offset so that it is used by "--add",
so the new device has a data offset matching a pre-existing device.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: separate to version of _avail_space1().
NeilBrown [Tue, 30 Jul 2013 05:17:22 +0000 (15:17 +1000)] 
super1: separate to version of _avail_space1().

_avail_space1() is calls from both avail_space1() and validate_geometry1()
and does slightly different things.

The partial code sharing doesn't really help.  In particularly the
responsibility for setting the size of the array is currently
confused.

So duplicate the code into the two locations - one where 'super' is
always NULL (validate_geometry1) and one where it is never NULL
(avail_space1), and simplify.

No behaviour change - just code re-organisation.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: remove call to validate_geometry.
NeilBrown [Tue, 30 Jul 2013 03:45:22 +0000 (13:45 +1000)] 
Manage: remove call to validate_geometry.

This call to validate_geometry is really rather gratuitous.
It is purely about the fact that super0 cannot use more than 4TB.
So just make it an explicit test - less confusing that way.

With this, validate_geometry is only called from Create, which
makes it easier to reason about.

Also validate_geometry is now never passed NULL for the 'chunk'
parameter, so we can remove those annoying tests for NULL.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_activate_spare: fix metadata update for SVDs
mwilck@arcor.de [Thu, 25 Jul 2013 18:59:13 +0000 (20:59 +0200)] 
DDF: ddf_activate_spare: fix metadata update for SVDs

Metadata updates for secondary RAID (RAID10) need to cover
all BVDs. Compare with code in write_init_super_ddf().

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_activate_spare: only activate good drives
mwilck@arcor.de [Thu, 25 Jul 2013 18:59:12 +0000 (20:59 +0200)] 
DDF: ddf_activate_spare: only activate good drives

Do not try to activate drives marked missing or failed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_activate_spare: Add RAID10 code
mwilck@arcor.de [Thu, 25 Jul 2013 18:59:11 +0000 (20:59 +0200)] 
DDF: ddf_activate_spare: Add RAID10 code

The check for degraded array is a bit more complex for RAID10.
Fixing it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: find_vdcr: fix minor bug in debug message
mwilck@arcor.de [Thu, 25 Jul 2013 18:59:10 +0000 (20:59 +0200)] 
DDF: find_vdcr: fix minor bug in debug message

This code could find disk -1. Fixed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoChange version to 3.3-rc2 mdadm-3.3-rc2
NeilBrown [Thu, 25 Jul 2013 07:54:54 +0000 (17:54 +1000)] 
Change version to 3.3-rc2

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd test for --replace handling.
NeilBrown [Wed, 24 Jul 2013 00:40:26 +0000 (10:40 +1000)] 
Add test for --replace handling.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: fix typo in error for "--with" handling
NeilBrown [Wed, 24 Jul 2013 05:32:26 +0000 (15:32 +1000)] 
Manage: fix typo in error for "--with" handling

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoImprove revert tests
NeilBrown [Wed, 24 Jul 2013 00:40:26 +0000 (10:40 +1000)] 
Improve revert tests

1/ perform revert-grow on more metadata versions
2/ add revert-inplace.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper0/1: fix typo in error messages.
NeilBrown [Wed, 24 Jul 2013 02:22:58 +0000 (12:22 +1000)] 
super0/1: fix typo in error messages.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: don't hold array open while waiting for reshape.
NeilBrown [Wed, 24 Jul 2013 02:21:10 +0000 (12:21 +1000)] 
Grow: don't hold array open while waiting for reshape.

If we will need to change array level when a reshape completes, a copy
of mdadm waits in the background.
Currently this copy hold the device (/dev/mdX) open.  This prevents
the array from being stopped.

So close the file descriptor and re-open after the reshape completes.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: update data_size when performing "revert-reshape".
NeilBrown [Wed, 24 Jul 2013 00:21:27 +0000 (10:21 +1000)] 
super1: update data_size when performing "revert-reshape".

The "data_size" is with respect to "data_offset".  When the kernel
changes "data_offset" it modifies "data_size" to match - see
md_finish_reshape() in the kernel.

So when mdadm switches the data_offset for the new data_offset, it
must update data_size correspondingly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper-ddf: allow mdassemble to compile.
NeilBrown [Tue, 23 Jul 2013 04:00:56 +0000 (14:00 +1000)] 
super-ddf: allow mdassemble to compile.

Just add/move some #ifdefs and move some code.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: convert big-endian __u16 to be16 type
mwilck@arcor.de [Sun, 21 Jul 2013 17:28:22 +0000 (19:28 +0200)] 
DDF: convert big-endian __u16 to be16 type

Last step of endian-safe recoding. This requires also bit
operations.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: convert big-endian __u64 to be64 type
mwilck@arcor.de [Sun, 21 Jul 2013 17:28:21 +0000 (19:28 +0200)] 
DDF: convert big-endian __u64 to be64 type

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: convert big endian to be32 type
mwilck@arcor.de [Sun, 21 Jul 2013 17:28:20 +0000 (19:28 +0200)] 
DDF: convert big endian to be32 type

Part 2 of endianness-safe conversion

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: add endian-safe typedefs
mwilck@arcor.de [Sun, 21 Jul 2013 17:28:19 +0000 (19:28 +0200)] 
DDF: add endian-safe typedefs

This adds typedefs for big-endian numbers. This will hopefully
reduce the number of endianness bugs I make.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-geometry: new unit test
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:15 +0000 (21:04 +0200)] 
tests/10ddf-geometry: new unit test

Test various RAID geometries, creation and deletion of subarrays

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotest: increase number of devices to 13
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:14 +0000 (21:04 +0200)] 
test: increase number of devices to 13

extended DDF/RAID10 tests need 6 disks for DDF.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/10ddf-create: create RAID5 first
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:13 +0000 (21:04 +0200)] 
tests/10ddf-create: create RAID5 first

Let the first created array be RAID5 rather than RAID0. This makes
the test harder than before, because everything after the first
Create has do be done indirectly through mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: getinfo_super_ddf_bvd: fix offset calculation for SVDs
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:12 +0000 (21:04 +0200)] 
DDF: getinfo_super_ddf_bvd: fix offset calculation for SVDs

Fix a bug that caused the wrong conf record to be used to derive
data offset and size on secondary RAID (RAID10).

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: kill_subarray_ddf: fix case without mdmon running
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:11 +0000 (21:04 +0200)] 
DDF: kill_subarray_ddf: fix case without mdmon running

When mdmon wasn't runnning, meta data wasn't committed to disk.
Fixed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: err_bad_md_layout: fix return value
mwilck@arcor.de [Fri, 19 Jul 2013 19:04:10 +0000 (21:04 +0200)] 
DDF: err_bad_md_layout: fix return value

This function must use -1 to indicate failure. Fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: factor out writing super block to single disk
mwilck@arcor.de [Thu, 18 Jul 2013 18:49:01 +0000 (20:49 +0200)] 
DDF: factor out writing super block to single disk

Factor out single disk from __write_init_super_ddf to a new function
_write_super_to_disk. Use this function in store_super_ddf.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: make "null_aligned" a static buffer
mwilck@arcor.de [Thu, 18 Jul 2013 18:49:00 +0000 (20:49 +0200)] 
DDF: make "null_aligned" a static buffer

Use a static buffer for this "zero page". This makes it easier
to factor out the header writing code.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: increase seq number in ddf_set_updates_pending
mwilck@arcor.de [Thu, 18 Jul 2013 18:48:59 +0000 (20:48 +0200)] 
DDF: increase seq number in ddf_set_updates_pending

Increase seq number only when there's actually a metadata change.
This is better then increasing it at every write.

This also fixes another endianness bug.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMerge commit '956a13fb850321bed8568dfa8692c0c323538d7c'
NeilBrown [Mon, 15 Jul 2013 01:39:50 +0000 (11:39 +1000)] 
Merge commit '956a13fb850321bed8568dfa8692c0c323538d7c'

10 years agotest: allow resync/reshape etc to go faster.
NeilBrown [Thu, 11 Jul 2013 03:16:40 +0000 (13:16 +1000)] 
test: allow resync/reshape etc to go faster.

Whenever we "check wait" - make the resync process go at full speed.

Also allow script to adjust it manually.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: pass INVALID_SECTORS to reshape_array, not 0.
NeilBrown [Thu, 11 Jul 2013 02:42:12 +0000 (12:42 +1000)] 
Grow: pass INVALID_SECTORS to reshape_array, not 0.

'0' means 'make it 0', which isn't what we want here.
We want 'leave it unchanged'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIMSM: fix wait_for_reshape_imsm
NeilBrown [Wed, 10 Jul 2013 23:48:25 +0000 (09:48 +1000)] 
IMSM: fix wait_for_reshape_imsm

This was waiting on "reshape_position" which doesn't
get update events.
Before sysfs_wait was introduced, the code to wait didn't
wait at all, so it spun.
With sysfs_wait, it would wait forever.

Change to wait in sync_completed which does get events.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoalign spelling of “RAID” and RAID levels
Christoph Anton Mitterer [Wed, 10 Jul 2013 20:42:46 +0000 (22:42 +0200)] 
align spelling of “RAID” and RAID levels

* Aligned the spelling of “RAID” to use captial letters in all places.
* Aligned the spelling of the RAID level names (LINEAR, RAID1, …) to use capital
  letters in all places, except for the string “faulty” in places where not the
  RAID level was meant.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
10 years agoStop: fix up synchronising end of reshape to good boundary.
NeilBrown [Tue, 9 Jul 2013 01:46:54 +0000 (11:46 +1000)] 
Stop: fix up synchronising end of reshape to good boundary.

If we stop too soon after reshape starts (probably only during
testing), we can get confused by the status of the reshape.
If that might be happening - sleep a bit longer.

Also allow for reshape going unusually slowly (again, probably only
during testing).

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: use mdstat_wait to wait for delayed reshape.
NeilBrown [Wed, 10 Jul 2013 01:10:54 +0000 (11:10 +1000)] 
Grow: use mdstat_wait to wait for delayed reshape.

Having a fix time for a wait is clumsy and can make us
wait much too long.
So use mdstat_wait and keep the mdstat_fd open.
This requires an 'mdstat_close' so it doesn't stay open
forever.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDon't set 'hold' option for mdstat_read if not needed.
NeilBrown [Wed, 10 Jul 2013 01:02:10 +0000 (11:02 +1000)] 
Don't set 'hold' option for mdstat_read if not needed.

We only need 'hold' if we want to mdstat_wait for a change.
These two callers don't care about a change, so they shouldn't
use the 'hold' flag.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF load headers: if primary is invalid, don't check fields.
NeilBrown [Wed, 10 Jul 2013 00:47:22 +0000 (10:47 +1000)] 
DDF load headers: if primary is invalid, don't check fields.

Currently we compare fields between primary and secondary
superblocks, before we check if the primary is even valid.
This is a bit backwards, so reverse it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_process_update: Fix updates for SVDs
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:48 +0000 (23:50 +0200)] 
DDF: ddf_process_update: Fix updates for SVDs

The "indirect" code path for adding VDs was not working correctly
for secondary RAID level. The "other BVDs" were not transmitted
to mdmon. Thus mdmon wouldn't build up correct information, and
RAID creation would fail when mdmon was already running on the container.

This patch fixes this.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_process_update: some more debug messages
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:47 +0000 (23:50 +0200)] 
DDF: ddf_process_update: some more debug messages

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: guid_str: more readable output
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:46 +0000 (23:50 +0200)] 
DDF: guid_str: more readable output

Print ASCII characters as ASCII

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_process_update: add debug messages fore adding VDs
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:45 +0000 (23:50 +0200)] 
DDF: ddf_process_update: add debug messages fore adding VDs

Add some debug messages for the DDF_VIRTR_RECORDS_MAGIC case.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: add debug message in add_super_ddf_bvd
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:44 +0000 (23:50 +0200)] 
DDF: add debug message in add_super_ddf_bvd

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: fix endianness of refnum in debug messages
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:43 +0000 (23:50 +0200)] 
DDF: fix endianness of refnum in debug messages

This makes it easier to match the debug output to existing
structures.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: getinfo_super_ddf_bvd: fix raid_disk calculation
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:42 +0000 (23:50 +0200)] 
DDF: getinfo_super_ddf_bvd: fix raid_disk calculation

The return value of disk.raid_disk may be wrong.
The old code was using raiddisk, which is only valid with auto
layout. This leads to errors when arrays are created with
specified disks and mdmon is already running, like this:

mdadm -CR /dev/md/container -n5 $d1 $d2 $d3 $d4 $d5
mdadm -CR /dev/md/r5 -n5 -l5 /dev/md/container -z 5000
mdadm -CR /dev/md/r1 -n2 -l1 $d1 $d2
  => resulting array will use wrong disks

This patch fixes that.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: getinfo_super_ddf_bvd: identify disk by refnum
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:41 +0000 (23:50 +0200)] 
DDF: getinfo_super_ddf_bvd: identify disk by refnum

Use refnum rather than raiddisk for identifying the physical disk.
raiddisk should only be used for auto-layout.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: implement kill_subarray
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:40 +0000 (23:50 +0200)] 
DDF: implement kill_subarray

Implement kill_subarray, for mdmon running and not running.

The way Kill_subarray() is implemented, this requires that the
DDF layer uses "currentconf" to remember the last subarray
queried with container_content(), and use it as the one to kill.
I don't like this much but IMSM does it the same way.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: write_init_super_ddf: don't zero superblocks for subarrays
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:39 +0000 (23:50 +0200)] 
DDF: write_init_super_ddf: don't zero superblocks for subarrays

commit d682f344 inserted this call to "Kill" in write_init_super_ddf:

    "Matching the functionality already in super0 and super1, when
    we first create a container, remove any other recognisable metadata to
    ensure it doesn't cause confusion."

But we should do this only at first container creation, not when
subarrays are created later.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMonitor: Don't write metadata in inactive array state
mwilck@arcor.de [Mon, 8 Jul 2013 21:50:38 +0000 (23:50 +0200)] 
Monitor: Don't write metadata in inactive array state

The kernel docs state that meta data is never written in states
clear, inactive, suspended, readonly, and read_auto.
Why should this be different for containers?

We need to write metadata when the array is disabled, though.
Tested with the DDF (10*) and IMSM (9*) tests, works.

Signed-off-by: NeilBrown <neilb@suse.de>