11 years agoimsm: add support for checkpointing via 'curr_migr_unit'
Dan Williams [Tue, 22 Dec 2009 00:54:32 +0000 (17:54 -0700)] 
imsm: add support for checkpointing via 'curr_migr_unit'

Unlike native md checkpointing some data about the geometry and type of
the migration process is coded into curr_migr_unit.  Provide logic to
convert between md/{resync_start|recovery_start} and imsm/curr_migr_unit.

Signed-off-by: Dan Williams <>
11 years agoSupport external metadata recovery-resume
Dan Williams [Mon, 21 Dec 2009 19:51:57 +0000 (12:51 -0700)] 
Support external metadata recovery-resume

Minimal changes needed to permit reassembling partially recovered
external metadata arrays.  The biggest logical change is that
->container_content() can now surface partially rebuilt members rather
than omitting them from the disk list.

Signed-off-by: Dan Williams <>
11 years agoTeach sysfs_add_disk() callers to use ->recovery_start versus 'insync' parameter
Dan Williams [Mon, 21 Dec 2009 18:26:21 +0000 (11:26 -0700)] 
Teach sysfs_add_disk() callers to use ->recovery_start versus 'insync' parameter

Also fixup 'in_sync' versus 'insync' typo.

Signed-off-by: Dan Williams <>
11 years agoIntroduce MaxSector
Dan Williams [Mon, 21 Dec 2009 17:23:26 +0000 (10:23 -0700)] 
Introduce MaxSector

Replace occurrences of ~0ULL to make it clear we are talking about maximal
resync/recovery position.

Signed-off-by: Dan Williams <>
11 years agoAdd scaffolding for handling md/dev-XXX/recovery_start
Dan Williams [Mon, 21 Dec 2009 17:06:14 +0000 (10:06 -0700)] 
Add scaffolding for handling md/dev-XXX/recovery_start

Prepare the code to handle saving a recovery checkpoint.

Signed-off-by: Dan Williams <>
11 years agomdmon: cleanup resync_start
Dan Williams [Mon, 14 Dec 2009 19:57:55 +0000 (12:57 -0700)] 
mdmon: cleanup resync_start

We don't need to sprinkle reads of this attribute all over the place,
just once at the entry of read_and_act().  Also, the mdinfo structure
for the array already has a 'resync_start' member, so just reuse that.
Finally, rename get_resync_start() to read_resync_start to make it
consistent with the other sysfs accessors in monitor.c.

Signed-off-by: Dan Williams <>
11 years agomdmon: cleanup manage_member() leak
Dan Williams [Sat, 12 Dec 2009 21:10:01 +0000 (14:10 -0700)] 
mdmon: cleanup manage_member() leak

free() the results of activate_spare().

Signed-off-by: Dan Williams <>
11 years agoimsm: cleanup print_imsm_dev()
Dan Williams [Sat, 12 Dec 2009 20:57:28 +0000 (13:57 -0700)] 
imsm: cleanup print_imsm_dev()

When printing the migration state there is no need to print "migrating".
The fact that the state is non-idle should be enough indication.

Signed-off-by: Dan Williams <>
11 years agoutil: fix devnum2devname for devnum == 0
Dan Williams [Sat, 12 Dec 2009 20:57:28 +0000 (13:57 -0700)] 
util: fix devnum2devname for devnum == 0

devnum 0 is md0 no md_d-1

Signed-off-by: Dan Williams <>
11 years agoimsm: fix thunderdome segfault
Dan Williams [Sat, 12 Dec 2009 20:57:25 +0000 (13:57 -0700)] 
imsm: fix thunderdome segfault

disk_list_get() can return NULL if:
1/ A formerly missing disk is re-added
2/ The original array has not been rebuilt, so the family number of the
   missing disk still matches
3/ The metadata record of the in-sync disks are read before the missing

This will result in the missing disk not adding its own serial number to
the disk_list, only its truncated value will be present.

Signed-off-by: Dan Williams <>
11 years agoimsm: fix spare promotion
Dan Williams [Thu, 10 Dec 2009 22:03:34 +0000 (15:03 -0700)] 
imsm: fix spare promotion

When associating a spare take on the target's metadata version number to
satisfy future compare_super checks.

Signed-off-by: Dan Williams <>
11 years agoimsm: honor orom constraints for auto-layout
Dan Williams [Thu, 10 Dec 2009 22:03:31 +0000 (15:03 -0700)] 
imsm: honor orom constraints for auto-layout

Factor out the orom checking bits to validate_geometry_imsm_orom() and
share it between validate_geometry_imsm_volume() and the entry path to

Signed-off-by: Dan Williams <>
11 years agoimsm: catch attempt to auto-layout zero-length arrays
Dan Williams [Tue, 1 Dec 2009 23:04:06 +0000 (16:04 -0700)] 
imsm: catch attempt to auto-layout zero-length arrays

When -z is omitted reserve_space() looks to satisfy a zero length
allocation which lo and behold is equal to the amount of free space on a
full disk.  So, catch maxsize == 0 and simplify the return value from
merge_extents() to always equal amount of free space (no benefit to
having a special case ~0ULL == error).

Signed-off-by: Dan Williams <>
11 years agoGrow: avoid truncation error when checking size of array.
NeilBrown [Thu, 26 Nov 2009 03:19:26 +0000 (14:19 +1100)] 
Grow: avoid truncation error when checking size of array.

array.size is only 32bit so it is not safe to multiply it
up before casting to (long long).
Actually, we shouldn't be using array.size here at all, but that
will get fixed in a subsequent patch.

Reported-by: Andrew Burgess <>
Signed-off-by: NeilBrown <>
11 years agoVarious fixes for --kill
NeilBrown [Tue, 24 Nov 2009 05:32:01 +0000 (16:32 +1100)] 
Various fixes for --kill

- When --kill-superblock is used with --metadata, find every
  different superblock if there are several and kill them all.
- When creating a new array, kill off any old metadata.  The code
  to do this was already present but has become broken over time.

Signed-off-by: NeilBrown <>
11 years agoRelease mdadm-3.1.1 devel-3.1 mdadm-3.1.1
NeilBrown [Thu, 19 Nov 2009 05:10:58 +0000 (16:10 +1100)] 
Release mdadm-3.1.1

bugfix over 3.1, but changes to some significant defaults.

11 years agoMerge branch 'master' into devel-3.1
NeilBrown [Thu, 19 Nov 2009 05:10:07 +0000 (16:10 +1100)] 
Merge branch 'master' into devel-3.1

11 years agoAssemble: fix testing of 'verbose' flag.
NeilBrown [Thu, 19 Nov 2009 04:55:59 +0000 (15:55 +1100)] 
Assemble: fix testing of 'verbose' flag.

The 'verbose' flag can be negative, meaning 'quiet'.
So never check for != 0.

Signed-off-by: NeilBrown <>
11 years agoCreate: warn when creating a raid1 using default metadata.
NeilBrown [Thu, 19 Nov 2009 04:54:49 +0000 (15:54 +1100)] 
Create: warn when creating a raid1 using default metadata.

As a some/most bootloaders don't understand md metadata, it might
be difficult to boot off an array with the default 1.0 metadata.
So if this is used for a RAID1, ask for confirmation.

Signed-Off-By: NeilBrown <>
11 years agoDon't silently map --re-add to --add
NeilBrown [Tue, 17 Nov 2009 02:15:34 +0000 (13:15 +1100)] 
Don't silently map --re-add to --add

As --add can destroy important data on a disk, and
--re-add is not suppose to, it is wrong to silently
try --add if --re-add fails.
So print a message and abort instead.

Signed-off-by: NeilBrown <>
11 years agoImprove error messages when metadata handler does not support request.
NeilBrown [Tue, 17 Nov 2009 02:15:34 +0000 (13:15 +1100)] 
Improve error messages when metadata handler does not support request.

->validate_geometry is called to validate overall parameters,
and to validate each individual device.
If it ever fails, it needs to report the reason, as common code
cannot possible know.

Signed-off-by: NeilBrown <>
11 years agoSet default bitmap-chunksize for internal bitmaps to at least 64Meg
NeilBrown [Tue, 17 Nov 2009 02:15:34 +0000 (13:15 +1100)] 
Set default bitmap-chunksize for internal bitmaps to at least 64Meg

A small bitmap-chunksize hurts performance without helping
resync speed much - particularly on internal bitmaps.

So set the default to at least 64Meg.

Signed-off-by: NeilBrown <>
11 years agoGrow: various fixes to recent breakages.
NeilBrown [Tue, 17 Nov 2009 02:15:33 +0000 (13:15 +1100)] 
Grow: various fixes to recent breakages.

- I forgot to write the send backup-super-block on spares.
- I wasn't adding the data_offset to an offset

Signed-off-by: NeilBrown <>
11 years agoChange default metadata from 0.90 to 1.1
NeilBrown [Tue, 17 Nov 2009 02:15:32 +0000 (13:15 +1100)] 
Change default metadata from 0.90 to 1.1

1.1 is more flexible in a number of ways and is safer.
0.90 is still fully supported.
1.0 should possibly be used for RAID1 arrays that you
want to boot off, depending on your boot loader.

Signed-off-by: NeilBrown <>
11 years agoIncrease default chunk size to 512K
NeilBrown [Tue, 17 Nov 2009 02:08:55 +0000 (13:08 +1100)] 
Increase default chunk size to 512K

This seems more appropriate for current (and recent) model drives than
64K is still the default for '--build' as changing that could corrupt
64K is also the default rounding for 'linear' on kernels older than

Signed-off-by: NeilBrown <>
11 years agoReplace all relevant occurrences of -4 with LEVEL_MULTIPATH
NeilBrown [Tue, 17 Nov 2009 01:31:12 +0000 (12:31 +1100)] 
Replace all relevant occurrences of -4 with LEVEL_MULTIPATH

Also -1 -> LEVEL_LINEAR.

Signed-off-by: NeilBrown <>
11 years agoAssemble/super0: allow non-in-sync devices to be assembled without complaint.
NeilBrown [Tue, 17 Nov 2009 01:31:10 +0000 (12:31 +1100)] 
Assemble/super0: allow non-in-sync devices to be assembled without complaint.

Other metadata formats already did not worry about whether 'sync' was
missing or not.  super0 needs that now, but only for 0.91 metadata
that is undergoing reshape.

Signed-off-by: NeilBrown <>
11 years agoAssemble: include ACTIVE but not in-sync devices as non-spares.
NeilBrown [Tue, 17 Nov 2009 01:30:54 +0000 (12:30 +1100)] 
Assemble: include ACTIVE but not  in-sync devices as non-spares.

Previously such things did not exist: ACTIVE and SYNC were either both
set or both clear.   Recent changes with reshape means that a device
can be ACTIVE but not yet fully in-sync, so they need to be handled
and included in the array as active devices.

Signed-off-by: NeilBrown <>
11 years agoGrow: data_offset is in sectors, offsets[] is in bytes - convert
NeilBrown [Mon, 16 Nov 2009 00:06:44 +0000 (11:06 +1100)] 
Grow: data_offset is in sectors, offsets[] is in bytes - convert

Another missed sectors->bytes conversion.

Signed-off-by: NeilBrown <>
11 years agoGrow: do not allow size changes with other changes.
NeilBrown [Fri, 6 Nov 2009 06:26:47 +0000 (17:26 +1100)] 
Grow: do not allow size changes with other changes.

A change the reduces the size of an array always happens
before any other change.  So it can cause data to be lost.
By themselves these changes are reversible.  But once another
change has started, the data would be permanently lost.
So recommend data integrity be checked between a size change
and any other change.

Signed-off-by: NeilBrown <>
11 years agoGrow: goto release rather than just return
NeilBrown [Fri, 6 Nov 2009 04:22:14 +0000 (15:22 +1100)] 
Grow: goto release rather than just return

otherwise we exit with the array frozen.

Signed-off-by: NeilBrown <>
11 years agoGrow: restrict to 2.6.32
NeilBrown [Fri, 6 Nov 2009 04:19:39 +0000 (15:19 +1100)] 
Grow: restrict to 2.6.32

2.6.31 has a bug which can lead to unsafe reshaping.
So only allow a reshape with 2.6.32.
When the required fixed get into 2.6.31.y, this can be relaxed

Signed-off-by: NeilBrown <>
11 years agoGrow: use large block count and make sure stripe cache can hold it.
NeilBrown [Fri, 6 Nov 2009 03:48:10 +0000 (14:48 +1100)] 
Grow: use large block count and make sure stripe cache can hold it.

The bigger the backup is, the fast it goes to some extend.

16Meg is fairly arbitrary

Signed-off-by: NeilBrown <>
11 years agoGrow: get component_size before using it.
NeilBrown [Fri, 6 Nov 2009 03:18:49 +0000 (14:18 +1100)] 
Grow: get component_size before using it.

We were using ->component_size while it hadn't been set.
This effectively meant that 'blocks' wasn't multiplied by
16 and reshape was even slower than it should have been.

Signed-off-by: NeilBrown <>
11 years agoGrow: handle array going degraded during reshape.
NeilBrown [Fri, 6 Nov 2009 02:56:05 +0000 (13:56 +1100)] 
Grow: handle array going degraded during reshape.

If an array goes degraded during reshape, we need to
adjust the devices we read from so as not to back up
stale data.

Signed-off-by: NeilBrown <>
11 years agoGrow: restore backup to proper location.
NeilBrown [Fri, 6 Nov 2009 02:38:43 +0000 (13:38 +1100)] 
Grow: restore backup to proper location.

The 'arraystart' is in sectors while restore_stripes requires
bytes, so we need a conversion.

Without this, backups get restored to the wrong offset.

Reported-by: "KueiHuan Chen" <>
Signed-off-by: NeilBrown <>
11 years agovol_id was removed by the udev upstream maintainer in May 2009.
Marco d'Itri [Wed, 28 Oct 2009 23:14:43 +0000 (10:14 +1100)] 
vol_id was removed by the udev upstream maintainer in May 2009.

One should use
  /sbin/blkid -o udev -p ...
(from util-linux >> 2.16) instead of
  vol_id --export ...

Author: Marco d'Itri <>
Reviewed-by: martin f. krafft <>
Signed-off-by: NeilBrown <>
11 years agoRemove bogus warnings from man page.
NeilBrown [Wed, 28 Oct 2009 23:11:01 +0000 (10:11 +1100)] 
Remove bogus warnings from man page.

LANG=C man --warnings -l mdadm.8 > /dev/null

complains that '.XX' is an invalid macro.
This is not correct.  The sequence

   .ig XX
   anything can go here

is correct and is ignored (see 'info groff' and the 'ig' index

However the same can be achieved with
   anything can go there

and this produces no warnings, so use that instead.

Signed-off-by: NeilBrown <>
11 years agoDetail: report new-layout for RAID6 arrays
NeilBrown [Wed, 28 Oct 2009 23:02:24 +0000 (10:02 +1100)] 
Detail: report new-layout for RAID6 arrays

We were only reporting it for RAID5 and RAID10.

Signed-off-by: NeilBrown <>
11 years agoRelease 3.1 mdadm-3.1
NeilBrown [Thu, 22 Oct 2009 03:07:05 +0000 (14:07 +1100)] 
Release 3.1

New functionality in --grow.

Signed-off-by: NeilBrown <>
11 years agoMerge branch 'master' into devel-3.1
NeilBrown [Thu, 22 Oct 2009 02:57:54 +0000 (13:57 +1100)] 
Merge branch 'master' into devel-3.1

11 years agoRelease 3.0.3 mdadm-3.0.3
NeilBrown [Thu, 22 Oct 2009 01:05:22 +0000 (12:05 +1100)] 
Release 3.0.3

Signed-off-by: NeilBrown <>
11 years agoMerge branch 'master' into devel-3.1
NeilBrown [Thu, 22 Oct 2009 00:13:13 +0000 (11:13 +1100)] 
Merge branch 'master' into devel-3.1

11 years agoFree some malloced memory that wasn't being freed.
NeilBrown [Thu, 22 Oct 2009 00:00:56 +0000 (11:00 +1100)] 
Free some malloced memory that wasn't being freed.

As mdadm is normally a short-lived program it isn't always necessary
to free memory that was allocated, as the 'exit()' call will
automatically free everything.  But it is more obviously correct if
the 'free' is there.
So this patch add a few calls to 'free'

Signed-off-by: NeilBrown <>
11 years agoGrow: update backup-metadata mtime every time we write it.
NeilBrown [Wed, 21 Oct 2009 23:42:06 +0000 (10:42 +1100)] 
Grow: update backup-metadata mtime every time we write it.

Originally the backup-metadata was only written once at the
start of a raid5 reshape that made the array bigger.  So we only
set the mtime once.

Now that we can be writing metadata continually during an in-place
reshape, we need to update the mtime more often.

Also, allow the metadata mtime to be slightly in advance of the
array mtime.  Normally the difference will be less than a second,
so 10 minutes should be plenty.  This guards against an old backup
file being used to restart an array.  but starting two reshapes in the
10 minutes is sufficiently unlikely, and the possibility of an
accident is already sufficiently small, that 10 minutes is probably

Thanks to Guy Martin <> for discovering and
reporting that .mtime wasn't being updated properly.

Signed-off-by: NeilBrown <>
11 years agoCompile fixes for mdassemble
NeilBrown [Tue, 20 Oct 2009 05:53:43 +0000 (16:53 +1100)] 
Compile fixes for mdassemble

Signed-off-by: NeilBrown <>
11 years agoGrow: reject raid-disks reduction in RAID5 etc before 2.6.32
NeilBrown [Tue, 20 Oct 2009 05:36:03 +0000 (16:36 +1100)] 
Grow: reject raid-disks reduction in RAID5 etc before 2.6.32

2.6.31 has some bugs with restarting a RAID5 reduction, so
refuse to try unless at least 2.6.32.

Signed-off-by: NeilBrown <>
11 years agoAssemble: print more verbose messages about restarting a reshape
NeilBrown [Tue, 20 Oct 2009 05:23:45 +0000 (16:23 +1100)] 
Assemble: print more verbose messages about restarting a reshape

Signed-off-by: NeilBrown <>
11 years agoAdd missing 'continue' in Grow_restart.
NeilBrown [Tue, 20 Oct 2009 04:36:49 +0000 (15:36 +1100)] 
Add missing 'continue' in Grow_restart.

Thus we weren't checking the uuid properly.

Signed-off-by: NeilBrown <>
11 years agosuper-intel: Fix compilation of mdassemble.
NeilBrown [Tue, 20 Oct 2009 02:50:23 +0000 (13:50 +1100)] 
super-intel:  Fix compilation of mdassemble.

Signed-off-by: NeilBrown <>
11 years agotestreshape5 fixes.
NeilBrown [Mon, 19 Oct 2009 21:02:53 +0000 (08:02 +1100)] 
testreshape5 fixes.

We seem to need a 'udevadm settle', and possibly the 'sync'..

Signed-off-by: NeilBrown <>
11 years agotests/imsm: allow for rounding of array size.
NeilBrown [Fri, 16 Oct 2009 06:57:28 +0000 (17:57 +1100)] 
tests/imsm:  allow for rounding of array size.

IMSM rounds array size to a multiple of 1024K, so our tests must
assume this.

Signed-off-by: NeilBrown <>
11 years agomdopen: only use 'dev' as chosen name if it is a full path.
NeilBrown [Mon, 19 Oct 2009 06:11:15 +0000 (17:11 +1100)] 
mdopen: only use 'dev' as chosen name if it is a full path.

Otherwise using names like "r0" causes problem.  They are
handled sufficiently by other paths in the code.

Signed-off-by: NeilBrown <>
11 years agoAssemble: handle container members better
NeilBrown [Mon, 19 Oct 2009 06:08:04 +0000 (17:08 +1100)] 
Assemble: handle container members better

When looking for a specific member, don't accept a
different member, but step on to the next one.

Signed-off-by: NeilBrown <>
11 years agoAssemble: print verbose messages when finding members in containers
NeilBrown [Mon, 19 Oct 2009 06:04:12 +0000 (17:04 +1100)] 
Assemble: print verbose messages when finding members in containers

.. so that "-Av" gives more hints at what is going on.

Signed-off-by: NeilBrown <>
11 years agoDetail: list containers before members.
NeilBrown [Mon, 19 Oct 2009 06:00:52 +0000 (17:00 +1100)] 
Detail: list containers before members.

To allow "--assemble --scan" to have a chance, list
containers before members in --detail --scan output.

Signed-off-by: NeilBrown <>
11 years agotest/ddf: don't insist that mdadm.conf is always in the same order.
NeilBrown [Mon, 19 Oct 2009 05:58:38 +0000 (16:58 +1100)] 
test/ddf:  don't insist that mdadm.conf is always in the same order.

When created by different process, the order could reasonably
be different.  So sort before compare

Signed-off-by: NeilBrown <>
11 years agotest/raid6integ: correct type
NeilBrown [Mon, 19 Oct 2009 05:57:16 +0000 (16:57 +1100)] 
test/raid6integ: correct type

ddf-zero-restart was misspelled.

Signed-off-by: NeilBrown <>
11 years agotest: udev-settle before testing device.
NeilBrown [Mon, 19 Oct 2009 05:56:13 +0000 (16:56 +1100)] 
test: udev-settle before testing device.

I think we sometime get way ahead of udev and devices disappear
and appear almost at random.  So add some settling.

Signed-off-by: NeilBrown <>
11 years agomdadm(8): fix spurious space after -e header
Mike Frysinger [Sun, 4 Oct 2009 00:34:55 +0000 (20:34 -0400)] 
mdadm(8): fix spurious space after -e header

Signed-off-by: Mike Frysinger <>
Signed-off-by: NeilBrown <>
11 years agoMonitor: add option to specify rebuild increments
Zdenek Behan [Mon, 19 Oct 2009 02:13:58 +0000 (13:13 +1100)] 
Monitor: add option to specify rebuild increments

ie. the percent increments after which RebuildNN event is generated

This is particulary useful when using --program option, rather than
(only) syslog for alerts.

Signed-off-by: Zdenek Behan <>
Signed-off-by: NeilBrown <>
11 years agomdmon: lock current memory as well as future memory.
NeilBrown [Mon, 19 Oct 2009 02:04:16 +0000 (13:04 +1100)] 
mdmon: lock current memory as well as future memory.

mlockall(MCL_FUTURE) only locks mappings that have not yet
been created.  To lock all memory used by the process, we need

Signed-off-by: NeilBrown <>
11 years agoMerge git://
NeilBrown [Mon, 19 Oct 2009 01:52:58 +0000 (12:52 +1100)] 
Merge git://

11 years agotests/imsm: allow for rounding of array size.
NeilBrown [Fri, 16 Oct 2009 06:57:28 +0000 (17:57 +1100)] 
tests/imsm:  allow for rounding of array size.

IMSM rounds array size to a multiple of 1024K, so our tests must
assume this.

Signed-off-by: NeilBrown <>
11 years agoTest different r5/r6 layouts.
NeilBrown [Fri, 16 Oct 2009 06:50:07 +0000 (17:50 +1100)] 
Test different r5/r6 layouts.

Make sure kernel and restripe agree on all different layouts.

Signed-off-by: NeilBrown <>
11 years agorestripe: fix assignment of raid6 blocks for syndrome calculation.
NeilBrown [Fri, 16 Oct 2009 06:50:06 +0000 (17:50 +1100)] 
restripe: fix assignment of raid6 blocks for syndrome calculation.

Particularly for the _6 style.

Signed-off-by: NeilBrown <>
11 years agoHandle negative delta_disks in super0 and super1.
NeilBrown [Fri, 16 Oct 2009 06:43:54 +0000 (17:43 +1100)] 
Handle negative delta_disks in super0 and super1.

Signed-off-by: NeilBrown <>
11 years agoGrow_restart to handle reducing number of devices in an array.
NeilBrown [Fri, 16 Oct 2009 06:43:51 +0000 (17:43 +1100)] 
Grow_restart to handle reducing number of devices in an array.

FIXME this is wrong . what direction does reshape_position move?

If the device count in an array is shrinking, the critical
region is different so the tests need to be different when

Signed-off-by: NeilBrown <>
11 years agoGrow: don't make 'blocks' too large during in-place reshape.
NeilBrown [Fri, 16 Oct 2009 06:02:34 +0000 (17:02 +1100)] 
Grow: don't make 'blocks' too large during in-place reshape.

On small (test) arrays, multiplying by 16 can make the 'chunk' size
larger than half the array, which is a problem.

Signed-off-by: NeilBrown <>
11 years agomdmon: preserve socket over chroot
Dan Williams [Wed, 14 Oct 2009 00:37:02 +0000 (17:37 -0700)] 
mdmon: preserve socket over chroot

Connect to the monitor in the old namespace and use that connection for
WaitClean requests when stopping the victim mdmon instance.  This allows
ping_monitor() to work post chroot().

Cc: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agomdmon: exec(2) when the switchroot argument is not "/"
Dan Williams [Wed, 14 Oct 2009 00:08:33 +0000 (17:08 -0700)] 
mdmon: exec(2) when the switchroot argument is not "/"

Try to execute mdmon from the target namespace.  When used for initramfs
handovers we need to drop all references to the initramfs filesystem for
that memory to be freed.

Cc: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agomdmon: avoid writes in the startup path for mdmon on root arrays
Dan Williams [Wed, 14 Oct 2009 00:41:57 +0000 (17:41 -0700)] 
mdmon: avoid writes in the startup path for mdmon on root arrays

When killing a previous monitor be careful not to cause writes to the
filesystem until the reads necessary to get the monitor operational have

The code is already prepared for errors creating the pid and socket
files, so simply defer creation of these files until after the first
call to manage().

Cc: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agoDetail: export MD_UUID from mapfile
Dan Williams [Wed, 14 Oct 2009 00:41:57 +0000 (17:41 -0700)] 
Detail: export MD_UUID from mapfile

The load_super() from an mdadm --detail call may race against an mdmon
update.  When this happens the load_super sees an inconsistent metadata
block and returns an error.  The fallback path to use the map file
contents lacks uuid reporting, so provide __fname_from_uuid for
generically printing a uuid.

Reported-by: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agoimsm: regression test for prodigal array member scenario
Dan Williams [Wed, 14 Oct 2009 00:41:53 +0000 (17:41 -0700)] 
imsm: regression test for prodigal array member scenario

Provide a test to sanity check assembly and reassembly in the presence
of conflicting family number information.

Signed-off-by: Dan Williams <>
11 years agoimsm: add --update=uuid support
Dan Williams [Wed, 14 Oct 2009 00:41:53 +0000 (17:41 -0700)] 
imsm: add --update=uuid support

When disks have conflicting container memberships (same container ids
but incompatible member arrays) --update=uuid can be used to move
offenders to a new container id by changing 'orig_family_num'.

Note that this only supports random updates of the uuid as the actual
uuid is synthesized.  We also need to communicate the new
'orig_family_num' value to all disks involved in the update.  A new
field 'update_private' is added to struct mdinfo to allow this
information to be transmitted.

Signed-off-by: Dan Williams <>
11 years agoddf: prevent superblock being zeroed on --update
Dan Williams [Wed, 14 Oct 2009 00:41:53 +0000 (17:41 -0700)] 
ddf: prevent superblock being zeroed on --update

The full fix would be to support updating ddf metadata, but this minimal
fix just prevents the superblock from being zeroed when someone
inadvertently passes an unsupported --update option during assembly.

Reported-by: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agoimsm: fix/support --update
Dan Williams [Wed, 14 Oct 2009 00:41:53 +0000 (17:41 -0700)] 
imsm: fix/support --update

Fix init_super_imsm() to return an empty mpb when info == NULL, and
teach store_super_imsm() to simply write out the passed in mpb.


Reported-by: Hans de Goede <>
Signed-off-by: Dan Williams <>
11 years agoimsm: fix spare record writeout race
Dan Williams [Wed, 14 Oct 2009 00:41:53 +0000 (17:41 -0700)] 
imsm: fix spare record writeout race

imsm_activate_spare() in the manager thread may race against
write_super_imsm_spares() in the monitor thread.  Give
write_super_imsm_spares() its own private mpb buffer to prevent
confusing the manager.

This change uncovered cases where spares were not being assembled due to
a failed metadata version number check.  Spares can freely associate
across metadata version number, so reduce the scope of the version check
in the spare assembly case.

Signed-off-by: Dan Williams <>
11 years agorestripe: fix compile warning.
NeilBrown [Mon, 12 Oct 2009 06:00:23 +0000 (17:00 +1100)] 
restripe: fix compile warning.

Just a type cast...

Signed-off-by: NeilBrown <>
11 years agotest changelevel: add tests for changing degraded arrays.
NeilBrown [Mon, 12 Oct 2009 05:57:55 +0000 (16:57 +1100)] 
test changelevel: add tests for changing degraded arrays.

Signed-off-by: NeilBrown <>
11 years agorestripe : various fixed for RAID6 2-failure recovery.
NeilBrown [Mon, 12 Oct 2009 05:57:22 +0000 (16:57 +1100)] 
restripe : various fixed for RAID6 2-failure recovery.

Signed-off-by: NeilBrown <>
11 years agoTest level changes and related reshaping.
NeilBrown [Mon, 12 Oct 2009 05:57:18 +0000 (16:57 +1100)] 
Test level changes and related reshaping.

Signed-off-by: NeilBrown <>
11 years agoGrow: ignore error from final wait_backup
NeilBrown [Mon, 12 Oct 2009 05:55:19 +0000 (16:55 +1100)] 
Grow: ignore error from final wait_backup

The last time wait_backup is called, it might see reshape
finish and so return an error indicator.
But this is not an error, and we must go ahead and prepare
the array for full access.

Signed-off-by: NeilBrown <>
11 years agoGrow: make sure bsb2 is properly aligned
NeilBrown [Mon, 12 Oct 2009 05:55:12 +0000 (16:55 +1100)] 
Grow: make sure bsb2 is properly aligned

We do O_DIRECT io in bsb2, so it must be aligned
properly.  Easiest if it is static.

Signed-off-by: NeilBrown <>
11 years agotestreshape5 - add tests for RAID6
NeilBrown [Mon, 12 Oct 2009 05:55:05 +0000 (16:55 +1100)] 
testreshape5 - add tests for RAID6

.. to make sure our raid6 calculations are working.

Signed-off-by: NeilBrown <>
11 years agoMerge branch 'master' into devel-3.1
NeilBrown [Thu, 1 Oct 2009 06:58:40 +0000 (16:58 +1000)] 
Merge branch 'master' into devel-3.1


11 years agoFix null-dereference in set_member_info
NeilBrown [Thu, 1 Oct 2009 02:51:04 +0000 (12:51 +1000)] 
Fix null-dereference in set_member_info

set_member_info would try to dereference ->metadata_version, without
checking that it isn't NULL.

Signed-off-by: NeilBrown <>
11 years agoAdd missing space in "--detail --brief" output.
NeilBrown [Thu, 1 Oct 2009 02:38:31 +0000 (12:38 +1000)] 
Add missing space in "--detail --brief" output.

We need a space between the device name and the word "level"..

Signed-off-by: NeilBrown <>
11 years agoimsm: disambiguate family_num
Dan Williams [Wed, 30 Sep 2009 18:45:41 +0000 (11:45 -0700)] 
imsm: disambiguate family_num

This is a result of trawling through the Windows implementation to learn
the mechanism of how it disambiguates family_num.  It is a continuation
of commit 148acb7b "imsm: fix family number handling" which introduced a
regression when reassembling a container with stale disks and rebuilt

When rebuilding, a new family number is assigned to protect against the
"prodigal array member" problem.  It prevents a former family member
from returning to the system and causing a rebuild to go the wrong
direction.  However, this invalidates looking at the generation number to
determine the most up-to-date disk when comparing across family numbers.
Instead the assembly logic looks for agreement between a disk's local
family membership compared against a global list of all families in the
system.  Whenever a disk's local metadata does not match a family number
on the global list that family number is marked offline.

It is possible that this logic results in multiple incompatible but
valid family numbers existing in a container.  In this case mdadm.conf
cannot be consulted because it only records the uuid which is generated
from static fields in the metadata.  The metadata lacks the data needed
to disambiguate "local" versus "foreign".  The "foreign" array in this
case requires updating to change its container-id information
(orig_family_num), and possibly the member array names.

Signed-off-by: Dan Williams <>
11 years agoimsm: kill close() of component device
Dan Williams [Wed, 30 Sep 2009 18:44:38 +0000 (11:44 -0700)] 
imsm: kill close() of component device

None of the other formats close the passed in fd at load, and this
becomes a problem when trying to support --update where we need O_EXCL
protection across the entire operation.

Signed-off-by: Dan Williams <>
11 years agoimsm: cleanup disk status tests
Dan Williams [Mon, 28 Sep 2009 21:40:59 +0000 (14:40 -0700)] 
imsm: cleanup disk status tests

Add is_failed(), is_configured(), and is_spare() helpers to clean up
disk status flag testing.

Signed-off-by: Dan Williams <>
11 years agoRelease mdadm-3.0.2 mdadm-3.0.2
NeilBrown [Fri, 25 Sep 2009 08:19:07 +0000 (18:19 +1000)] 
Release mdadm-3.0.2
Just one bugfix.

11 years agosuper0: fix crash on assemble if homehost is not set.
NeilBrown [Fri, 25 Sep 2009 07:56:22 +0000 (17:56 +1000)] 
super0: fix crash on assemble if homehost is not set.

If homehost is not set - typically during early boot,
and assemble of v0.90 metadata arrays will crash.

Reported-by: PaweĊ‚ Sikora <>
Signed-off-by: NeilBrown <>
11 years agoFix raid6 error recovery in 'restripe' code.
NeilBrown [Fri, 25 Sep 2009 07:23:33 +0000 (17:23 +1000)] 
Fix raid6 error recovery in 'restripe' code.

Thanks to Matthias Urlichs for discovering and reporting this.

Signed-off-by: NeilBrown <>
11 years agoRelease mdadm-3.0.1 mdadm-3.0.1
NeilBrown [Fri, 25 Sep 2009 07:08:19 +0000 (17:08 +1000)] 
Release mdadm-3.0.1

Just bugfixes.

Signed-off-by: NeilBrown <>
11 years agotestreshape5 - flush devices between tests.
NeilBrown [Fri, 25 Sep 2009 06:57:01 +0000 (16:57 +1000)] 
testreshape5 - flush devices between tests.

We need to flush the block devices before reading different data.

Signed-off-by: NeilBrown <>
11 years agoMerge branch 'master' of git://
NeilBrown [Fri, 25 Sep 2009 04:11:11 +0000 (14:11 +1000)] 
Merge branch 'master' of git://

11 years agomdmon: fix freeing unallocated memory
Hans de Goede [Thu, 24 Sep 2009 13:52:06 +0000 (06:52 -0700)] 
mdmon: fix freeing unallocated memory

mdmon was creating a supertype struct with malloc, and thus not
necessarily getting zero-d memory.

This was causing it to segfault when called like this from the initrd:
/sbin/mdmon /proc/mdstat /sysroot

The problem was that  load_super_imsm would get called on the non-zero'd
super struct, whcih in turn calls free_super_imsm, which checks st->sb,
which should be zero but isn't and then starts freeing bogus memory.

Signed-off-by: Dan Williams <>
11 years agoimsm: clear CONFIGURED_DISK for failed drives
Dan Williams [Tue, 15 Sep 2009 18:35:28 +0000 (11:35 -0700)] 
imsm: clear CONFIGURED_DISK for failed drives

Synchronizing with what the Windows driver does.

Signed-off-by: Dan Williams <>
11 years agoimsm: kill USABLE_DISK flag
Dan Williams [Tue, 15 Sep 2009 18:35:28 +0000 (11:35 -0700)] 
imsm: kill USABLE_DISK flag

'USABLE_DISK' is not a 'persistent' status flag it is an internal status
flag used for the in memory representation of the disk in the Windows

Signed-off-by: Dan Williams <>