]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
9 years agoRelease mdadm-3.3.1 mdadm-3.3.1
NeilBrown [Thu, 5 Jun 2014 06:45:56 +0000 (16:45 +1000)] 
Release mdadm-3.3.1

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoMake sure "make everything" builds again.
NeilBrown [Thu, 5 Jun 2014 06:38:29 +0000 (16:38 +1000)] 
Make sure "make everything" builds again.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoIncremental: remove old devices when assembling in container.
NeilBrown [Thu, 5 Jun 2014 05:58:31 +0000 (15:58 +1000)] 
Incremental: remove old devices when assembling in container.

When assembling a native array we just give all devices to the kernel
and leave it to discard the 'old' ones (based on sequence/event
number).

For external/container arrays, mdadm needs to do that.

So in assemble_container_content, get list of current devices in
array and discard any that aren't in the 'content' given.
They must have been rejected by metadata manager.

If we cannot discard old devices the array must already be active, so
just leave it alone, but with a message.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoimsm: retry load_and_parse_mpb if we suspect mdmon has made modifications
Artur Paszkiewicz [Mon, 2 Jun 2014 13:02:59 +0000 (15:02 +0200)] 
imsm: retry load_and_parse_mpb if we suspect mdmon has made modifications

If the checksum verification fails in mdadm and mdmon is running, retry
the load to get a consistent snapshot of the mpb.

Based on db575f3b

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: Do not fork via systemd if freeze_reshape is set
Baldysiak, Pawel [Fri, 30 May 2014 14:40:11 +0000 (14:40 +0000)] 
Grow: Do not fork via systemd if freeze_reshape is set

Mdadm should not run 'grow-continue' unit file for container if
'--freeze-reshape' argument is passed. Otherwise it will be ignored,
and reshape will start anyway.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: Use 'forked' also for reshape_container in Grow_continue
Baldysiak, Pawel [Fri, 30 May 2014 14:39:28 +0000 (14:39 +0000)] 
Grow: Use 'forked' also for reshape_container in Grow_continue

Similar to commit 06e293d0970e36b1ed049b9d3ccb21a870e9d2eb
same thing should be done for reshape_container in Grow_continue

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDo not set default 'before.layout' when reshaping from RAID4 to RAID4
Baldysiak, Pawel [Fri, 30 May 2014 14:38:09 +0000 (14:38 +0000)] 
Do not set default 'before.layout' when reshaping from RAID4 to RAID4

Commit fdcad551e9a54c4aa8c4b63160b76e2c539a0441
brings some changes to reshape process.
Setting 'before.layout' when reshaping from RAID4 to another RAID4 is
not really necessary.
If reshape is restarted 'before.layout' will be compared with
'info->array.layout' in reshape_array(). Changes brought by mentioned
commit will cause this comparation return as false, becouse 'array.layout'
is always set to 'ALGORITHM_PARITY_N' in analyse_change() for RAID4, so
reshape will not be continued after reboot/stop.
This patch reverts unnecessary changes.

Signed-off-by: Pawel Baldysiak pawel.baldysiak@intel.com
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agomdcheck: don't pass the '+' to "date".
NeilBrown [Sun, 25 May 2014 23:37:05 +0000 (09:37 +1000)] 
mdcheck: don't pass the '+' to "date".

It isn't needed, makes is harder to describe what --duration does.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove "BVD xx is missing".
NeilBrown [Thu, 22 May 2014 07:22:47 +0000 (17:22 +1000)] 
DDF: remove "BVD xx is missing".

This can happen in normal cases during incremental assembly so
printing an error message is confusing.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoinstall: use BINDIR consistently to locate mdadm and mdmon
NeilBrown [Thu, 22 May 2014 07:13:02 +0000 (17:13 +1000)] 
install:  use BINDIR consistently to locate mdadm and mdmon

Every place where the paths for mdadm or mdmon is explicit,
it should use the BINDIR setting, not "/sbin/".

Reported-by: member graysky <graysky@archlinux.us> (https://bugs.archlinux.org/task/37330)
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agomdcheck: new script to help with regular checks of md arrays.
NeilBrown [Thu, 22 May 2014 06:00:39 +0000 (16:00 +1000)] 
mdcheck: new script to help with regular checks of md arrays.

This script allows arrays to be 'checked' for a limited amount
of time on a regular basis.

For example, running

 mdcheck --duration 6hours

early every Sunday morning and

 mdcheck --continue 6hours

ever other morning will check all arrays every week, but if that take
more than 6 hours, will won't run into the day, but will be continued
the next morning, and the next ... etc.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoMISC: add --action option to set or abort check/repair.
NeilBrown [Thu, 22 May 2014 05:55:31 +0000 (15:55 +1000)] 
MISC: add --action option to set or abort check/repair.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years ago--examine-bitmap: give useful message if no bitmap found on md array.
NeilBrown [Thu, 22 May 2014 05:22:39 +0000 (15:22 +1000)] 
--examine-bitmap: give useful message if no bitmap found on md array.

The bitmap is stored on member devices, not on the array, so
--examine-bitmap should be given the member device.
If --examine-bitmap is given an array, and it doesn't have a bitmap
on it (i.e. it isn't a member of some other array), then that
is probably a usage error, so print a helpful message.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agomdadm: Do not reimplment offsetof
Cristian Rodríguez [Wed, 21 May 2014 16:45:19 +0000 (12:45 -0400)] 
mdadm: Do not reimplment offsetof

Proper implementations have offsetof in stddef.h

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: fix resent grow_continue breakage.
NeilBrown [Thu, 22 May 2014 04:22:58 +0000 (14:22 +1000)] 
Grow: fix resent grow_continue breakage.

Commit 5e76dce1acd906e8fc8af04973c3a129cdc77fd6 changed
Grow_continue to assume a fork had already happened, so that
   mdadm --grow --continue

didn't fork.  This is good, but it means that if Grow_continue
is run from Assemble, then
  mdadm --assemble ....

can misbehave if the array was in the middle of a reshape.

So introduce finer control.  Grow_continue only assumes it has
already forked if run from "mdadm --grow --continue".

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove some pointless code in validate_geometry
NeilBrown [Wed, 21 May 2014 04:03:48 +0000 (14:03 +1000)] 
DDF: remove some pointless code in validate_geometry

I'm not sure what this was supposed to do, but it isn't needed
as creating on a container and on individual devices (in a container)
work fine already.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove a FIXME comment that doesn't seem to mean anything.
NeilBrown [Wed, 21 May 2014 03:51:33 +0000 (13:51 +1000)] 
DDF: remove a FIXME comment that doesn't seem to mean anything.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove 'FIXME' comment that doesn't need fixing.
NeilBrown [Wed, 21 May 2014 03:50:52 +0000 (13:50 +1000)] 
DDF: remove 'FIXME' comment that doesn't need fixing.

It appears this is correct, though for consistency with elsewhere
we check that pdnum is not negative.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: ensure dl->devname is freed when processing a 'delete device' update.
NeilBrown [Wed, 21 May 2014 03:27:54 +0000 (13:27 +1000)] 
DDF: ensure dl->devname is freed when processing a 'delete device' update.

As this code runs in 'monitor' it cannot just free memory,
it must add it to a list for 'manager' to free.
Fortunate update->space_list exists for just this purpose.
dl->devname might be small, so put it in update->space and
put dl in update->space_list.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove old comment about looking for spares.
NeilBrown [Wed, 21 May 2014 03:10:03 +0000 (13:10 +1000)] 
DDF: remove old comment about looking for spares.

As handle_missing() sets ->check_degraded, nothing else needs to be
done here.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove an old incorrect FIXME comment.
NeilBrown [Wed, 21 May 2014 03:00:08 +0000 (13:00 +1000)] 
DDF: remove an old incorrect FIXME comment.

We mustn't close fds in write_init_super if ->update_tail
was set.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: add data-offset information to --examine output.
NeilBrown [Wed, 21 May 2014 02:43:40 +0000 (12:43 +1000)] 
DDF: add data-offset information to --examine output.

 Raid Devices[1] : 5 (4@20000K 3@20000K 2@0K 1@0K 0@0K)

The data offsets are 200000K and 0K.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: split up ddf_process_update
NeilBrown [Wed, 21 May 2014 02:20:56 +0000 (12:20 +1000)] 
DDF: split up ddf_process_update

Function was way too big, make several smaller functions.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agotests: handle change to DDF assembly.
NeilBrown [Tue, 13 May 2014 02:22:03 +0000 (12:22 +1000)] 
tests: handle change to DDF assembly.

When a DDF array is assembled with missing devices, those devices
are now alway marked as 'missing' and cannot just re-appear in the array
and be working again.

test must be changed to acknowledge this.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agotests: handle new raid10/ddf geometries.
NeilBrown [Tue, 13 May 2014 02:19:40 +0000 (12:19 +1000)] 
tests: handle new raid10/ddf geometries.

Recent changes to support more ddf geometries using raid1e
requires updates to tests.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: add support of --data-offset when creating array.
NeilBrown [Tue, 6 May 2014 04:52:24 +0000 (14:52 +1000)] 
DDF: add support of --data-offset when creating array.

Infrastructure is there, so use it.

This requires making sure that ->data_offset is correctly set, even
for containers.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: factor out common code for search through extents.
NeilBrown [Tue, 6 May 2014 04:47:03 +0000 (14:47 +1000)] 
DDF: factor out common code for search through extents.

Each place the uses "get_extents" has slightly different search code
to look through the result.

Factor this out into a single find_space() function.

This is will make it easier to add --data-offset support.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: allow for unused slots when creating map list for getinfo_super_ddf.
NeilBrown [Tue, 6 May 2014 01:42:12 +0000 (11:42 +1000)] 
DDF: allow for unused slots when creating map list for getinfo_super_ddf.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: DDF_Missing devices should not be reported as 'working' by getinfo_super_ddf
NeilBrown [Tue, 6 May 2014 01:29:49 +0000 (11:29 +1000)] 
DDF: DDF_Missing devices should not be reported as 'working' by getinfo_super_ddf

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: remove old and wrong comment about settinig raid_disk.
NeilBrown [Mon, 28 Apr 2014 07:01:04 +0000 (17:01 +1000)] 
DDF: remove old and wrong comment about settinig raid_disk.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: provide simple detail_super() implementation.
NeilBrown [Mon, 28 Apr 2014 06:50:57 +0000 (16:50 +1000)] 
DDF: provide simple detail_super() implementation.

Just print the GUID, Seq and number of VDs in the container.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: support more RAID10 levels.
NeilBrown [Mon, 28 Apr 2014 05:31:50 +0000 (15:31 +1000)] 
DDF: support more  RAID10 levels.

The DDF "RAID1E" level is similar to md "raid10".

So use raid10 to support RAID1E, and create RAID1E for raid10
configs not already supported.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: explain why spare_refs are ignored.
NeilBrown [Thu, 10 Apr 2014 02:57:25 +0000 (12:57 +1000)] 
DDF: explain why spare_refs are ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: use array_size from metadata.
NeilBrown [Thu, 10 Apr 2014 02:54:13 +0000 (12:54 +1000)] 
DDF: use array_size from metadata.

If some other controller sets a number smaller than a calculation
would give us, we really should honour it.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: set utime for container from timestamp is superblock.
NeilBrown [Thu, 10 Apr 2014 01:44:50 +0000 (11:44 +1000)] 
DDF: set utime for container from timestamp is superblock.

Also be more consistent about setting events from seq in superblock.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: don't assume the anchor is fully up-to-date.
NeilBrown [Thu, 10 Apr 2014 01:41:18 +0000 (11:41 +1000)] 
DDF: don't assume the anchor is fully up-to-date.

We currently copy the anchor to both primary and secondary
blocks.
This assumes that the anchor is uptodate, but it might not be.
We should trust the 'active' block and copy from there.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: update timestamp/seqnum for virtual disks when config changes.
NeilBrown [Thu, 10 Apr 2014 01:34:56 +0000 (11:34 +1000)] 
DDF: update timestamp/seqnum for virtual disks when config changes.

- we weren't updating this timestamp at all
- the 'vd_config' seqnum was updated on every write of the metadata,
  which is excessive.  Just update it when there is a change.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: update timestamp in DDF header.
NeilBrown [Wed, 9 Apr 2014 07:11:57 +0000 (17:11 +1000)] 
DDF: update timestamp in DDF header.

Doco says:
  Header update timestamp. MUST be set when the DDF
  header is updated.

So I guess we should.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: avoid ref outside array in getinfo_super_ddf_bvd
NeilBrown [Wed, 9 Apr 2014 06:59:49 +0000 (16:59 +1000)] 
DDF: avoid ref outside array in getinfo_super_ddf_bvd

As we are range-checking 'cd', there is a chance that it is not
in-range.  In that case we should include all array indexes with 'cd'
inside the range-tested branch.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: examine_pds to also list devices that aren't in the metadata.
NeilBrown [Wed, 9 Apr 2014 06:56:45 +0000 (16:56 +1000)] 
DDF: examine_pds to also list devices that aren't in the metadata.

The phys disks table should list all disks, but if the metadata
is corrupt, it might not even list the disk it was read from.
So check for and report any known disks that aren't listed.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: fix usage of ->used_pdes
NeilBrown [Wed, 9 Apr 2014 06:45:27 +0000 (16:45 +1000)] 
DDF: fix usage of ->used_pdes

The "used_pdes" value counts the number of physdisk entries that
are in used.
It may not be the last one in use as there may be unused slots in
the middle.

So when were are iterating over phys disks, we need to use max_pdes
and skip unused entries.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoDDF: more guards against pdnum being negative.
NeilBrown [Wed, 9 Apr 2014 06:35:18 +0000 (16:35 +1000)] 
DDF: more guards against pdnum being negative.

With consistent metdata, pdnum should never be negative,
but it is better to be safe than sorry.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoReshape: use systemd to continue containers as well as native arrays.
NeilBrown [Tue, 20 May 2014 06:59:58 +0000 (16:59 +1000)] 
Reshape: use systemd to continue containers as well as native arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: split continue_via_systemd into a separate function.
NeilBrown [Tue, 20 May 2014 06:56:51 +0000 (16:56 +1000)] 
Grow: split continue_via_systemd into a separate function.

This allows it to be used for containers too.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: add 'forked' option to reshape_container.
NeilBrown [Tue, 20 May 2014 06:51:56 +0000 (16:51 +1000)] 
Grow: add 'forked' option to reshape_container.

This is a better match for reshape_array() and means that
"mdadm --grow --continue" will run in the foreground, which
makes more sense.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: try to let "--grow --continue" from systemd complete a reshape.
NeilBrown [Wed, 14 May 2014 06:34:06 +0000 (16:34 +1000)] 
Grow: try to let "--grow --continue" from systemd complete a reshape.

If "--assemble" or "--incremental" is started by udev, then
monitoring the reshape in the background won't work.

So try asking systemd to start a grow-continue.

If that fails, just do it the old way.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoGrow: store a link to current backup file in /run/mdadm or similar.
NeilBrown [Thu, 15 May 2014 04:23:16 +0000 (14:23 +1000)] 
Grow: store a link to current backup file in /run/mdadm or similar.

Subsequent patch will allow the background part of "mdadm --grow" to
be run from systemd.  This can require the passing of a backup file
name.
To do this, store that name as a symlink in /run/mdadm (or MAP_DIR)
and look for it when appropriate.

It might be useful to also store the name across reboot, but that
would be a different patch.  We would need to use the uuid to identify
it, and store it in stable storage.

Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoCreate: don't default to bitmap=internal when it is not supported
Artur Paszkiewicz [Tue, 15 Apr 2014 08:01:44 +0000 (10:01 +0200)] 
Create: don't default to bitmap=internal when it is not supported

For large arrays (component size > 100GB) if write-intent bitmap is not
enabled, then it is set by default to "internal", even if the metadata
format does support internal bitmaps, which causes Create to fail.

This patch adds checking if add_internal_bitmap is set in the
superswitch before setting bitmap_file to "internal".

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
9 years agoFix race between --create and --incremental
Artur Paszkiewicz [Wed, 9 Apr 2014 15:14:59 +0000 (17:14 +0200)] 
Fix race between --create and --incremental

This modifies locking in Create to eliminate a situation where
--incremental can assemble a device between write_init_super() and
add_disk(), which causes Create to fail.

It sporadically occurs e.g. when metadata is written on a device,
causing an udev change event which triggers mdadm --incremental.

Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosystemd: various fixes for boot with container-arrays.
NeilBrown [Tue, 8 Apr 2014 07:22:18 +0000 (17:22 +1000)] 
systemd: various fixes for boot with container-arrays.

1/ Add systemd shutdown script to ensure DDF and IMSM are
   clean before we actually shutdown

2/ Get udev to tell systemd to run the mdmon@mdXXX.service
   units when a member array appears.

   If we boot off a member array (with dracut at least),
   the mdmon started in the initramfs will lose track of
   /sys etc, so we need to restart it.
   systemd will try to forget about it too (but not actually
   kill it because we said not to do this).
   Having udev tell it to start it will allow a new mdmon to
   run which can see /sys, and systemd will know about it.

3/ Always use --offroot and --takeover when starting mdmon with
   systemd
   --offroot is needed else shutdown will hang.
   --takeover is needed incase an mdmon was started earlier
   (e.g. in initramfs).
   Neither hurt if they aren't actually needed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: Don't fail compare_super_ddf due to re-configure changes.
NeilBrown [Wed, 2 Apr 2014 04:26:35 +0000 (15:26 +1100)] 
DDF: Don't fail compare_super_ddf due to re-configure changes.

It is possible that one device has seem some reconfig but the other
hasn't.  In that case  they are still the "same" DDF, even though
one might be older.  Such age will be detected by 'seq' differences.

If A is new and B is old, then it is import that
  mdadm -I B
  mdadm -I A

doesn't get confused because A has the same uuid as B, but compare_super fails.

So: if the seq numbers are different, then just accept as two
different superblocks.
If they are the same, then look to copy data from new to old.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: fix possible mdmon crash when updating metadata.
NeilBrown [Wed, 2 Apr 2014 04:14:43 +0000 (15:14 +1100)] 
DDF: fix possible mdmon crash when updating metadata.

Testing 'c' and then using 'vdc' assumes that the two are in sync,
but sometimes they aren't.
Testing 'vdc' is safer.
This avoids a crash in some cases when failing/removing/added devices
to a DDF.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: guard against ->pdnum being negative.
NeilBrown [Wed, 2 Apr 2014 02:34:10 +0000 (13:34 +1100)] 
DDF: guard against ->pdnum being negative.

It is conceivable that ->pdnum could be -1, though only if
the metadata is corrupt.
We should be careful not to use it if it is.

Also remove an assignment for pdnum to ->container_member.
This is never used and cannot possibly mean anything.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: mark missing-on-assembly device properly.
NeilBrown [Tue, 1 Apr 2014 05:15:06 +0000 (16:15 +1100)] 
DDF: mark missing-on-assembly device properly.

As well as removing from the array we really should mark
it is 'failed', and mark the array as degraded.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: Fix assorted typos and do some reformatting.
NeilBrown [Tue, 1 Apr 2014 04:57:09 +0000 (15:57 +1100)] 
DDF: Fix assorted typos and do some reformatting.

..because it is more fun when new patches are harder to apply to old version :-)

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: move manual repair code to separate function
Piergiorgio Sartor [Sat, 15 Mar 2014 17:33:13 +0000 (18:33 +0100)] 
raid6check.c: move manual repair code to separate function

This patch cleans up a bit the code by moving
the second repair mode, that is the manual
repair, to a separate function.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: move autorepair code to separate function
Piergiorgio Sartor [Sat, 15 Mar 2014 16:56:22 +0000 (17:56 +0100)] 
raid6check.c: move autorepair code to separate function

This patch cleans up a bit the code by moving
the autorepair part into a separate function.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: lock the stripe until necessary
Piergiorgio Sartor [Sat, 15 Mar 2014 15:37:52 +0000 (16:37 +0100)] 
raid6check.c: lock the stripe until necessary

The stripe locking mechanism must be atomic between
the check and the, potential, autorepair.
For this reason, the autorepair code needs to be just
after the check and both parts (check and autorepair)
must be excuted under stripe lock.
Of course, the manual repair can operate as before.
This patch reorganize the code and provides the single,
atomic, stripe lock.
It should be confirmed that this new locking is not
too demanding.
In case it is, some other solutions will be required
(suggestions wellcome).

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoddf-sudden-degraded test fix.
NeilBrown [Wed, 26 Mar 2014 03:30:21 +0000 (14:30 +1100)] 
ddf-sudden-degraded test fix.

Change how sudden-degraded devices should appear.
We don't record failure, we record that the device isn't there.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: when first activating an array, record any missing devices.
NeilBrown [Wed, 26 Mar 2014 03:26:53 +0000 (14:26 +1100)] 
DDF: when first activating an array, record any missing devices.

We must remember they are missing so that if they re-appear we
don't get confused.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: report seq counter as events.
NeilBrown [Wed, 26 Mar 2014 03:19:43 +0000 (14:19 +1100)] 
DDF: report seq counter as events.

Also don't treat two devices with different seq numbers as completely
unrelated.

This allows split-brain detection to work properly for ddf.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoWork around architectures having statfs.f_type defined as long
Jes Sorensen [Wed, 19 Mar 2014 13:26:02 +0000 (14:26 +0100)] 
Work around architectures having statfs.f_type defined as long

Having RAMFS_MAGIC defined as 0x858458f6 causing problems when trying
to compare it directly against statfs.f_type being cast from long to
unsigned long.

This hack is extremly ugly, but it should at least do the right thing
for every situation.

Thanks to Arnd Bergmann for suggesting the fix.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add test that DDF marks missing devices as failed on assembly.
NeilBrown [Tue, 11 Mar 2014 06:11:08 +0000 (17:11 +1100)] 
tests: add test that DDF marks missing devices as failed  on assembly.

If we assemble a newly-degraded array, the missing devices must be marked
as 'failed' so we don't expect them in future.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon@.service: Change type of process start-up to 'forking'.
Pawel Baldysiak [Thu, 6 Mar 2014 14:51:44 +0000 (15:51 +0100)] 
mdmon@.service: Change type of process start-up to 'forking'.

Mdadm does not wait enough time when mdmon is started by systemd.
It causes various problems with behaviour of a RAID volume with external metadata.
For example: mdmon does not update a value of checkpoint during migration
and second RAID5 volume is read-only after reboot done during
container reshape (both problems occur with IMSM matadata).
If a type of process start-up is changed to 'forking', systemctl will
wait until mdmon (parent) process exits after calling fork.
This way mdmon will always be fully initialized after start_mdmon
and these problems will not occur.
In this case it is recommended to add a path to PIDFile, so that systemd
does not have to guess a PID of the mdmon process.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: change load_devices to return most_recent 'st' value.
NeilBrown [Tue, 25 Feb 2014 04:04:16 +0000 (15:04 +1100)] 
Assemble: change load_devices to return most_recent 'st' value.

This means that

st->ss->getinfo_super(st, content, NULL);
clean = content->array.state & 1;

will get an up-to-date value for 'clean'.  This fix allows
  tests/03r5assem-failed
to work.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: re-arrange freeing of 'tst' in load_devices().
NeilBrown [Tue, 25 Feb 2014 03:59:12 +0000 (14:59 +1100)] 
Assemble:  re-arrange freeing of 'tst' in load_devices().

When we return in error, we need to free(tst), and ->free_super(tst);
Sometimes we didn't.

Also the final ->free_super(tst) should be followed by free(tst)
but wasn't.

Move that file free forward in the code a bit as we will want to use
the tst there in the next patch.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: allow load_devices to change the 'st' which is passed in.
NeilBrown [Tue, 25 Feb 2014 03:54:34 +0000 (14:54 +1100)] 
Assemble: allow load_devices to change the 'st' which is passed in.

The given 'st' might not be best.  Making this interface change
will allow load_devices to return a better 'st'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoNew test: 03r5assem-failed
NeilBrown [Tue, 25 Feb 2014 03:52:14 +0000 (14:52 +1100)] 
New test: 03r5assem-failed

This test currently fails, confirming a bug which was recently
reported.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: reduce verbosity
Piergiorgio Sartor [Wed, 5 Feb 2014 19:18:45 +0000 (20:18 +0100)] 
raid6check.c: reduce verbosity

This patch will remove some legacy code.
It is part of the verbosity "cleanup".
In any case, if information about the P
and Q parity mismatches is required, it
should go inside the code handling page
size blocks, not full stripe size.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: add O_SYNC to open
Piergiorgio Sartor [Sat, 1 Feb 2014 21:27:58 +0000 (22:27 +0100)] 
raid6check.c: add O_SYNC to open

It could be better to make sure the
data reaches the disks, so open the
drives with O_SYNC flag.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: fix Q parity generation
Piergiorgio Sartor [Sat, 1 Feb 2014 21:16:52 +0000 (22:16 +0100)] 
raid6check.c: fix Q parity generation

In the transition to 4K page processing,
the Q parity generation had a wrong offset
in the buffer.
This patche fix this.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: fix position printout
Piergiorgio Sartor [Sat, 1 Feb 2014 16:39:27 +0000 (17:39 +0100)] 
raid6check.c: fix position printout

This patch make a bit more clear
the position, in the disk, where
an error is found.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c: reduce verbosity
Piergiorgio Sartor [Sat, 1 Feb 2014 16:03:34 +0000 (17:03 +0100)] 
raid6check.c: reduce verbosity

This patch removes some printouts, which
are not really useful here.
These could be re-added later, in case a
verbosity parameter will be provided.

Signed off: piergiorgio.sartor@nexgo.de

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check.c add page size check and repair
Piergiorgio Sartor [Mon, 20 Jan 2014 19:10:22 +0000 (20:10 +0100)] 
raid6check.c add page size check and repair

raid6check current performs checks and repair on a whole chunk at a
time.  This is often not ideal as corruption can happen with smaller
granularity.

This patches changes raid6check to use a page-size (4K) granularity.

We still process a chunk at a time, but within each chunk we process a
page at a time.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon@.service: remove over-ride of Standard IO.
NeilBrown [Wed, 22 Jan 2014 01:53:31 +0000 (12:53 +1100)] 
mdmon@.service: remove over-ride of Standard IO.

Redirecting output to /dev/null is unnecessary and hides any error
messages there might be.  So leave as defaults which are none,
journal, inherit.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosystemd/mdmon: set IMSM_NO_PLATFORM=1
NeilBrown [Mon, 20 Jan 2014 22:46:07 +0000 (09:46 +1100)] 
systemd/mdmon: set IMSM_NO_PLATFORM=1

As mdmon doesn't inherit environment from mdadm when it is started
by system, it cannot inherit IMSM_NO_PLATFORM.
But if an imsm array as assembled then mdmon really should handle it
whether there is a platform present or not.
So always set this var.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdmon: don't complain about notifying parent when there is no need
NeilBrown [Mon, 20 Jan 2014 22:43:31 +0000 (09:43 +1100)] 
mdmon: don't complain about notifying parent when there is no need

When run with --foreground mdmon has no need to notify any
parent, so it shouldn't even try, let alone complain when it fails.

Also close an end of a pipe which is no longer used.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIMSM: don't crash when creating an array with missing devices.
NeilBrown [Mon, 20 Jan 2014 22:40:02 +0000 (09:40 +1100)] 
IMSM: don't crash when creating an array with missing devices.

'missing' devices are in a different list so when collection the
serial numbers of all devices we need to check both lists.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix problems with prematurely aborting of reshapes.
NeilBrown [Mon, 20 Jan 2014 04:31:45 +0000 (15:31 +1100)] 
Grow: fix problems with prematurely aborting of reshapes.

1/ when unfreezing, make sure the array is frozen first.
   If it isn't we might end up interrupting a reshape.
2/ When the child finishes, don't call abort_reshape() as that
   will interrupt the reshape.  Just set suspend_* etc
   explicitly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: fix detection of failed devices during assembly.
NeilBrown [Mon, 20 Jan 2014 04:27:29 +0000 (15:27 +1100)] 
DDF: fix detection of failed devices during assembly.

When we call "getinfo_super", we report the working/failed status
of the particular device, and also (via the 'map') the working/failed
status of every other device that this metadata is aware of.

It is important that the way we calculate "working or failed" is
consistent.
As it is, getinfo_super_ddf() will report a spare as "working", but
every other device will see it as "failed", which leads to failure to
assemble arrays with spares.

For getinfo_super_ddf (i.e. for the container), a device is assumed
"working" unless flagged as DDF_Failed.
For getinfo_super_ddf_bvd (for a member array), a device is assumed
"failed" unless DDF_Online is set, and DDF_Failed is not set.

Reported-by: "David F." <df7729@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: avoid infinite loop when auto-assembling partial container.
NeilBrown [Mon, 20 Jan 2014 04:23:31 +0000 (15:23 +1100)] 
Assemble: avoid infinite loop when auto-assembling partial container.

When auto-assembling we loop until we get no successes.

If a device is found that look like it is part of an already-existing
container, but we subsequently fail to add that device, then the fact
that the container is running looks like a success.  This can result
in infinite looping.
So if a container was already partially assemble, and is still only
partially assembled after we try to add devices, then don't treat that
as success.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF - really ignore DDF metadata on partitions.
NeilBrown [Mon, 20 Jan 2014 01:25:23 +0000 (12:25 +1100)] 
DDF - really ignore DDF metadata on partitions.

See commit 357ac1067835d1cdd5f80acc28501db0ffc64957
which made a similar change for super-intel, and really should have
fixed DDF at the same time.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agopolicy: NULL path isn't really acceptable - use the devname
Lukasz Dorau [Thu, 19 Dec 2013 12:02:12 +0000 (13:02 +0100)] 
policy: NULL path isn't really acceptable - use the devname

According to:
commit b451aa4846c5ccca5447a6b6d45e5623b8c8e961
Fix handling for "auto" line in mdadm.conf

a NULL path isn't really acceptable and the devname should be used instead.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoClarify scope of Rebuild events in mdadm manpage
Jan Ceuleers [Wed, 11 Dec 2013 07:45:55 +0000 (08:45 +0100)] 
Clarify scope of Rebuild events in mdadm manpage

To date, the manpage did not make it clear under which circumstances
Rebuild events are generated, leading to a question on the mailing
list as to whether it is normal for these events to be generated
while checking an array.
So clarify that all operations that act on the entire array are in
scope. The list is given as "e.g.", because it might grow in the
future as other full-array operations are added.

Reported-by: Mark Knecht <markknecht@gmail.com>
Signed-off-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdamd-last-resort: add a Conflicts line to stop the timer.
NeilBrown [Thu, 12 Dec 2013 02:20:32 +0000 (13:20 +1100)] 
mdamd-last-resort: add a Conflicts line to stop the timer.

When the md device actually appears we want to stop the timer and not
bother with the mdadm-last-resort@.server.  In particular, running
that causes confusing messages and is in general best avoided.

Fortuantely this can simply be achieved with a Conflicts= line

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoudev rules: try "mdadm -I" on "change" events.
NeilBrown [Wed, 11 Dec 2013 01:29:22 +0000 (12:29 +1100)] 
udev rules: try "mdadm -I" on "change" events.

We need to attempt "mdadm -I" on "change" events as well as "add" events,
as the "change" make make a device ready to be part of an array.
This is particularly important for stacked md devices. When the
member devices are "add"ed they don't have any content visible yet.
That doesn't happen until a "change".

Idea taken from Fedora udev file.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoudev rules: add some by-pass rules from Fedora
NeilBrown [Wed, 11 Dec 2013 01:25:02 +0000 (12:25 +1100)] 
udev rules: add some by-pass rules from Fedora

1/ If ANACONDA is running, don't -I assemble any arrays, ANACONDA
   needs to be in control
2/ honour "noiswmd" and "nodmraid" kernel command line options.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd mdmonitor.service systemd unit file.
NeilBrown [Tue, 10 Dec 2013 23:47:54 +0000 (10:47 +1100)] 
Add mdmonitor.service systemd unit file.

This systemd unit file runs mdadm in --monitor mode.
It is started by a SYSTEMD_WANTS signal from udev whenever
an md array is started that would benefit from mdadm --monitor.

Commandline arguments can be provided by a script
  /usr/lib/systemd/scripts/mdadm_env.sh
which should write an
  MDADM_MONITOR_ARGS=....
line to /run/sysconfig/mdadm

A script to extra args from SUSE's /etc/sysconfig/mdadm file
is provided.
If no mdadm_env.sh is provided, then args are "--scan" which
requires "mail" or "program" to be set in /etc/mdadm.conf.
I believe this is suitable for Fedora.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble/Incremental: don't hold O_EXCL on mddev after assembly.
NeilBrown [Wed, 4 Dec 2013 23:35:16 +0000 (10:35 +1100)] 
Assemble/Incremental: don't hold O_EXCL on mddev after assembly.

As soon as the array is assembled, udev or systemd might run
fsck and mount it.  So we need to drop O_EXCL promptly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoTwo small fixes related to enough()
NeilBrown [Wed, 4 Dec 2013 21:58:21 +0000 (08:58 +1100)] 
Two small fixes related to enough()

1/ enough_fd doesn't use avail_disks any more, so discard it.

2/ Manage_Add increments 'found' at the wrong place, so it can
   waste time before calling enough().

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: improve support for "DEVICE" based restriction in mdadm.conf
NeilBrown [Tue, 3 Dec 2013 03:01:24 +0000 (14:01 +1100)] 
Incremental: improve support for "DEVICE" based restriction in mdadm.conf

--incremental currently fails if the device name passed does not
textually match the names permitted by the DEVICE line in mdadm.conf.
This is problematic when "mdadm -I" is run by udev as the name given
can be a temp name.

This patch makes two improvements:
1/ We generate a list of all existing devices that match the names
  in mdadm.conf, and allow rdev based matching
2/ We allows extra aliases to be provided on the command line, and
  perform textual matching on those.  This is particularly suitable
  for udev usages as ${DEVLINKS} can be provided even though the links
  make not yet be created.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoSystemd integration for starting newly-degraded arrays.
NeilBrown [Mon, 2 Dec 2013 05:08:04 +0000 (16:08 +1100)] 
Systemd integration for starting newly-degraded arrays.

Normally "mdadm -I" will not start an array if it has reason to
expect further devices.
This means that if a device is removed while the host is shut down,
"mdadm -I" will never start the device.

If  the array is know to the host, it make sense to start the array
anyway after a reasonable timeout.

This patch adds systemd/udev infrastructure so that 30 seconds after
a known array first becomes able to be assembled as a degraded array,
the array will be assembled even if more devices are still expected.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: add --export handling.
NeilBrown [Thu, 28 Nov 2013 04:15:30 +0000 (15:15 +1100)] 
Incremental: add --export handling.

If --export is given with --incremental, then
  MD_DEVNAME
is output which gives the name of the device (in /dev/md) that
is the array (or container) that the device would be added to.
Also
  MD_STARTED
is set to one of
  no
  unsafe
  yes
  nothing

to indicate if the array was started.  IF MD_STARTED=unsafe
then it may be appropriate to run
  mdadm -R /dev/md/$MD_DEVNAME
after a timeout to ensure newly degraded array are started.

If
  MD_FOREIGN=yes
it might be appropriate to suppress this as the array is
probably not critical.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoRestructure assemble_container_content and improve messages.
NeilBrown [Thu, 28 Nov 2013 03:47:41 +0000 (14:47 +1100)] 
Restructure assemble_container_content and  improve messages.

We lose one level of indent, and now get told the difference between
'not assemble because not safe' and 'not assembled because not enough
devices'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: don't abort container if one member explicitly disabled.
NeilBrown [Thu, 28 Nov 2013 02:33:56 +0000 (13:33 +1100)] 
Incremental: don't abort container if one member explicitly disabled.

If a member of a container is explicitly disabled, others may not
be so we should continue.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: remove test that can never succeed.
NeilBrown [Thu, 28 Nov 2013 02:30:23 +0000 (13:30 +1100)] 
Incremental: remove test that can never succeed.

Incremental_container never returns 1, so this test is pointless.
It is a holdover from when we called "Incremental()" rather than
"Incremental_container()" at this point.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIMSM metadata really should be ignored when found on partitions.
NeilBrown [Tue, 19 Nov 2013 23:49:14 +0000 (10:49 +1100)] 
IMSM metadata really should be ignored when found on partitions.

commit b31df43682216d1c65813eae49ebdd8253db8907
changed load_super_imsm to not insist on finding a partition if
ignore_hw_compat was set.
Unfortunately this is set for '--assemble' so arrays could get
assembled badly.

The comment says this was to allow e.g. --examine of image files.
A better fixes for this is to change test_partitions to not report
a regular file as being a partition.
The errors from the BLKPG ioctl are:

 ENOTTY : not a block device.
 EINVAL : not a whole device (probably a partition)
 ENXIO  : partition doesn't exist (so not a partition)

Reported-by: "David F." <df7729@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoddf tests: fix get_rootdev
NeilBrown [Tue, 19 Nov 2013 05:40:09 +0000 (16:40 +1100)] 
ddf tests: fix get_rootdev

Getting the major number from the hex device number should take
all-but-the-last-two digits, rather than just the first two digits.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd support for --add-spare
NeilBrown [Wed, 30 Oct 2013 23:41:50 +0000 (10:41 +1100)] 
Add support for --add-spare

--add-spare is like --add, but a --re-add is never attempted.
So it is equivalent to two separate commands:

 --zero-metadata
 --add

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoFix typos in mdadm.8.in
John Schmidt [Tue, 29 Oct 2013 17:16:18 +0000 (10:16 -0700)] 
Fix typos in mdadm.8.in

I found a small bug in the documentation of mdadm.  I fixed it in my
local git clone of git://neil.brown.name/mdadm  Here is the change:

Signed-off-by: NeilBrown <neilb@suse.de>