]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
10 years agoGrow: allow a reshape which only changes --data-offset
NeilBrown [Tue, 21 May 2013 06:50:55 +0000 (16:50 +1000)] 
Grow: allow a reshape which only changes --data-offset

Sometimes, that is all we want to do.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: E2BIG should be reporte differently if --data-offset was requested.
NeilBrown [Tue, 21 May 2013 06:50:05 +0000 (16:50 +1000)] 
Grow: E2BIG should be reporte differently if --data-offset was requested.

In that case the problem is almost certainly that --data-offset is too big.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: --backup-file and --data-offset are incompatible.
NeilBrown [Tue, 21 May 2013 06:40:23 +0000 (16:40 +1000)] 
Grow: --backup-file and --data-offset are incompatible.

So report if both are given, and if --backup-file is given,
don't try to update data-offset.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: handle E2BIG from new_offset changes more gracefully.
NeilBrown [Tue, 21 May 2013 06:35:29 +0000 (16:35 +1000)] 
Grow: handle E2BIG from new_offset changes more gracefully.

If new_offset change is too big, just do the reshape the old way.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow --data-offset to be specified for raid4/5/6
NeilBrown [Tue, 21 May 2013 06:33:56 +0000 (16:33 +1000)] 
Grow: allow --data-offset to be specified for raid4/5/6

Previously it was rejected for non-RAID10.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow metadata to indicate that changing data_offset not supported.
NeilBrown [Tue, 21 May 2013 06:32:00 +0000 (16:32 +1000)] 
Grow: allow metadata to indicate that changing data_offset not supported.

If space_after and space_before are zero (the default) then assume that
metadata doesn't support changing data_offset.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: use new_data_offset instead of backups for raid4/5/6 reshape.
NeilBrown [Tue, 21 May 2013 06:28:23 +0000 (16:28 +1000)] 
Grow: use new_data_offset instead of backups for raid4/5/6 reshape.

If we can modify the data_offset, we can avoid doing any backups at all.
If we can't fall back on old approach - but not if --data-offset
 was requested.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: introduce min_offset_change to struct reshape.
NeilBrown [Wed, 22 May 2013 02:17:32 +0000 (12:17 +1000)] 
Grow: introduce min_offset_change to struct reshape.

raid10 currently uses the 'backup_blocks' field to store something
else: a minimum offset change.
This is bad practice, we will shortly need to have both for RAID5/6,
so make a separate field.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: have analyse_change zero the reshape structure first.
NeilBrown [Wed, 22 May 2013 01:51:43 +0000 (11:51 +1000)] 
Grow: have analyse_change zero the reshape structure first.

This is generally safer and means we can remove lots of zero
assignments.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow.c: split impose_reshape out as a function.
NeilBrown [Tue, 21 May 2013 06:11:08 +0000 (16:11 +1000)] 
Grow.c: split impose_reshape out as a function.

It will be useful soon.
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow.c: split out update_cache_size() function.
NeilBrown [Tue, 21 May 2013 05:59:11 +0000 (15:59 +1000)] 
Grow.c: split out update_cache_size() function.

Make this a separate function as I might want to call it from another
location.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow.c remove some pointless casts on 'data_offset'.
NeilBrown [Tue, 21 May 2013 05:41:25 +0000 (15:41 +1000)] 
Grow.c remove some pointless casts on 'data_offset'.

'data_offset' is 'unsigned long long' so the cast is pointless.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: improve calculation of space_before/space_after
NeilBrown [Tue, 21 May 2013 05:38:49 +0000 (15:38 +1000)] 
super1: improve calculation of space_before/space_after

1/ these must allow for bad-block-list
2/ they must match the kernel, which has a 32k buffer after the
   superblock.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoExamine/super1: don't report "New Offset" when feature not set.
NeilBrown [Tue, 21 May 2013 05:37:20 +0000 (15:37 +1000)] 
Examine/super1: don't report "New Offset" when feature not set.

The "new_offset" field may be non-zero, but if the feature flag is not
set, it should be ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agopr_err for mdmon.
NeilBrown [Tue, 21 May 2013 02:58:02 +0000 (12:58 +1000)] 
pr_err for mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMore conversion to pr_err
NeilBrown [Tue, 21 May 2013 02:54:52 +0000 (12:54 +1000)] 
More conversion to pr_err

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoChange some fprintf(stderrs to cont_err()
NeilBrown [Tue, 21 May 2013 02:51:33 +0000 (12:51 +1000)] 
Change some fprintf(stderrs to cont_err()

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoChange some "fprintf(stderr,"s to pr_err.
NeilBrown [Tue, 21 May 2013 02:40:09 +0000 (12:40 +1000)] 
Change some "fprintf(stderr,"s to pr_err.

They just keep slipping in..

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: set_new_data_offset should report if kernel is too old.
NeilBrown [Tue, 21 May 2013 02:34:24 +0000 (12:34 +1000)] 
Grow: set_new_data_offset should report if kernel is too old.

For RAID5, not being able to set new_data_offset because of
old kernel is not a problem.  So make this fatal on for RAID10.

Also remove an unused assignment to 'rv'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agocomment typo
NeilBrown [Tue, 21 May 2013 02:25:21 +0000 (12:25 +1000)] 
comment typo

10 years agoGrow: just pass delta_disks instead of all of 'info'.
NeilBrown [Tue, 21 May 2013 01:55:44 +0000 (11:55 +1000)] 
Grow: just pass delta_disks instead of all of 'info'.

That is all we need, so make purpose of code more obvious
by only passing delta_disks.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: split out code for setting new_data_offset
NeilBrown [Tue, 21 May 2013 01:53:43 +0000 (11:53 +1000)] 
Grow: split out code for setting new_data_offset

This will soon be used for more than just RAID10, so
it deserves independent existence.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: replace '1' with 'INVALID_SECTORS' where appropriate.
NeilBrown [Tue, 21 May 2013 01:32:57 +0000 (11:32 +1000)] 
Grow: replace '1' with 'INVALID_SECTORS' where appropriate.

Here are some '1's which missed the introduction of INVALID_SECTORS
as a useful #define.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd --dump / --restore functionality.
NeilBrown [Thu, 16 May 2013 05:07:16 +0000 (15:07 +1000)] 
Add --dump / --restore functionality.

This allows the metadata on a device to be saved and later restored.
This can be useful before experimenting on an array that is misbehaving.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agointel,ddf: don't require partitions when ignore_hw_compat is set.
NeilBrown [Thu, 16 May 2013 03:24:07 +0000 (13:24 +1000)] 
intel,ddf: don't require partitions when ignore_hw_compat is set.

Partitions are a hw-compat issue.

This allows e.g "--examine" to be used on image files.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoCreate: over-ride "start_ro" setting when creating an array.
NeilBrown [Wed, 15 May 2013 01:40:27 +0000 (11:40 +1000)] 
Create: over-ride "start_ro" setting when creating an array.

If module parameter start_ro is set, arrays start readonly.
This is OK when assembling, but is very surprising when creating
an array as the resync won't start.
So over-ride the setting (unless --read-only was given) make
arrays RW when created.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoSuppress error messages from systemctl.
NeilBrown [Wed, 15 May 2013 01:10:54 +0000 (11:10 +1000)] 
Suppress error messages from systemctl.

We call systemctl to see if systemd will run mdmon for us.
If it cannot, we run mdmon directly, so we aren't interested
in the error message.
So redirect stderr to /dev/null.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoman pages: remove references to raidtools.
NeilBrown [Wed, 15 May 2013 01:07:17 +0000 (11:07 +1000)] 
man pages: remove references to raidtools.

raidtools is so ancient now that it is uninteresting.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agocreate_mddev: add support for /dev/md_XXX non-numeric names.
NeilBrown [Wed, 15 May 2013 01:03:25 +0000 (11:03 +1000)] 
create_mddev: add support for /dev/md_XXX non-numeric names.

With the 'devnm' infrastructure fixed, it is quite easy to support
names like "md_home" for md arrays.
The currently defaults to "off" and can be enabled in mdadm.conf with
  CREATE names=yes
This is incase other tools get confused by the new names.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: remove partitions when assembling.
NeilBrown [Tue, 14 May 2013 02:06:27 +0000 (12:06 +1000)] 
Incremental: remove partitions when assembling.

We remove partitions for --create and --assemble, but not for
--incrmental.
So fix that ommision.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoCreate: fix bug with --data-offset.
NeilBrown [Mon, 13 May 2013 07:26:37 +0000 (17:26 +1000)] 
Create: fix bug with --data-offset.

Test for VARIABLE_OFFSET was wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd some built files to .gitignore.
NeilBrown [Mon, 13 May 2013 07:11:42 +0000 (17:11 +1000)] 
Add some built files to .gitignore.

Now everything made by "make everything" is suitably ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAlways test return value of posix_memalign.
NeilBrown [Mon, 13 May 2013 07:09:55 +0000 (17:09 +1000)] 
Always test return value of posix_memalign.

FORTIFY_SOURCE likes this, and it is good practice.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdassemble - fix new compile-time problems.
NeilBrown [Mon, 13 May 2013 07:05:16 +0000 (17:05 +1000)] 
mdassemble - fix new compile-time problems.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDetail: report on inactive arrays.
NeilBrown [Mon, 13 May 2013 06:57:10 +0000 (16:57 +1000)] 
Detail: report on inactive arrays.

Array can be inactive when e.g. -I is in the process of assembling them.
This change allows --detail to report limited information about
these arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDetail: fix --brief --verbose
NeilBrown [Mon, 13 May 2013 04:57:41 +0000 (14:57 +1000)] 
Detail: fix --brief --verbose

This pair of options should give a --brief listing including devices=
information.  But recent changes to flag passing broke this.
So fix it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoRemove open-coded use_udev().
NeilBrown [Mon, 13 May 2013 03:03:25 +0000 (13:03 +1000)] 
Remove open-coded use_udev().

Manage_runstop has an open-coded version of use_udev() which is no
longer correct.  So make it use use_udev() explicitly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomisc_scan: don't trust the mapping file too much for device names.
NeilBrown [Mon, 13 May 2013 02:56:38 +0000 (12:56 +1000)] 
misc_scan: don't trust the mapping file too much for device names.

misc_scan assumes that any device name found in the 'mapping' file
is usable.  Usually it is but sometimes not, such as for inactive
devices.
Depending on it isn't really robust, when a name is found, check that
it exists. If not, fall back on map_dev.

This will allow "--detail --scan" to notice inactive devices.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncrmental: tell udevs to unmount when array looks to have disappeared.
NeilBrown [Mon, 13 May 2013 02:07:40 +0000 (12:07 +1000)] 
Incrmental: tell udevs to unmount when array looks to have disappeared.

If a device is removed which appears to be busy in an md array, then
it is very like the array cannot be used.
We currently try to stop it, but that could fail if udisks had
automatically mounted it.
So tell udisks to unmount it, but ignore any error.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdadm.conf.5: document the use of quotation characters in mdadm.conf
NeilBrown [Mon, 13 May 2013 01:28:15 +0000 (11:28 +1000)] 
mdadm.conf.5: document the use of quotation characters in mdadm.conf

single or double quotes protect spaces and double or single quotes.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: support --fail set-X and --remove set-X
NeilBrown [Tue, 5 Mar 2013 01:08:43 +0000 (12:08 +1100)] 
Manage: support --fail set-X and  --remove set-X

A RAID10 array can have 'sets' of devices which are reported by
--detail.
They can now be collectively failed or removed.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoWait: also wait if an action is about to start.
NeilBrown [Wed, 1 May 2013 00:23:40 +0000 (10:23 +1000)] 
Wait: also wait if an action is about to start.

If a sync/recover action is about to start but hasn't actually begun
yet, /proc/mdstat won't show it, but md/sync_action will (it checks
MD_RECOVERY_NEEDED).
So when /proc/mdstat seems to say nothing is happening, double check
with md/sync_action.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agotests: zero devices before --adding them.
NeilBrown [Tue, 30 Apr 2013 23:24:11 +0000 (09:24 +1000)] 
tests: zero devices before --adding them.

Linux 3.10 will allow more "--add" to be handled as "--re-add".
To be sure the tests work correctly we sometimes need to zero
the device to ensure it really is an --add that happens.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdmon: Add missing option documentation to --help output
Jes Sorensen [Thu, 25 Apr 2013 15:24:36 +0000 (17:24 +0200)] 
mdmon: Add missing option documentation to --help output

Document that -a is equivalent to --all, as well as --foreground / -F

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: fix bug in compare_super_ddf
mwilck@arcor.de [Tue, 23 Apr 2013 18:10:16 +0000 (20:10 +0200)] 
DDF: fix bug in compare_super_ddf

Fix bug in previous patch
"DDF: compare_super_ddf: merge local info of other superblock"

Just discovered this bug in my last patch set - unfortunately, just after
you committed it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agotests/10ddf-create: omit log output check
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:39 +0000 (12:07 +0200)] 
tests/10ddf-create: omit log output check

The test script was counting output lines - its expectations
don't match the current code any more. Remove this pointless
test.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomonitor: treat unreadable array_state as clean
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:38 +0000 (12:07 +0200)] 
monitor: treat unreadable array_state as clean

Failure to read array_state can only mean the array has been
deleted by the kernel; it is not an indication that the array
is dirty.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomonitor: read_and_act: handle race conditions for resync_start
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:37 +0000 (12:07 +0200)] 
monitor: read_and_act: handle race conditions for resync_start

When arrays are stopped, sysfs attributes may be deleted by
the kernel, and attempts to read these attributes will fail.

Setting resync_start to 0 is wrong in this case, because it
may make is_resync_complete() erroneously return
FALSE for a clean array. It is better to leave resync_start
untouched (the previously read value for this array).

Otherwise set_array_state() will pass thewrong state information
to the metadata handler, which will write it to disk, and at
the next restart an unnecessary recovery is started for the
array.

It is also possible that resync_start is actually *not* deleted
yet when read_and_act is running, and an apparently valid
value of "0" is read from it, with the same effect as described
above. This happens if the kernel has already called md_clean()
on the array (setting recovery_cp = 0), but the delayed removal
of "resync_start" hasn't happened yet. Therefore, in "clear"
state, "resync_start" shouldn't be read at all.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomonitor: don't call pselect() on deleted sysfs files
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:36 +0000 (12:07 +0200)] 
monitor: don't call pselect() on deleted sysfs files

It makes no sense to listen for events on files that have
been deleted. This happens when arrays are stopped and the
kernel removes the associated sysfs structures.

Calling pselect() on the deleted attributes may cause a storm
of wake events.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: add code to debug state changes
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:35 +0000 (12:07 +0200)] 
DDF: add code to debug state changes

The 10ddf-create test case fails sporadically because wrong meta
data is written, making the array appear inconsistent when it's
restarted. Added code to aid debugging this.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: brief_detail_super_ddf: print correct UUID for subarrays
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:34 +0000 (12:07 +0200)] 
DDF: brief_detail_super_ddf: print correct UUID for subarrays

Commit c1ea5a98 caused brief_detail_super_ddf() to be called
for subarrays. But the UUID printed was always the one of the
container. This is wrong and actually worse than printing no UUID
at all, and causes the DDF test case (10ddf-create) to fail.

This patch adds code to determine the MD UUID of a subarray correctly.
The hard part is to figure out for which subarray the function is
called. Moved that to an extra function.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: __write_init_super_ddf: just use seq number of active header
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:33 +0000 (12:07 +0200)] 
DDF: __write_init_super_ddf: just use seq number of active header

It's not necessary to check for 0xffffffff, which is a valid
sequential number.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: __write_ddf_structure: Fix wrong reference to ddf->primary
mwilck@arcor.de [Fri, 25 Oct 2013 10:07:32 +0000 (12:07 +0200)] 
DDF: __write_ddf_structure: Fix wrong reference to ddf->primary

Should reference "header" instead here.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoManage_runstop: call flush_mdmon if O_EXCL fails on stopping mdmon array.
NeilBrown [Mon, 22 Apr 2013 07:05:33 +0000 (17:05 +1000)] 
Manage_runstop: call flush_mdmon if O_EXCL fails on stopping mdmon array.

When stopping an mdmon array, at reshape might be being aborted
which inhibets O_EXCL.  So if that is possible, call flush_mdmon
to make sure mdmon isn't still busy.

Reported-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoimsm: monitor: do not finish migration if there are no failed disks
Przemyslaw Czarnowski [Thu, 18 Apr 2013 08:51:37 +0000 (10:51 +0200)] 
imsm: monitor: do not finish migration if there are no failed disks

Transition from "degraded" to "recovery" made in OROM is slightly different
than the same transision in mdadm. Missing disk is not removed from list of
raid devices, but just from map. Therefore mdadm should not end migration
basing on existence of list of missing disks but should rely on count of
failed disks.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Tested-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAdd updating component_size to manager thread of mdmon
Pawel Baldysiak [Wed, 3 Apr 2013 01:43:42 +0000 (12:43 +1100)] 
Add updating component_size to manager thread of mdmon

Mdmon does not update component_size now. It is wrong because in case
of size's expansion component_size is changed by mdadm but mdmon does not
reread its new value and uses a wrong, old one. As a result the metadata
is incorrect during size's expansion. It contains no information that
resync is in progress (there is no checkpoint too). The metadata is
as if resync has already been finished but it has not.

Component_size will be set to match information in sysfs. This value
will be updated by manager thread in manage_member() function.
Now mdmon uses the correct, current value of component_size and the
correct metadata (containing information about resync and checkpoint)
is written.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoEnsure mddev_dev struct always zeroed on allocation.
NeilBrown [Tue, 5 Mar 2013 00:53:51 +0000 (11:53 +1100)] 
Ensure mddev_dev struct always zeroed on allocation.

There are a number of fields which should not
be left uninitialised.  e.g. attempt_re_add can get
confused if ->writemostly is not set correctly.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoCreate: default to bitmap=internal for large arrays.
NeilBrown [Mon, 4 Mar 2013 23:36:21 +0000 (10:36 +1100)] 
Create: default to bitmap=internal for large arrays.

Here, "large" means components are 100G or more.  It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoEnhance incremental removal.
NeilBrown [Mon, 4 Mar 2013 22:46:34 +0000 (09:46 +1100)] 
Enhance incremental removal.

When asked to incrementally-remove a device, try marking the array
read-auto first.  That will delay recording the failure in the
metadata until it is really relevant.
This way, if the device are just unplugged when the array is not
really in use, the metadata will remain clean.

If marking the default as faulty fails because it is EBUSY, that
implies that the array would be failed without the device.  As the
device has (presumably gone) - that means the array is dead.  So try
to stop it.  If that fails because it is in use, send a uevent to
report that it is gone.  Hopefully whoever mounted it will now let go.

This means that if  you plug in some devices and they are
auto-assembled, then unplugging them will auto-deassemble relatively
cleanly.

To be complete, we really need the kernel to disassemble the array
after the last close somehow.  Maybe if a REMOVE has failed and a STOP
has failed and nothing else much has happened, it could safely stop
the array on last close.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdadm.8: Detail use for IMSM_NO_PLATFORM environment variable.
NeilBrown [Mon, 4 Mar 2013 06:25:36 +0000 (17:25 +1100)] 
mdadm.8: Detail use for IMSM_NO_PLATFORM environment variable.

Suggested-by: Marcin Tomczak <marcin.tomczak@intel.com>
11 years agoDetail.c: call load_container for container subarrays
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:33 +0000 (23:28 +0100)] 
Detail.c: call load_container for container subarrays

Without calling load_container at this point, the
info structure may be missing some important information.
In particular, information about secondary DDF RAID levels
may be wrong if information is only read from a single disk.

If this fails, fall back to the previous code.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: compare_super_ddf: merge local info of other superblock
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:32 +0000 (23:28 +0100)] 
DDF: compare_super_ddf: merge local info of other superblock

If a match is found in compare_super_ddf, check the other SB
for local DDF information (VD config records, physical disk data)
which is not available in the current superblock, and add it
if needed.

This is important for the mdmon - when disks are added to a
auto read-only array, they must be present in the DDF structure
in order to guarantee consistent writeback of metadata to all
disks.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: add sanity checks in compare_super_ddf
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:31 +0000 (23:28 +0100)] 
DDF: add sanity checks in compare_super_ddf

Besides container GUID, also check seqnum, physical and virtual
disk numbers, and check match between local and global sections.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: __write_init_super_ddf: use correct VD conf
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:30 +0000 (23:28 +0100)] 
DDF: __write_init_super_ddf: use correct VD conf

When writing back the DDF structure, make sure that on each disk
we write the configs that include this disk even if a secondary
RAID level is present. Otherwise the secondary RAID will not be
read correctly any more when we open the device next time.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: container_content_ddf: handle RAID layout for RAID10
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:29 +0000 (23:28 +0100)] 
DDF: container_content_ddf: handle RAID layout for RAID10

This patch adds basic handling for the special case of RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: container_content_ddf: check for secondary RAID
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:28 +0000 (23:28 +0100)] 
DDF: container_content_ddf: check for secondary RAID

Check for supportable secondary RAID configurations.
There is currently only one: RAID 10, if the stripe
sizes and Basic volume sizes are all equal.

With this patch, mdadm will not try to start unsupported
secondary RAID level configurations any more.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: container_content_ddf: change array disk search loop
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:27 +0000 (23:28 +0100)] 
DDF: container_content_ddf: change array disk search loop

When searching for container elements, loop over the known phys
disks rather than the elements of the current configuration.

This patch changes nothing in the logic or return value of the code.
It just prepares extended logic for handling RAID10.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: load_ddf_local: store VD conf for other BVDs
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:26 +0000 (23:28 +0100)] 
DDF: load_ddf_local: store VD conf for other BVDs

Store VD config for other BVDs in the other_bvds array.
This allows handling secondary RAID levels in container_content_ddf.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: added other_bvd to struct vcl
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:25 +0000 (23:28 +0100)] 
DDF: added other_bvd to struct vcl

The VD config structures of different BVDs in the same SVD may be
different. This pointer stores the other BVDs.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: increase seq number when writing meta data
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:24 +0000 (23:28 +0100)] 
DDF: increase seq number when writing meta data

Cleanly increase the seq number when the DDF structures are
written, instead of always setting it back to 1.

Also, make sure that the sequential number of all headers and
VD conf records is the same.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: use existing locations for primary and secondary DDF structure
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:23 +0000 (23:28 +0100)] 
DDF: use existing locations for primary and secondary DDF structure

Some RAID BIOSes apparently use hard-coded LBA offsets (presumably
from the end of the disk) for the primary and secondary DDF
structure, ignoring the values given in the DDF anchor. This is
broken BIOS behavior, but it will cause any changes made by MD
(e.g. setting the init_state flag after a full initialization)
to be "forgotten" after the next reboot.

This patch fixes this by using the exiting LBA locations if
available. Verified that this fixes MD+LSI Mega Software RAID
BIOS.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDDF: cleanly save the secondary DDF structure
mwilck@arcor.de [Fri, 1 Mar 2013 22:28:22 +0000 (23:28 +0100)] 
DDF: cleanly save the secondary DDF structure

So far, mdadm only saved the header of the secondary structure.
With this patch, the full secondary DDF structure is saved
consistently, too. Some vendor DDF implementations need it.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDiscard devnum in favour of devnm
NeilBrown [Thu, 1 Nov 2012 05:14:01 +0000 (16:14 +1100)] 
Discard devnum in favour of devnm

We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.

So get rid of devnum (a number) and use devnm (a 32char string).
eg.
  md0
  md_d2
  md_home

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoGrow: fix problem with reshaping RAID4 to RAID0.
NeilBrown [Thu, 21 Feb 2013 06:02:21 +0000 (17:02 +1100)] 
Grow: fix problem with reshaping RAID4 to RAID0.

As 'layout' doesn't map neatly from RAID4 to RAID5, we need to
set it correctly for RAID4.
Also, when no reshape is needed we should set re->level to the final
desired level.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoGrow: disallow --size changes on RAID0 and Linear.
NeilBrown [Thu, 21 Feb 2013 03:51:11 +0000 (14:51 +1100)] 
Grow: disallow --size changes on RAID0 and Linear.

These aren't meaningful and must be disabled.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoudev: Fix order of execution of the md rules
Thomas Bächler [Sat, 9 Feb 2013 20:49:47 +0000 (21:49 +0100)] 
udev: Fix order of execution of the md rules

Right now, the rules that run blkid on raid arrays are executed after
the assembly rules. This means incremental assembly will always fail
when raid arrays are again physical components of raid arrays.

Instead of simply reversing the order, split the rules up into two files,
one dealing with array properties and one dealing with assembly.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoModernize udev rules
Thomas Bächler [Sat, 9 Feb 2013 17:48:38 +0000 (18:48 +0100)] 
Modernize udev rules

* $tempnode is deprecated, use $devnode
* blkid -o udev output is deprecated, use IMPORT{builtin}="blkid" instead

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdadm.h: fix ugly glibc specific ifdeffery
John Spencer [Sat, 2 Feb 2013 16:37:55 +0000 (17:37 +0100)] 
mdadm.h: fix ugly glibc specific ifdeffery

the code that was exposed on anything else than dietlibc and klibc
is entirely glibc specific and broke the build on musl libc.

Signed-off-by: John Spencer <maillist-mdadm@barfooze.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoplatform-intel: canonicalize_file_name() is not portable
John Spencer [Sat, 2 Feb 2013 16:26:45 +0000 (17:26 +0100)] 
platform-intel: canonicalize_file_name() is not portable

this is a GLIBC specific feature and should not be used.

according to its manpage:
"The call canonicalize_file_name(path) is equivalent
to the call realpath(path, NULL)."

thus, we use realpath so it works everywhere.

Signed-off-by: John Spencer <maillist-mdadm@barfooze.de>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomake --update=homehost work again
NeilBrown [Thu, 7 Feb 2013 00:51:21 +0000 (11:51 +1100)] 
make --update=homehost work again

Commit 1e2b276535cea41c348292a019bdda8a58cb1679 (Report error in --update
string is not recognised) broke homehost updating functionality because it
depended on each string comparison being done even after we already found
a match.  Make it work again by restructuring code.

Reported-by: (and original version by) Justin Maggard <jmaggard10@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAvoid using BLKFLSBUF.
NeilBrown [Tue, 5 Feb 2013 04:34:17 +0000 (15:34 +1100)] 
Avoid using BLKFLSBUF.

Now that we use O_DIRECT for all device IO, BLKFLSBUF is not needed to
ensure we get current data, and it can impose a cost if any flush-out
is needed.  So remove it.

To be safe, add O_DIRECT to one place where it isn't currently used:
when reading a bitmap.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoDetail: print correct size for large external-metadata arrays.
NeilBrown [Tue, 5 Feb 2013 04:32:49 +0000 (15:32 +1100)] 
Detail: print correct size for large external-metadata arrays.

If externally menaged metadata is in use, array.major_version will
be zero, so the test here to consider using get_component_size()
is wrong.  So if sra is present, use the major_version from there.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdmon: add --foreground option
NeilBrown [Tue, 5 Feb 2013 04:57:09 +0000 (15:57 +1100)] 
mdmon: add --foreground option

While not strictly necessary for systemd, it is cleaner to avoid
forking when running from a management daemon.  So add a --foreground
option to mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoIn case launching mdmon fails, print an error message before exiting
Jes Sorensen [Fri, 1 Feb 2013 15:15:19 +0000 (16:15 +0100)] 
In case launching mdmon fails, print an error message before exiting

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAdd support for launching mdmon via systemctl instead of fork/exec
Jes Sorensen [Fri, 1 Feb 2013 15:15:18 +0000 (16:15 +0100)] 
Add support for launching mdmon via systemctl instead of fork/exec

If launching mdmon via systemctl fails, we fall back to the old method
of fork/exec. This allows for having mdmon launched via systemctl
which avoids problems with it getting killed by systemd due to it
ending up in the parent's cgroup (udev).

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoRemove --offroot argument and default to always setting argv[0] to @
Jes Sorensen [Fri, 1 Feb 2013 15:15:17 +0000 (16:15 +0100)] 
Remove --offroot argument and default to always setting argv[0] to @

We still allow --offroot to be given - for compatibility with scripts
- but ignore it.

The whole point of --offroot is to get systemd to not auto-kill mdmon,
and we always want that.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomdadm.conf.5: clarify connection between action=re-add and bitmaps.
NeilBrown [Sun, 20 Jan 2013 23:12:53 +0000 (10:12 +1100)] 
mdadm.conf.5: clarify connection between action=re-add and bitmaps.

action=re-add will only re-add a recently removed device if a
bitmap is present.
Otherwise a force-space is needed.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agodev_open - don't bother trying map_dev
NeilBrown [Sun, 6 Jan 2013 23:38:46 +0000 (10:38 +1100)] 
dev_open - don't bother trying map_dev

map_dev can be slow, and doesn't really provide a better result
than just creating a temporary device.
So discard it and use mknod/open/unlink to open a major:minor device.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoplatform-intel - cache 'intel_devices' for a few seconds.
NeilBrown [Sun, 6 Jan 2013 23:34:43 +0000 (10:34 +1100)] 
platform-intel - cache 'intel_devices' for a few seconds.

find_intel_devices() has take a little while to run as it scans
some directory tree, and the result isn't likely to change
often.
So cache the value and only discard it after 10 seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoconditionally remove map_dev from find_free_devnum
NeilBrown [Sun, 6 Jan 2013 23:17:04 +0000 (10:17 +1100)] 
conditionally remove map_dev from find_free_devnum

map_dev can be slow so it is best to not call it when
not necessary.
The final test in "find_free_devnum" is not relevant when
udev is being used, so remove the test in that case.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoMISC: Add --examine-badblocks option
NeilBrown [Wed, 5 Dec 2012 01:56:31 +0000 (12:56 +1100)] 
MISC: Add --examine-badblocks option

This will list the contents of the bad-blocks log, if one is present.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAssemble: fix spelling: report_missmatch -> report_mismatch
NeilBrown [Wed, 5 Dec 2012 00:40:28 +0000 (11:40 +1100)] 
Assemble: fix spelling: report_missmatch -> report_mismatch

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAssemble: Don't auto-assemble arrays which conflict with mdadm.conf
NeilBrown [Wed, 5 Dec 2012 00:06:55 +0000 (11:06 +1100)] 
Assemble:  Don't auto-assemble arrays which conflict with mdadm.conf

When auto-assembling we might find an array which appear in
mdadm.conf.
This can happen if the array (based on UUID) doesn't match what is
in mdadm.conf.
For consistency we should avoid auto-assembling such an array just as
we avoid regular-assembling of the array.

Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoManage: Add support for --re-add faulty
NeilBrown [Tue, 27 Nov 2012 23:19:52 +0000 (10:19 +1100)] 
Manage: Add support for --re-add faulty

mdadm /dev/mdXX --re-add faulty

will identify any faulty devices in the array, remove them, and
--re-add them.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoFix "--remove faulty" and similar commands.
NeilBrown [Tue, 27 Nov 2012 23:12:09 +0000 (10:12 +1100)] 
Fix "--remove faulty" and similar commands.

A recent change to improve error messages for subdev management broken
all use cases were device names like %d:%d were used.
Re-arrange the code again so we use dev_open first - which understands
those names - and then only try 'stat' if that failed.
The important thing is to base the 'Cannot find' message on the result
of 'stat', not on the result of 'open'.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAssemble: ensure that <ignore>d arrays are not auto-assembled.
NeilBrown [Thu, 22 Nov 2012 06:04:20 +0000 (17:04 +1100)] 
Assemble: ensure that <ignore>d arrays are not auto-assembled.

It isn't enough to simply not assemble arrays found to be called
<ignore>, as the final stage of auto-assemble doesn't check for names
in mdadm.conf.

So add a check to Assemble, similar to the check in Incremental()

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoconf: allow multiple arrays to be <ignore>d
NeilBrown [Thu, 22 Nov 2012 05:28:00 +0000 (16:28 +1100)] 
conf: allow multiple arrays to be <ignore>d

We currently complain if mdadm.conf contains multiple
definitions for the same name.  Unfortunately this stops
multiple arrays  from being <ignored>d.

So exclude "<ignore>" from the duplicate-names test.

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoAllow --wait to wait for delayed resync.
NeilBrown [Wed, 21 Nov 2012 21:58:54 +0000 (08:58 +1100)] 
Allow --wait to wait for delayed resync.

If a resync is delayed, then e->percent will be negative but not
RESYNC_NONE.  In that case we still want to wait.

Reported-by: Ross Boylan <ross@biostat.ucsf.edu>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoGrow: fix bug when multiple arrays present.
NeilBrown [Wed, 21 Nov 2012 21:57:25 +0000 (08:57 +1100)] 
Grow: fix bug when multiple arrays present.

commit 1f9b0e2845e1ec22dc24dcef275a733c09ff2edd
    Grow - be careful about 'delayed' reshapes.

Introduced a bug where a list of devices longer than 1
would cause an infinite loop.  Oops.

Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoMakefile: remove "sh" from instructions for running 'test'.
NeilBrown [Tue, 20 Nov 2012 01:15:11 +0000 (12:15 +1100)] 
Makefile: remove "sh" from instructions for running 'test'.

'test' is really a bash script more than an 'sh' script, so
don't say "run 'sh ./test'", just say "run './test'".

Reported-by: Gilles Espinasse <g.esp@free.fr>
Signed-off-by: NeilBrown <neilb@suse.de>