]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
10 years agoDDF: layout_md2ddf: new md->DDF layout conversion
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:52 +0000 (22:27 +0200)] 
DDF: layout_md2ddf: new md->DDF layout conversion

Support for RAID 10 makes it necessary to rewrite the algorithm
for deriving DDF layout from MD layout. The functions level_to_prl
and layout_to_rlq are combined in a single function that takes
md layout parameters and converts them to DDF.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: layout_ddf2md: new DDF->md RAID layout conversion
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:51 +0000 (22:27 +0200)] 
DDF: layout_ddf2md: new DDF->md RAID layout conversion

layout_ddf2md() is a new RAID layout conversion routine.
It obsoletes the previous separate routines for obtaining
md level and layout (map_num1, rlq_to_layout).

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: allow empty slots in virt disk table
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:50 +0000 (22:27 +0200)] 
DDF: allow empty slots in virt disk table

The DDF code was assuming that the VD slots 0..populated_vdes
were used and the rest was unused. Remove this assumption and
deal with empty slots instead.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: get_svd_state: Status logic for secondary RAID level
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:49 +0000 (22:27 +0200)] 
DDF: get_svd_state: Status logic for secondary RAID level

Implement logic to derive the status of a secondary RAID
from its members. Use it in ddf_set_disk.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_set_disk: move status logic to separate function
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:48 +0000 (22:27 +0200)] 
DDF: ddf_set_disk: move status logic to separate function

Moved code to determine RAID status to a separate function
get_bvd_status(). I need this to account for secondary RAID level.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: find_vdcr: account for secondary RAID level
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:47 +0000 (22:27 +0200)] 
DDF: find_vdcr: account for secondary RAID level

If secondary RAID level is taken into account, translation between
the md RAID member (raid_disk) and the index of a physical disk
in a BVD becomes more complex.

Also, take into account that the member list can have unused entries
(this is independent of secondary RAID level).

Adapt usage of find_vdcr() accordingly

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: ddf_open_new: implement minimal consistency check
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:46 +0000 (22:27 +0200)] 
DDF: ddf_open_new: implement minimal consistency check

Added a minimal consitency check as in imsm_open_new().

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: Implement store_super_ddf
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:45 +0000 (22:27 +0200)] 
DDF: Implement store_super_ddf

This patch implements the previously unsupported case where
store_super_ddf is called with a non-empty superblock.

For DDF, writing meta data to just one disk makes no sense.
We would run the risk of writing inconsistent meta data
to the devices. So just call __write_init_super_ddf and
write to all devices, including the one passed by the caller.

This patch assumes that the device to store the superblock on
has already been added to the DDF structure. Otherwise, an
error message will be emitted.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: handle "open flag" according to spec
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:44 +0000 (22:27 +0200)] 
DDF: handle "open flag" according to spec

The DDF spec mandates that the "open flag" be set to non-0 before
writing a configuration, and reset to 0 when finished to indicate
success.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: load_ddf_headers: use secondary header as fallback
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:43 +0000 (22:27 +0200)] 
DDF: load_ddf_headers: use secondary header as fallback

When the primary header can't be read, use the secondary header
as fallback.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF: check_secondary: fix treatment of missing BVDs
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:42 +0000 (22:27 +0200)] 
DDF: check_secondary: fix treatment of missing BVDs

Unused BVDs should just be skipped instead of bailing out.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDDF (cleanup): use a common macro for failed searches
mwilck@arcor.de [Wed, 3 Jul 2013 20:27:41 +0000 (22:27 +0200)] 
DDF (cleanup): use a common macro for failed searches

Use DDF_NOTFOUND instead of NO_SUCH_REFNUM.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdadm.8: growing RAID10 chunk size is possible
Christoph Anton Mitterer [Fri, 5 Jul 2013 02:14:44 +0000 (04:14 +0200)] 
mdadm.8: growing RAID10 chunk size is possible

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: avoid a consistency check when --force is given.
NeilBrown [Mon, 8 Jul 2013 02:02:23 +0000 (12:02 +1000)] 
Assemble: avoid a consistency check when --force is given.

mdadm will normally not include a device into an array if that device
reports that the "best" device has failed, as this normally implies
some sort of inconsistency.
However when --force is given it means that the given drives really
should be assembled if at all possible so in that case the test should
be avoided.

The particular case where this was a problem was a RAID5 were all
devices had the same event count but three of them reported that the
first two had failed.
As they all had the same event count the first was taken as the 'best'
and that caused the later ones to be excluded.  Listing one of the
later ones first allowed the array to be assembled.  So in this case
the test clearly just got in the way and did nothing useful.

Reported-by: "Marek Jaros" <mjaros1@nbox.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add test to revert shrinking reshapes.
NeilBrown [Thu, 4 Jul 2013 07:16:58 +0000 (17:16 +1000)] 
tests: add test to revert shrinking reshapes.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: notice when --stop is synchronising a reshape and don't mess it up.
NeilBrown [Thu, 4 Jul 2013 07:16:20 +0000 (17:16 +1000)] 
Grow: notice when --stop is synchronising a reshape and don't mess it up.

--stop now tries to wait for a reshape to be at just the right spot.
However for a reducing reshape, mdadm will be running in the
background watching, and might adjust sync_max and mess things up.

So teach "progress_reshape" to notice when "sync_max" is modified, and
leave it alone.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix small bug when reshape interrupted.
NeilBrown [Wed, 3 Jul 2013 01:39:28 +0000 (11:39 +1000)] 
Grow: fix small bug when reshape interrupted.

progress_reshape() may not set reshape_completed if the reshape is
interrupted, so we need to initialize it to the current value before
hand, so the value used afterwards is credible.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add a test for reverting reshapes
NeilBrown [Tue, 2 Jul 2013 06:19:52 +0000 (16:19 +1000)] 
tests: add a test for reverting reshapes

Only reverting reshapes that grow the array so far.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoStop: improve synchronising of reshape with whole stripes.
NeilBrown [Tue, 2 Jul 2013 06:18:21 +0000 (16:18 +1000)] 
Stop: improve synchronising of reshape with whole stripes.

It is possible for 'sync_completed' to be further ahead than
we deduced from 'reshape_position'.  However we cannot read it while
the array is frozen, so it is hard to know.

Once that array is unfrozen, check and if sync_completed is ahead of
'sync_max',  push 'sync_max' well ahead if 'sync_completed' so it
will all synchronise up properly.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agorevert-reshape: only impose reshape_position tests on raid[456]
NeilBrown [Tue, 2 Jul 2013 06:10:27 +0000 (16:10 +1000)] 
revert-reshape: only impose reshape_position tests on raid[456]

This test is irrelevant for RAID10, so restrict it to those
levels in which it is meaningful.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosysfs: fix bugs in new sysfs_wait function.
NeilBrown [Tue, 2 Jul 2013 06:08:34 +0000 (16:08 +1000)] 
sysfs: fix bugs in new sysfs_wait function.

- 'tv' isn't initialised properly.
- 100?  I'm sure I fixed that already! Seems not.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check - fix compile
NeilBrown [Tue, 2 Jul 2013 06:06:55 +0000 (16:06 +1000)] 
raid6check - fix compile

Recent rearrangement of library code broke 'raid6check' and this
wasn't noticed because 'make everything' doesn't build it.

So fix the breakage and have 'make everything' built it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotest: include any 'stderr' output in the log file.
NeilBrown [Tue, 2 Jul 2013 03:12:07 +0000 (13:12 +1000)] 
test: include any 'stderr' output in the log file.

Errors from mdadm go to 'stderr', so if there is an array,
copy those to the log file.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: improve messages when restarting a reshape.
NeilBrown [Tue, 2 Jul 2013 03:09:07 +0000 (13:09 +1000)] 
Assemble: improve messages when restarting a reshape.

If the restarted reshape needs a backup file and we don't have one,
that should be reported before we try to start the array.
Also we shouldn't say the "Cannot grow" but "cannot complete".

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: ignore devices= if container= is present.
NeilBrown [Tue, 2 Jul 2013 01:07:38 +0000 (11:07 +1000)] 
Assemble: ignore devices= if container= is present.

If "container=" is present, then we are going to assemble from the
given container where that container is made of those devices or not.
So in this case the "devices=" is purely documentation and is best
ignored.

As part of this, move the test on the "container=" value when that
start with "/" up before the device is opened.  There sooner we test
things, the better.

Reported-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoConfig: use better device names for "DEVICES container"
NeilBrown [Tue, 2 Jul 2013 00:46:43 +0000 (10:46 +1000)] 
Config: use better device names for "DEVICES container"

When "containers" appears on the "DEVICES" line (which is does by
default), use names from the mdadm map file instead of kernel names,
when possible.
This mean that the name will be more likely to appear in mdadm.conf
and so more likely to match "container=" tags.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: write raid-disks should be less fatal.
NeilBrown [Tue, 2 Jul 2013 00:31:31 +0000 (10:31 +1000)] 
Assemble: write raid-disks should be less fatal.

If the container metadata doesn't know how many device to expect (as
is the case with IMSM), don't fail an --assemble which over-specifies
the number of devices.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMove find_free_devnum to mdopen.c
NeilBrown [Tue, 2 Jul 2013 00:24:50 +0000 (10:24 +1000)] 
Move find_free_devnum to mdopen.c

There is only one called to find_free_devnum and it is in mdopen.c

The removes a dependency between util.c and config.c which allows
us to now drop config.o from mdmon.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMove conf_line and free_line from conf.c to lib.c
NeilBrown [Tue, 2 Jul 2013 00:17:51 +0000 (10:17 +1000)] 
Move conf_line and free_line from conf.c to lib.c

As they are uses for mdstat as well as mdadm.conf, they don't really
belong in conf.c

This removes a dependency between mdmon and conf.c

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDetail: Factor out add_device()
Martin Wilck [Thu, 27 Jun 2013 19:39:27 +0000 (21:39 +0200)] 
Detail: Factor out add_device()

Makes the code a little more readable.

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdadm.8.in: Fix typo: previous -> previously
NeilBrown [Mon, 1 Jul 2013 22:30:28 +0000 (08:30 +1000)] 
mdadm.8.in: Fix typo: previous -> previously

Signed-off-by: Wieland Hoffmann <themineo@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: check alignment when stopping an array undergoing reshape.
NeilBrown [Mon, 1 Jul 2013 05:10:05 +0000 (15:10 +1000)] 
Manage: check alignment when stopping an array undergoing reshape.

To be able to revert-reshape of raid4/5/6 which is changing
the number of devices, the reshape must has been stopped on a multiple
of the old and new stripe sizes.

The kernel only enforces the new stripe size multiple.

So we enforce the old-stripe-size multiple by careful use of
"sync_max" and monitoring "reshape_position".

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoNew function: sysfs_wait
NeilBrown [Mon, 1 Jul 2013 03:28:13 +0000 (13:28 +1000)] 
New function: sysfs_wait

We have several places that wait for activity on a sysfs
file.  Combine most of these into a single 'sysfs_wait' function.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agorevert-reshape: make sure reshape_position is acceptable.
NeilBrown [Thu, 27 Jun 2013 06:38:53 +0000 (16:38 +1000)] 
revert-reshape: make sure reshape_position is acceptable.

We can only revert a reshape if the reshape_position aligns
properly for the old geometry.
If it doesn't we just fail for now.

Also fix a +/- error with updating raid_disks for super1.c

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests/raid6repair: default data offset has changed.
NeilBrown [Thu, 27 Jun 2013 04:29:18 +0000 (14:29 +1000)] 
tests/raid6repair: default data offset has changed.

So the test scripts must change too.

Signed-off-by: NeilBrown <neilb2suse.de>
10 years ago"make test" should build "raid6check"
NeilBrown [Thu, 27 Jun 2013 04:09:48 +0000 (14:09 +1000)] 
"make test" should build "raid6check"

As there are selftests for raid6check.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: remove some stray tracing.
NeilBrown [Thu, 27 Jun 2013 04:07:38 +0000 (14:07 +1000)] 
Assemble: remove some stray tracing.

Was introduced in:
  Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix crash when restarting an array.
NeilBrown [Thu, 27 Jun 2013 03:10:44 +0000 (13:10 +1000)] 
Grow: fix crash when restarting an array.

After the 'started' label it is assumed that 'sra' is set, so better
set it when jumping there.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: lack of head/tail space not fatal for RAID5 etc.
NeilBrown [Thu, 27 Jun 2013 00:20:34 +0000 (10:20 +1000)] 
Grow: lack of head/tail space not fatal for RAID5 etc.

For RAID10, we must have head/tail space for reshape.
For RAID4/5/6 we can use a spare or a backup file.

So make that distinction.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: report better message when --grow --chunk cannot work.
NeilBrown [Thu, 27 Jun 2013 00:12:31 +0000 (10:12 +1000)] 
Grow: report better message when --grow --chunk cannot work.

When changing the chunksize of an array, the new chunksize must
divide the device size.
If it doesn't we report a very brief message.
Make this message a bit longer and suggest a way forward be reducing
the size of the array.

Reported-by: Mark Knecht <markknecht@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMakefile/version: use version/date from .git if possible.
NeilBrown [Tue, 25 Jun 2013 06:27:05 +0000 (16:27 +1000)] 
Makefile/version: use version/date from .git if possible.

If being built from a git tree, use the version and date
information from the top commit rather than the hard-coded
values.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoSubject: Make wait_for and open_dev_excl faster
NeilBrown [Tue, 25 Jun 2013 05:56:22 +0000 (15:56 +1000)] 
Subject: Make wait_for and open_dev_excl faster

When we crete or assemble an array, we wait for udev to create the
device file in /dev so that as soon as mdadm complete, the device can
be used.

This waiting is performed in multiples of 200ms, which can sometimes
be too long to wait.

So change to an exponential backoff.  Wait 1, then 2, then 4 msec etc.
Once we get to 256msec, stop backing off and continue waiting 256ms at
a time until we reach the limit which is now 4.608sec rather than 5sec
which it was before.

Ditto for open_dev_excl.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add device size tests when change raid leve to/from 0
NeilBrown [Tue, 25 Jun 2013 05:54:44 +0000 (15:54 +1000)] 
tests: add device size tests when change raid leve to/from 0

There was a kernel bug that got this wrong, so better check for it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix bug in raid0 -> raid5 conversion.
NeilBrown [Tue, 25 Jun 2013 05:52:58 +0000 (15:52 +1000)] 
Grow: fix bug in raid0 -> raid5 conversion.

The moment we change a RAID0 to a RAID5 it will try to recovery.  This
will abort quite quickly as there are not spare devices, but it could
confuse the attempt to freeze the array.

So allow 'freeze' to work even on a recovering array.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMake: CXFLAGS should be conditionally assigned. mdadm-3.3-rc1
NeilBrown [Mon, 24 Jun 2013 06:59:37 +0000 (16:59 +1000)] 
Make: CXFLAGS should be conditionally assigned.

As the Makefile encourages users to set CXFLAGS for extra flags,
we should only conditionally set it.
That way it can be over-ridden in the environment as well as on
the command line.

Suggested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDetail: deterministic ordering in --brief --verbose
mwilck@arcor.de [Thu, 20 Jun 2013 20:21:05 +0000 (22:21 +0200)] 
Detail: deterministic ordering in --brief --verbose

Have mdadm --Detail --brief --verbose print the list of devices in
alphabetical order.

This is useful for debugging purposes. E.g. the test script
10ddf-create compares the output of two mdadm -Dbv calls which
may be different if the order is not deterministic.

(I confess: I use a modified "test" script that always runs
"mdadm --verbose" rather than "mdadm --quiet", otherwise this
wouldn't happen in 10ddf-create).

Signed-off-by: Martin Wilck <mwilck@arcor.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: fix space_{before,after} for RAID0
NeilBrown [Mon, 24 Jun 2013 06:24:08 +0000 (16:24 +1000)] 
super1: fix space_{before,after} for RAID0

For RAID0 we need to use 'data_size', no 'size' as later is 0.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow "--add" with "--grow --level=??"
NeilBrown [Mon, 24 Jun 2013 06:13:00 +0000 (16:13 +1000)] 
Grow: allow "--add" with "--grow --level=??"

This is useful for reshaping a RAID0 to a higher level.
The recovery will happen at the same time as the reshape.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: chose default layout when converting from RAID0.
NeilBrown [Mon, 24 Jun 2013 06:06:21 +0000 (16:06 +1000)] 
Grow: chose default layout when converting from RAID0.

If we don't do this explicitly, we end up keeping the "current"
layout, which is meaningless for RAID0.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add test for converting levels to raid0 and back.
NeilBrown [Mon, 24 Jun 2013 05:57:58 +0000 (15:57 +1000)] 
tests: add test for converting levels to raid0 and back.

Now that I have this mostly working, I should make sure
it doesn't break...

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotest/00names: use appropriate mdadm.conf
NeilBrown [Mon, 24 Jun 2013 05:44:36 +0000 (15:44 +1000)] 
test/00names: use appropriate mdadm.conf

Using non-numeric names needs an mdadm.conf setting,
so make sure we have one.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: centralise level-change code.
NeilBrown [Mon, 24 Jun 2013 05:27:07 +0000 (15:27 +1000)] 
Grow: centralise level-change code.

There are now 3 places which change level.
And they all do it slightly differently with different
messages etc.

Make a single function for this and use it.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: remove excess drives when converting to RAID0.
NeilBrown [Mon, 24 Jun 2013 04:08:41 +0000 (14:08 +1000)] 
Grow: remove excess drives when converting to RAID0.

When converting to RAID0, all spares and non-data drives
need to be removed first.
It is possible that the first HOT_REMOVE_DISK will fail because the
personality hasn't let go of it yet, so retry a few times.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: clear new_layout when we change the level.
NeilBrown [Mon, 24 Jun 2013 03:08:13 +0000 (13:08 +1000)] 
Grow: clear new_layout when we change the level.

After changing the level, the meaning of layout numbers changes,
so we will keeping a new_layout value around can cause later confusion.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: analyse_change needs to set new_size even if nothing much is happening.
NeilBrown [Mon, 24 Jun 2013 03:06:32 +0000 (13:06 +1000)] 
Grow: analyse_change needs to set new_size even if nothing much is happening.

This means it will be set for a "--data-offset" only reshape so that
case doesn't complain that the array is getting smaller.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix two problems with new_data_offset
NeilBrown [Mon, 24 Jun 2013 03:04:38 +0000 (13:04 +1000)] 
Grow: fix two problems with new_data_offset

1/ ignore failed devices - obviously
2/ We need to tell the kernel which direction the reshape should
   progress even if we didn't choose the particular data_offset
   to use.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: Try hard to set new_offset.
NeilBrown [Mon, 24 Jun 2013 03:02:35 +0000 (13:02 +1000)] 
Grow: Try hard to set new_offset.

Setting new_offset can fail if the v1.x "data_size" is too small.
So if that happens, try increasing it first by writing "0".
That can fail on spare devices due to a kernel bug, so if it doesn't
try writing the correct number of sectors.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: Make sure new data-offset is well-aligned
NeilBrown [Mon, 24 Jun 2013 02:55:41 +0000 (12:55 +1000)] 
Grow: Make sure new data-offset is well-aligned

If we choose a new data-offset, make sure it is rounded to a largest
power of to possible, up to 1Meg

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: a data_offset should not be tested against 0.
NeilBrown [Wed, 19 Jun 2013 06:55:35 +0000 (16:55 +1000)] 
Grow: a data_offset should not be tested against 0.

It should always be tested against INVALID_SECTORS!!!

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: add test for non-numeric device names
NeilBrown [Wed, 19 Jun 2013 06:44:18 +0000 (16:44 +1000)] 
tests: add test for non-numeric device names

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd test for interaction of --assemble with --incr
NeilBrown [Wed, 19 Jun 2013 06:33:55 +0000 (16:33 +1000)] 
Add test for interaction of --assemble with --incr

and fix the bug that it found.  The refactor of start_array()
missed a test.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAdd test for --update=metadata and fix bug it found.
NeilBrown [Wed, 19 Jun 2013 06:28:05 +0000 (16:28 +1000)] 
Add test for --update=metadata and fix bug it found.

We were not setting device size correctly for raid0.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agotests: rearrange sometest groupings.
NeilBrown [Wed, 19 Jun 2013 03:46:53 +0000 (13:46 +1000)] 
tests: rearrange sometest groupings.

All 'update' tests in 04
More imsm tests in 09

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoRemove lots of unnecessary white space.
NeilBrown [Wed, 19 Jun 2013 02:31:45 +0000 (12:31 +1000)] 
Remove lots of unnecessary white space.

Now that I am using white-space mode in Emacs I can see all of this,
and I don't like it :-)

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: allow "--stop" on kernel names.
NeilBrown [Wed, 19 Jun 2013 01:39:14 +0000 (11:39 +1000)] 
Manage: allow "--stop" on kernel names.

e.g.
   mdadm --stop md4

This works even if udev has become confused or killed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoManage: split Manage_runstop into Manage_run and Manage_stop
NeilBrown [Wed, 19 Jun 2013 01:23:44 +0000 (11:23 +1000)] 
Manage: split Manage_runstop into Manage_run and Manage_stop

The two branches have virtually nothing in common, so it is simpler if
they are separate.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: when forcing a single-degraded RAID6 array, trigger a 'repair'.
NeilBrown [Wed, 19 Jun 2013 01:09:33 +0000 (11:09 +1000)] 
Assemble: when forcing a single-degraded RAID6 array, trigger a 'repair'.

When an active/degraded RAID6 array is force-started we clear the
'active' flag, but it is still possible that some parity is
no in sync.  This is because there are two parity block.
It would be nice to be able to tell the kernel "P is OK, Q maybe not".
But that is not possible.

So when we force-assemble such an array, trigger a 'repair' to fix up
any errant Q blocks.

This is not ideal as a restart during the repair will not be continued
after the restart, but it is the best we can do without kernel help.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoDetail: add device information to --detail --export
NeilBrown [Wed, 19 Jun 2013 00:35:23 +0000 (10:35 +1000)] 
Detail: add device information to --detail --export

We may well want more per-device information here, but this
is a start.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosysfs_read: return devices in same order as in filesystem.
NeilBrown [Wed, 19 Jun 2013 00:33:47 +0000 (10:33 +1000)] 
sysfs_read: return devices in same order as in filesystem.

When we read devices from sysfs (../md/dev-*), store them in the same
order that they appear.  That makes more sense when exposed to a
human (as the next patch will).

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check: Check return value of lseek64()
Bernd Schubert [Tue, 18 Jun 2013 09:09:41 +0000 (11:09 +0200)] 
raid6check: Check return value of lseek64()

If lseek64() failed it was still writing to the disks, which would introduce
data corruption.

Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check: Fix compiler warnings.
Bernd Schubert [Tue, 18 Jun 2013 09:09:36 +0000 (11:09 +0200)] 
raid6check: Fix compiler warnings.

Fix some compiler warnings appearing with optimization levels.

Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check: Use enums for repair type
Bernd Schubert [Tue, 18 Jun 2013 09:09:31 +0000 (11:09 +0200)] 
raid6check: Use enums for repair type

Using hard coded numbers is error prone and hard to read by humans.

Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check: Fix memory leaks detected by valgrind
Bernd Schubert [Tue, 18 Jun 2013 09:09:26 +0000 (11:09 +0200)] 
raid6check: Fix memory leaks detected by valgrind

==2389947== 24 bytes in 1 blocks are definitely lost in loss record 1 of 10
==2389947==    at 0x4C2B3F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2389947==    by 0x408067: xmalloc (xmalloc.c:36)
==2389947==    by 0x401B19: check_stripes (raid6check.c:151)
==2389947==    by 0x4030C6: main (raid6check.c:521)
==2389947==
==2389947== 24 bytes in 1 blocks are definitely lost in loss record 2 of 10
==2389947==    at 0x4C2B3F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2389947==    by 0x408067: xmalloc (xmalloc.c:36)
==2389947==    by 0x401B67: check_stripes (raid6check.c:155)
==2389947==    by 0x4030C6: main (raid6check.c:521)
==2389947==

Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoraid6check: Fix build of raid6check
Bernd Schubert [Tue, 18 Jun 2013 09:09:16 +0000 (11:09 +0200)] 
raid6check: Fix build of raid6check

After recent git pull 'make raid6check' did not work anymore, as
sysfs_read() was called with a wrong argument and as check_env()
was used by use_udev(), but not defined.

Replace sysfs_read(..., -1, ...) by sysfs_read(..., NULL, ...)

Move check_env() from util.c to lib.c

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMakefile: add "-O3" to WARN_UNUSED options.
NeilBrown [Wed, 19 Jun 2013 00:02:17 +0000 (10:02 +1000)] 
Makefile: add "-O3" to WARN_UNUSED options.

This finds more errors

Also remove some trailing spaces.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: fix up recent changes to set_new_data_offset.
NeilBrown [Tue, 18 Jun 2013 23:58:02 +0000 (09:58 +1000)] 
Grow: fix up recent changes to set_new_data_offset.

The second 'info2' wasn't being initialised.  So don't use it.

Reported by -O3

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper0: set uninitialized variable.
NeilBrown [Tue, 18 Jun 2013 23:51:01 +0000 (09:51 +1000)] 
super0: set uninitialized variable.

Reported by -O3

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble/Incr: Don't include spares with too-high event count.
NeilBrown [Mon, 17 Jun 2013 06:55:31 +0000 (16:55 +1000)] 
Assemble/Incr: Don't include spares with too-high event count.

Some failure scenarios can leave a spare with a higher event count
than an in-sync device.  Assembling an array like this will confuse
the kernel.
So detect spares with event counts higher than the best non-spare
event count and exclude them from the array.

Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agomdadm.h: add little bits of doco for 'struct superswitch'.
NeilBrown [Mon, 17 Jun 2013 06:04:59 +0000 (16:04 +1000)] 
mdadm.h: add little bits of doco for 'struct superswitch'.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoMake sure NOFILE resource limit is big enough.
NeilBrown [Thu, 30 May 2013 04:31:09 +0000 (14:31 +1000)] 
Make sure NOFILE resource limit is big enough.

Some people want to create truely enormous arrays.
As we sometimes need to hold one file descriptor for each
device, this can hit  the NOFILE limit.

So raise the limit if it ever looks like it might be a problem.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoIncremental: allow --quiet to silence from errors from "-If"
NeilBrown [Tue, 28 May 2013 23:13:25 +0000 (09:13 +1000)] 
Incremental: allow --quiet to silence from errors from "-If"

-q is currently ineffective on "mdadm -If".   Messages that are not
usage errors should be suppressed.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow_continue: handle RESHAPE_NO_BACKUP correctly.
NeilBrown [Mon, 27 May 2013 05:37:30 +0000 (15:37 +1000)] 
Grow_continue: handle RESHAPE_NO_BACKUP correctly.

If the reshape does not require a backup, Grow_continue can
abort early.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: set RESHAPE_NO_BACKUP based on new_offset.
NeilBrown [Mon, 27 May 2013 05:18:07 +0000 (15:18 +1000)] 
super1: set RESHAPE_NO_BACKUP based on new_offset.

We need to check for a backup iff the data_offset has changed.
Testing against level==10 was an effective but short-sighted approach.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow for different sized devices when updating data_offset.
NeilBrown [Mon, 27 May 2013 05:09:38 +0000 (15:09 +1000)] 
Grow: allow for different sized devices when updating data_offset.

It is possible that the devices in an array have different sizes, and
different data_offsets.  So the 'before_space' and 'after_space' may
be different from drive to drive.
Any decisions about how much to change the data_offset must work on
all devices, so must be based on the minimum available space on
any devices.

So find this minimum first, then do the calculation.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoAssemble: allow --update=revert-reshape
NeilBrown [Thu, 23 May 2013 05:48:48 +0000 (15:48 +1000)] 
Assemble: allow --update=revert-reshape

This will cause a reshape to start going backwards.

10 years agoAssemble: --update=metadata converts v0.90 to v1.0
NeilBrown [Thu, 23 May 2013 04:41:29 +0000 (14:41 +1000)] 
Assemble: --update=metadata converts v0.90 to v1.0

This allows the smooth conversion of legacy 0.90 arrays
to 1.0 metadata.
Old metadata is likely to remain but will be ignored.
It can be removed with
  mdadm --zero-superblock --metadata=0.90 /dev/whatever

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: fix some casts of signed superblock fields.
NeilBrown [Tue, 28 May 2013 06:43:03 +0000 (16:43 +1000)] 
super1: fix some casts  of signed superblock fields.

These need to be cast to uint32_t before being cast to 'long', else
sign extension doesn't happen on 64bit hosts.

And bitmap_offset is le32, not le64 !!

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoExamine/super1 - report Unused space, before and after.
NeilBrown [Wed, 22 May 2013 06:37:19 +0000 (16:37 +1000)] 
Examine/super1 - report Unused space, before and after.

Might be confusing, or might be useful when reshaping.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agosuper1: don't put the bblog at the end of the free space.
NeilBrown [Wed, 22 May 2013 06:00:21 +0000 (16:00 +1000)] 
super1: don't put the bblog at the end of the free space.

It seems like a nice location, but it means that we cannot
decrease the data_offset during a reshape.

So put it just after the bitmap, leaving 32K.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow a reshape which only changes --data-offset
NeilBrown [Tue, 21 May 2013 06:50:55 +0000 (16:50 +1000)] 
Grow: allow a reshape which only changes --data-offset

Sometimes, that is all we want to do.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: E2BIG should be reporte differently if --data-offset was requested.
NeilBrown [Tue, 21 May 2013 06:50:05 +0000 (16:50 +1000)] 
Grow: E2BIG should be reporte differently if --data-offset was requested.

In that case the problem is almost certainly that --data-offset is too big.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: --backup-file and --data-offset are incompatible.
NeilBrown [Tue, 21 May 2013 06:40:23 +0000 (16:40 +1000)] 
Grow: --backup-file and --data-offset are incompatible.

So report if both are given, and if --backup-file is given,
don't try to update data-offset.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: handle E2BIG from new_offset changes more gracefully.
NeilBrown [Tue, 21 May 2013 06:35:29 +0000 (16:35 +1000)] 
Grow: handle E2BIG from new_offset changes more gracefully.

If new_offset change is too big, just do the reshape the old way.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow --data-offset to be specified for raid4/5/6
NeilBrown [Tue, 21 May 2013 06:33:56 +0000 (16:33 +1000)] 
Grow: allow --data-offset to be specified for raid4/5/6

Previously it was rejected for non-RAID10.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: allow metadata to indicate that changing data_offset not supported.
NeilBrown [Tue, 21 May 2013 06:32:00 +0000 (16:32 +1000)] 
Grow: allow metadata to indicate that changing data_offset not supported.

If space_after and space_before are zero (the default) then assume that
metadata doesn't support changing data_offset.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: use new_data_offset instead of backups for raid4/5/6 reshape.
NeilBrown [Tue, 21 May 2013 06:28:23 +0000 (16:28 +1000)] 
Grow: use new_data_offset instead of backups for raid4/5/6 reshape.

If we can modify the data_offset, we can avoid doing any backups at all.
If we can't fall back on old approach - but not if --data-offset
 was requested.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: introduce min_offset_change to struct reshape.
NeilBrown [Wed, 22 May 2013 02:17:32 +0000 (12:17 +1000)] 
Grow: introduce min_offset_change to struct reshape.

raid10 currently uses the 'backup_blocks' field to store something
else: a minimum offset change.
This is bad practice, we will shortly need to have both for RAID5/6,
so make a separate field.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow: have analyse_change zero the reshape structure first.
NeilBrown [Wed, 22 May 2013 01:51:43 +0000 (11:51 +1000)] 
Grow: have analyse_change zero the reshape structure first.

This is generally safer and means we can remove lots of zero
assignments.

Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow.c: split impose_reshape out as a function.
NeilBrown [Tue, 21 May 2013 06:11:08 +0000 (16:11 +1000)] 
Grow.c: split impose_reshape out as a function.

It will be useful soon.
Signed-off-by: NeilBrown <neilb@suse.de>
10 years agoGrow.c: split out update_cache_size() function.
NeilBrown [Tue, 21 May 2013 05:59:11 +0000 (15:59 +1000)] 
Grow.c: split out update_cache_size() function.

Make this a separate function as I might want to call it from another
location.

Signed-off-by: NeilBrown <neilb@suse.de>