]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
12 years agoimsm: Disable checkpoint updating by mdmon for general migration
Adam Kwolek [Wed, 8 Jun 2011 07:11:49 +0000 (17:11 +1000)] 
imsm: Disable checkpoint updating by mdmon for general migration

imsm contains 2 check-pointing mechanism. One (per array) is used for
initialization and rebuild and second (per container) is used for general
migration (reshape). First is controlled by mdmon, second by mdadm.
To avoid conflicts disable mdmon checkpoints updating for general
migration.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Implement recover_backup_imsm() for imsm metadata
Adam Kwolek [Wed, 8 Jun 2011 07:11:23 +0000 (17:11 +1000)] 
imsm: Implement recover_backup_imsm() for imsm metadata

Add ability to restore data backed up in General Migration Copy Area
in case of unexpected reshape interruption.
Function restores data during an array assembly and then reshape
is continues from next checkpoint.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoAdd reshape restart support for external metadata
Adam Kwolek [Wed, 8 Jun 2011 07:11:11 +0000 (17:11 +1000)] 
Add reshape restart support for external metadata

Patch introduces support for reshape process restart for external metadata
using metadata specific data handling methods.
It introduces recover_backup() function that restores array to stable state
It is equivalent to Grow_restart() functionality for native metadata.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: update blocks_per_migr_unit() to support migration record
Adam Kwolek [Wed, 8 Jun 2011 07:09:50 +0000 (17:09 +1000)] 
imsm: update blocks_per_migr_unit() to support migration record

blocks_per_migr_unit() has to use information from migration record
for general migration case. This causes to pass intel_super pointer
to this function and some other interfaces changes.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Add information about migration record to mdadm '-E' option
Adam Kwolek [Wed, 8 Jun 2011 07:09:29 +0000 (17:09 +1000)] 
imsm: Add information about migration record to mdadm '-E' option

Add ability to display information from migration record in examine
option.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Clear migration record when no migration in progress
Adam Kwolek [Wed, 8 Jun 2011 07:09:16 +0000 (17:09 +1000)] 
imsm: Clear migration record when no migration in progress

When metadata is saved and there is no general migration in progress
/in container/ clear migration record in container.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Check if array degradation has been changed
Adam Kwolek [Wed, 8 Jun 2011 07:09:09 +0000 (17:09 +1000)] 
imsm: Check if array degradation has been changed

Before reshaping every "migration unit", check if array is still usable.
In failed disks number is greater than allowed degradation level, reshape
has to be aborted.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Implement imsm_manage_reshape(), reshape workhorse
Adam Kwolek [Wed, 8 Jun 2011 07:09:08 +0000 (17:09 +1000)] 
imsm: Implement imsm_manage_reshape(), reshape workhorse

Before reshape is started, mdadm should check again if there is only one
array (in container) under reshape. Then function "divides" array in to
"migration units" that can fits migration copy area and enters main loop.
It checks if current "migration unit" requires to be backed up.
If necessary mdadm saves it to copy area and updates migration record.
Then MD-driver is directed to perform reshape step (by "migration unit" size)
and checkpoint is moved forward. In this way reshape is executed until
array ends.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Add wait_for_reshape_imsm() implementation
Adam Kwolek [Wed, 8 Jun 2011 07:07:10 +0000 (17:07 +1000)] 
imsm: Add wait_for_reshape_imsm() implementation

After each checkpoint mdadm should set new reshaped area and wait
until md finishes reshape. Function wait_for_reshape_imsm() sets
new reshape range and waits for job completion.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoGrow: Add paranoid level checking to analyse_change.
NeilBrown [Wed, 8 Jun 2011 06:56:41 +0000 (16:56 +1000)] 
Grow: Add paranoid level checking to analyse_change.

Just in case array.level is ever something that we don't expect, make
sure we report an error clearly rather than get confused.

Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: check migration compatibility
Adam Kwolek [Wed, 8 Jun 2011 06:46:37 +0000 (16:46 +1000)] 
imsm: check migration compatibility

Under Windows IMSM can reshape arrays in 2 directions
(ascending and decsending).
Under Linux one (ascending) direction is supported at this moment.
Block loading metadata when decsending reshape is detected

Windows also uses optimalization area during reshaping array.
Linux does not support it.
The patch blocks this operation also.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Add support for copy area and backup operations
Adam Kwolek [Wed, 8 Jun 2011 06:46:35 +0000 (16:46 +1000)] 
imsm: Add support for copy area and backup operations

This patch adds methods of manipulating migration record:
init_migr_record_imsm() - initiate migration record at the beginning of
     the reshape process
write_imsm_migr_rec() - saves migration record to array.
     Migration record is stored on 2 first disks in array only.
save_backup_imsm() - saves critical data stripes to Migration Copy Area
     and updates the current migration unit status.
     Uses restore_stripes() to format a destination stripe, and to write it
     to the Migration Copy Area.
save_checkpoint_imsm() - Updates the current unit status in the
     migration record.

Migration record is written to 2 first array disks only (similar to reading
operation).

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoDefine dummy functions to mdmon.c
Adam Kwolek [Wed, 8 Jun 2011 06:28:40 +0000 (16:28 +1000)] 
Define dummy functions to mdmon.c

Definitions are necessary to compile mdmon.
Metadata specific source code is compiled to mdmon.
Functions used for reshape check pointing:
- restore_stripes()
- save_stripes
- abort_reshape
are not used in mdmon, but they are compiled in it.
To enable mdmon compilation, dummy functions are used.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoSupport restore_stripes() from the given buffer
Adam Kwolek [Wed, 8 Jun 2011 06:24:48 +0000 (16:24 +1000)] 
Support restore_stripes() from the given buffer

For external metadata backup location and saving methods depends
on metadata specific implementation details. Currently restore_stripes()
function is able to restore data only from the given backup file handles
and it is used only for assembling partially reshaped arrays.
As this function will be very helpful for external metadata backup
mechanism, add the support for restoring data from the given source buffer.
Add possibility for save_stripes() to work without designation targets.
Save_stripes() can now prepare data for restore_stripes() only.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoimsm: Add migration record to intel_super
Adam Kwolek [Wed, 8 Jun 2011 06:19:06 +0000 (16:19 +1000)] 
imsm: Add migration record to intel_super

IMSM for securing reshape process uses special disk area outside metadata
for reshaped area backup purposes. If just reshaped array area requires
backup, bunch of array stripes prepared for reshape is stored in to
Migration Copy Area. In case of reshape interruption, Option ROM during
restart or mdadm during reshape restart (when no reboot occurs) will
restore Migration Copy Area to designation array. Reshape can be
continued from stable array stable state.

This patch adds support for IMSM migration record structure.
IMSM migration record is stored on the first two disks of IMSM volume
during the migration.

Add function for reading migration record, so mdadm can read (if present)
migration record. Migration record has to be cleared every time
MIGR_GEN_MIGR is started.

Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agogetinfo_super now clears the 'info' structure before filling it in.
NeilBrown [Wed, 8 Jun 2011 05:54:13 +0000 (15:54 +1000)] 
getinfo_super now clears the 'info' structure before filling it in.

Some code currently clears 'info' before calling getinfo_super,
some code doesn't.

To be consistent, change it so no caller ever clears 'info',
but ever getinfo_super function must clear it.

Note that ->raid_disk may be meaningful if that 'map' is passed
non-NULL.  In that case it is copied out before the structure
is zeroed.

Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoRestore ability to create imsm array from specific devices.
NeilBrown [Mon, 23 May 2011 07:21:37 +0000 (17:21 +1000)] 
Restore ability to create imsm array from specific devices.

A recent change to improve error messages make it not possible to
create an array from devices that are 'busy'.  However if they are
made busy by a container, then the create should be allowed.

So move one of the error messages later.

Reported-by: "Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoRemove unused variable 'superrno' in Query.c
NeilBrown [Mon, 23 May 2011 07:21:36 +0000 (17:21 +1000)] 
Remove unused variable 'superrno' in Query.c

This variable hasn't been used for 5 years!

Reported-by: Mathias Burén <mathias.buren@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
12 years agoCheck all member devices in enough_fd
NeilBrown [Mon, 23 May 2011 07:21:35 +0000 (17:21 +1000)] 
Check all member devices in enough_fd

The loop over all member devices in enough_fd could easily stop
before it had found all devices.  This would cause --re-add to
fail incorrectly.

So change the loop to be based on the reported number of devices
in the device - with a safe-guard limit of 1024.

Change some other loops to be more careful too.

Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: Fix crash if /proc/mdstat lists 0.9 superblocks
Michal Marek [Tue, 17 May 2011 01:08:16 +0000 (11:08 +1000)] 
mdmon: Fix crash if /proc/mdstat lists 0.9 superblocks

Signed-off-by: Michal Marek <mmarek@suse.cz>
13 years agoRAID-6 check standalone suspend array
Piergiorgio Sartor [Sun, 15 May 2011 19:15:15 +0000 (21:15 +0200)] 
RAID-6 check standalone suspend array

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: accept --assume-clean with --grow --size
NeilBrown [Mon, 16 May 2011 07:28:27 +0000 (17:28 +1000)] 
Grow: accept --assume-clean with --grow --size

When an array is resized to have larger members, --assume-clean will
disable any resync if the kernel supports it (2.6.40 and later).

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: add error checking for 'write_init_super'.
NeilBrown [Wed, 11 May 2011 03:43:27 +0000 (13:43 +1000)] 
Create: add error checking for 'write_init_super'.

If this fails, we really must fail the whole 'create'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: give better error message if member device unusable.
NeilBrown [Tue, 10 May 2011 07:58:41 +0000 (17:58 +1000)] 
Create: give better error message if member device unusable.

Rather than just saying "unusable", report if device is busy
or is no a block device.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: allow chunksize to be non-power-of-2.
NeilBrown [Tue, 10 May 2011 07:35:41 +0000 (17:35 +1000)] 
Create: allow chunksize to be non-power-of-2.

RAID0 has accepted chunksizes that are not a power of 2 since 2.6.30.
So it time mdadm allowed that to be used.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGive suitable error for mdadm /dev/md0 --stop
NeilBrown [Tue, 10 May 2011 06:30:40 +0000 (16:30 +1000)] 
Give suitable error for mdadm /dev/md0 --stop

Options like --stop must come before the device that is being
stopped.  If (in --misc mode) a  device does not have an option,
nothing will be done to it, which can be confusing.
So report an error in this case.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoManage: minor fix to add/re-add handling.
NeilBrown [Tue, 10 May 2011 06:20:25 +0000 (16:20 +1000)] 
Manage: minor fix to add/re-add handling.

If using an old kernel we should still check if a re-add might be
intended, so we can refuse and require a '--zero' first if it is not
possible.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoconfig: restore the possibility of a NULL homehost
NeilBrown [Tue, 10 May 2011 06:17:12 +0000 (16:17 +1000)] 
config: restore the possibility of a NULL homehost

As homehost defaults to the system name it is not possible to specify
a NULL homehost.

This patch restored this ability with either --homehost="" or
--homehost="<none>".

This allows the creation of v1.x arrays without a "hostname:"
prefix in the name.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: allow auto-readonly arrays to be reshaped.
NeilBrown [Tue, 10 May 2011 03:09:37 +0000 (13:09 +1000)] 
Grow: allow auto-readonly arrays to be reshaped.

In an array is auto-readonly then a reshape will not start.
But auto-readonly is only wanted until something is explicitly
done to acknowledge that the array is really wanted.
So it is perfectly correct to switch an auto-readonly array to
'clean' if a reshape has been requested.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: handle abort/restart of grow while being monitored.
NeilBrown [Tue, 10 May 2011 02:53:51 +0000 (12:53 +1000)] 
Grow: handle abort/restart of grow while being monitored.

If a device fails while the grow is being monitored but the array is
still functional, the Grow will appear to abort and then almost
instantly restart from where it was up to.

So if it appears to abort, wait up to 10 seconds for a restart (it
should be much much less than this.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: restore ability to configure 'faulty' arrays via mdadm.
NeilBrown [Tue, 10 May 2011 02:09:02 +0000 (12:09 +1000)] 
Grow: restore ability to configure 'faulty' arrays via mdadm.

The big 'grow' refactor lost us the ability to configure 'faulty'
arrays through --grow.
So put that back as a special case.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: report if a --size change has no effect.
NeilBrown [Tue, 10 May 2011 01:56:38 +0000 (11:56 +1000)] 
Grow: report if a --size change has no effect.

e.g. if "--grow --size=max" doesn't actually change anything, it is
useful to report that.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: check if any changes needed before proceeding to analyse_change.
NeilBrown [Tue, 10 May 2011 01:49:57 +0000 (11:49 +1000)] 
Grow: check if any changes needed before proceeding to analyse_change.

Analyse_change can give unhelpful error messages if nothing was
changed.  This is particularly awkward when only changing --size.

So check and re-introduce a message that was list in commit
5da9ab9874cb

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: When setting component size make sure components are ready.
NeilBrown [Tue, 10 May 2011 00:44:00 +0000 (10:44 +1000)] 
Grow: When setting component size make sure components are ready.

If you change the size of a member of an array (e.g. it might be a
dm device that can be resized, or on a smart storage device), md
doesn't notice and so the space cannot be used without explicitly
telling md that the device is bigger.

This change causes "mdadm --grow --size=...." to make sure each
component device is making at least that much space available if it
can.

Normally usage of "--size=max" will cause all devices to make max
space available, the md will use as much as it can of that.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: add new chunk size to metadata update
Przemyslaw Czarnowski [Wed, 4 May 2011 15:13:22 +0000 (17:13 +0200)] 
imsm: add new chunk size to metadata update

Put information about new chunk size change in to migration metadata
update allowing simultaneous level change and re-striping.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: process update for raid level migrations
Przemyslaw Czarnowski [Wed, 4 May 2011 15:12:48 +0000 (17:12 +0200)] 
imsm: process update for raid level migrations

Received update and prepared memory is processed to update imsm metadata.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: prepare memory for level migration update
Przemyslaw Czarnowski [Wed, 4 May 2011 15:12:14 +0000 (17:12 +0200)] 
imsm: prepare memory for level migration update

When level is changed from raid0 to raid5 memory is required for replace device
smaller device/array object.
This memory is allocated in manager context in prepare_update()

Prepare_update() is called in manager context so memory allocation are
allowed here. This allows us to look for spare devices for meta update.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: fix: disable migration from raid5->raid0
Przemyslaw Czarnowski [Wed, 4 May 2011 15:11:41 +0000 (17:11 +0200)] 
imsm: fix: disable migration from raid5->raid0

it is not supported yet, so start such transition is improper.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: prepare update for level migrations reshape
Przemyslaw Czarnowski [Wed, 4 May 2011 15:11:07 +0000 (17:11 +0200)] 
imsm: prepare update for level migrations reshape

Introducing raid0->raid5 level migration metadata update structure
is prepared for future use.

Adding spare device is required to hold additional raid5 parity.
Mdadm just checks for spares, but it is not included in update.
If there are no spares available, abort. Otherwise we will create
degraded array what should be not allowed.
Mdmon will decide what spare device is used for parity.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: FIX: Do not write check-point '0'
Adam Kwolek [Mon, 2 May 2011 06:12:03 +0000 (16:12 +1000)] 
imsm: FIX: Do not write check-point '0'

When 2 arrays are configured in container and arrays are reassembled during
rebuild or initialization, checkpoint for one array can be reset. It depends
on arrays assembly order.

Scenario:
1. Create 2 arrays (e.g. raid5)
2. Add spare to container
3. Degrade arrays /rebuild starts on array #1 and continues to n%/
4. Reassembly arrays
5. Rebuild starts on array #2 /because of assembly order/ from 0%
6. On first checkpoint stored for array #2 (non 0 value), checkpoint
   for array #1 is cleared /it is delayed rebuild in md, so progress is 0/
7. Rebuild on #1 starts from n% /it was configured before checkpoint
   was cleared/.

Any next reassembly during rebuild of #2 array (after p.6) causes
checkpoint information lost for array #1.

Solution is not store checkpoint for progress == 0.
Checkpoint is set to 0 when rebuild/initialization starts.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFinally remove auto-home-host
NeilBrown [Tue, 26 Apr 2011 23:58:49 +0000 (09:58 +1000)] 
Finally remove auto-home-host

This was #if-ed out for 3.0, but it really should go.
Gcc 4.6.0 complains that auto_update_home is set but not used
(which is true).

Reported-by: Tobias Powalowski <t.powa@gmx.de>
13 years agoFIX: Check correctly raid disks during reshape restart
Adam Kwolek [Tue, 19 Apr 2011 07:25:43 +0000 (17:25 +1000)] 
FIX: Check correctly raid disks during reshape restart

During reshape restart info->array.raid_disks contains new raid_disks number
It cannot be compared against old disks number. Such check will always fail.

Check raid disks array field against final disks number for restart.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Count correctly added devices
Adam Kwolek [Mon, 18 Apr 2011 00:31:43 +0000 (10:31 +1000)] 
FIX: Count correctly added devices

When array is in reshape state raid_disks field contains final disks number.
To know how many disks were added, disk.raid_disk index has to be compared
against old disk number computed using delta_disks.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Set proper raid disks during migration
Adam Kwolek [Mon, 18 Apr 2011 00:31:15 +0000 (10:31 +1000)] 
FIX: Set proper raid disks during migration

During migration raid_disks field contains new disks number now.
It should be set old disks number first and then new disks number
to allow md to calculate e.g. delta_disks parameter.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Fiddle raid_disks number when restarting reshape
Adam Kwolek [Mon, 18 Apr 2011 00:31:06 +0000 (10:31 +1000)] 
FIX: Fiddle raid_disks number when restarting reshape

When restarting a reshape, the value of 'raid_disks' is the *new*
value.  The old value is found by subtracting delta_disks.
So before calling analyse_change we must set raid_disks to be the
old value, and then reset it afterwards.

All other fields are cleanly separated with the main field being
the 'old' value and a new_* field available.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Always report new raid_disks during migration
Adam Kwolek [Fri, 15 Apr 2011 10:30:31 +0000 (12:30 +0200)] 
FIX: Always report new raid_disks during migration

To behave in the similar way as native metadata during migration,
new raid disks number has to be reported by metadata handler.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Use successfully loaded metadata only
Adam Kwolek [Thu, 14 Apr 2011 07:50:17 +0000 (17:50 +1000)] 
FIX: Use successfully loaded metadata only

Values greater than 0, means error. We exit from loop on error
with empty super-block pointer when sd pointer is valid.
This cannot be detected by check condition as error.
For sure we shouldn't go forward with error condition.
It leads to throwing exception with core file when metadata handler
wants to access non existing super-block.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRAID-6 check standalone fix component list parsing
Piergiorgio Sartor [Thu, 14 Apr 2011 07:28:31 +0000 (17:28 +1000)] 
RAID-6 check standalone fix component list parsing

Fix the parsing of the component list, i.e. skipping the "spare" one.

I also added a check in case the array is degraded.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMonitor: avoid NULL dereference with 0.90 metadata
Jonathan Liu [Tue, 12 Apr 2011 08:28:01 +0000 (18:28 +1000)] 
Monitor: avoid NULL dereference with 0.90 metadata

0.90 array do not report the metadata type in /proc/mdstat, so
we cannot assume that mse->metadata_version is non-NULL.

So add an appropriate check.

This adds an additional check missed by commit
eb28e119b03fd5149886ed516fa4bb006ad3602e.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Raid0 expansion cannot be restarted
Adam Kwolek [Mon, 11 Apr 2011 05:00:13 +0000 (15:00 +1000)] 
FIX: Raid0 expansion cannot be restarted

When raid0 expansion is restarted, mdadm refuses to correctly assemble
array because critical section cannot be restored from backup file.
mdadm exits with information:
mdadm: Failed to restore critical section for reshape - sorry.

For raid0 new level is 0, current array level is 4.
Function Grow_restart() doesn't allow for level change.

Grow_restart really shouldn't be checking for level changes.
As they are always instantaneous they should never appear
in the metadata so it doesn't mean anything to check for them.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdadm/mdmon: use CFLAGS when linking
Mike Frysinger [Mon, 11 Apr 2011 04:54:42 +0000 (14:54 +1000)] 
mdadm/mdmon: use CFLAGS when linking

People often put flags that control ABI options into CFLAGS (like -mcpu)
and don't duplicate them in LDFLAGS because most build systems nowadays
(like autotools) use both when linking.  So make that work with mdadm's
custom build system too.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdadm: respect --syslog in monitor mode
Mike Frysinger [Mon, 11 Apr 2011 04:54:27 +0000 (14:54 +1000)] 
mdadm: respect --syslog in monitor mode

A few places don't accept syslog as a monitor mode, so fix that.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdadm: add missing --syslog option to monitor help
Mike Frysinger [Mon, 11 Apr 2011 04:54:18 +0000 (14:54 +1000)] 
mdadm: add missing --syslog option to monitor help

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomove .man targets from "all" to "man" - and "everything"
Mike Frysinger [Mon, 11 Apr 2011 04:54:16 +0000 (14:54 +1000)] 
move .man targets from "all" to "man" - and "everything"

These .man files are never installed, nor generally used, so don't force
people who generally want to build/install mdadm to build them up.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: fix: report aligned component size value
Adam Kwolek [Wed, 6 Apr 2011 02:40:31 +0000 (12:40 +1000)] 
imsm: fix: report aligned component size value

OROM can create array with chunk size not aligned.
To resolve this problem in mdadm, metadata handler has to report
component size aligned value for mdadm operations
while metadata value stays unchanged.

Do not correct alignment for raid1 and in error case.

Correction allows check in analyse_change() (Grow.c:905) to pass.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: FIX: Check array alignment before expansion
Adam Kwolek [Wed, 6 Apr 2011 02:40:04 +0000 (12:40 +1000)] 
imsm: FIX: Check array alignment before expansion

It can occur that OROM creates array not aligned properly.
Expansion cannot be run in such cases. It is detected in analyse_change().
It is too late. This causes that metadata is in migration state already,
when expansion cannot be started.
This problem has to be detected before metadata is updated,
in all arrays in reshaped container.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: Warn user about reboot risk
Adam Kwolek [Wed, 6 Apr 2011 02:38:50 +0000 (12:38 +1000)] 
imsm: Warn user about reboot risk

Current check-pointing implementation doesn't allow for interrupting reshape of boot arrays
due to checkpoint restore has to be done before system start.
There is problem with passing backup file name to array automatically mounted during boot time,
especially when scan mode is used.

Until IMSM check-pointing implementation will be introduced, warning about reboot risk
should be placed in mdadm.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agorestripe: make sure zero buffer is always large enough.
NeilBrown [Tue, 5 Apr 2011 11:43:52 +0000 (21:43 +1000)] 
restripe: make sure zero buffer is always large enough.

If restripe is called to restore stripes of one size and then
save stripes with a larger chunk size, the 'zero' buffer will not
be large enough and a double-degraded RAID6 will over-run the buffer.

So record the current size of the zero buffer and use it when deciding
if we need to allocate a new buffer.

Reported-by: Brad Campbell <lists2009@fnarfbargle.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: fix size after setting default chunk
Czarnowska, Anna [Mon, 4 Apr 2011 23:29:45 +0000 (09:29 +1000)] 
Create: fix size after setting default chunk

When -e option is given then the first validate_geometry
sets default chunk. Size must be rounded there and do_default_chunk
needs to be set to 0 so that we don't repeat the message below.

If we start without st then what we find on the the first disk determines
the st and sets chunk. So after running
validate_geometry on the first disk we need to fix the size too.
At this point chunk should always be set but it is safer to keep the check.

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: check for UnSet when looking at chunk
Czarnowska, Anna [Wed, 30 Mar 2011 10:28:11 +0000 (11:28 +0100)] 
Create: check for UnSet when looking at chunk

A default chunk size of 0 gets modified to UnSet, so any location that
checks for !chunk really needs to check for !(chunk || chunk == UnSet).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: After discarding array give chance monitor to remove it
Adam Kwolek [Mon, 28 Mar 2011 11:56:49 +0000 (13:56 +0200)] 
FIX: After discarding array give chance monitor to remove it

When raid0 expansion occurs, takeover operation is used.
After backward takeover monitor remains in memory.

This happens due to remaining just removed active array in mdmon structures.
If there is no other monitored arrays, mdmon has to finish his work.

Problem was introduced in patch (2011.03.22):
    mdmon: Stop keeping track of RAID0 (and LINEAR) arrays.
Prior to this patch mdmon kicking occurs via replace_array() where
wakeup_monitor() was called.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMonitor: avoid NULL dereference with 0.90 metadata
NeilBrown [Mon, 4 Apr 2011 23:16:57 +0000 (09:16 +1000)] 
Monitor: avoid NULL dereference with 0.90 metadata

0.90 array do not report the metadata type in /proc/mdstat, so
we cannot assume that mse->metadata_version is non-NULL.

So add an appropriate check.

Reported-by: Eugene <hdejin@yahoo.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRAID-6 check standalone code cleanup
Piergiorgio Sartor [Mon, 4 Apr 2011 23:16:55 +0000 (09:16 +1000)] 
RAID-6 check standalone code cleanup

Major change is code cleanup and simplification.
Furthermore, a better error handling and a couple
of bug fixes.
Last but not least, the command line parameters are
changed from "bytes" to "stripes", which is more
convenient, I guess.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRAID-6 check standalone md device
Piergiorgio Sartor [Mon, 4 Apr 2011 22:56:41 +0000 (08:56 +1000)] 
RAID-6 check standalone md device

Allow RAID-6 check to be passed only the
MD device, start and length.
The three parameters are mandatory.

All necessary information is collected using
the "sysfs_read()" call.
Furthermore, if "length" is "0", then the check
is performed until the end of the array.

Some checks are done, for example if the md device
is really a RAID-6. Nevertheless I guess it is not
bullet proof...

Next patch will include the "suspend" action.
My idea is to do it "per stripe", please let me
know if you've some better options.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSplit some of util.c into a new lib.c
NeilBrown [Mon, 4 Apr 2011 22:44:54 +0000 (08:44 +1000)] 
Split some of util.c into a new lib.c

Some of util.c is dependent on lots of other code, some of it
is stand-alone.
Move some of the stand-alone stuff into a new lib.c so it can be used
by smaller utilities.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agosplit name/number maps into separate file.
NeilBrown [Mon, 4 Apr 2011 22:40:49 +0000 (08:40 +1000)] 
split name/number maps into separate file.

This reduced some interdependencies between files.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMove WaitClean from sysfs to Monitor.c
NeilBrown [Mon, 4 Apr 2011 22:21:03 +0000 (08:21 +1000)] 
Move WaitClean from sysfs to Monitor.c

It might not really belong in Monitor, but it really doesn't
belong in sysfs.c, and fits well with Wait()

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRelease 3.2.1 mdadm-3.2.1
NeilBrown [Mon, 28 Mar 2011 02:30:29 +0000 (13:30 +1100)] 
Release 3.2.1

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotest: Don't use dev6 and dev7 together in a non-multipath test
NeilBrown [Mon, 28 Mar 2011 02:24:04 +0000 (13:24 +1100)] 
test: Don't use dev6 and dev7 together in a non-multipath test

dev6 and dev7 refer to the same storage and are used for
multipath testing.  So using them both in any other test will
be confusing.  So change 11spare-migration test 5 to use
dev10 rather than dev7

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: reading of UEFI variables needs an update
Hawrylewicz Czarnowski, Przemyslaw [Sun, 27 Mar 2011 23:42:07 +0000 (10:42 +1100)] 
imsm: reading of UEFI variables needs an update

Content of EFI variable is stored in "data" file. Moreover size of data
provided by given variable can be initially validated by reading value of
"size" file.
Function read_efi_variable() has been introduced to simplify the code.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: remove OEM table from detection of OROM and EFI.
Hawrylewicz Czarnowski, Przemyslaw [Sun, 27 Mar 2011 23:41:35 +0000 (10:41 +1100)] 
imsm: remove OEM table from detection of OROM and EFI.

OEM table does not suit our needs so it cannot be used.
This patch removes feature added in commit 8a0bf4f378c8b.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotests: Make sure config file is empty when required.
NeilBrown [Sun, 27 Mar 2011 23:41:09 +0000 (10:41 +1100)] 
tests: Make sure config file is empty when required.

We need to have no config at all for this test so
make sure it is empty.

Reported-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotests: use $config to store test config path
Czarnowska, Anna [Thu, 24 Mar 2011 21:43:44 +0000 (21:43 +0000)] 
tests: use $config to store test config path

We also need to tell Monitor where to look for Policy in 11spare-migration tests

Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoopen_dev_excl: allow device to be read-only. devel-3.2
NeilBrown [Thu, 24 Mar 2011 03:21:58 +0000 (14:21 +1100)] 
open_dev_excl: allow device to be read-only.

For many operations we don't need a writable device.  So if
opening O_RDWR fails in open_dev_excl, then try again O_RDONLY.

If we really needed write, a subsequent operation will failed.  But
if we didn't, we succeed when otherwise we wouldn't have.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotests: use /tmp/mdadm.conf rather than /etc/mdadm.conf.
NeilBrown [Thu, 24 Mar 2011 01:45:23 +0000 (12:45 +1100)] 
tests: use /tmp/mdadm.conf rather than /etc/mdadm.conf.

Modifying /etc/mdadm.conf for testing is just wrong.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMerge branch 'master' into devel-3.2
NeilBrown [Thu, 24 Mar 2011 01:00:55 +0000 (12:00 +1100)] 
Merge branch 'master' into devel-3.2

Conflicts:
Incremental.c
Manage.c
ReadMe.c
inventory
mdadm.8.in
mdadm.spec
mdassemble.8
mdmon.8

13 years agoFIX: imsm: Do not change serial if disk failed
Krzysztof Wojcik [Wed, 23 Mar 2011 23:15:01 +0000 (10:15 +1100)] 
FIX: imsm: Do not change serial if disk failed

This patch rollback one change connected with mdadm-OROM
compatibility:
adding ':0' at the end of disk serial number if disk is
detected as failed.
Current mdadm's implementation does not distinguish two
cases when disk is marked as failed:
1. If disk is really failed- disconnected, broken
2. Just marked as failed by mdadm- using "-f" option

Second case is not yet fully handled and compatible with
IMSM standard.
Changing serial number of existing, operational disk causes
problems in "thunderdome" and "load_super" functions that use
serial numbers to disks comparisons and searching.
The change must be recalled until full support will be
developed.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Tests: raid0->raid10 without degradation
Krzysztof Wojcik [Wed, 23 Mar 2011 23:11:58 +0000 (10:11 +1100)] 
FIX: Tests: raid0->raid10 without degradation

raid0->raid10 transition needs at least 2 spare devices.
After level changing to raid10 recovery is triggered on
failed (missing) disks. At the end of recovery process
we have fully operational (not degraded) raid10 array.

Initialy there was possibility to migrate raid0->raid10
without recovery triggering (it results degraded raid10).
Now it is not possible.
This patch adapt tests to new mdadm's behavior.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: imsm: Rebuild does not start on second failed disk
Krzysztof Wojcik [Wed, 23 Mar 2011 15:04:20 +0000 (16:04 +0100)] 
FIX: imsm: Rebuild does not start on second failed disk

Problem:
If we have an array with two failed disks and the array is in degraded
state (now it is possible only for raid10 with 2 degraded mirrors) and
we have two spare devices in the container, recovery process should be
triggered on booth failed disks. It does not.
Recovery is triggered only for first failed disk.
Second failed disk remains unchanged although the spare drive exists
in the container and is ready to recovery.

Root cause:
mdmon does not check if the array is degraded after recovery of first
drive is completed.

Resolution:
Check if current number of disks in the array equals target number of disks.
If not, trigger degradation check and then recovery process.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRelease mdadm-3.1.5 mdadm-3.1.5
NeilBrown [Wed, 23 Mar 2011 04:43:19 +0000 (15:43 +1100)] 
Release mdadm-3.1.5

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoIncr: don't exclude 'active' devices from auto inclusion in a container.
NeilBrown [Wed, 23 Mar 2011 04:42:35 +0000 (15:42 +1100)] 
Incr: don't exclude 'active' devices from auto inclusion in a container.

For containers, it is always appropriate to include a device in the
container.
Whether it should then be included in an array is a separate question.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years ago--stop: separate 'is busy' test for 'did it stop properly'.
NeilBrown [Wed, 23 Mar 2011 04:42:24 +0000 (15:42 +1100)] 
--stop: separate 'is busy' test for 'did it stop properly'.

Stopping an md array requires that there is no other user of it.
However with udev and udisks and such there can be transient other
users of md devices which can interfere with stopping the array.

If there is a transient users, we really want "mdadm --stop" to wait a
little while and retry.
However if the array is genuinely in-use (e.g. mounted), then we
don't want to wait at all - we want to fail immediately.

So before trying to stop, re-open device with O_EXCL.  If this fails
then the device is probably in use, so give up.

If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly
a transient failure, so try again for a few seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAssemble: improve efficacy of -Af in assembling degraded dirty arrays.
NeilBrown [Wed, 23 Mar 2011 00:07:27 +0000 (11:07 +1100)] 
Assemble: improve efficacy of -Af in assembling degraded dirty arrays.

If a degraded dirty array has some superblocks which are clean and
others that are dirty, and the dirty ones are newer by precisely '1'
in the event count, then the current code to force the array to be
clean will not work.
We need to make sure to find a superblock with most recent event count
and force that one to be 'clean'.

Reported-by: A J Wyborny <ajwyborny@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agosuper-intel: enable loading metadata from non-IMSM compliant disks
Labun, Marcin [Wed, 23 Mar 2011 01:05:53 +0000 (12:05 +1100)] 
super-intel: enable loading metadata from non-IMSM compliant disks

Honor ignore_hw_compat to load metadata from disk attached to non-IMSM
controller or when there are no IMSM OROM/EFI capabilities.
Used only for guessing and examining metadata format.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoexamine: allows to examine a disk metadata on non-metadata compliant systems
Labun, Marcin [Wed, 23 Mar 2011 01:04:46 +0000 (12:04 +1100)] 
examine: allows to examine a disk metadata on non-metadata compliant systems

Allow for loading metadata from disk attached to non-metadata compliant
system. Affects mdadm --examine and guess_super.

Added ignore_hw_compat in supertype to pass information to load_super
handler. If ignore_hw_compat is set the handler should load metadata
also from disks that do not comply with metadata requirements (i.e. disk is not
attached to native controller, etc).

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoman mdadm: Add note about auto-assembly during array reshape
Adam Kwolek [Wed, 23 Mar 2011 01:02:28 +0000 (12:02 +1100)] 
man mdadm: Add note about auto-assembly during array reshape

Add note to man that auto-assembly cannot be used for reshaped arrays.

Revisions: NeilBrown

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoman mdadm: add information for MDADM_EXPERIMENTAL flag
Adam Kwolek [Wed, 23 Mar 2011 00:45:03 +0000 (11:45 +1100)] 
man mdadm: add information for MDADM_EXPERIMENTAL flag

Update man for MDADM_EXPERIMENTAL flag.

Minor revisions by Mathias Burén <mathias.buren@gmail.com> and Neil Brown.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMonitor: handle v.quick removal of devices better.
NeilBrown [Tue, 22 Mar 2011 03:47:55 +0000 (14:47 +1100)] 
Monitor: handle v.quick removal of devices better.

If a device fails and then is removed before Monitor sees
the failure, GET_DISK_INFO returns nothing so Monitor relies
on mdstat info where '_' is incorrectly interpreted as 'a spare'.

We should treat '_' as 'removed' - that is safer.

Without this, a v.quick fail+remove gets reported as 'Failed' then
'SpareActive'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: fix up detection of failed/missing devices.
NeilBrown [Mon, 21 Mar 2011 23:32:09 +0000 (10:32 +1100)] 
ddf: fix up detection of failed/missing devices.

If a device hasn't been found yet we can still tell if it is
expected to be working, and we must to do to make sure
'working_disks' is correct.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agorestripe: allow test code to have an offset on each device.
Piergiorgio Sartor [Mon, 21 Mar 2011 23:09:38 +0000 (10:09 +1100)] 
restripe: allow test code to have an offset on each device.

If device name ends :number, e.g.
   /dev/sda0:1234

then assume the RAID data starts that many sectors from start of
device.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAssemble: improve efficacy of -Af in assembling degraded dirty arrays.
NeilBrown [Wed, 23 Mar 2011 00:07:27 +0000 (11:07 +1100)] 
Assemble: improve efficacy of -Af in assembling degraded dirty arrays.

If a degraded dirty array has some superblocks which are clean and
others that are dirty, and the dirty ones are newer by precisely '1'
in the event count, then the current code to force the array to be
clean will not work.
We need to make sure to find a superblock with most recent event count
and force that one to be 'clean'.

Reported-by: A J Wyborny <ajwyborny@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: Stop keeping track of RAID0 (and LINEAR) arrays.
NeilBrown [Tue, 22 Mar 2011 06:23:17 +0000 (17:23 +1100)] 
mdmon: Stop keeping track of RAID0 (and LINEAR) arrays.

Tracking RAID0 arrays doesn't really work.  There is no need,
and there are some sysfs files which won't exist when the array
appears and then won't be opened when the level is changed.

So simply ignore RAID0 and LINEAR arrays - don't add them when they
appear and if an array we are monitoring turns into one of these,
discard it promptly.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: don't wait for O_EXCL when shutting down.
NeilBrown [Tue, 22 Mar 2011 05:10:22 +0000 (16:10 +1100)] 
mdmon: don't wait for O_EXCL when shutting down.

If mdmon is shutting down because there are no devices
left to look at, then don't wait 5 seconds for an O_EXCL open,
and that can block progress of --grow.

Only wait for O_EXCL if we received a signal.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: allow manage_member to cope with ->container becoming NULL.
NeilBrown [Tue, 22 Mar 2011 03:52:37 +0000 (14:52 +1100)] 
mdmon: allow manage_member to cope with ->container becoming NULL.

As monitor() can set ->container to NULL, we need to be careful
about dereferencing it.
So take a copy in manage_member, return if it is NULL, and only
use the copy.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: increase raid_disks before adding specific spares.
NeilBrown [Tue, 22 Mar 2011 03:52:36 +0000 (14:52 +1100)] 
Grow: increase raid_disks before adding specific spares.

When we add spared that have been targeted at a specific slot,
we need raid_disks to be bigger than the slot number.
But currently we don't increase raid_disks until after we add
these spares.

So introduce an early increase of raid_disks to allow the spares
to be added.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMonitor: handle v.quick removal of devices better.
NeilBrown [Tue, 22 Mar 2011 03:47:55 +0000 (14:47 +1100)] 
Monitor: handle v.quick removal of devices better.

If a device fails and then is removed before Monitor sees
the failure, GET_DISK_INFO returns nothing so Monitor relies
on mdstat info where '_' is incorrectly interpreted as 'a spare'.

We should treat '_' as 'removed' - that is safer.

Without this, a v.quick fail+remove gets reported as 'Failed' then
'SpareActive'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: fix up detection of failed/missing devices.
NeilBrown [Mon, 21 Mar 2011 23:32:09 +0000 (10:32 +1100)] 
ddf: fix up detection of failed/missing devices.

If a device hasn't been found yet we can still tell if it is
expected to be working, and we must to do to make sure
'working_disks' is correct.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agorestripe: allow test code to have an offset on each device.
Piergiorgio Sartor [Mon, 21 Mar 2011 23:09:38 +0000 (10:09 +1100)] 
restripe: allow test code to have an offset on each device.

If device name ends :number, e.g.
   /dev/sda0:1234

then assume the RAID data starts that many sectors from start of
device.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotest: call "udevadm settle" after stopping array.
NeilBrown [Mon, 21 Mar 2011 23:09:30 +0000 (10:09 +1100)] 
test: call "udevadm settle" after stopping array.

If we don't do this, then the unlink from /dev might happen
after the next step in the test creates something in /dev,
and device names seem to go missing.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRAID-6 check standalone
Piergiorgio Sartor [Mon, 21 Mar 2011 02:52:44 +0000 (13:52 +1100)] 
RAID-6 check standalone

Hi Neil,

please find attached a patch, to mdadm-3.2 base, including
a standalone versione of the raid-6 check.

This is basically a re-working (and hopefully improvement)
of the already implemented check in "restripe.c".

I splitted the check function into "collect" and "stats",
so that the second one could be easily replaced.
The API is also simplified.

The command line option are reduced, since we only level
is raid-6, but the ":offset" option is included.

The output reports the block/stripe rotation, P/Q errors
and the possible HDD (or unknown).

BTW, the patch applies also to the already patched "restripe.c",
including the last ":offset" patch (which is not yet in git).

Other item is that due to "sysfs.c" linking (see below) the
"Makefile" needed some changes, I hope this is not a problem.

Next steps (TODO list you like) would be:

1) Add the "sysfs.c" code in order to retrieve the HDDs info
from the MD device. It is already linked, together with the
whole (mdadm) universe, since it seems it cannot leave alone.
I'll need some advice or hint on how to do use it. I checked
"sysfs.c", but before I dig deep into it maybe better to
have some advice (maybe just one function call will do it).

2) Add the suspend lo/hi control. Fellow John Robinson was
suggesting to look into "Grow.c", which I did, but I guess
the same story as 1) is valid: better to have some hint on
where to look before wasting time.

3) Add a repair option (future). This should have different
levels, like "all", "disk", "stripe". That is, fix everything
(more or less like "repair"), fix only if a disk is clearly
having problems, fix each stripe which has clearly a problem
(but maybe different stripes may belong to different HDDs).

So, for the point 1) and 2) would be nice to have some more
detail on where to look what. Point 3) we will discuss later.

Thanks, please consider for inclusion,

bye,

pg

Signed-off-by: NeilBrown <neilb@suse.de>