]> git.ipfire.org Git - thirdparty/mdadm.git/log
thirdparty/mdadm.git
13 years agoexamine: allows to examine a disk metadata on non-metadata compliant systems
Labun, Marcin [Wed, 23 Mar 2011 01:04:46 +0000 (12:04 +1100)] 
examine: allows to examine a disk metadata on non-metadata compliant systems

Allow for loading metadata from disk attached to non-metadata compliant
system. Affects mdadm --examine and guess_super.

Added ignore_hw_compat in supertype to pass information to load_super
handler. If ignore_hw_compat is set the handler should load metadata
also from disks that do not comply with metadata requirements (i.e. disk is not
attached to native controller, etc).

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoman mdadm: Add note about auto-assembly during array reshape
Adam Kwolek [Wed, 23 Mar 2011 01:02:28 +0000 (12:02 +1100)] 
man mdadm: Add note about auto-assembly during array reshape

Add note to man that auto-assembly cannot be used for reshaped arrays.

Revisions: NeilBrown

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoman mdadm: add information for MDADM_EXPERIMENTAL flag
Adam Kwolek [Wed, 23 Mar 2011 00:45:03 +0000 (11:45 +1100)] 
man mdadm: add information for MDADM_EXPERIMENTAL flag

Update man for MDADM_EXPERIMENTAL flag.

Minor revisions by Mathias BurĂ©n <mathias.buren@gmail.com> and Neil Brown.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: Stop keeping track of RAID0 (and LINEAR) arrays.
NeilBrown [Tue, 22 Mar 2011 06:23:17 +0000 (17:23 +1100)] 
mdmon: Stop keeping track of RAID0 (and LINEAR) arrays.

Tracking RAID0 arrays doesn't really work.  There is no need,
and there are some sysfs files which won't exist when the array
appears and then won't be opened when the level is changed.

So simply ignore RAID0 and LINEAR arrays - don't add them when they
appear and if an array we are monitoring turns into one of these,
discard it promptly.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: don't wait for O_EXCL when shutting down.
NeilBrown [Tue, 22 Mar 2011 05:10:22 +0000 (16:10 +1100)] 
mdmon: don't wait for O_EXCL when shutting down.

If mdmon is shutting down because there are no devices
left to look at, then don't wait 5 seconds for an O_EXCL open,
and that can block progress of --grow.

Only wait for O_EXCL if we received a signal.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: allow manage_member to cope with ->container becoming NULL.
NeilBrown [Tue, 22 Mar 2011 03:52:37 +0000 (14:52 +1100)] 
mdmon: allow manage_member to cope with ->container becoming NULL.

As monitor() can set ->container to NULL, we need to be careful
about dereferencing it.
So take a copy in manage_member, return if it is NULL, and only
use the copy.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: increase raid_disks before adding specific spares.
NeilBrown [Tue, 22 Mar 2011 03:52:36 +0000 (14:52 +1100)] 
Grow: increase raid_disks before adding specific spares.

When we add spared that have been targeted at a specific slot,
we need raid_disks to be bigger than the slot number.
But currently we don't increase raid_disks until after we add
these spares.

So introduce an early increase of raid_disks to allow the spares
to be added.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMonitor: handle v.quick removal of devices better.
NeilBrown [Tue, 22 Mar 2011 03:47:55 +0000 (14:47 +1100)] 
Monitor: handle v.quick removal of devices better.

If a device fails and then is removed before Monitor sees
the failure, GET_DISK_INFO returns nothing so Monitor relies
on mdstat info where '_' is incorrectly interpreted as 'a spare'.

We should treat '_' as 'removed' - that is safer.

Without this, a v.quick fail+remove gets reported as 'Failed' then
'SpareActive'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: fix up detection of failed/missing devices.
NeilBrown [Mon, 21 Mar 2011 23:32:09 +0000 (10:32 +1100)] 
ddf: fix up detection of failed/missing devices.

If a device hasn't been found yet we can still tell if it is
expected to be working, and we must to do to make sure
'working_disks' is correct.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agorestripe: allow test code to have an offset on each device.
Piergiorgio Sartor [Mon, 21 Mar 2011 23:09:38 +0000 (10:09 +1100)] 
restripe: allow test code to have an offset on each device.

If device name ends :number, e.g.
   /dev/sda0:1234

then assume the RAID data starts that many sectors from start of
device.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotest: call "udevadm settle" after stopping array.
NeilBrown [Mon, 21 Mar 2011 23:09:30 +0000 (10:09 +1100)] 
test: call "udevadm settle" after stopping array.

If we don't do this, then the unlink from /dev might happen
after the next step in the test creates something in /dev,
and device names seem to go missing.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRAID-6 check standalone
Piergiorgio Sartor [Mon, 21 Mar 2011 02:52:44 +0000 (13:52 +1100)] 
RAID-6 check standalone

Hi Neil,

please find attached a patch, to mdadm-3.2 base, including
a standalone versione of the raid-6 check.

This is basically a re-working (and hopefully improvement)
of the already implemented check in "restripe.c".

I splitted the check function into "collect" and "stats",
so that the second one could be easily replaced.
The API is also simplified.

The command line option are reduced, since we only level
is raid-6, but the ":offset" option is included.

The output reports the block/stripe rotation, P/Q errors
and the possible HDD (or unknown).

BTW, the patch applies also to the already patched "restripe.c",
including the last ":offset" patch (which is not yet in git).

Other item is that due to "sysfs.c" linking (see below) the
"Makefile" needed some changes, I hope this is not a problem.

Next steps (TODO list you like) would be:

1) Add the "sysfs.c" code in order to retrieve the HDDs info
from the MD device. It is already linked, together with the
whole (mdadm) universe, since it seems it cannot leave alone.
I'll need some advice or hint on how to do use it. I checked
"sysfs.c", but before I dig deep into it maybe better to
have some advice (maybe just one function call will do it).

2) Add the suspend lo/hi control. Fellow John Robinson was
suggesting to look into "Grow.c", which I did, but I guess
the same story as 1) is valid: better to have some hint on
where to look before wasting time.

3) Add a repair option (future). This should have different
levels, like "all", "disk", "stripe". That is, fix everything
(more or less like "repair"), fix only if a disk is clearly
having problems, fix each stripe which has clearly a problem
(but maybe different stripes may belong to different HDDs).

So, for the point 1) and 2) would be nice to have some more
detail on where to look what. Point 3) we will discuss later.

Thanks, please consider for inclusion,

bye,

pg

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoplatform_intel: support EFI SCU OEM variable
Labun, Marcin [Sun, 20 Mar 2011 04:47:33 +0000 (15:47 +1100)] 
platform_intel: support EFI SCU OEM variable

RstScuV and RstScuO variable names are supported.
First try reading from RstScuV, when it fails try RstScuO.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Tested-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: FIX: indicate that metadada has to be written
Adam Kwolek [Sun, 20 Mar 2011 04:47:31 +0000 (15:47 +1100)] 
imsm: FIX: indicate that metadada has to be written

During adding spare disks to raid0, spare metadata is not written.
This is due to exit form sync_metadata() on empty updates_pending flag.

When mdmon is absent indicate sync_metadata() to flush changes to disks.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Add spare throws exception (v2)
Adam Kwolek [Sun, 20 Mar 2011 04:47:17 +0000 (15:47 +1100)] 
FIX: Add spare throws exception (v2)

sync_metadata() requires st->sb to be loaded, otherwise exception is
generated.  This fails expansion, because spares cannot be added.

metadata update uses tst instead st pointer, it is better than
loading anchor for st as I proposed previously.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRetry writing 'inactive' state during stopping array
Krzysztof Wojcik [Fri, 18 Mar 2011 01:42:17 +0000 (12:42 +1100)] 
Retry writing 'inactive' state during stopping array

Issue observed:
Sporadicaly stopping arrays using "mdadm -Ss" command does not succeded.
Cause:
Writting "inactive" to the array state not succeded- array is busy
(accessed by udev, blkid etc.)
Resolution:
If writing 'inactive' fails, wait and retry again (because it is possibly
a transient failure)

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: ping_monitor() usage causes memory leaks
Adam Kwolek [Fri, 18 Mar 2011 01:32:16 +0000 (12:32 +1100)] 
FIX: ping_monitor() usage causes memory leaks

When for ping_monitor() input devnum2devname() is used,
received string pointer should be passed to free() for memory release.
It is not made in several places. This use case should have function
to avoid memory leak.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoManage: fix the mess I made in earlier patch.
NeilBrown [Fri, 18 Mar 2011 01:31:45 +0000 (12:31 +1100)] 
Manage: fix the mess I made in earlier patch.

When I separated the 'native metadata' case more cleanly from the
"external metadata" case for adding a drive, I left some 'external'
code in the 'native' case, and didn't copy it to the 'external' case.

When - in the external case - we add to super, we much check for
mdmon first, so we know whether to do the metadata update ourselves
or not, then afterwards call either flush_metadata_updates (to send
to mdmon) or sync_metadata (to do it directly).

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years ago--stop: separate 'is busy' test for 'did it stop properly'.
NeilBrown [Thu, 17 Mar 2011 02:35:10 +0000 (13:35 +1100)] 
--stop: separate 'is busy' test for 'did it stop properly'.

Stopping an md array requires that there is no other user of it.
However with udev and udisks and such there can be transient other
users of md devices which can interfere with stopping the array.

If there is a transient users, we really want "mdadm --stop" to wait a
little while and retry.
However if the array is genuinely in-use (e.g. mounted), then we
don't want to wait at all - we want to fail immediately.

So before trying to stop, re-open device with O_EXCL.  If this fails
then the device is probably in use, so give up.

If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly
a transient failure, so try again for a few seconds.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix regression when using 'grow' to add a bitmap.
NeilBrown [Tue, 15 Mar 2011 05:31:20 +0000 (16:31 +1100)] 
Fix regression when using 'grow' to add a bitmap.

When we allowed a devlist to accompany some --grow modes - but not
--bitmap - we made --bitmap always fail, in stead of fail of a device
was given to add.
As 'devs_found' includes the md device, we need to  compare against
'1'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMerge branch 'master' into devel-3.2
NeilBrown [Tue, 15 Mar 2011 04:35:04 +0000 (15:35 +1100)] 
Merge branch 'master' into devel-3.2

Conflicts:
Manage.c
managemon.c
super-ddf.c
super-intel.c

13 years agomdadm.man: added encouragement to shrink filesystem before array.
NeilBrown [Tue, 15 Mar 2011 04:24:03 +0000 (15:24 +1100)] 
mdadm.man: added encouragement to shrink filesystem before array.

Suggesting by Rory Jaffe <rsjaffe@gmail.com> to make the danger
of shrinking, and to recommended avoidance technique, more explicit.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: implement remove_from_super
NeilBrown [Mon, 14 Mar 2011 07:56:16 +0000 (18:56 +1100)] 
ddf: implement remove_from_super

This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.

Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoIMSM: Fix problem in mdmon monitor of using removed disk in imsm container.
Labun, Marcin [Tue, 15 Mar 2011 04:09:31 +0000 (15:09 +1100)] 
IMSM: Fix problem in mdmon monitor of using removed disk in imsm container.

Manager thread shall pass the information to monitor thread (mdmon)
that some devices are removed from container.  Otherwise, monitor
(mdmon) might use such devices (spares) to rebuild the array that has
gone degraded.

This problem happens for imsm containers, since a list of the
container disks is maintained in intel_super structure. When array
goes degraded, the list is searched to find a spare disks to start
rebuild.  Without this fix the rebuild could be stared on the spare
device that was a member of the container, but has been removed from
it.

New super type function handler has been introduced to prepare
metadata format specific information about removed devices.

int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo)

The message prepared in remove_from_super is later processed by
process_update handler in monitor thread.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoDDF Allow a RAID1 to be 'partially optimal'.
NeilBrown [Tue, 15 Mar 2011 04:09:24 +0000 (15:09 +1100)] 
DDF Allow a RAID1 to be 'partially optimal'.

If a RAID1 is meant to have more than 2 device and while it doesn't
have that many, it still has more than 1, then according to the
DDF spec it is "partially optional" rather than "degraded"
So make that so.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: remove failed devices that are no longer in use.
NeilBrown [Tue, 15 Mar 2011 04:02:49 +0000 (15:02 +1100)] 
ddf: remove failed devices that are no longer in use.

The DDF spec requires we have a phys disk record for every physically
attached device.  But it isn't clear what that means in the case
of soft raid in a general purpose Linux computer.
So remove phys disk records for any failed device that is not
active in any array.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: set Rebuilding flag when adding devices to a degraded array
NeilBrown [Tue, 15 Mar 2011 03:57:46 +0000 (14:57 +1100)] 
ddf: set Rebuilding flag when adding devices to a degraded array

This is a big fragile, but DDF has wierd rules that we aren't really
set up to handle properly.

When we add a device to a degraded array it must be a spare, so
mark it as Rebuilding.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: use correct loop variable in activate_spare
NeilBrown [Tue, 15 Mar 2011 03:54:46 +0000 (14:54 +1100)] 
ddf: use correct loop variable in activate_spare

Using 'i' when you mean 'j' just shows how silly it is to use
variables named 'i' and 'j'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: Don't consider 'dl' entries with state_fd < 0
NeilBrown [Tue, 15 Mar 2011 03:53:00 +0000 (14:53 +1100)] 
ddf: Don't consider 'dl' entries with state_fd < 0

These have been marked as invalid (recently failed) so
don't trust the major/minor associated with them.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomanagemon: Don't do spare assignment while any updates are pending.
NeilBrown [Tue, 15 Mar 2011 03:51:12 +0000 (14:51 +1100)] 
managemon: Don't do spare assignment while any updates are pending.

Spare assignment requires full knowledge of array state.  A pending
update might modify that state (such as a pending spare assignment)
so don't try while there are updates pending.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoManage/external: for external metadata, add_to_super needs lock on container.
NeilBrown [Tue, 15 Mar 2011 03:48:20 +0000 (14:48 +1100)] 
Manage/external: for external metadata, add_to_super needs lock on container.

add_to_super could use information from the current superblock (ddf
does), so add_to_super for external metadata should be called with
the O_EXCL lock held on the container to ensure the update is complete
before any other process tries to make any changes (like adding
another device to array).

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: FIX: existing backup file fails unit tests
Adam Kwolek [Mon, 14 Mar 2011 14:09:29 +0000 (15:09 +0100)] 
imsm: FIX: existing backup file fails unit tests

During normal test execution, backup file is deleted after test execution.
If test is interrupted/broken, backup file can remain for next run.
When backup file exists before unit test run, suits 12 and 13 fails.

To avoid this remove backup file before grow is executed.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: implement remove_from_super
NeilBrown [Mon, 14 Mar 2011 07:56:16 +0000 (18:56 +1100)] 
ddf: implement remove_from_super

This is needed to remove devices from mdmon's knowledge when the
device is removed from the md container.

Now that ddf have a remove_from_super we don't need the code
that allows some personalities not to implement this.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: zero space_list in ddf_activate_spare.
NeilBrown [Mon, 14 Mar 2011 07:54:21 +0000 (18:54 +1100)] 
ddf: zero space_list in ddf_activate_spare.

Currently ->space_list is uninitialised here, which is obviously bad.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMerge branch 'master' into devel-3.2
NeilBrown [Mon, 14 Mar 2011 07:49:57 +0000 (18:49 +1100)] 
Merge branch 'master' into devel-3.2

13 years agoddf: set vcnum correctly when creating a new virtual device in conflist
NeilBrown [Mon, 14 Mar 2011 07:47:47 +0000 (18:47 +1100)] 
ddf: set vcnum correctly when creating a new virtual device in conflist

We weren't setting ->vcnum at all when an array was added.  This
meant that a subsequent device failure could be assigned to the
wrong array.

Reported-by: Albert Pauw <albert.pauw@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: teach set_disk to cope with new or changed devices.
NeilBrown [Mon, 14 Mar 2011 07:45:26 +0000 (18:45 +1100)] 
ddf: teach set_disk to cope with new or changed devices.

When set_disk is called, we need to check if the disk has changed or
recently appeared, and update everything properly if it has.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: free_super should be add_list as well.
NeilBrown [Mon, 14 Mar 2011 07:32:38 +0000 (18:32 +1100)] 
ddf: free_super should be add_list as well.

It is possible there is data and even an open file descriptor
on 'add_list' - so it must be freed too.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: minor activate_super fixes.
NeilBrown [Mon, 14 Mar 2011 07:30:34 +0000 (18:30 +1100)] 
ddf: minor activate_super fixes.

1/ ignore devices with "state_fd < 0" as these have been removed.
2/ Set update 'length' properly and clear 'space'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomonitor: close recovery_fd when closing state_Fd
NeilBrown [Mon, 14 Mar 2011 07:24:01 +0000 (18:24 +1100)] 
monitor: close recovery_fd when closing state_Fd

These should be open or closed together.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoWarn the user about too small array size
Krzysztof Wojcik [Mon, 14 Mar 2011 07:21:21 +0000 (18:21 +1100)] 
Warn the user about too small array size

If single-disk RAID0 or RAID1 array is created, user may preserve data on
disk. If array given size covers all partitions on disk, all data will be
available on created array. If array size is too small (not covers
all partitions), data will be not accessible.
This patch introduces warning message during array creation if given size
is too small. User may interrupt creation process to avoid data loss.

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoplatfrom_intel: find OROM based on Intel AHCI and SAS driver device id
Labun, Marcin [Mon, 14 Mar 2011 07:18:46 +0000 (18:18 +1100)] 
platfrom_intel: find OROM based on Intel AHCI and SAS driver device id

We use PCI device id exposed by AHCI and ISCU drivers (SAS controller)
to find OROM version table.
In this way there is no need to maintain AHCI and ISCU device id list
in mdadm. The consequence is that the OROM properties can be found by mdadm when AHCI or
SAS drivers are loaded in the system.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: FIX: Store checkpoint in per disk units
Adam Kwolek [Mon, 14 Mar 2011 07:17:53 +0000 (18:17 +1100)] 
imsm: FIX: Store checkpoint in per disk units

While last_checkpoint is counter in per disk units, checkpoints
should be stored in the same manner.
Restoring from checkpoint should should recalculate checkpoint in to
array position (reshape_progress).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Last_checkpoint has to be initialized in per disk units
Adam Kwolek [Mon, 14 Mar 2011 07:17:52 +0000 (18:17 +1100)] 
FIX: Last_checkpoint has to be initialized in per disk units

last_checkpoint is variable that tracks sync_complete sysfs entry.
sync_complete is per disk counter, so initializing during starting from checkpoint
has to have this in mind and convert reshape position properly.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Last checkpoint is not initialized on reshape restart
Adam Kwolek [Thu, 10 Mar 2011 14:05:54 +0000 (15:05 +0100)] 
FIX: Last checkpoint is not initialized on reshape restart

When reshape is restarted and active array in mdmon is being initialized,
mdmon has to know last checkpoint, otherwise reshape will be restarted
form '0' position.
mdadm when reshaped array is assembled stores reshape_position in sysfs
and runs mdmon. Initialize last_checkpoint in active array structure
to value present in sysfs for reshaped array start.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Unfreeze array on success only
Adam Kwolek [Thu, 10 Mar 2011 07:30:42 +0000 (08:30 +0100)] 
FIX: Unfreeze array on success only

Unfreeze array on success only.
rv is initialized by restart variable so we have 2 cases.
1. regular reshape start
rv == restart == 0
   this means that real error (returned by reshape) can cause leaving container frozen
   If array is not touched by reshape it can be unfrozen
2. During reshape restart even untouched array under reshape is left unfrozen,
   If reshape is started do not unfreeze array on error also.

This allows user for array repair action
(mdmon will not change array state).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: Failed should suppress Online and others.
NeilBrown [Thu, 10 Mar 2011 07:14:43 +0000 (18:14 +1100)] 
ddf: Failed should suppress Online and others.

so the notes say, so make it so.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMerge branch 'master' into devel-3.2
NeilBrown [Thu, 10 Mar 2011 06:37:04 +0000 (17:37 +1100)] 
Merge branch 'master' into devel-3.2

Conflicts:
Grow.c
Manage.c
managemon.c
mdadm.8.in
util.c

13 years agoManage: be more careful about --add attempts.
NeilBrown [Mon, 22 Nov 2010 08:35:25 +0000 (19:35 +1100)] 
Manage:  be more careful about --add attempts.

If an --add is requested and a re-add looks promising but fails or
cannot possibly succeed, then don't try the add.  This avoids
inadvertently turning devices into spares when an array is failed but
the devices seem to actually work.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: remove duplicate container_member setting.
NeilBrown [Mon, 22 Nov 2010 08:35:25 +0000 (19:35 +1100)] 
ddf: remove duplicate container_member setting.

We were setting ->container_member twice in ddf get_info.
Once to currentconf->vcnum,
once to atoi(st->subarray).

Both should be the same.
For consistency with super-intel, use the first.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix warning about host-endian bitmaps.
NeilBrown [Tue, 30 Nov 2010 05:25:26 +0000 (16:25 +1100)] 
Fix warning about host-endian bitmaps.

Hostendian bitmaps should be warned about on all arch's.
And fix a speeling mistake.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: give useful message when adding bitmap gives EBUSY.
NeilBrown [Tue, 30 Nov 2010 05:34:25 +0000 (16:34 +1100)] 
Grow: give useful message when adding bitmap gives EBUSY.

If adding a bitmap fails with EBUSY, then it is because the array is
currently resyncing/recovering/reshaping.
As this is non-obvious, give a message explaining the fact.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAssemble: add --update=no-bitmap
NeilBrown [Tue, 30 Nov 2010 05:46:01 +0000 (16:46 +1100)] 
Assemble: add --update=no-bitmap

This allows an array with a corrupt internal bitmap to be assembled
without the bitmap.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAssemble: call remove_partitions later.
NeilBrown [Tue, 30 Nov 2010 05:56:01 +0000 (16:56 +1100)] 
Assemble: call remove_partitions later.

We shouldn't call remove_partitions until we have made a really firm
decision to include the device into the array.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdmon: don't copy an invalid chunk_size
NeilBrown [Tue, 30 Nov 2010 07:35:36 +0000 (18:35 +1100)] 
mdmon: don't copy an invalid chunk_size

As chunk_size in mdstat_ent is never set, we shouldn't copy
it into a->info.array.
In fact, it is safest to get rid of the field altogether.

Reported-by: "Kwolek, Adam" <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoddf: fail creation of new subarray with same name as old.
NeilBrown [Tue, 30 Nov 2010 22:55:35 +0000 (09:55 +1100)] 
ddf: fail creation of new subarray with same name as old.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate: report failure if array cannot be started.
NeilBrown [Wed, 1 Dec 2010 00:03:28 +0000 (11:03 +1100)] 
Create: report failure if array cannot be started.

We weren't checking the result of writing 'active' to array_state

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: disallow placing backup file on array being reshaped.
NeilBrown [Wed, 1 Dec 2010 00:58:32 +0000 (11:58 +1100)] 
Grow: disallow placing backup file on array being reshaped.

the tests here aren't perfect, but they could catch some cases.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate/grow: improve checks on number of devices.
NeilBrown [Wed, 1 Dec 2010 03:51:27 +0000 (14:51 +1100)] 
Create/grow: improve checks on number of devices.

Check on upper limit of number of devices was in the wrong place.
Result was could not create array with more than 27 devices without
explicitly setting metadata, even though default metadata allows more.

Fixed, and also perform check when growing an array.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoerror check reading of 'degraded' from sysfs.
NeilBrown [Thu, 20 Jan 2011 21:59:00 +0000 (08:59 +1100)] 
error check reading of 'degraded' from sysfs.

I'm seen mdadm spinning while failing to read 'degraded'.
This doesn't really fix it, but is a reminder that it needs to be
fixed.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Reset disk state if disk is missing
Krzysztof Wojcik [Thu, 10 Mar 2011 06:07:04 +0000 (17:07 +1100)] 
FIX: Reset disk state if disk is missing

If we can't read actual disk state, it shoud be initiated
to 0.
Overwise it may be out of date value resulting false action
later in code (e.g. set disk to improper state).

Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoopen_mddev: open RDONLY if RDWR doesn't work.
NeilBrown [Thu, 10 Mar 2011 06:07:04 +0000 (17:07 +1100)] 
open_mddev: open RDONLY if RDWR doesn't work.

If an array is read-only then "mdadm -S"
cannot open it to stop it without this fix.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoInitialise all of file when opening backup file for reshape.
NeilBrown [Thu, 10 Mar 2011 06:06:59 +0000 (17:06 +1100)] 
Initialise all of file when opening backup file for reshape.

Due to a miscalculation we didn't initialise the whole file.
There is 4K (8 sectors) for the metadata, then the data.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agomdadm.man add encouragement to shrink filesystem before shrinking array.
NeilBrown [Tue, 15 Feb 2011 01:40:21 +0000 (12:40 +1100)] 
mdadm.man add encouragement to shrink filesystem before shrinking array.

Before resizing an array with --size or --array-size, then filesystem
should be resized.  mdadm cannot do this so the user should.

Reported-by: Gavin Flower <gavinflower@yahoo.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoDetail: report subarrays of a container properly.
NeilBrown [Wed, 9 Mar 2011 07:22:27 +0000 (18:22 +1100)] 
Detail: report subarrays of a container properly.

Due to the wrong variable being used, this part of --detail
wasn't working at all.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agodev_open should always open read-only.
NeilBrown [Thu, 10 Mar 2011 00:41:21 +0000 (11:41 +1100)] 
dev_open should always open read-only.

When opening an array to manipulate it we never need to write to the
array and  sometimes it might be read-only so the open for write will
fail.
So always open read-only.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMan page updates for new --grow options.
NeilBrown [Thu, 10 Mar 2011 05:41:54 +0000 (16:41 +1100)] 
Man page updates for new --grow options.

Describe all the new ways that mdadm can reshape arrays.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: allow monitor thread to exit when there is nothing more to do.
NeilBrown [Thu, 10 Mar 2011 04:59:24 +0000 (15:59 +1100)] 
Grow: allow monitor thread to exit when there is nothing more to do.

When an array using native metadata is increasing in size, we don't
need to keep monitoring it after the initial 'critical section'.
So detect that case.
If a final level-change is still needed mdadm will wait for that,
otherwise it will simply exit.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: don't forget_backup when length of backup is zero.
NeilBrown [Thu, 10 Mar 2011 04:43:04 +0000 (15:43 +1100)] 
Grow: don't forget_backup when length of backup is zero.

This is just a waste of IO

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: make sure 'info' doesn't have confusing data.
NeilBrown [Thu, 10 Mar 2011 04:36:07 +0000 (15:36 +1100)] 
Grow: make sure 'info' doesn't have confusing data.

We now test ->reshape_active, but don't set it in a common case.

So just zero out the whole structure to be on the safe side.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: support reshape of RAID0 arrays.
NeilBrown [Thu, 10 Mar 2011 04:05:23 +0000 (15:05 +1100)] 
Grow: support reshape of RAID0 arrays.

This is done via conversion to RAID4 and back.

To grow the array, extra devices will be needed which cannot
already be present as spares - so allow a list of new devices
to be included in grow request which changed the number of devices.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: Allow for component_size not being set for RAID0 arrays.
NeilBrown [Thu, 10 Mar 2011 04:00:38 +0000 (15:00 +1100)] 
Grow: Allow for component_size not being set for RAID0 arrays.

When an RAID0 is started using SET_ARRAY_INFO ioctl the
component_size will be zero.
This confused the code for reshaping a RAID0 via RAID4.

So if that seems to be the case, fake a believable component_size

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMake find_intel_hba_capability less verbose.
NeilBrown [Thu, 10 Mar 2011 03:53:30 +0000 (14:53 +1100)] 
Make find_intel_hba_capability less verbose.

mdadm has a convention in some areas of passing a device name
if error messages about it are interesting, or NULL if not.

Follow this convention with find_intel_hba_capability so that it
doesn't complain when not appropriate - and so that it doesn't
have to go and find a device name that it wasn't given.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoplatform_intel: support for OROM OEM capabilities
Labun, Marcin [Thu, 10 Mar 2011 00:52:22 +0000 (11:52 +1100)] 
platform_intel: support for OROM OEM capabilities

Scan memory to match $VER and $OEM.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: introduce SAS controller support in imsm metadata handler
Labun, Marcin [Thu, 10 Mar 2011 00:52:15 +0000 (11:52 +1100)] 
imsm: introduce SAS controller support in imsm metadata handler

OROM/EFI capabilities are retrieved based on disk's controller type.
1/ alloc_super no longer retrieves OROM capabilities
2/ find_imsm_capability replaces find_imsm_orom
3/ new function find_intel_hba_capability gets disk's HBA and relevant
capability

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: move code for retrieving HBA to a function
Labun, Marcin [Thu, 10 Mar 2011 00:50:58 +0000 (11:50 +1100)] 
imsm: move code for retrieving HBA to a function

Function find_intel_hba_capability attaches HBA information
to intel_super structure based on fd of the component disk.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: verify that component disks are attached to the same type of HBA
Labun, Marcin [Thu, 10 Mar 2011 00:50:57 +0000 (11:50 +1100)] 
imsm: verify that component disks are attached to the same type of HBA

compare_super_imsm verifies that the component disks use the same type of HBA
in platform dependent environment. Otherwise print-out error message and block
the action.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: add maximum number of disk validation in RAID array
Labun, Marcin [Thu, 10 Mar 2011 00:50:54 +0000 (11:50 +1100)] 
imsm: add maximum number of disk validation in RAID array

Arrays exceeding the OROM/EFI maximum number of supported disk are
blocked in validate_geometry_imsm_orom function.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: print-out error message when volume validation fails
Labun, Marcin [Thu, 10 Mar 2011 00:50:52 +0000 (11:50 +1100)] 
imsm: print-out error message when volume validation fails

Print-out error message when volume geometry fails to comply with
OROM/EFI controller's capabilities.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: do not publish OROM/EFI unsupported arrays
Labun, Marcin [Thu, 10 Mar 2011 00:50:49 +0000 (11:50 +1100)] 
imsm: do not publish OROM/EFI unsupported arrays

Container_content_imsm calls validate_goemtry_imsm_orom to verify that
the array parameters are supported by controller's OROM/EFI.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: detail_platform_imsm displays AHCI and SAS controller information
Labun, Marcin [Thu, 10 Mar 2011 00:46:11 +0000 (11:46 +1100)] 
imsm: detail_platform_imsm displays AHCI and SAS controller information

The function uses find_intel_device and find_imsm_capability to present
AHCI and SAS controller capabilities taken from OROM or EFI.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm: remove unused parameters in function attach_hba_to_super
Labun, Marcin [Thu, 10 Mar 2011 00:45:49 +0000 (11:45 +1100)] 
imsm: remove unused parameters in function attach_hba_to_super

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoread platform capabilities from EFI
Labun, Marcin [Thu, 10 Mar 2011 00:45:35 +0000 (11:45 +1100)] 
read platform capabilities from EFI

If operating system is installed using efi, IMSM platform capabilities are
 not available via option ROM, but are stored as efi variables. New
 mechanism has been introduced to obtain capabilities by variables.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSome guid manipulation utilities has been added.
Labun, Marcin [Thu, 10 Mar 2011 00:45:15 +0000 (11:45 +1100)] 
Some guid manipulation utilities has been added.

It will be used for reading efi variables with capabilities.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoupdate of imsm_orom structure
Labun, Marcin [Thu, 10 Mar 2011 00:45:00 +0000 (11:45 +1100)] 
update of imsm_orom structure

The structure is update according to current specification. These values
are not used right now, but they are not "reserved" anymore.

Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoPlatform-intel: support for OROM SAS and AHCI controller
Labun, Marcin [Thu, 10 Mar 2011 00:44:21 +0000 (11:44 +1100)] 
Platform-intel: support for OROM SAS and AHCI controller

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoprobe_roms: allow to probe expansion ROMs using vendor and device id.
Labun, Marcin [Thu, 10 Mar 2011 00:41:46 +0000 (11:41 +1100)] 
probe_roms: allow to probe expansion ROMs using vendor and device id.

Adds data offset to PCI expansion ROM Data Structure in resource
describing Expansion ROMs. This allows AHCI OROM scanning function
to identify AHCI OROM by device id 0x2822 and vendor id 0x8086.

Signed-off-by: Marcin Labun <marcin.labun@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoimsm : FIX: Assemble dirty array when reshape is in progress
Adam Kwolek [Thu, 10 Mar 2011 00:41:33 +0000 (11:41 +1100)] 
imsm : FIX: Assemble dirty array when reshape is in progress

During reshape for dirty volumes reshape_progress has to be calculated
also.  To keep the same logic for array creation:
  not setting info->resync_start = MaxSector when first condition is
  true,
  resync_start is initialized by MaxSector to allow proper array
  initialization.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Set 'active' array state before array configuration
Adam Kwolek [Thu, 10 Mar 2011 00:41:28 +0000 (11:41 +1100)] 
FIX: Set 'active' array state before array configuration

For not reshaped array in container during assembly array is in
auto-read-only state.  It is not possible to set disk slot for such
array and later reshape cannot be started also.  To move array from
'auto-read-only' to 'active' state storing 'active' state to sysfs is
added. This allows for disks configuration and reshape.

During reshaped array restart it is disabled by condition on restart
variable.

When reshape is starting, storing 'active' state to already active
array should not matter.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agodev_open should always open read-only.
NeilBrown [Thu, 10 Mar 2011 00:41:21 +0000 (11:41 +1100)] 
dev_open should always open read-only.

When opening an array to manipulate it we never need to write to the
array and  sometimes it might be read-only so the open for write will
fail.
So always open read-only.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRemove incorrect use of open_dev
NeilBrown [Thu, 10 Mar 2011 00:36:47 +0000 (11:36 +1100)] 
Remove incorrect use of open_dev

open_dev can only be used for md array.  To open an
arbitrary device, dev_open must be used.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: make sure mdmon is running for Grow_continue arrays.
NeilBrown [Thu, 10 Mar 2011 00:36:47 +0000 (11:36 +1100)] 
Grow: make sure mdmon is running for Grow_continue arrays.

when starting an array that is in the middle of a migration,
we need to start mdmon, just as we do for arrays which are not
in the middle of a migration.

Repored-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Make expansion counter usable
Adam Kwolek [Wed, 9 Mar 2011 22:58:35 +0000 (09:58 +1100)] 
FIX: Make expansion counter usable

Currently whole array geometry is set in sysfs_set_array(),
so none of disks (even for expansion) should fail during sysfs_add_disk()
Due to this expansion counter should be used for reshaped array when
disk slot is bigger than number of disks in array.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Block reshaped array monitoring
Adam Kwolek [Wed, 9 Mar 2011 22:57:39 +0000 (09:57 +1100)] 
FIX: Block reshaped array monitoring

When array under reshape is assembled it has to be disabled from
monitoring as soon as possible. It can occur that this is i.e second
array in container and mdmon is loaded already.
Lack of blocking monitoring can cause change array state to active,
and reshape continuation will be not possible.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Load container content for container reshape continuation
Adam Kwolek [Wed, 9 Mar 2011 22:54:56 +0000 (09:54 +1100)] 
FIX: Load container content for container reshape continuation

st->sb is null. This is exception cause.
reshape_container() function expects that super block will be loaded.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: don't let analyse_change see new level from metadata.
NeilBrown [Wed, 9 Mar 2011 07:53:09 +0000 (18:53 +1100)] 
Grow: don't let analyse_change see new level from metadata.

This is a bit of a hack - probably analyse change needs to be
re-written a bit to handle this properly.

However when the metadata deduced the intermediate state for a
reshaping array, the 'new_level' it sets should not be used to
interpret the 'delta_disks' number.
So in that case, hide the new_level while calling analyse_change.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: don't try to use 'raid_disks' value for a container.
NeilBrown [Wed, 9 Mar 2011 07:50:59 +0000 (18:50 +1100)] 
Grow: don't try to use 'raid_disks' value for a container.

The 'raid_disks' for a container is zero, so subtracting it
from the given raid_disks to get delta_disks doesn't make sense.

Rather set delta_disks to UnSet and set raid_disks to the requested
number of disks.   This then gets passed to reshape_super() which
can use it as required.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: only check 'native format' when really needed.
NeilBrown [Wed, 9 Mar 2011 07:47:24 +0000 (18:47 +1100)] 
Grow: only check 'native format' when really needed.

The check that the array info is already in 'native format' is
only relevant when restarting a growth, so only perform it then.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFIX: Check correct 'old' level to see if reshape is needed.
Adam Kwolek [Tue, 8 Mar 2011 12:24:55 +0000 (13:24 +0100)] 
FIX: Check correct 'old' level to see if reshape is needed.

Normally when reshape_array is called with restart == 0,
info->array is the same as the 'array' read from the kernel
(via ioctl) so both have the same level.

However when called from reshape_container, info->array was
generated by the metadata so it will have 'level' set to the
intermediate (or final) level already.

So to test if we need to change the level, we need to compare the
desired level with that which was loaded from the kernel (array.level)
rather than that which was read from metadata (info->array.level).

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGrow: add check that there are enough devices.
NeilBrown [Wed, 9 Mar 2011 07:37:00 +0000 (18:37 +1100)] 
Grow: add check that there are enough devices.

The check for 'enough spares' doesn't apply to RAID0 as we don't
mind it going degraded.  But add a test that there are enough spares
to actually produce a working array.

Signed-off-by: NeilBrown <neilb@suse.de>