Dan Williams [Wed, 30 Jul 2008 02:01:06 +0000 (19:01 -0700)]
mdmon: ignore inactive arrays and other manage_new() cleanups
While mdadm is constructing an array mdmon may see an intermediate state
(some disks not yet added / redundancy attributes like sync_action not
available). Waiting for mdstat->active == true ensures that the array
is ready to be handled. This fixes a bug in create array via mdmon
update whereby failures are not detected in the new array.
Introduce aa_ready() to catch cases where the active_array is not
correctly initialized. Barring a kernel bug this should never trigger,
nonetheless it precludes a class of bugs like the one mentioned above
from triggering.
Cleanup the exit paths and only call replace_array when the new array is
ready to be inserted into container->arrays.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 25 Jul 2008 23:59:47 +0000 (16:59 -0700)]
imsm: refactor mpb handling into parse and coalesce
Maintaining a single global buffer is unwieldly when extending/rewriting
sections of the metadata. Parse the metadata into component data
structures upon reading and coalesce to a coherent buffer before
writing.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 25 Jul 2008 00:26:24 +0000 (17:26 -0700)]
sysfs: deprecate sysfs_disk_to_sg
The cmd_filter patch merged for 2.6.27 broke retrieving the serial
number via an ioctl to /dev/sgN. In debugging this I found that other
utilities like sdparm simply run the ioctl on /dev/sdX. So just convert
to that for protection in numbers, but scream on the mailing list for
the inconvenience grr...
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Neil Brown [Fri, 18 Jul 2008 06:37:09 +0000 (16:37 +1000)]
Stop managed arrays more carefully.
If an array is being managed by mdmon, then just
write "inactive" to stop it, and let mdmon do the
final "clear". This makes sure mdmon has a chance
to read the final state and update the metadata properly.
After writing "inactive" with use "ping_monitor" to synchronise
with mdadm, then STOP the array just in case it is still running,
else we will get into an infinite loop in "mdadm -Ss".
When a 'ping' (empty message) is sent to mdmon, we wait for
'monitor' to do a full loop to make sure it has caught up
with anything that needs doing.
This allows synchronisation between mdadm and mdmon.
Maybe monitor should signal managemon rather than managemon polling...
Dan Williams [Tue, 24 Jun 2008 13:16:44 +0000 (06:16 -0700)]
imsm: remove extra superswitches
Following the lead of 75ede16d. This incidentally fixes creation of a second
array by gating call to getinfo_super_imsm_volume with a valid ->current_vol.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Neil Brown [Sat, 12 Jul 2008 10:28:38 +0000 (20:28 +1000)]
Print used message in place of "default metadata" message.
When creating an array in a container, print e.g.
Creating array inside ddf container /dev/whatever
rather than
Defaulting to version /md127/1 metadata
Neil Brown [Sat, 12 Jul 2008 10:28:33 +0000 (20:28 +1000)]
Use O_DIRECT for all IO to devices.
Using buffered IO risks non-atomic updates to parts of the
device that we don't actually want to write to. This isn't in
general safe.
So switch to O_DIRECT for all that IO and make sure we have
properly aligned buffers.
Neil Brown [Sat, 12 Jul 2008 10:27:42 +0000 (20:27 +1000)]
Improve shutdown for container-based arrays.
1/ close a race where multiple arrays disappear at once
and monitor isn't woken up to find out that the last one
has gone.
2/ "mdadm -Ss" needs to pause briefly for mdmon to exit.
Neil Brown [Sat, 12 Jul 2008 10:27:40 +0000 (20:27 +1000)]
Remove mon_pipe for communicating from monitor to manager
The returned value was never used, and we don't really want
this return path anyway as writing to a pipe could conceivably
block, and the monitor must not block.
Neil Brown [Sat, 12 Jul 2008 10:27:40 +0000 (20:27 +1000)]
Handle device removal from container
This really should be done in mdadm, not mdmon.
We ensure the device won't be suddenly commited as a hot-spare
using O_EXCL, then check the 'holders' sysfs directory
to make sure it is only in use once.
Neil Brown [Sat, 12 Jul 2008 10:27:38 +0000 (20:27 +1000)]
Add subarray field to supertype.
When loading the metadata for a subarray (super_by_fd), we set
->subarray to be the name read from md/metadata_version so that
getinfo_super can return info about the correct array.
With this we can differentiate between a container and
an array within the container by looking at ->subarray[0].
Neil Brown [Sat, 12 Jul 2008 10:27:38 +0000 (20:27 +1000)]
Hide subordinate superswitch structures.
Only one superswitch should be externally visible for each
general type. Others which handle different flavours
(e.g. container/data-array) should be internal only.
Neil Brown [Sat, 12 Jul 2008 10:27:36 +0000 (20:27 +1000)]
Fix write_init_super usage when hot-adding a spare
Using write_init_super to add a spare to an active array is quite
different to how it is used when creating an array.
It mostly works, but if we are adding two devices to an array,
then when we add the second, there are still traces of the first
which confuse write_init_super.
So get write_init_super to ignore those traces. Longer term, we
probably want to do this differently as for DDF, hot-adding to
an active array will have to be quite different - it will want to
write to all metadata, possibly via mdmon.
Neil Brown [Thu, 10 Jul 2008 22:50:06 +0000 (08:50 +1000)]
Always assume_clean for raid0, linear, multipath, faulty
For arrays that don't have redundancy (raid0, linear etc), the
clean/dirty distinction doesn't mean anything. So always
'assume clean' for these arrays.
Neil Brown [Thu, 19 Jun 2008 06:30:36 +0000 (16:30 +1000)]
Fix an error when assembling arrays that are in the middle of a reshape.
It is important that dup_super always returns an 'st' with the same
->ss and ->minor_version as the st that was passed.
This wasn't happening for 0.91 metadata (i.e. in the middle of a reshape).