Neil Brown [Thu, 19 Jun 2008 06:30:36 +0000 (16:30 +1000)]
Fix an error when assembling arrays that are in the middle of a reshape.
It is important that dup_super always returns an 'st' with the same
->ss and ->minor_version as the st that was passed.
This wasn't happening for 0.91 metadata (i.e. in the middle of a reshape).
Neil Brown [Thu, 12 Jun 2008 00:13:23 +0000 (10:13 +1000)]
Allow passing metadata update to the monitor.
Code in manager can now just call queue_metadata_update with a
(freeable) buf holding the update, and it will get passed to the
monitor and written out.
Neil Brown [Tue, 27 May 2008 07:23:16 +0000 (17:23 +1000)]
Avoid NULL reference calling free_super and elsewhere.
Since we made free_super a superswitch call, we need to be careful
that st is non NULL before calling st->ss->free_super(st).
Also when updating byteorder there is a chance of a similar NULL
deref.
Neil Brown [Mon, 26 May 2008 23:18:56 +0000 (09:18 +1000)]
Discard st->container_member
'container_member' isn't really a well defined concept.
Each metadata might enumerate members differently, so just
let each format /mdX/YYYY as appropriate.
Neil Brown [Mon, 26 May 2008 23:18:55 +0000 (09:18 +1000)]
Remove st->text_version in favour of info->text_version
I want the metadata handler to have more control over the 'version',
particularly for arrays which are members of containers.
So discard st->text_version and instead use info->text_version
which getinfo_super can initialise.
Neil Brown [Mon, 26 May 2008 23:18:53 +0000 (09:18 +1000)]
Discard get_sync_pos. We should be using get_resync_start.
"sync_complete" just tracks the current resync/recover/check/whatever pass.
"resync_start" tracks which parts of the array are known to be in-sync
(modulo active writes). So it is what we need to use to update the metadata.
Also we cannot call it when the array has stopped, as the value is no longer
available then. We must call it when the resync completes.
Possibly also call it preiodically if the array is quiescent.
Neil Brown [Mon, 26 May 2008 23:18:38 +0000 (09:18 +1000)]
Implement mark_clean for ddf and remove mark_dirty and mark_sync
mark_dirty is just a special case of mark_clean - with sync_pos == 0.
mark_sync is not required. We don't modify the metadata when sync
finishes. Only when the array becomes non-writeable at which point we
use mark_clean to record how far the resync progressed.
Dan Williams [Thu, 15 May 2008 06:48:54 +0000 (16:48 +1000)]
add infrastructure to receive higher order commands, like remove_device
From: Dan Williams <dan.j.williams@intel.com>
Each md_message encapsulates a single command. A command includes an 'action'
member which describes what if any data comes after the action. Communication
with the monitor involves updating the active_cmd pointer and then writing to
mgr_pipe. Pass/fail status is returned via mon_pipe.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 15 May 2008 06:48:49 +0000 (16:48 +1000)]
handle disk failures
From: Dan Williams <dan.j.williams@intel.com>
Added curr_state as a parameter to set_disk. Handlers look at this to
record components failures, and set global 'degraded' or 'failed'
status.
When reading the state as faulty:
1/ mark the disk failed in the metadata
2/ write '-blocked' to the rdev state to allow the kernel's failure
mechanism to advance
3/ the kernel will take away the drive's role in remove_and_add_spares()
4/ once the disk no longer has a role writing 'remove' to the rdev state
will get the disk out of array.
There is a window after writing '-blocked' where the kernel will return
-EBUSY to remove requests. We rely on the fact that the disk will
continue to show faulty so we lazily wait until the kernel is ready to
remove the disk. If the manager thread needs to get the disk out of the
way it can ping the monitor and wait, just like the replace_array()
case.
[buglet fix: swap the parameters of attr_match in read_dev_state]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Neil Brown [Thu, 15 May 2008 06:48:12 +0000 (16:48 +1000)]
Change write_init_super to be called only once.
The current model for creating arrays involves writing
a superblock to each device in the array.
With containers (as with DDF), that model doesn't work.
Every device in the container may need to be updated
for an array made from just some the devices in a container.
So instead of calling write_init_super for each device,
we call it once for the array and have it iterate over
all the devices in the array.
To help with this, ->add_to_super now passes in an 'fd' and name for
the device. These get saved for use by write_init_super. So
add_to_super takes ownership of the fd, and write_init_super will
close it.
This information is stored in the new 'info' field of supertype.
As part of this, write_init_super now removes any old traces of raid
metadata rather than doing this in common code.
Neil Brown [Thu, 15 May 2008 06:48:08 +0000 (16:48 +1000)]
Reduce openning of dev in create.
Now that validate_geometry opens and checks the device,
we don't need to do it as much in top level Create.
We only need it to check for old array or filesystem info.
So only open the device at that place.