NeilBrown [Thu, 4 Jun 2009 02:44:32 +0000 (12:44 +1000)]
Examine: fix --examine --brief --verbose on containers.
With --verbose, --examine --brief prints dev= information after
the personality has done its bit.
But with containers, the member array are printed in between.
So in super-ddf and super-intel, move printing of the member
arrays to before printing of the container. This avoids
confusion.
NeilBrown [Thu, 4 Jun 2009 02:29:21 +0000 (12:29 +1000)]
super-intel: fix test on failed_disk_num.
We sometimes set failed_disk_num to ~0.
However we cannot test for equality with that as failed_disk_num
is 8bit and ~0 is probably 32bit with lots of 1's.
So test if ~failed_disk_num is 0 instead.
Reported-By: "Mr. James W. Laferriere" <babydr@baby-dragons.com> Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 2 Jun 2009 04:35:44 +0000 (14:35 +1000)]
Monitor: reduce default poll interval if mdstat is pollable.
Since 2.6.16, mdstat responds to select/poll.
So in that case, increase the default poll interval to about 15
minutes.
This ensures that the background load is insignificant.
NeilBrown [Tue, 2 Jun 2009 04:24:58 +0000 (14:24 +1000)]
Monitor: don't get confused if utime is never set.
externally managed arrays do not (currently) cause utime in
GET_ARRAY_INFO to be updated. So if it is zero, just assume the
current time.
This will cause GET_DISK_INFO to be called more often, but as we do
the scan only every 60 seconds normally, a few extra syscalls isn't
going to make a big difference.
Dan Williams [Mon, 18 May 2009 17:02:58 +0000 (10:02 -0700)]
imsm: kill "auto=" in brief_examine_super_imsm
The auto parameter is obsolete after kernel version 2.6.28 as all arrays
are partitionable via block device extended minor support. Environments
that requre the mdp style of array can always edit the configuration
file to specify auto=mdp.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Mon, 18 May 2009 16:58:55 +0000 (09:58 -0700)]
imsm: fix num_domains
The 'num_domains' field simply identifies the number of mirrors. So it
is 2 for a 2-disk raid1 or a 4-disk raid10. The orom does not currently
support more than 2 mirrors, but a three disk raid1 for example would
increase num_domains to 3.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
NeilBrown [Mon, 11 May 2009 05:58:44 +0000 (15:58 +1000)]
create_mddev: don't replace /dev/mdX with /dev/md/X
If someone creates/assemble an array called "/dev/md0", don't force
it to be "/dev/md/0". Doing so isn't really necessary and it
likely to confuse people.
NeilBrown [Mon, 11 May 2009 05:58:42 +0000 (15:58 +1000)]
mapfile - when rebuilding, choose an appropriate name is none is found.
When rebuilding the mapfile (mdadm -Ir), if not appropriate name is
found in /dev/md/, try to find an appropriate name, either by looking
in mdadm.conf or by using the name in the metadata.
NeilBrown [Mon, 11 May 2009 05:47:10 +0000 (15:47 +1000)]
Fix printf compile warning.
It always afters to cast big things to (unsigned long long) before
printing as %llu - it seems there will always be one arch which
has something to complain about ....
NeilBrown [Mon, 11 May 2009 05:47:10 +0000 (15:47 +1000)]
map_dev: prefer names in /dev/md/
Rather than preferring non-standard names (of which there are
many, like /dev/block/9:1), prefer names in /dev/md/ when finding
the name of an md device.
NeilBrown [Mon, 11 May 2009 05:47:10 +0000 (15:47 +1000)]
Be more consistent about keeping the host: prefix on array names.
If an array name contains a "hostname:" prefix, then
--assemble will tend to leave it there, while --incremental
will strip it off (when chosing a device name during auto-assembly).
Make this more consistent: strip the name off if we decide that
the name will be treated as 'local'. Leave it on if it will be
treated as 'foreign'.
NeilBrown [Mon, 11 May 2009 05:46:46 +0000 (15:46 +1000)]
Allow homehost to be largely ignored when assembling arrays.
If mdadm.conf contains
HOMEHOST <ignore>
or commandline contains
--homehost=<ignore>
then the check that array metadata mentions the given homehost is
replace by a check that the name recorded in the metadata is not
already used by some other array mentioned in mdadm.conf.
This allows more arrays to use their native name rather than having
an _NN suffix added.
This should only be used during boot time if all arrays required for
normal boot are listed in mdadm.conf.
If auto-assembly is used to find all array during boot, then the
HOMEHOST feature should be used to ensure there is no room for
confusion in choosing array names, and so it should not be set
to <ignore>.
NeilBrown [Mon, 11 May 2009 05:18:25 +0000 (15:18 +1000)]
Fix tests on ->container and ->member
For container= and member= to be effective in an mdadm.conf line
they must both be present. So when checking for their absence we
need container != NULL || member != NULL.
NeilBrown [Mon, 11 May 2009 05:18:20 +0000 (15:18 +1000)]
Make --brief even briefer.
Because ---examine --brief, or --detail --brief are
often used to create mdadm.conf, and because people don't want to
have to update their mdadm.conf unnecessarily, we don't want to
include information that might change.
And now that level changing is supported, that is almost everything
but UUID.
So move some more fields into the "Only print with --verbose" class.
NeilBrown [Mon, 11 May 2009 05:17:05 +0000 (15:17 +1000)]
config: support "ARRAY <ignore> ..." lines in mdadm.conf
Sometimes we want to ensure particular arrays are never
assembled automatically. This might include an array made of
devices that are shared between hosts.
To support this, allow ARRAY lines in mdadm.conf to use the word
"ignore" rather than a device name. Arrays which match such lines
are never automatically assembled (though they can still be assembled
by explicitly giving identification information on the mdadm command
line.
NeilBrown [Mon, 11 May 2009 05:16:49 +0000 (15:16 +1000)]
assemble: support arrays created with --homehost=any
If an array is created with --homehost=any, then --assemble and
--incremental will treat it as being local to 'this' host, no matter
what the name of this host is.
This is useful for array that will be given unique names and be
moved between machines.
NeilBrown [Mon, 11 May 2009 05:16:47 +0000 (15:16 +1000)]
create_dev - allow array names like mdX and /dev/mdX to appear 'numeric'
When choosing the minor number to use with an array, we currently base
the number of the 'name' stored in the metadata if that name is
numeric.
Extend that so that if it looks like a number md device name (/dev/md0
or just md0 or even /dev/md/0), then we use the number at the end to
suggest a minor number.
The means that if someone creates and array with "--name md0" or even
"--name /dev/md0" it will continue to do what they expect.
Apparently the dereferencing of a type-punned pointer breaks strict
aliasing rules. And we wouldn't want to do that.
So just make a different array of the appropriate type and use memcpy.
Paul Clements [Wed, 11 Feb 2009 18:49:26 +0000 (13:49 -0500)]
mdadm: allow build to use --size
This patch enables the --size parameter for build operations.
Without this, if you have a raid1, for instance, where the 2 disks are
not the exact same size, and you need to build the array but one of the
disks is not available right at the moment (maybe it's USB and it's
unplugged, or maybe it's a network disk and it's unavailable), then you
have to play some weird games to get the array to size correctly (that
is, to the size of the smaller of the two components or less).
From 2.6.30, /proc/mounts and various /sys files will
probably always returns 'readable' to select, so we will need
to wait on POLLPRI to get the 'new data is available' signal.
When using select, this corresponds to an 'exception', so
adjust calls to select accordingly.
In one case we sometimes wait on a socket and sometime on
/proc/mounts, so we need to test which.
During early boot, /var/run may not exist or be writable.
If that happens, sore the mapfile (which is very important for
incremental assembly) in /dev (which should exist for udev).
Thanks to Doug Ledford <dledford@redhat.com> for identify this
problem and suggesting a solution.
incremental_container: preserve 'in_sync' flag when adding to existing array.
When building container members with -IR, we need to ensure that
devices added to an active array preserve the 'in_sync' status so they
don't needlessly get rebuilt.
So allow sysfs_add_disk to do this (only works in kernels since
2.6.30) and pass the relevant flag down.
Dan Williams [Sun, 12 Apr 2009 07:58:28 +0000 (00:58 -0700)]
imsm: set array size at Create/Assemble
imsm arrays round down the effective array size to the closest 1
megabyte boundary so teach get_info_super_imsm and sysfs_set_array to
set 'md/array_size' if available (and make sure ddf uses the default
size).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Sun, 12 Apr 2009 07:58:28 +0000 (00:58 -0700)]
imsm: defend against unsupported migrations (temporary)
Until support for higher order migrations (online capacity expansion,
raid level migration, chunk size migration...) are implemented do not
allow arrays in these states to be assembled.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Sun, 12 Apr 2009 07:58:27 +0000 (00:58 -0700)]
imsm: add 'verify', 'verify with fixup', and 'general' migration types
imsm distinguishes parity initialization from parity checking in the
metadata. Older option roms marked the repair operation with the
'verify' type and a 'with fixup' flag in the raid device 'status' field.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Sun, 12 Apr 2009 07:58:27 +0000 (00:58 -0700)]
imsm: fix imsm_map.num_domains
'num_domains' is the number of parity domains. I.e. 2 in the raid10
case (2-mirrors), while raid0 through raid5 have 1 parity domain (even
though raid0 does not have parity).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Wed, 8 Apr 2009 18:41:51 +0000 (11:41 -0700)]
imsm: extract right-most whitespace stripped serial number
According to new documentation the metadata expects that all whitespace
(characters <= 0x20) are stripped from the incoming serial number. If
the length remains longer than MAX_RAID_SERIAL_LEN then only the
right-most characters are preserved.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
I'm not attaching a patch for this because it's so simple. Long story
short, watching both add and change events in udev rules is bad for md
devices. Specifically, the kernel will generate a change event on
things like array stop, and on things like fdisk close. In the case
of array stop, it can result in the array being assembled again
immediately. In the case of fdisk close, the situation is worse.
Let's say you stop all the md devices on some block device in order to
repartition. You run fdisk, change the partition table, then issue a
write of the table. The write of the table triggers the change event
*before* the kernel updates the partition table in memory for the
block device, causing udev to rerun the incremental rules on the old
partition table and restart all the arrays you just stopped with the
old partition table layout, at which point the kernel is unable to
reread the partition table. So, once you've enable incremental
assembly, it becomes apparent that what we really want is to only
start devices on add, not on add|change.
ddf: fixed 'working_disks' reported by container_content.
The 'work_disks' number should be the number that is expected, not the
number found so far. This is needed for Incremental assembly to
start the array at the right time.
When reporting "--detail --scan", use names like /dev/md/foo where
available rather than /dev/md/127
This is particularly needed for containers where the member arrays
will report "container=/dev/md/foo" and we want the container to have
the same name.
grow: don't wait forever for critical section to pass.
If an array reshape completed within 1 second, then --grow will not
notice that it has finished and will keep waiting for the critical
section to pass.
NeilBrown [Tue, 10 Mar 2009 05:28:22 +0000 (16:28 +1100)]
mdmon: allow incremental assembly of containers.
If mdmon sees a device added to a container, it should assume it is
a new spare. It could be a part of the array that just hadn't been
assembled yet. So check first.
NeilBrown [Tue, 10 Mar 2009 05:28:22 +0000 (16:28 +1100)]
Assemble/container: catch errors when starting a partial container.
If we are assembling an array in a container and it isn't complete
enough to start yet, then
- don't start mdmon
- don't say the array is started
- don't wait for the device to appear in /dev
NeilBrown [Tue, 10 Mar 2009 05:28:22 +0000 (16:28 +1100)]
mdopen: be more careful when adding digit to names.
If we need to add digits to a name to make it unique, but don't have
to add '_', we need to avoid adding a digit immediately after a digit.
So if the last character of the name is a digit, add the '_' anyway.
NeilBrown [Tue, 10 Mar 2009 05:28:22 +0000 (16:28 +1100)]
Incremental: fix some handling of trustworthy.
1/ if homehost matches, then we need to set trustworthy to 'LOCAL'
2/ if we decide to set trustworthy to 'METADATA' because we have to
use the metadata version name, do that *after* we have checked if
we are going to assemble within a container, as inside the
container there could be different sources of names to use.