]> git.ipfire.org Git - thirdparty/mdadm.git/commit - managemon.c
mdmon: fix, close spare activation race
authorDan Williams <dan.j.williams@intel.com>
Fri, 26 Aug 2011 02:14:29 +0000 (19:14 -0700)
committerNeilBrown <neilb@suse.de>
Tue, 30 Aug 2011 00:49:42 +0000 (10:49 +1000)
commit1d446d52a79b8afcaf604a9a70f906e5605db1f6
tree326d1cdbc523c0aa98828e7f72c3e3011743ad04
parentb276dd33c74a51598e37fc72e6fb8f5ebd6620f2
mdmon: fix, close spare activation race

The following test fails when the md_check_recovery() event triggered by
the ro->rw transition causes remove_and_add_spares() to run while mdmon
is attempting spare activation.

Result is that the kernel races to set the slot immediately after
sysfs_add_disk() writes new_dev.  mdmon thinks the spare activation
failed and declines to send the monitor a new acitve_array.  We show
degraded after the wait because the monitor cannot notify the metadata
that all disks are in_sync.

#!/bin/bash
i=0
false
while [ $? == 1 ]
do
i=$((i+1))
mdadm -Ss
mdadm -CR /dev/md0 /dev/loop[0-2] -n 3 -e imsm
mdadm -CR /dev/md1 /dev/loop[01] missing -n 3 -l 5
mdadm --wait /dev/md1
mdadm -E /dev/loop2 | grep -i degraded
done
echo "failed: $i"

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
managemon.c