]> git.ipfire.org Git - thirdparty/kernel/stable.git/commitdiff
md/raid5: fix another livelock caused by non-aligned writes.
authorNeilBrown <neilb@suse.de>
Sun, 1 Feb 2015 23:44:29 +0000 (10:44 +1100)
committerLuis Henriques <luis.henriques@canonical.com>
Tue, 10 Feb 2015 13:38:45 +0000 (13:38 +0000)
commit b1b02fe97f75b12ab34b2303bfd4e3526d903a58 upstream.

If a non-page-aligned write is destined for a device which
is missing/faulty, we can deadlock.

As the target device is missing, a read-modify-write cycle
is not possible.
As the write is not for a full-page, a recontruct-write cycle
is not possible.

This should be handled by logic in fetch_block() which notices
there is a non-R5_OVERWRITE write to a missing device, and so
loads all blocks.

However since commit 67f455486d2ea2, that code requires
STRIPE_PREREAD_ACTIVE before it will active, and those circumstances
never set STRIPE_PREREAD_ACTIVE.

So: in handle_stripe_dirtying, if neither rmw or rcw was possible,
set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set
after a suitable delay.

Fixes: 67f455486d2ea20b2d94d6adf5b9b783d079e321
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
drivers/md/raid5.c

index 222aa75218773fb873b68833cfab6192046d408e..68a03d7f25eef00203620aab98c0cc8cf4193eca 100644 (file)
@@ -3204,6 +3204,11 @@ static void handle_stripe_dirtying(struct r5conf *conf,
                                          (unsigned long long)sh->sector,
                                          rcw, qread, test_bit(STRIPE_DELAYED, &sh->state));
        }
+
+       if (rcw > disks && rmw > disks &&
+           !test_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+               set_bit(STRIPE_DELAYED, &sh->state);
+
        /* now if nothing is locked, and if we have enough data,
         * we can start a write request
         */