]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - queue-4.19/scsi-hisi_sas-fix-a-timeout-race-of-driver-internal-.patch
7a9afa6b1285bb47fa97be891c654588a495cdcb
[thirdparty/kernel/stable-queue.git] / queue-4.19 / scsi-hisi_sas-fix-a-timeout-race-of-driver-internal-.patch
1 From 9abdfa5551fb10a0e5e43ba8692b638e9cef109f Mon Sep 17 00:00:00 2001
2 From: Xiang Chen <chenxiang66@hisilicon.com>
3 Date: Thu, 28 Feb 2019 22:50:58 +0800
4 Subject: scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
5
6 [ Upstream commit 4790595723d4b833b18c994973d39f9efb842887 ]
7
8 For internal IO and SMP IO, there is a time-out timer for them. In the
9 timer handler, it checks whether IO is done according to the flag
10 task->task_state_lock.
11
12 There is an issue which may cause system suspended: internal IO or SMP IO
13 is sent, but at that time because of hardware exception (such as inject
14 2Bit ECC error), so IO is not completed and also not timeout. But, at that
15 time, the SAS controller reset occurs to recover system. It will release
16 the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO
17 timeout, it will never complete the completion of IO and wait for ever.
18
19 [ 729.123632] Call trace:
20 [ 729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8
21 [ 729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc
22 [ 729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c
23 [ 729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc
24 [ 729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0
25 [ 729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34
26 [ 729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main]
27 [ 729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main]
28 [ 729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8
29 [ 729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398
30 [ 729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0
31 [ 729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138
32 [ 729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18
33
34 To solve the issue, callback function task_done of those IOs need to be
35 called when on SAS controller reset.
36
37 Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
38 Signed-off-by: John Garry <john.garry@huawei.com>
39 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
40 Signed-off-by: Sasha Levin <sashal@kernel.org>
41 ---
42 drivers/scsi/hisi_sas/hisi_sas_main.c | 3 ++-
43 1 file changed, 2 insertions(+), 1 deletion(-)
44
45 diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
46 index c25f3a9b0b9f..fd9d82c9033d 100644
47 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c
48 +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
49 @@ -810,7 +810,8 @@ static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task
50 spin_lock_irqsave(&task->task_state_lock, flags);
51 task->task_state_flags &=
52 ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR);
53 - task->task_state_flags |= SAS_TASK_STATE_DONE;
54 + if (!slot->is_internal && task->task_proto != SAS_PROTOCOL_SMP)
55 + task->task_state_flags |= SAS_TASK_STATE_DONE;
56 spin_unlock_irqrestore(&task->task_state_lock, flags);
57 }
58
59 --
60 2.19.1
61