From 8890bf450e0b6b283f48ac619fca5ac2f14ddd62 Mon Sep 17 00:00:00 2001 From: Anil Gurumurthy Date: Wed, 10 Dec 2025 15:45:59 +0530 Subject: [PATCH] scsi: qla2xxx: Delay module unload while fabric scan in progress System crash seen during load/unload test in a loop. [105954.384919] RBP: ffff914589838dc0 R08: 0000000000000000 R09: 0000000000000086 [105954.384920] R10: 000000000000000f R11: ffffa31240904be5 R12: ffff914605f868e0 [105954.384921] R13: ffff914605f86910 R14: 0000000000008010 R15: 00000000ddb7c000 [105954.384923] FS: 0000000000000000(0000) GS:ffff9163fec40000(0000) knlGS:0000000000000000 [105954.384925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [105954.384926] CR2: 000055d31ce1d6a0 CR3: 0000000119f5e001 CR4: 0000000000770ee0 [105954.384928] PKRU: 55555554 [105954.384929] Call Trace: [105954.384931] [105954.384934] qla24xx_sp_unmap+0x1f3/0x2a0 [qla2xxx] [105954.384962] ? qla_async_scan_sp_done+0x114/0x1f0 [qla2xxx] [105954.384980] ? qla24xx_els_ct_entry+0x4de/0x760 [qla2xxx] [105954.384999] ? __wake_up_common+0x80/0x190 [105954.385004] ? qla24xx_process_response_queue+0xc2/0xaa0 [qla2xxx] [105954.385023] ? qla24xx_msix_rsp_q+0x44/0xb0 [qla2xxx] [105954.385040] ? __handle_irq_event_percpu+0x3d/0x190 [105954.385044] ? handle_irq_event+0x58/0xb0 [105954.385046] ? handle_edge_irq+0x93/0x240 [105954.385050] ? __common_interrupt+0x41/0xa0 [105954.385055] ? common_interrupt+0x3e/0xa0 [105954.385060] ? asm_common_interrupt+0x22/0x40 The root cause of this was that there was a free (dma_free_attrs) in the interrupt context. There was a device discovery/fabric scan in progress. A module unload was issued which set the UNLOADING flag. As part of the discovery, after receiving an interrupt a work queue was scheduled (which involved a work to be queued). Since the UNLOADING flag is set, the work item was not allocated and the mapped memory had to be freed. The free occurred in interrupt context leading to system crash. Delay the driver unload until the fabric scan is complete to avoid the crash. Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/202512090414.07Waorz0-lkp@intel.com/ Fixes: 783e0dc4f66a ("qla2xxx: Check for device state before unloading the driver.") Cc: stable@vger.kernel.org Signed-off-by: Anil Gurumurthy Signed-off-by: Nilesh Javali Reviewed-by: Himanshu Madhani Link: https://patch.msgid.link/20251210101604.431868-8-njavali@marvell.com Signed-off-by: Martin K. Petersen --- drivers/scsi/qla2xxx/qla_os.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 16a44c0917e18..c9aff70e7357b 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1183,7 +1183,8 @@ qla2x00_wait_for_hba_ready(scsi_qla_host_t *vha) while ((qla2x00_reset_active(vha) || ha->dpc_active || ha->flags.mbox_busy) || test_bit(FX00_RESET_RECOVERY, &vha->dpc_flags) || - test_bit(FX00_TARGET_SCAN, &vha->dpc_flags)) { + test_bit(FX00_TARGET_SCAN, &vha->dpc_flags) || + (vha->scan.scan_flags & SF_SCANNING)) { if (test_bit(UNLOADING, &base_vha->dpc_flags)) break; msleep(1000); -- 2.47.3