Add NULL check for scmd_local in the MPI2_FUNCTION_SCSI_IO_REQUEST case
to handle firmware duplicate/stale completions.
When firmware sends a duplicate completion for a command that was
already processed and returned to the pool, the driver accesses NULL
scmd pointer causing a crash.
Timeline of the bug:
1. Command completes normally, megasas_return_cmd_fusion() called
2. This sets cmd->scmd = NULL and clears io_request with memset(..., 0,
...)
3. Firmware sends duplicate/stale completion for same SMID (firmware
bug)
4. Driver processes reply descriptor again
5. Cleared io_request has Function = 0 (MPI2_FUNCTION_SCSI_IO_REQUEST)
6. Switch statement matches SCSI_IO_REQUEST case by accident
7. Accesses megasas_priv(NULL scmd)->status -> crash at offset 0x228
The offset 0x228 = sizeof(struct scsi_cmnd) 0x220 + offsetof(status)
0x8.
This issue was observed on PERC H330 Mini running firmware 25.5.9.0001
after 3+ days of heavy I/O load.
Crash signature:
BUG: unable to handle kernel NULL pointer dereference at 0x228
RIP: complete_cmd_fusion+0x428
Function: megasas_priv(cmd_fusion->scmd)->status
Add defensive check to skip processing when scmd_local is NULL. This
handles duplicate completions from firmware and prevents accessing freed
command structures. The check protects all scmd_local uses in both the
SCSI_IO path and the fallthrough LDIO path.
Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Link: https://patch.msgid.link/agWAgtk6rtHqNWb5@machine1
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
complete(&cmd_fusion->done);
break;
case MPI2_FUNCTION_SCSI_IO_REQUEST: /*Fast Path IO.*/
+ /*
+ * Firmware can send stale/duplicate completions for
+ * commands already returned to the pool. scmd_local
+ * would be NULL for such cases. Skip processing to
+ * avoid NULL pointer access.
+ */
+ if (!scmd_local)
+ break;
+
/* Update load balancing info */
if (fusion->load_balance_info &&
(megasas_priv(cmd_fusion->scmd)->status &