A deferred qc may timeout while waiting for the device queue to drain
to be submitted. In such case, since the qc is not active,
ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(),
which frees the qc. But as the port deferred_qc field still references
this finished/freed qc, the deferred qc work may eventually attempt to
call ata_qc_issue() against this invalid qc, leading to errors such as
reported by UBSAN (syzbot run):
UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24
shift exponent
4210818301 is too large for 64-bit type 'long long unsigned int'
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
ubsan_epilogue+0xa/0x30 lib/ubsan.c:233
__ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494
ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166
ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679
process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275
process_scheduled_works kernel/workqueue.c:3358 [inline]
worker_thread+0x5da/0xe40 kernel/workqueue.c:3439
kthread+0x370/0x450 kernel/kthread.c:467
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Fix this by checking if the qc of a timed out SCSI command is a deferred
one, and in such case, clear the port deferred_qc field and finish the
SCSI command with DID_TIME_OUT.
Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
set_host_byte(scmd, DID_OK);
ata_qc_for_each_raw(ap, qc, i) {
- if (qc->flags & ATA_QCFLAG_ACTIVE &&
- qc->scsicmd == scmd)
+ if (qc->scsicmd != scmd)
+ continue;
+ if ((qc->flags & ATA_QCFLAG_ACTIVE) ||
+ qc == ap->deferred_qc)
break;
}
- if (i < ATA_MAX_QUEUE) {
+ if (qc == ap->deferred_qc) {
+ /*
+ * This is a deferred command that timed out while
+ * waiting for the command queue to drain. Since the qc
+ * is not active yet (deferred_qc is still set, so the
+ * deferred qc work has not issued the command yet),
+ * simply signal the timeout by finishing the SCSI
+ * command and clear the deferred qc to prevent the
+ * deferred qc work from issuing this qc.
+ */
+ WARN_ON_ONCE(qc->flags & ATA_QCFLAG_ACTIVE);
+ ap->deferred_qc = NULL;
+ set_host_byte(scmd, DID_TIME_OUT);
+ scsi_eh_finish_cmd(scmd, &ap->eh_done_q);
+ } else if (i < ATA_MAX_QUEUE) {
/* the scmd has an associated qc */
if (!(qc->flags & ATA_QCFLAG_EH)) {
/* which hasn't failed yet, timeout */