ublk_start_cancel() previously bailed out early when ublk_get_disk()
returned NULL, treating it as "our disk has been dead". That is correct
for the post-teardown case, but it also wrongly covers the pre-start
case: ublk_ctrl_start_dev() has not assigned ub->ub_disk yet, while
io_uring is already tearing down the daemon's uring_cmds via
ublk_uring_cmd_cancel_fn().
In that window, the cancel path skips ublk_set_canceling(), so
ubq->canceling stays false, even though ublk_cancel_cmd() goes on to
NULL out every io->cmd. ublk_ctrl_start_dev() then proceeds to set
ub->ub_disk, call add_disk(), and schedule partition_scan_work. When
ublk_partition_scan_work() runs bdev_disk_changed() and the resulting
read reaches ublk_queue_rq() -> ublk_queue_cmd(), the ubq->canceling
check passes and the code dereferences the NULL io->cmd:
BUG: kernel NULL pointer dereference, address:
0000000000000018
RIP: ublk_queue_cmd drivers/block/ublk_drv.c [inline]
RIP: ublk_queue_rq+0x73/0x100
Call Trace:
blk_mq_dispatch_rq_list+0x1c5/0xca0
...
bdev_disk_changed+0x3d4/0x5e0
ublk_partition_scan_work+0x89/0xe0
process_one_work+0x344/0x8a0
Fix it by always setting ub->canceling / ubq->canceling under
cancel_mutex. When the disk is allocated, keep the existing
quiesce/unquiesce dance so the flag is observed across the
ublk_queue_rq() barrier. When the disk is not yet allocated, there is
no request_queue and ublk_queue_rq() cannot be running concurrently, so
simply flipping the flag is sufficient: any subsequent I/O - including
the partition scan started by ublk_ctrl_start_dev() - will see
canceling set and be aborted via __ublk_queue_rq_common().
Fixes: 7fc4da6a304b ("ublk: scan partition in async way")
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260527144042.2095194-1-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
{
struct gendisk *disk = ublk_get_disk(ub);
- /* Our disk has been dead */
- if (!disk)
- return;
-
mutex_lock(&ub->cancel_mutex);
if (ub->canceling)
goto out;
- /*
- * Now we are serialized with ublk_queue_rq()
- *
- * Make sure that ubq->canceling is set when queue is frozen,
- * because ublk_queue_rq() has to rely on this flag for avoiding to
- * touch completed uring_cmd
- */
- blk_mq_quiesce_queue(disk->queue);
- ublk_set_canceling(ub, true);
- blk_mq_unquiesce_queue(disk->queue);
+
+ if (disk) {
+ /*
+ * Quiesce to serialize with ublk_queue_rq(), ensuring
+ * ubq->canceling is visible when the queue resumes.
+ */
+ blk_mq_quiesce_queue(disk->queue);
+ ublk_set_canceling(ub, true);
+ blk_mq_unquiesce_queue(disk->queue);
+ } else {
+ /*
+ * Disk not yet allocated by ublk_ctrl_start_dev(), so
+ * there is no request queue and ublk_queue_rq() cannot
+ * be running. Just set the flag; if start_dev proceeds
+ * later, new I/O will see canceling and be aborted.
+ */
+ ublk_set_canceling(ub, true);
+ }
out:
mutex_unlock(&ub->cancel_mutex);
ublk_put_disk(disk);