git.ipfire.org Git - thirdparty/kernel/linux.git/commit

author	Ming Lei <ming.lei@redhat.com>
	Fri, 16 Jan 2026 07:46:38 +0000 (15:46 +0800)
committer	Jens Axboe <axboe@kernel.dk>
	Tue, 20 Jan 2026 17:18:01 +0000 (10:18 -0700)
commit	f7bc22ca0d55bdcb59e3a4a028fb811d23e53959
tree	cc2e8ed9bad2d693a32f28b7aaf73754758188d2	tree \| snapshot
parent	5e2fde1a9433efc484a5feec36f748aa3ea58c85	commit \| diff

nvme/io_uring: optimize IOPOLL completions for local ring context

When multiple io_uring rings poll on the same NVMe queue, one ring can
find completions belonging to another ring. The current code always
uses task_work to handle this, but this adds overhead for the common
single-ring case.

This patch passes the polling io_ring_ctx through io_comp_batch's new
poll_ctx field. In io_do_iopoll(), the polling ring's context is stored
in iob.poll_ctx before calling the iopoll callbacks.

In nvme_uring_cmd_end_io(), we now compare iob->poll_ctx with the
request's owning io_ring_ctx (via io_uring_cmd_ctx_handle()). If they
match (local context), we complete inline with io_uring_cmd_done32().
If they differ (remote context) or iob is NULL (non-iopoll path), we
use task_work as before.

This optimization eliminates task_work scheduling overhead for the
common case where a ring polls and finds its own completions.

~10% IOPS improvement is observed in the following benchmark:

fio/t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -O0 -P1 -u1 -n1 /dev/ng0n1

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

drivers/nvme/host/ioctl.c		diff \| blob \| blame \| history
include/linux/blkdev.h		diff \| blob \| blame \| history
io_uring/rw.c		diff \| blob \| blame \| history