]> git.ipfire.org Git - thirdparty/kernel/linux.git/commit
svcrdma: Release write chunk resources without re-queuing
authorChuck Lever <chuck.lever@oracle.com>
Wed, 6 May 2026 15:26:50 +0000 (11:26 -0400)
committerChuck Lever <cel@kernel.org>
Tue, 9 Jun 2026 20:32:59 +0000 (16:32 -0400)
commit9545262f7e58d67de413d5a47ea2a3f2e59ba9f6
tree11b8f80ee664ce4d9bd9c222bd36c0558bc8a1e1
parent2804a16b9e51f803a32ac2e9a014f5851a7cc3f4
svcrdma: Release write chunk resources without re-queuing

Each RDMA Send completion triggers a cascade of work items on the
svcrdma_wq unbound workqueue:

  ib_cq_poll_work (on ib_comp_wq, per-CPU)
    -> svc_rdma_send_ctxt_put -> queue_work    [work item 1]
      -> svc_rdma_write_info_free -> queue_work [work item 2]

Every transition through queue_work contends on the unbound
pool's spinlock. Profiling an 8KB NFSv3 read/write workload
over RDMA shows about 4% of total CPU cycles spent on this
lock, with the cascading re-queue of write_info release
contributing roughly 1%.

The initial queue_work in svc_rdma_send_ctxt_put is needed to
move release work off the CQ completion context (which runs on
a per-CPU bound workqueue). However, once executing on
svcrdma_wq, there is no need to re-queue for each write_info
structure. svc_rdma_reply_chunk_release already calls
svc_rdma_cc_release inline from the same svcrdma_wq context,
and svc_rdma_recv_ctxt_put does the same from nfsd thread
context.

Release write chunk resources inline in
svc_rdma_write_info_free, removing the intermediate
svc_rdma_write_info_free_async work item and the wi_work
field from struct svc_rdma_write_info.

Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Jonathan Flynn <jonathan.flynn@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
include/linux/sunrpc/svc_rdma.h
net/sunrpc/xprtrdma/svc_rdma_rw.c