From: Chuck Lever Date: Tue, 26 May 2026 14:14:03 +0000 (-0400) Subject: xprtrdma: Add request-pool slack for delayed recycling X-Git-Tag: v7.2-rc1~46^2~27 X-Git-Url: http://git.ipfire.org/gitweb/?a=commitdiff_plain;h=64bf6892057b746c55bcc045b9492741b72d8d27;p=thirdparty%2Fkernel%2Flinux.git xprtrdma: Add request-pool slack for delayed recycling After the previous patch gates req recycling on Send completion, a completed RPC's rpcrdma_req can remain pinned by the sendctx ring until the next signaled Send completion releases it. The transmitted-RPC ceiling is unchanged: xprt_request_get_cong() gates Sends against xprt->cwnd, the RPC/RDMA credit window fed by server-granted credits and capped at re_max_requests. The req pool, however, must exceed max_reqs by enough that this recycle delay does not stall a slot allocation that the credit window would admit. The headroom is bounded. frwr_open() sets re_send_batch to re_max_requests >> 3 -- one in every eight Sends is signaled -- so at most re_send_batch unsignaled Sends can be outstanding before the next signaled completion releases them. That equals max_reqs / 8 reqs in the worst case, with a one-slot floor for small max_reqs values where the right-shift rounds to zero. The sendctx ring and the hardware Send Queue are not enlarged to match. Both are sized in rpcrdma_sendctxs_create() and frwr_query_device() for re_max_requests in-flight Sends, which is the ceiling the credit window enforces. The pool slack does not raise that ceiling -- it only lets allocation keep pace with the credit window during the brief interval in which earlier reqs are pinned waiting for the next signaled completion. At any moment, at most re_send_batch sendctxes are held by unswept unsignaled Sends, leaving the rest of the ring available for newly admitted Sends. Allocate max_reqs + DIV_ROUND_UP(max_reqs, 8) request objects and name the slack calculation at the allocation site so the 1/8 bound stays tied to the Send-signaling batch size. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker --- diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 97b8b2376602c..98bd965787e68 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -1080,6 +1080,22 @@ static void rpcrdma_reps_destroy(struct rpcrdma_buffer *buf) spin_unlock(&buf->rb_lock); } +static unsigned int rpcrdma_req_pool_slack(unsigned int max_reqs) +{ + /* The sendctx ring can hold up to one Send-signaling batch + * (re_send_batch, set by frwr_open() to re_max_requests >> 3) + * of unfinished Sends. Each pins its req until a signaled Send + * completion releases the sendctx. Size the pool above max_reqs + * by that batch so the recycle delay does not stall a slot + * allocation that the RPC/RDMA credit window would admit. + * + * Round up: re_max_requests >> 3 is zero when max_reqs < 8, but + * a single unsignaled Send is still enough to pin one req. One + * slack slot covers that case. + */ + return DIV_ROUND_UP(max_reqs, 8); +} + /** * rpcrdma_buffer_create - Create initial set of req/rep objects * @r_xprt: transport instance to (re)initialize @@ -1089,6 +1105,7 @@ static void rpcrdma_reps_destroy(struct rpcrdma_buffer *buf) int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt) { struct rpcrdma_buffer *buf = &r_xprt->rx_buf; + unsigned int max_reqs; int i, rc; buf->rb_bc_srv_max_requests = 0; @@ -1102,7 +1119,9 @@ int rpcrdma_buffer_create(struct rpcrdma_xprt *r_xprt) INIT_LIST_HEAD(&buf->rb_all_reps); rc = -ENOMEM; - for (i = 0; i < r_xprt->rx_xprt.max_reqs; i++) { + max_reqs = r_xprt->rx_xprt.max_reqs; + max_reqs += rpcrdma_req_pool_slack(max_reqs); + for (i = 0; i < max_reqs; i++) { struct rpcrdma_req *req; req = rpcrdma_req_create(r_xprt,