The fixed RPCRDMA_MAX_RECV_BATCH of 7 results in frequent
small ib_post_recv batches during high-rate workloads. With
a 128-slot credit window, receives are reposted every 7th
completion, each batch incurring atomic serialization and a
doorbell write.
Replace the fixed batch constant with a per-endpoint value
scaled to 25% of the negotiated credit window. For a typical
128-credit connection this raises the batch from 7 to 32,
reducing doorbell frequency by roughly 4x and amortizing the
per-batch atomic and MMIO costs over a larger group of
receive WRs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
}
ep->re_attr.cap.max_send_wr += RPCRDMA_BACKWARD_WRS;
ep->re_attr.cap.max_send_wr += 1; /* for ib_drain_sq */
+ ep->re_recv_batch = ep->re_max_requests >> 2;
ep->re_attr.cap.max_recv_wr = ep->re_max_requests;
ep->re_attr.cap.max_recv_wr += RPCRDMA_BACKWARD_WRS;
- ep->re_attr.cap.max_recv_wr += RPCRDMA_MAX_RECV_BATCH;
+ ep->re_attr.cap.max_recv_wr += ep->re_recv_batch;
ep->re_attr.cap.max_recv_wr += 1; /* for ib_drain_rq */
ep->re_max_rdma_segs =
if (likely(ep->re_receive_count > needed))
goto out;
needed -= ep->re_receive_count;
- needed += RPCRDMA_MAX_RECV_BATCH;
+ needed += ep->re_recv_batch;
if (atomic_inc_return(&ep->re_receiving) > 1)
goto out_dec;
struct rpcrdma_notification re_rn;
int re_receive_count;
unsigned int re_max_requests; /* depends on device */
+ unsigned int re_recv_batch;
unsigned int re_inline_send; /* negotiated */
unsigned int re_inline_recv; /* negotiated */