rpcrdma_buffer_get() and rpcrdma_buffer_put() both take rb_lock to
pop/push from the rb_send_bufs free list. Under high I/O concurrency
(e.g., nconnect=N with small random writes), this spinlock is contended
between the request submission path and the transport completion path.
Replace the list_head with an llist_head. The put side uses
lockless llist_add(), which is safe for concurrent producers. The
get side retains the spinlock to satisfy the llist single-consumer
contract portably; submitters continue to serialize there. Completion
handlers returning buffers no longer contend on rb_lock, eliminating
contention on the return path.
rb_lock remains for the MR free list and the tracking lists used
during setup and teardown. rb_free_reps already uses llist_head, so
the llist idiom is established in this structure. The precedent is the
data structure, not the locking: rb_free_reps serializes its single
consumer through the re_receiving gate in rpcrdma_post_recvs, whereas
rb_send_bufs serializes its consumer with rb_lock. Both satisfy the
llist single-consumer contract.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <anna.schumaker@hammerspace.com>