In a benchmark scenario triggering a lot of requests that triggers a lot
of DLM messages on the network it can be that the mh->seq is not ordered
according the oldest seq number. This ordering is required by
dlm_receive_ack as "before(mh->seq, seq)" will stop to check for older
sequence numbers that are ordered in the tail of "node->send_queue".
The side effects of not having it correct ordered regarding
"before(mh->seq, seq)" are refcounting issues and use-after free.
I only was able to reproduce this issue in a experimental DLM branch
and a user space DLM benchmark that uses io_uring. After changing this I
don't experienced any refcounting with the sending buffer issues anymore.
Fixes: 489d8e559c659 ("fs: dlm: add reliable connection if reconnect")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
atomic_inc(&mh->node->send_queue_cnt);
spin_lock_bh(&mh->node->send_queue_lock);
+ /* need to be locked with list_add_tail_rcu() because list is ordered */
+ mh->seq = atomic_fetch_inc(&mh->node->seq_send);
list_add_tail_rcu(&mh->list, &mh->node->send_queue);
spin_unlock_bh(&mh->node->send_queue_lock);
-
- mh->seq = atomic_fetch_inc(&mh->node->seq_send);
}
static struct dlm_msg *dlm_midcomms_get_msg_3_2(struct dlm_mhandle *mh, int nodeid,