From: Willy Tarreau Date: Thu, 14 Nov 2019 13:58:39 +0000 (+0100) Subject: BUG/MINOR: queue/threads: make the queue unlinking atomic X-Git-Tag: v2.1-dev5~25 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=9ada030697c945d0e4bcbc85870d6d25f33b76b0;p=thirdparty%2Fhaproxy.git BUG/MINOR: queue/threads: make the queue unlinking atomic There is a very short race in the queues which happens in the following situation: - stream A on thread 1 is being processed by a server - stream B on thread 2 waits in the backend queue for a server - stream B on thread 2 is fed up with waiting and expires, calls stream_free() which calls pendconn_free(), which sees the stream attached - at the exact same instant, stream A finishes on thread 1, sees one stream is waiting (B), detaches it and wakes it up - stream B continues pendconn_free() and calls pendconn_unlink() - pendconn_unlink() now detaches the node again and performs a second deletion (harmless since idempotent), and decrements srv/px->nbpend again => the number of connections on the proxy or server may reach -1 if/when this race occurs. It is extremely tight as it can only occur during the test on p->leaf_p though it has been witnessed at least once. The solution consists in testing leaf_p again once the lock is held to make sure the element was not removed in the mean time. This should be backported to 2.0 and 1.9, probably even 1.8. --- diff --git a/src/queue.c b/src/queue.c index 30b7ef0566..1bb08a5131 100644 --- a/src/queue.c +++ b/src/queue.c @@ -171,16 +171,17 @@ static inline void pendconn_queue_unlock(struct pendconn *p) /* Removes the pendconn from the server/proxy queue. At this stage, the * connection is not really dequeued. It will be done during process_stream(). - * This function takes all the required locks for the operation. The caller is - * responsible for ensuring that

is valid and still in the queue. Use - * pendconn_cond_unlink() if unsure. When the locks are already held, please - * use __pendconn_unlink() instead. + * This function takes all the required locks for the operation. The pendconn + * must be valid, though it doesn't matter if it was already unlinked. Prefer + * pendconn_cond_unlink() to first check

. When the locks are already held, + * please use __pendconn_unlink() instead. */ void pendconn_unlink(struct pendconn *p) { pendconn_queue_lock(p); - __pendconn_unlink(p); + if (p->node.node.leaf_p) + __pendconn_unlink(p); pendconn_queue_unlock(p); }