From fac0f645dff9fb1854db32da9fa7c317eaaeede1 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Fri, 2 Oct 2020 17:52:49 +0200 Subject: [PATCH] BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe A crash reported in github issue #880 looks impossible unless pendconn_cond_unlink() occasionally sees a null leaf_p when attempting to remove an entry, which seems to be confirmed by the reporter. What seems to be happening is that depending on compiler optimizations, this pointer can appear as null while pointers are moved if one of the node's parents is removed from or inserted into the tree. There's no explicit null of the pointer during these operations but those pointers are rewritten in multiple steps and nothing prevents this situation from happening, and there are no particular barrier nor atomic ops around this. This test was used to avoid unnecessary locking, for already deleted entries, but looking at the code it appears that pendconn_free() already resets s->pend_pos that's used as

there, and that the other call reasons are after an error where the connection will be dropped as well. So we don't save anything by doing this test, and make it unsafe. The older code used to check for list emptiness there and not inside pendconn_unlink(), which explains why the code has stayed there. Let's just remove this now. Thanks to @jaroslawr for reporting this issue in great details and for testing the proposed fix. This should be backpored to 1.8, where the test on LIST_ISEMPTY should be moved to pendconn_unlink() instead (inside the lock, just like 2.0+). --- include/haproxy/queue.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/haproxy/queue.h b/include/haproxy/queue.h index f803bdd9db..5cc571d61a 100644 --- a/include/haproxy/queue.h +++ b/include/haproxy/queue.h @@ -47,7 +47,7 @@ void pendconn_unlink(struct pendconn *p); */ static inline void pendconn_cond_unlink(struct pendconn *p) { - if (p && p->node.node.leaf_p) + if (p) pendconn_unlink(p); } -- 2.39.5