From: Willy Tarreau Date: Thu, 5 Dec 2019 06:40:32 +0000 (+0100) Subject: BUG/MEDIUM: listener/thread: fix a race when pausing a listener X-Git-Tag: v2.2-dev1~215 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=4c044e274c16fde42863c476449895b0fd603818;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: listener/thread: fix a race when pausing a listener There exists a race in the listener code where a thread might disable receipt on a listener's FD then turn it to LI_PAUSED while at the same time another one faces EAGAIN on accept() and enables it again via fd_cant_recv(). The result is that the FD is in LI_PAUSED state with its polling still enabled. listener_accept() does not do anything then and doesn't disable the FD either, resulting in a thread eating all the CPU as reported in issue #358. A solution would be to take the listener's lock to perform the fd_cant_recv() call and do it only if the FD is still in LI_READY state, but this would be totally overkill while in practice the issue only happens during shutdown. Instead what is done here is that when leaving we recheck the state and disable polling if the listener is not in LI_READY state, which never happens except when being limited. In the worst case there could be one extra check per thread for the time required to converge, which is absolutely nothing. This fix was successfully tested, and should be backported to all versions using the lock-free listeners, which means all those containing commit 3f0d02bb ("MAJOR: listener: do not hold the listener lock in listener_accept()"), hence 2.1, 2.0, 1.9.7+, 1.8.20+. --- diff --git a/src/listener.c b/src/listener.c index 1ce8c596d7..d68a1c3335 100644 --- a/src/listener.c +++ b/src/listener.c @@ -1058,6 +1058,9 @@ void listener_accept(int fd) (!p->fe_sps_lim || freq_ctr_remain(&p->fe_sess_per_sec, p->fe_sps_lim, 0) > 0)) dequeue_all_listeners(&p->listener_queue); } + + if (l->state != LI_READY) + fd_stop_recv(l->fd); } /* Notify the listener that a connection initiated from it was released. This