From: Willy Tarreau Date: Sat, 9 Jul 2022 16:36:49 +0000 (+0200) Subject: MEDIUM: epoll: don't synchronously delete migrated FDs X-Git-Tag: v2.7-dev2~79 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=0d023774bff9cfc49ab5665f73c6258f0f60f9b3;p=thirdparty%2Fhaproxy.git MEDIUM: epoll: don't synchronously delete migrated FDs Between 1.8 and 1.9 commit d9e7e36c6 ("BUG/MEDIUM: epoll/threads: use one epoll_fd per thread") split the epoll poller to use one poller per thread (and this was backported to 1.8). This patch added a call to epoll_ctl(DEL) on return from the I/O handler as a safe way to deal with a detected thread migration when that code was still quite fragile. One aspect of this choice was that by then we wanted to maintain support for the rare old bogus epoll implementations that failed to remove events on close(), so risking to lose the event was not an option. Later in 2.5, commit 200bd50b7 ("MEDIUM: fd: rely more on fd_update_events() to detect changes") changed the code to perform most of the operations inside fd_update_events(), but it maintained that oddity, to the point that strictly all pollers except epoll now just add an update to be dealt with at the next round. This approach is much more efficient, because under load and server-side connection reuse, it's perfectly possible for a thread to see the same FD several times in a poll loop, the first time to relinquish it after a migration, then the other thread makes a request, gets its response, and still during the same loop for the first one, grabbing an idle connection to send a request and wait for a response will program a new update on this FD. By using a synchronous epoll_ctl(DEL), we effectively lose the opportunity to aggregate certain changes in the same update. Some tests performed locally with 8 threads and one server show that on average, by using an update instead of a synchronous call, we reduce the number of epoll_ctl() calls by 25-30% (under low loads it will probably not change anything). So this patch implements the same method for all pollers and replaces the synchronous epoll_ctl() with an update. --- diff --git a/src/ev_epoll.c b/src/ev_epoll.c index fd49d92c56..1917ed1aca 100644 --- a/src/ev_epoll.c +++ b/src/ev_epoll.c @@ -213,7 +213,6 @@ static void _do_poll(struct poller *p, int exp, int wake) /* process polled events */ for (count = 0; count < status; count++) { - struct epoll_event ev; unsigned int n, e; int ret; @@ -236,9 +235,8 @@ static void _do_poll(struct poller *p, int exp, int wake) if (ret == FD_UPDT_MIGRATED) { /* FD has been migrated */ - epoll_ctl(epoll_fd[tid], EPOLL_CTL_DEL, fd, &ev); - _HA_ATOMIC_AND(&polled_mask[fd].poll_recv, ~tid_bit); - _HA_ATOMIC_AND(&polled_mask[fd].poll_send, ~tid_bit); + if (!HA_ATOMIC_BTS(&fdtab[fd].update_mask, tid)) + fd_updt[fd_nbupdt++] = fd; } } /* the caller will take care of cached events */