From: Willy Tarreau Date: Thu, 2 Jul 2020 17:05:30 +0000 (+0200) Subject: BUG/MEDIUM: server: don't kill all idle conns when there are not enough X-Git-Tag: v2.2-dev12~8 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=18ed789ae262382e3fe89288132d709c114575aa;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: server: don't kill all idle conns when there are not enough In srv_cleanup_idle_connections(), we compute how many idle connections are in excess compared to the average need. But we may actually be missing some, for example if a certain number were recently closed and the average of used connections didn't change much since previous period. In this case exceed_conn can become negative. There was no special case for this in the code, and calculating the per-thread share of connections to kill based on this value resulted in special value -1 to be passed to srv_migrate_conns_to_remove(), which for this function means "kill all of them", as used in srv_cleanup_connections() for example. This causes large variations of idle connections counts on servers and CPU spikes at the moment the cleanup task passes. These were quite more visible with SSL as it costs CPU to close and re-establish these connections, and it also takes time, reducing the reuse ratio, hence increasing the amount of connections during reconnection. In this patch we simply skip the killing loop when this condition is met. No backport is needed, this is purely 2.2. --- diff --git a/src/server.c b/src/server.c index a016b7bd90..b0c061e88f 100644 --- a/src/server.c +++ b/src/server.c @@ -5288,6 +5288,9 @@ struct task *srv_cleanup_idle_connections(struct task *task, void *context, unsi srv->max_used_conns = srv->curr_used_conns; + if (exceed_conns <= 0) + goto remove; + /* check all threads starting with ours */ for (i = tid;;) { int max_conn;