From: Willy Tarreau Date: Thu, 15 May 2025 13:41:50 +0000 (+0200) Subject: BUG/MEDIUM: peers: also limit the number of incoming updates X-Git-Tag: v3.2-dev17~43 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=f2d7aa8406912ee20dab4deba83c184bcdde5056;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: peers: also limit the number of incoming updates There's a configurable limit to the number of messages sent to a peer (tune.peers.max-updates-at-once), but this one is not applied to the receive side. While it can usually be OK with default settings, setups involving a large tune.bufsize (1MB and above) regularly experience high latencies and even watchdogs during reloads because the full learning process sends a lot of data that manages to fill the entire buffer, and due to the compactness of the protocol, 1MB of buffer can contain more than 100k updates, meaning taking locks etc during this time, which is not workable. Let's make sure the receiving side also respects the max-updates-at-once setting. For this it counts incoming updates, and refrains from continuing once the limit is reached. It's a bit tricky to do because after receiving updates we still have to send ours (and possibly some ACKs) so we cannot just leave the loop. This issue was reported on 3.1 but it should progressively be backported to all versions having the max-updates-at-once option available. --- diff --git a/src/peers.c b/src/peers.c index 6f22e1751..e79d9d9d3 100644 --- a/src/peers.c +++ b/src/peers.c @@ -2905,6 +2905,7 @@ static void peer_io_handler(struct appctx *appctx) int repl = 0; unsigned int maj_ver, min_ver; int prev_state; + int msg_done = 0; if (unlikely(se_fl_test(appctx->sedesc, (SE_FL_EOS|SE_FL_ERROR)))) { co_skip(sc_oc(sc), co_data(sc_oc(sc))); @@ -3107,6 +3108,19 @@ switchstate: applet_wont_consume(appctx); goto out; } + + /* check if we've already hit the rx limit (i.e. we've + * already gone through send_msgs and we don't want to + * process input messages again). We must absolutely + * leave via send_msgs otherwise we can leave the + * connection in a stuck state if acks are missing for + * example. + */ + if (msg_done >= peers_max_updates_at_once) { + applet_have_more_data(appctx); // make sure to come back here + goto send_msgs; + } + applet_will_consume(appctx); /* local peer is assigned of a lesson, start it */ @@ -3128,6 +3142,12 @@ switchstate: /* skip consumed message */ co_skip(sc_oc(sc), totl); + + /* make sure we don't process too many at once */ + if (msg_done >= peers_max_updates_at_once) + goto send_msgs; + msg_done++; + /* loop on that state to peek next message */ goto switchstate;