From: Willy Tarreau Date: Tue, 16 Jul 2024 14:38:31 +0000 (+0200) Subject: BUG/MAJOR: mux-h2: force a hard error upon short read with pending error X-Git-Tag: v3.1-dev4~44 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=4de03e42cdc6361cf9d26e9c599469617e88eab5;p=thirdparty%2Fhaproxy.git BUG/MAJOR: mux-h2: force a hard error upon short read with pending error A risk of truncated packet was addressed in 2.9 by commit 19fb19976f ("BUG/MEDIUM: mux-h2: Only Report H2C error on read error if demux buffer is empty") by ignoring CO_FL_ERROR after a recv() call as long as some data remained present in the buffer. However it has a side effect due to the fact that some frame processors only deal with full frames, for example, HEADERS. The side effect is that an incomplete frame will not be processed and will remain in the buffer, preventing the error from being taken into account, so the I/O handler wakes up the H2 parser to handle the error, and that one just subscribes for more data, and this loops forever wasting CPU cycles. Note that this only happens with errors at the SSL layer exclusively, otherwise we'd have a read0 pending that would properly be detected: conn->flags = CO_FL_XPRT_TRACKED | CO_FL_ERROR | CO_FL_XPRT_READY | CO_FL_CTRL_READY conn->err_code = CO_ERR_SSL_FATAL h2c->flags = H2_CF_ERR_PENDING | H2_CF_WINDOW_OPENED | H2_CF_MBUF_HAS_DATA | H2_CF_DEM_IN_PROGRESS | H2_CF_DEM_SHORT_READ The condition to report the error in h2_recv() needs to be refined, so that connection errors are taken into account either when the buffer is empty, or when there's an incomplete frame, since we're certain it will never be completed. We're certain to enter that function because H2_CF_DEM_SHORT_READ implies too short a frame, and earlier there's a protocol check to validate that no frame size is larger than bufsize, hence a H2_CF_DEM_SHORT_READ implies there's some room left in the buffer and we're allowed to try to receive. The condition to reproduce the bug seems super hard to meet but was observed once by Patrick Hemmer who had the reflex to capture lots of information that allowed to explain the problem. In order to reproduce it, the SSL code had to be significantly modified to alter received contents at very empiric places, but that was sufficient to reproduce it and confirm that the current patch works as expected. The bug was tagged MAJOR because when it triggers there's no other solution to get rid of it but to restart the process. However given how hard it is to trigger on a lab, it does not seem very likely to occur in field. This needs to be backported to 2.9. --- diff --git a/src/mux_h2.c b/src/mux_h2.c index b2b1d47181..5d9bb20e92 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -4156,7 +4156,8 @@ static int h2_recv(struct h2c *h2c) TRACE_DATA("received read0", H2_EV_H2C_RECV, h2c->conn); h2c->flags |= H2_CF_RCVD_SHUT; } - if (h2c->conn->flags & CO_FL_ERROR && !b_data(&h2c->dbuf)) { + if (h2c->conn->flags & CO_FL_ERROR && + (!b_data(&h2c->dbuf) || (h2c->flags & H2_CF_DEM_SHORT_READ))) { TRACE_DATA("connection error", H2_EV_H2C_RECV, h2c->conn); h2c->flags |= H2_CF_ERROR; }