From: Willy Tarreau Date: Mon, 29 Aug 2022 08:22:56 +0000 (+0200) Subject: BUG/MEDIUM: mux-h1: do not refrain from signaling errors after end of input X-Git-Tag: v2.7-dev5~43 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=cad42a78b846fab542d6172814526abe369a3ab6;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: mux-h1: do not refrain from signaling errors after end of input In 2.6-dev4, a fix for truncated response was brought with commit 99bbdbcc2 ("BUG/MEDIUM: mux-h1: only turn CO_FL_ERROR to CS_FL_ERROR with empty ibuf"), trying to address the situation where an error is present at the connection level but some data are still pending to be read by the stream. However, this patch did not consider the case where the stream was no longer willing to read the pending data, resulting in a situation where some aborted transfers could lead to excessive CPU usage by causing constant stream wakeups for which no error was reported. This perfectly matches what was observed and reported in github issue #1842. It's not trivial to reproduce, but aborting HTTP/1 pipelining in the middle of transfers seems to give good results (using h2load and Ctrl-C in the middle). The fix was incorrct as the error should be held only if there were data that the stream was able to read. This is the approach taken by this patch, which also checks via SE_FL_EOI | SE_FL_EOS that the stream will be able to consume the pending data. Note that the loop was provoked by the attempt by sc_conn_io_cb() itself to call sc_conn_send() which resulted in a write subscription in h1_subscribe() which immediately calls a tasklet_wakeup() since the event is ready, and that it is now stopped by the presence of SE_FL_ERROR that is checked in sc_conn_io_cb(). It seems that an extra check down the send() path to refrain from subscribing when the connection is in error could speed up error detection or at least avoid a risk of loops in this case, but this is tricky. In addition, there's already SE_FL_ERR_PENDING that seems more suitable for reporting when there are pending data, but similarly, it probably isn't checked well enough to be suitable for backports. FWIW the issue may (unreliably) be reproduced by chaining haproxy to httpterm and issuing: (printf "GET /?s=10g HTTP/1.1\r\n\r\n"; sleep 0.1; printf "\r\n") | \ nc6 --half-close 0 8001 | head -c1000000000 >/dev/null It's necessary to play with the size of the head command that's supposed to trigger the error at some point. A variant involving h2load in h1 mode and multiple pipelined streams, that is stopped with Ctrl-C also tends to work. As the fix above was backported as far as 2.0, it would be tempting to backport this one as far. However tests have shown that the oldest version that can trigger this issue is 2.5, maybe due to subtle differences in older ones, so it's probably not worth going further until an issue is reported. Note that in 2.5 and older, the SE_FL_* flags are applied on the conn_stream instead, as CS_FL_*. Special thanks go to Felipe W Damasio for providing lots of detailed data allowing to quickly spot the root cause of the problem. --- diff --git a/src/mux_h1.c b/src/mux_h1.c index fe3f20be99..c72e24d62d 100644 --- a/src/mux_h1.c +++ b/src/mux_h1.c @@ -3058,7 +3058,8 @@ static int h1_process(struct h1c * h1c) h1s->flags |= H1S_F_REOS; TRACE_STATE("read0 on connection", H1_EV_H1C_RECV, conn, h1s); } - if ((h1c->flags & H1C_F_ST_ERROR) || ((conn->flags & CO_FL_ERROR) && !b_data(&h1c->ibuf))) + if ((h1c->flags & H1C_F_ST_ERROR) || ((conn->flags & CO_FL_ERROR) && + (se_fl_test(h1s->sd, SE_FL_EOI | SE_FL_EOS) || !b_data(&h1c->ibuf)))) se_fl_set(h1s->sd, SE_FL_ERROR); TRACE_POINT(H1_EV_STRM_WAKE, h1c->conn, h1s); h1_alert(h1s); @@ -3743,7 +3744,8 @@ static size_t h1_snd_buf(struct stconn *sc, struct buffer *buf, size_t count, in break; } - if (h1c->flags & H1C_F_ST_ERROR || ((h1c->conn->flags & CO_FL_ERROR) && !b_data(&h1c->ibuf))) { + if ((h1c->flags & H1C_F_ST_ERROR) || ((h1c->conn->flags & CO_FL_ERROR) && + (se_fl_test(h1s->sd, SE_FL_EOI | SE_FL_EOS) || !b_data(&h1c->ibuf)))) { se_fl_set(h1s->sd, SE_FL_ERROR); TRACE_ERROR("reporting error to the app-layer stream", H1_EV_STRM_SEND|H1_EV_H1S_ERR|H1_EV_STRM_ERR, h1c->conn, h1s); }