From: Willy Tarreau Date: Thu, 18 Dec 2025 14:57:48 +0000 (+0100) Subject: BUG/MEDIUM: mux-h2: synchronize all conditions to create a new backend stream X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=9a046fc3ad600ea0cdf3b884831f90ae96f368a5;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: mux-h2: synchronize all conditions to create a new backend stream In H2 the conditions to create a new stream differ for a client and a server when a GOAWAY was exchanged. While on the server, any stream whose ID is lower than or equal to the one advertised in GOAWAY is valid, for a client it's forbidden to create any stream after receipt of a GOAWAY, even if its ID is lower than or equal to the last one, despite the server not being able to tell the difference from the number of streams in flight. Unfortunately, the logic in the code did not always reflect this specificity of the client (the backend code in our case), and most often considered that it was still permitted to create a new stream until the max_id was greater than or equal to the advertised last_id. This is for example what h2c_is_dead() and h2c_streams_left() do. In other places, such as h2_avail_streams(), the rule is properly taken into account. Very often the advertised last_id is the same, and this is also what haproxy does (which explains why it's impossible to reproduce the issue by chaining two haproxy layers), but a server may wish to advertise any ID including 2^31-1 as mentioned in the spec, and in this case the functions would behave differently. This discrepancy results in a corner case where a GOAWAY received on an idle connection will cause the next stream creation to be initially accepted but then rejected via h2_avail_streams(), and the connection left in a bad state, still attached to the session due to http-reuse safe, but not reinserted into idle list, since the backend code currently is not able to properly recover from this situation. Worse, the idle flags are no longer on it but TASK_F_USR1 still is, and this makes the recently added BUG_ON() rightfully trigger since this case is not supposed to happen. Admittedly more of the backend recovery code needs to be reworked, however the mux must consistently decide whether or not a connection may be reused or needs to be released. This commit fixes the affected logic by introducing a new function "h2c_reached_last_stream()" which says if a connection has reached its last stream, regardless of the side, and using this one everywhere max_id was compared to last_id. This is sufficient to address the corner case that be_reuse_connection() currently cannot recover from. This is in relation to GH issue #3215 and it should be sufficient to fix the issue there. Thanks to Chris Staite for reporting the issue and kudos to Amaury for spotting the events sequence that can lead to this situation. This patch must be backported to 3.3 first, then to older versions later. It's worth noting that it's much more difficult to observe the issue before 3.3 because the BUG_ON() is not there, and the possibly non-released connection might end up being killed for other reasons (timeouts etc). But one possible visible effect might be the impossibility to delete a server (which Chris observed in 3.3). --- diff --git a/src/mux_h2.c b/src/mux_h2.c index 61ee41848..857d1dfa0 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -979,6 +979,33 @@ static void h2c_update_timeout(struct h2c *h2c) TRACE_LEAVE(H2_EV_H2C_WAKE); } +/* returns non-zero if the connection reached its last possible stream, i.e. + * on a frontend, if last_sid is set, max_id reached it, and on a backend, + * last_sid is set (since it's forbidden by the spec to initiate new streams + * on a connection that received a GOAWAY regardless of the advertised last + * stream id). + */ +static inline int h2c_reached_last_stream(const struct h2c *h2c) +{ + /* highest stream ID already reached ? */ + if (h2c->max_id >= 0x7fffffff) + return 1; + + /* GOAWAY sent or received ? */ + if (h2c->last_sid < 0) + return 0; + + if (h2c->flags & H2_CF_IS_BACK) + return 1; + + /* front: reached advertised limit ? */ + if (h2c->max_id >= h2c->last_sid) + return 1; + + /* ok not yet */ + return 0; +} + static __inline int h2c_is_dead(const struct h2c *h2c) { @@ -989,7 +1016,7 @@ h2c_is_dead(const struct h2c *h2c) (!(h2c->conn->owner) && !conn_is_reverse(h2c->conn)) || /* Nobody's left to take care of the connection, drop it now */ (!br_data(h2c->mbuf) && /* mux buffer empty, also process clean events below */ ((h2c->flags & H2_CF_RCVD_SHUT) || - (h2c->last_sid >= 0 && h2c->max_id >= h2c->last_sid))))) + h2c_reached_last_stream(h2c))))) return 1; return 0; @@ -1146,12 +1173,14 @@ static inline int h2_streams_left(const struct h2c *h2c) { int ret; + if (h2c_reached_last_stream(h2c)) + return 0; + /* consider the number of outgoing streams we're allowed to create before - * reaching the last GOAWAY frame seen. max_id is the last assigned id, + * reaching the highest stream number. max_id is the last assigned id, * nb_reserved is the number of streams which don't yet have an ID. */ - ret = (h2c->last_sid >= 0) ? h2c->last_sid : 0x7FFFFFFF; - ret = (unsigned int)(ret - h2c->max_id) / 2 - h2c->nb_reserved - 1; + ret = (unsigned int)(0x7FFFFFFF - h2c->max_id) / 2 - h2c->nb_reserved - 1; if (ret < 0) ret = 0; return ret; @@ -5161,8 +5190,7 @@ static int h2_process(struct h2c *h2c) if ((h2c->flags & H2_CF_ERROR) || h2c_read0_pending(h2c) || h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED || - (eb_is_empty(&h2c->streams_by_id) && h2c->last_sid >= 0 && - h2c->max_id >= h2c->last_sid)) { + (eb_is_empty(&h2c->streams_by_id) && h2c_reached_last_stream(h2c))) { h2_wake_some_streams(h2c, 0); if (eb_is_empty(&h2c->streams_by_id)) {