BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread
Michel Mayen reported that mixing lua actions loaded from 'lua-load'
and 'lua-load-per-thread' directives within a single http/tcp session
yields unexpected results.
When executing action defined in another running context from the one of
the previously executed action (from lua-load, then from
lua-load-per-thread or the opposite, order doesn't matter), it would yield
this kind of error:
"Lua function 'name': [state-id x] runtime error: attempt to call a nil value from ."
He also noted that when loading all actions using the same loading
directive, the issue is gone.
This is due to the fact that for lua actions, fetches and converters, lua
code is being executed from the stream lua context. However, the stream
lua context, which is created on the fly when first executing some lua
code related to the stream, is reused between multiple lua executions.
But the thing is, despite successive executions referring to the same
parent "stream" (which is also assigned to a given thread id), they don't
necessarily depend on the same running context from lua point of view.
Indeed, since the function which is about to be executed could have been
loaded from either 'lua-load' or 'lua-load-per-thread', the function
declaration and related dependencies are defined in a specific stack ID
which is known by calling fcn_ref_to_stack_id() on the given function.
Thus, in order to make streams capable of chaining lua actions, fetches and
converters loaded in different lua stacks, we add a new detection logic
in hlua_stream_ctx_prepare() to be able to recreate the lua context in the
proper stack space when the existing one conflicts with the expected stack
id.
This must be backported in every stable versions.
It depends on:
- "MINOR: hlua: add hlua_stream_prepare helper function"
[for < 2.5, skip the filter part since they didn't exist]
[wt: warning, wait a little bit before backporting too far, we
need to be certain the added BUG_ON() will never trigger]
BUG/MINOR: hlua: fix invalid use of lua_pop on error paths
Multiple error paths made invalid use of lua_pop():
When the stack is emptied using lua_settop(0), lua_pop() (which is
implemented as a lua_settop() macro) should not be used right after,
because it could lead to invalid reads since the stack is already empty.
Unfortunately, some remnants from initial lua stack implementation kept
doing so, resulting in haproxy crashs on some lua runtime errors paths
from time to time (ie: ERRRUN, ERRMEM).
Moreover, the extra lua_pop() instruction, even if it was safe, is totally
pointless in such case.
Removing such unsafe lua_pop() statements when we know that the stack is
already empty.
Amaury Denoyelle [Fri, 11 Aug 2023 14:10:34 +0000 (16:10 +0200)]
BUG/MEDIUM: quic: fix tasklet_wakeup loop on connection closing
It is possible to trigger a loop of tasklets calls if a QUIC connection
is interrupted abruptly by the client. This is caused by the following
interaction :
* FD iocb is woken up for read. This causes a wakeup on quic_conn
tasklet.
* quic_conn_io_cb is run and try to read but fails as the connection
socket is closed (typically with a ECONNREFUSED). FD read is
subscribed to the poller via qc_rcv_buf() which will cause the loop.
The looping will stop automatically once the idle-timeout is expired and
the connection instance is finally released.
To fix this, ensure FD read is subscribed only for transient error cases
(EAGAIN or similar). All other cases are considered as fatal and thus
all future read operations will fail. Note that for the moment, nothing
is reported on the quic_conn which may not skip future reception. This
should be improved in a future commit to accelerate connection closing.
This bug can be reproduced on a frequent occurence by interrupting the
following command. Quic traces should be activated on haproxy side to
detect the loop :
$ ngtcp2-client --tp-file=/tmp/ngtcp2-tp.txt \
--session-file=/tmp/ngtcp2-session.txt \
-r 0.3 -t 0.3 --exit-on-all-streams-close 127.0.0.1 20443 \
"http://127.0.0.1:20443/?s=1024"
The tasklet responsible of handling the remaining QUIC connection object
and its traffic was not released, leading to a memory leak. Furthermore its
callback, quic_cc_conn_io_cb(), should return NULL after this tasklet is
released.
BUG/MINOR: quic: Possible crash when issuing "show fd/sess" CLI commands
->xprt_ctx (struct ssl_sock_ctx) and ->conn (struct connection) must be kept
by the remaining QUIC connection object (struct quic_cc_conn) after having
release the previous one (struct quic_conn) to allow "show fd/sess" commands
to be functional without causing haproxy crashes.
BUG/MINOR: quic: Possible crash in quic_cc_conn_io_cb() traces.
Reset the local cc_qc and qc after having released cc_qc. Note that
cc_qc == qc. This is required to prevent haproxy from crashing
when TRACE_LEAVE() is called.
Willy Tarreau [Wed, 9 Aug 2023 14:52:28 +0000 (16:52 +0200)]
DEV: makefile: add a new "range" target to iteratively build all commits
This will iterate over all commits in the range passed in RANGE, or all
those from master to RANGE if no ".." exists in RANGE, and run "make all"
with the exact same variables. This aims to ease the verification that
no build failure exists inside a series. In case of error, it prints the
faulty commit and stops there with the tree checked out. Example:
$ make-disctcc range RANGE=HEAD
Found 14 commit(s) in range master..HEAD.
Current branch is 20230809-plock+tbl+peers-4
Starting to building now...
[ 1/14 ] 392922bc5 #############################
(...)
Done! 14 commit(s) built successfully for RANGE master..HEAD
Maybe in the future it will automatically use HEAD as a default for RANGE
depending on the feedback.
It's not listed in the help target so as not to encourage users to try it
as it can very quickly become confusing due to the checkouts.
Willy Tarreau [Wed, 9 Aug 2023 14:02:44 +0000 (16:02 +0200)]
BUILD: mux-h1: shut a build warning on clang from previous commit
Commit 5201b4abd ("BUG/MEDIUM: mux-h1: do not forget EOH even when no
header is sent") introduced a build warning on clang due to the remaining
two parenthesis in the expression. Let's fix this. No backport needed.
Willy Tarreau [Wed, 9 Aug 2023 09:58:15 +0000 (11:58 +0200)]
BUG/MEDIUM: mux-h1: do not forget EOH even when no header is sent
Since commit 723c73f8a ("MEDIUM: mux-h1: Split h1_process_mux() to make
code more readable"), outgoing H1 requests with no header at all (i.e.
essentially HTTP/1.0 requests) get delayed by 200ms. Christopher found
that it's due to the fact that we end processing too early and we don't
have the opportunity to send the EOH in this case.
This fix addresses it by verifying if it's required to emit EOH when
retruning from h1_make_headers(). But maybe that block could be moved
after the while loop in fact, or the stop condition in the loop be
revisited not to stop of !htx_is_empty(). The current solution gets the
job done at least.
Willy Tarreau [Wed, 9 Aug 2023 09:51:58 +0000 (11:51 +0200)]
BUG/MEDIUM: mux-h1: fix incorrect state checking in h1_process_mux()
That's a regression introduced in 2.9-dev by commit 723c73f8a ("MEDIUM:
mux-h1: Split h1_process_mux() to make code more readable") and found
by Christopher. The consequence is uncertain but the test definitely was
not right in that it would catch most existing states (H1_MSG_DONE=30).
At least it would emit too many "H1 request fully xferred".
Willy Tarreau [Wed, 9 Aug 2023 09:02:34 +0000 (11:02 +0200)]
BUG/MINOR: http: skip leading zeroes in content-length values
Ben Kallus also noticed that we preserve leading zeroes on content-length
values. While this is totally valid, it would be safer to at least trim
them before passing the value, because a bogus server written to parse
using "strtol(value, NULL, 0)" could inadvertently take a leading zero
as a prefix for an octal value. While there is not much that can be done
to protect such servers in general (e.g. lack of check for overflows etc),
at least it's quite cheap to make sure the transmitted value is normalized
and not taken for an octal one.
This is not really a bug, rather a missed opportunity to sanitize the
input, but is marked as a bug so that we don't forget to backport it to
stable branches.
A combined regtest was added to h1or2_to_h1c which already validates
end-to-end syntax consistency on aggregate headers.
Willy Tarreau [Wed, 9 Aug 2023 06:32:48 +0000 (08:32 +0200)]
BUG/MAJOR: http: reject any empty content-length header value
The content-length header parser has its dedicated function, in order
to take extreme care about invalid, unparsable, or conflicting values.
But there's a corner case in it, by which it stops comparing values
when reaching the end of the header. This has for a side effect that
an empty value or a value that ends with a comma does not deserve
further analysis, and it acts as if the header was absent.
While this is not necessarily a problem for the value ending with a
comma as it will be cause a header folding and will disappear, it is a
problem for the first isolated empty header because this one will not
be recontructed when next ones are seen, and will be passed as-is to the
backend server. A vulnerable HTTP/1 server hosted behind haproxy that
would just use this first value as "0" and ignore the valid one would
then not be protected by haproxy and could be attacked this way, taking
the payload for an extra request.
In field the risk depends on the server. Most commonly used servers
already have safe content-length parsers, but users relying on haproxy
to protect a known-vulnerable server might be at risk (and the risk of
a bug even in a reputable server should never be dismissed).
A configuration-based work-around consists in adding the following rule
in the frontend, to explicitly reject requests featuring an empty
content-length header that would have not be folded into an existing
one:
http-request deny if { hdr_len(content-length) 0 }
The real fix consists in adjusting the parser so that it always expects a
value at the beginning of the header or after a comma. It will now reject
requests and responses having empty values anywhere in the C-L header.
This needs to be backported to all supported versions. Note that the
modification was made to functions h1_parse_cont_len_header() and
http_parse_cont_len_header(). Prior to 2.8 the latter was in
h2_parse_cont_len_header(). One day the two should be refused but the
former is also used by Lua.
The HTTP messaging reg-tests were completed to test these cases.
Thanks to Ben Kallus of Dartmouth College and Narf Industries for
reporting this! (this is in GH #2237).
Willy Tarreau [Tue, 8 Aug 2023 17:35:25 +0000 (19:35 +0200)]
DOC: clarify the handling of URL fragments in requests
We indicate in path/pathq/url that they may contain '#' if the frontend
is configured with "option accept-invalid-http-request", and that option
mentions the fragment as well.
Willy Tarreau [Tue, 8 Aug 2023 15:54:26 +0000 (17:54 +0200)]
BUG/MINOR: h3: reject more chars from the :path pseudo header
This is the h3 version of this previous fix:
BUG/MINOR: h2: reject more chars from the :path pseudo header
In addition to the current NUL/CR/LF, this will also reject all other
control chars, the space and '#' from the :path pseudo-header, to avoid
taking the '#' for a part of the path. It's still possible to fall back
to the previous behavior using "option accept-invalid-http-request".
Here the :path header value is scanned a second time to look for
forbidden chars because we don't know upfront if we're dealing with a
path header field or another one. This is no big deal anyway for now.
This should be progressively backported to 2.6, along with the
following commits it relies on (the same as for h2):
REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests
REORG: http: move has_forbidden_char() from h2.c to http.h
MINOR: ist: add new function ist_find_range() to find a character range
MINOR: http: add new function http_path_has_forbidden_char()
Willy Tarreau [Tue, 8 Aug 2023 13:40:49 +0000 (15:40 +0200)]
BUG/MINOR: h2: reject more chars from the :path pseudo header
This is the h2 version of this previous fix:
BUG/MINOR: h1: do not accept '#' as part of the URI component
In addition to the current NUL/CR/LF, this will also reject all other
control chars, the space and '#' from the :path pseudo-header, to avoid
taking the '#' for a part of the path. It's still possible to fall back
to the previous behavior using "option accept-invalid-http-request".
This patch modifies the request parser to change the ":path" pseudo header
validation function with a new one that rejects 0x00-0x1F (control chars),
space and '#'. This way such chars will be dropped early in the chain, and
the search for '#' doesn't incur a second pass over the header's value.
This should be progressively backported to stable versions, along with the
following commits it relies on:
REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests
REORG: http: move has_forbidden_char() from h2.c to http.h
MINOR: ist: add new function ist_find_range() to find a character range
MINOR: http: add new function http_path_has_forbidden_char()
MINOR: h2: pass accept-invalid-http-request down the request parser
Willy Tarreau [Tue, 8 Aug 2023 14:17:22 +0000 (16:17 +0200)]
BUG/MINOR: h1: do not accept '#' as part of the URI component
Seth Manesse and Paul Plasil reported that the "path" sample fetch
function incorrectly accepts '#' as part of the path component. This
can in some cases lead to misrouted requests for rules that would apply
on the suffix:
Note that this behavior can be selectively configured using
"normalize-uri fragment-encode" and "normalize-uri fragment-strip".
The problem is that while the RFC says that this '#' must never be
emitted, as often it doesn't suggest how servers should handle it. A
diminishing number of servers still do accept it and trim it silently,
while others are rejecting it, as indicated in the conversation below
with other implementers:
Looking at logs from publicly exposed servers, such requests appear at
a rate of roughly 1 per million and only come from attacks or poorly
written web crawlers incorrectly following links found on various pages.
Thus it looks like the best solution to this problem is to simply reject
such ambiguous requests by default, and include this in the list of
controls that can be disabled using "option accept-invalid-http-request".
We're already rejecting URIs containing any control char anyway, so we
should also reject '#'.
In the H1 parser for the H1_MSG_RQURI state, there is an accelerated
parser for bytes 0x21..0x7e that has been tightened to 0x24..0x7e (it
should not impact perf since 0x21..0x23 are not supposed to appear in
a URI anyway). This way '#' falls through the fine-grained filter and
we can add the special case for it also conditionned by a check on the
proxy's option "accept-invalid-http-request", with no overhead for the
vast majority of valid URIs. Here this information is available through
h1m->err_pos that's set to -2 when the option is here (so we don't need
to change the API to expose the proxy). Example with a trivial GET
through netcat:
[08/Aug/2023:16:16:52.651] frontend layer1 (#2): invalid request
backend <NONE> (#-1), server <NONE> (#-1), event #0, src 127.0.0.1:50812
buffer starts at 0 (including 0 out), 16361 free,
len 23, wraps at 16336, error at position 7
H1 connection flags 0x00000000, H1 stream flags 0x00000810
H1 msg state MSG_RQURI(4), H1 msg flags 0x00001400
H1 chunk len 0 bytes, H1 body len 0 bytes :
00000 GET /aa#bb HTTP/1.0\r\n
00021 \r\n
This should be progressively backported to all stable versions along with
the following patch:
REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests
Similar fixes for h2 and h3 will come in followup patches.
Thanks to Seth Manesse and Paul Plasil for reporting this problem with
detailed explanations.
Willy Tarreau [Tue, 8 Aug 2023 13:38:28 +0000 (15:38 +0200)]
MINOR: h2: pass accept-invalid-http-request down the request parser
We're adding a new argument "relaxed" to h2_make_htx_request() so that
we can control its level of acceptance of certain invalid requests at
the proxy level with "option accept-invalid-http-request". The goal
will be to add deactivable checks that are still desirable to have by
default. For now no test is subject to it.
Willy Tarreau [Tue, 8 Aug 2023 13:24:54 +0000 (15:24 +0200)]
MINOR: http: add new function http_path_has_forbidden_char()
As its name implies, this function checks if a path component has any
forbidden headers starting at the designated location. The goal is to
seek from the result of a successful ist_find_range() for more precise
chars. Here we're focusing on 0x00-0x1F, 0x20 and 0x23 to make sure
we're not too strict at this point.
Willy Tarreau [Tue, 8 Aug 2023 13:23:19 +0000 (15:23 +0200)]
MINOR: ist: add new function ist_find_range() to find a character range
This looks up the character range <min>..<max> in the input string and
returns a pointer to the first one found. It's essentially the equivalent
of ist_find_ctl() in that it searches by 32 or 64 bits at once, but deals
with a range.
Willy Tarreau [Tue, 8 Aug 2023 13:27:02 +0000 (15:27 +0200)]
MINOR: mux-h2/traces: also suggest invalid header upon parsing error
Historically the parsing error used to apply only to too large headers,
so this is what has been reported in traces. But nowadays we can also
reject invalid characters, and when this happens the trace is a bit
misleading, so let's mention "or invalid".
In practice it's exactly the same for h3 as 54f53ef7c ("BUG/MAJOR: h2:
reject header values containing invalid chars") was for h2: we must
make sure never to accept NUL/CR/LF in any header value because this
may be used to construct splitted headers on the backend. Hence we
apply the same solution. Here pseudo-headers, headers and trailers are
checked separately, which explains why we have 3 locations instead of
2 for h2 (+1 for response which we don't have here).
This is marked major for consistency and due to the impact if abused,
but the reality is that at the time of writing, this problem is limited
by the scarcity of the tools which would permit to build such a request
in the first place. But this may change over time.
This must be backported to 2.6. This depends on the following commit
that exposes the filtering function:
REORG: http: move has_forbidden_char() from h2.c to http.h
Willy Tarreau [Tue, 8 Aug 2023 15:00:50 +0000 (17:00 +0200)]
REORG: http: move has_forbidden_char() from h2.c to http.h
This function is not H2 specific but rather generic to HTTP. We'll
need it in H3 soon, so let's move it to HTTP and rename it to
http_header_has_forbidden_char().
MINOR: quic: Warning for OpenSSL wrapper QUIC bindings without "limited-quic"
If the "limited-quic" globale option wa not set, the QUIC listener bindings were
not bound, this is ok, but silently ignored.
Add a warning in these cases to ask the user to explicitely enable the QUIC
bindings when building QUIC support against a TLS/SSL library without QUIC
support (OpenSSL).
Add a check to the QUIC packet handler running at application level (after the
handshake is complete) to release the quic_conn memory calling quic_conn_release().
This is done only if the mux is not started.
MEDIUM: quic: Allow the quic_conn memory to be asap released.
When the connection enters the "connection closing" state after having sent
a datagram with CONNECTION_CLOSE frames inside its packets, a lot of memory
may be freed from quic_conn objects (QUIC connection). This is done allocating
a reduced sized object which keeps enough information to handle the remaining
incoming packets for the connection in "connection closing" state, and to
continue to send again the previous datagram with CONNECTION_CLOSE frames inside
which has already been sent.
Define a new quic_cc_conn struct which represents the connection object after
entering the "connection close" state and after having release the quic_conn
connection object.
Define <pool_head_quic_cc_conn> new pool for these quic_cc_conn struct objects.
Define QUIC_CONN_COMMON structure which is shared between quic_conn struct object
(the connection before entering "connection close" state), and new quic_cc_conn
struct object (the connection after entering "connection close"). So, all the
members inside QUIC_CONN_COMMON may be indifferently dereferenced from a
quic_conn struct or a quic_cc_conn struct pointer.
Implement qc_new_cc_conn() function to allocate such connections in
"connection close" state. This function is responsible of copying the
required information from the original connection (quic_conn) to the remaining
connection (quic_cc_conn). Among others initialization, it redefined the
QUIC packet handler task to quic_cc_conn_io_cb() and the idle timer task
to qc_cc_idle_timer_task(). quic_cc_conn_io_cb() drains the received and
resend the datagram which CONNECTION_CLOSE frame which has already been sent
when entering "connection close" state. qc_cc_idle_timer_task() only releases
the remaining quic_cc_conn struct object.
Modify quic_conn_release() to allocate quic_cc_conn struct objects from the
original connection passed as argument. It does nothing if this original
connection is not in closing state, or if the idle timer has already expired.
Implement quic_release_cc_conn() to release a "connection close" connection.
It is called when its timer expires or if an error occured when sending
a packet from this connection when the peer is no more reachable.
MINOR: quic: Use a pool for the connection ID tree.
Add "quic_cids" new pool to allocate the ->cids trees of quic_conn objects.
Replace ->cids member of quic_conn objects by pointer to "quic_cids" and
adapt the code consequently. Nothing special.
MEDIUM: quic: Send CONNECTION_CLOSE packets from a dedicated buffer.
Add a new pool <pool_head_quic_cc_buf> for buffer used when building datagram
wich CONNECTION_CLOSE frames inside with QUIC_MIN_CC_PKTSIZE(128) as minimum
size.
Add ->cc_buf_area to quic_conn struct to store such buffers.
Add ->cc_dgram_len to store the size of the "connection close" datagrams
and ->cc_buf a buffer struct to be used with ->cc_buf_area as ->area member
value.
Implement qc_get_txb() to be called in place of qc_txb_alloc() to allocate
a struct "quic_cc_buf" buffer when the connection needs an immediate close
or a buffer struct if not.
Modify qc_prep_hptks() and qc_prep_app_pkts() to allow them to use such
"quic_cc_buf" buffer when an immediate close is required.
MINOR: quic: Move some counters from [rt]x quic_conn anonymous struct
Move rx.bytes, tx.bytes and tx.prep_bytes quic_conn struct member to
bytes anonymous struct (bytes.rx, bytes.tx and bytes.prep member respectively).
They are moved before being defined into a bytes anonoymous struct common to
a future struct to be defined.
Add a BUG_ON() to quic_peer_validated_addr() to check the amplification limit
is respected when it return false(0), i.e. when the connection is not validated.
Implement quic_may_send_bytes() which returns the number of bytes which may be
sent when the connection has not already been validated and call this functions
at several places when this is the case (after having called
quic_peer_validated_addr()).
Furthermore, this patch improves the code maintainability. Some patches to
come will have to rename ->[rt]x.bytes quic_conn struct members.
BUG/MAJOR: http-ana: Get a fresh trash buffer for each header value replacement
When a "replace-header" action is used, we loop on all headers in the
message to change value of all headers matching a name. The new value is
placed in a trash. However, there is a race here because if the message must
be defragmented, another trash is used. If several defragmentation are
performed because several headers must be updated at same time, the first
trash, used to store the new value, may be crushed. Indeed, there are only 2
pre-allocated trash used in rotation. and the trash to store the new value
is never renewed. As consequece, random data may be inserted into the header
value.
Here, to fix the issue, we must take care to refresh the trash buffer when
we evaluated a new header. This way, a trash used for the new value, and
eventually another way for the htx defragmentation. But that's all.
Thanks to Christian Ruppert for his detailed report.
This patch must be to all stable versions. On the 2.0, the patch must be
applied on src/proto_htx.c and the function is named
htx_transform_header_str().
MINOR: h3: abort request if not completed before full response
A HTTP server may provide a complete response even prior receiving the
full request. In this case, RFC 9114 allows the server to abort read
with a STOP_SENDING with error code H3_NO_ERROR.
This scenario was notably reproduced with haproxy and an inactive
server. If the client send a POST request, haproxy may provide a full
HTTP 503 response before the end of the full request.
GCC warns about a possible NULL dereference when requeuing a datagram on
the connection socket. This happens due to a MT_LIST_POP to retrieve a
rxbuf instance.
In fact, this can never be NULL there is enough rxbuf allocated for each
thread. Once a thread has finished to work with it, it must always
reappend it.
This issue was introduced with the following patch :
commit b34d353968db7f646e83871cb6b21a246af84ddc
BUG/MEDIUM: quic: consume contig space on requeue datagram
As such, it must be backported in every version with the above commit.
BUG/MINOR: quic: reappend rxbuf buffer on fake dgram alloc error
A thread must always reappend the rxbuf instance after finishing
datagram reception treatment. This was not the case on one error code
path : when fake datagram allocation fails on datagram requeing.
This issue was introduced with the following patch :
commit b34d353968db7f646e83871cb6b21a246af84ddc
BUG/MEDIUM: quic: consume contig space on requeue datagram
As such, it must be backported in every version with the above commit.
REGTESTS: http: Create a dedicated script to test spliced bodyless responses
Splicing is not available on all platform. Thus a dedicated script is used
to check we properly skip payload for bodyless response when splicing is
used. This way, we are still able to test the feature with the original
script on all platform.
This patch fixes an issue on the CI introduced by commit ef2b15998
("BUG/MINOR: htx/mux-h1: Properly handle bodyless responses when splicing is
used"). It must be backported with the above commit.
CLEANUP: stconn: Move comment about sedesc fields on the field line
Fields of sedesc structure were documented in the comment about the
structure itself. It was not really convenient, hard to read, hard to
update. So comments about the fields are moved on the corresponding field
line, as usual.
BUG/MINOR: http-client: Don't forget to commit changes on HTX message
In the http-client I/O handler, HTX request and response are loaded from the
channels buffer. Some changes are preformed in these messages. So, we must
take care to commit changes into the underlying buffer by calling
htx_to_buf().
It is especially important when the HTX message becoms empty to be able to
quickly release the buffer.
BUG/MEDIUM: quic: consume contig space on requeue datagram
When handling UDP datagram reception, it is possible to receive a QUIC
packet for one connection to the socket attached to another connection.
To protect against this, an explicit comparison is done against the
packet DCID and the quic-conn CID. On no match, the datagram is requeued
and dispatched via rxbuf and will be treated as if it arrived on the
listener socket.
One reason for this wrong reception is explained by the small race
condition that exists between bind() and connect() syscalls during
connection socket initialization. However, one other reason which was
not thought initially is when clients reuse the same IP:PORT for
different connections. In this case the current FD attribution is not
optimal and this can cause a substantial number of requeuing.
This situation has revealed a bug during requeuing. If rxbuf contig
space is not big enough for the datagram, the incoming datagram was
dropped, even if there is space at buffer origin. This can cause several
datagrams to be dropped in a series until eventually buffer head is
moved when passing through the listener FD.
To fix this, allocate a fake datagram to consume contig space. This is
similar to the handling of datagrams on the listener FD. This allows
then to store the datagram to requeue on buffer head and continue.
This can be reproduced by starting a lot of connections. To increase the
phenomena, POST are used to increase the number of datagram dropping :
$ while true; do curl -F "a=@~/50k" -k --http3-only -o /dev/null https://127.0.0.1:20443/; done
BUG/MINOR: htx/mux-h1: Properly handle bodyless responses when splicing is used
There is a mechanisme in the H1 and H2 multiplexer to skip the payload when
a response is returned to the client when it must not contain any payload
(response to a HEAD request or a 204/304 response). However, this does not
work when the splicing is used. The H2 multiplexer does not support the
splicing, so there is no issue. But with the mux-h1, when data are sent
using the kernel splicing, the mux on the server side is not aware the
client side should skip the payload. And once the data are put in a pipe,
there is no way to stop the sending.
It is a defect of the current design. This will be easier to deal with this
case when the mux-to-mux forwarding will be implemented. But for now, to fix
the issue, we should add an HTX flag on the start-line to pass the info from
the client side to the server side and be able to disable the splicing in
necessary.
The associated reg-test was improved to be sure it does not fail when the
splicing is configured.
MEDIUM: stream: Reset response analyse expiration date if there is no analyzer
When the stream expiration date is computed at the end of process_stream(),
if there is no longer analyzer on the request channel, its analyse
expiration date is reset. The same is now performed on the response
channel. This way, we are sure to not inherit of an orphan expired date.
This should prevent spinning loop on process_stream().
BUG/MEDIUM: bwlim: Reset analyse expiration date when then channel analyse ends
The bandwidth limitation filter sets the analyse expiration date on the
channel to restart the data forwarding and thus limit the bandwidth.
However, this expiration date is not reset on abort. So it is possible to
reuse the same expiration date to set the stream one. If it expired before
the end of the stream, this will lead to a spinning loop on process_stream()
because the task expiration date is always set in past.
To fix the issue, when the analyse ends on a channel, the bandwidth
limitation filter reset the corrsponding analyse expiration date.
This patch should fix the issue #2230. It must be backported as far as 2.7.
Willy Tarreau [Tue, 1 Aug 2023 09:18:00 +0000 (11:18 +0200)]
BUILD: cfgparse: keep a single "curproxy"
Surprisingly, commit 00e00fb42 ("REORG: cfgparse: extract curproxy as a
global variable") caused a build breakage on the CI but not on two
developers' machines. It looks like it's dependent on the linker version
used. What happens is that flt_spoe.c already has a curproxy struct which
already is a copy of the one passed by the parser because it also needed
it to be exported, so they now conflict. Let's just drop this unused copy.
BUG/MINOR: quic+openssl_compat: Non initialized TLS encryption levels
The ->openssl_compat struct member of the QUIC connection object was not fully
initialized. This was done on purpose, believing that ->write_level and
->read_level member was initialized by quic_tls_compat_keylog_callback() (the
keylog callback) before entering quic_tls_compat_msg_callback() which
has to parse the TLS messages. In fact this is not the case at all.
quic_tls_compat_msg_callback() is called before quic_tls_compat_keylog_callback()
when receiving the first TLS ClientHello message.
->write_level and ->read_level was not initialized to <ssl_encryption_initial> (= 0)
as this is implicitely done by the originial ngxinx wrapper which calloc()s the openssl
compatibily structure. This could lead to a crash after ssl_to_qel_addr() returns
NULL when called by ha_quic_add_handshake_data().
This patch explicitely initialializes ->write_level and ->read_level to
<ssl_encryption_initial> (=0).
BUG/MEDIUM: h3: Be sure to handle fin bit on the last DATA frame
When DATA frames are decoded for a QUIC stream, we take care to not exceed
the announced content-length, if any. To do so, we check we don't received
more data than excepted but also no less than announced. For the last check,
we rely on the fin bit.
However, it is possible to have several DATA frames to decode at a time
while the end of the stream was received. In this case, we must take care to
handle the fin bit only on the last frame. But because of a bug, the fin bit
was handled to early, erroneously triggering an internal error.
MINOR: quic: Split QUIC connection code into three parts
Move the TX part of the code to quic_tx.c.
Add quic_tx-t.h and quic_tx.h headers for this TX part code.
The definition of quic_tx_packet struct has been move from quic_conn-t.h to
quic_tx-t.h.
Same thing for the TX part:
Move the RX part of the code to quic_rx.c.
Add quic_rx-t.h and quic_rx.h headers for this TX part code.
The definition of quic_rx_packet struct has been move from quic_conn-t.h to
quic_rx-t.h.
Move the code which directly calls the functions of the OpenSSL QUIC API into
quic_ssl.c new C file.
Some code have been extracted from qc_conn_finalize() to implement only
the QUIC TLS part (see quic_tls_finalize()) into quic_tls.c.
qc_conn_finalize() has also been exported to be used from this new quic_ssl.c
C module.
MINOR: quic: Export QUIC traces code from quic_conn.c
To accelerate the compilation of quic_conn.c file, export the code in relation
with the traces from quic_conn.c to quic_trace.c.
Also add some headers (quic_trace-t.h and quic_trace.h).
BUG/MINOR: quic: Possible crash when acknowledging Initial v2 packets
The memory allocated for TLS cipher context used to encrypt/decrypt QUIC v2
packets should not be released as soon as possible. Indeed, even if
after having received an client Handshake packet one may drop the Initial
TLS cipher context, one has often to used it to acknowledged Initial packets.
BUG/MEDIUM: h3: Properly report a C-L header was found to the HTX start-line
When H3 HEADERS frames are converted to HTX, if a Content-Length header was
found, the HTX start-line must be notified by setting HTX_SL_F_CLEN flag.
Some components may rely on this flag to know there is a content-length
without looping on headers to get the info.
Among other this, it is mandatory for the FCGI multiplexer because it must
announce the message body length.
BUG/MINOR: ssl: OCSP callback only registered for first SSL_CTX
If multiple SSL_CTXs use the same certificate that has an OCSP response
file on the filesystem, only the first one will have the OCSP callback
set. This bug was introduced by "cc346678d MEDIUM: ssl: Add ocsp_certid
in ckch structure and discard ocsp buffer early" which cleared the
ocsp_response from the ckch_data after it was inserted in the tree,
which prevented subsequent contexts from having the callback registered.
Released version 2.9-dev2 with the following main changes :
- BUG/MINOR: quic: Possible leak when allocating an encryption level
- BUG/MINOR: quic: Missing QUIC connection path member initialization
- BUILD: quic: Compilation fixes for some gcc warnings with -O1
- DOC: ssl: Fix typo in 'ocsp-update' option
- DOC: ssl: Add ocsp-update troubleshooting clues and emphasize on crt-list only aspect
- BUG/MINOR: tcp_sample: bc_{dst,src} return IP not INT
- MEDIUM: acl/sample: unify sample conv parsing in a single function
- MINOR: sample: introduce c_pseudo() conv function
- MEDIUM: sample: add missing ADDR=>? compatibility matrix entries
- MINOR: sample: fix ipmask sample definition
- MEDIUM: tree-wide: fetches that may return IPV4+IPV6 now return ADDR
- MEDIUM: sample: introduce 'same' output type
- BUG/MINOR: quic: Possible crash in "show quic" dumping packet number spaces
- BUG/MINOR: cache: A 'max-age=0' cache-control directive can be overriden by a s-maxage
- BUG/MEDIUM: sink: invalid server list in sink_new_from_logsrv()
- BUG/MINOR: http_ext: unhandled ERR_ABORT in proxy_http_parse_7239()
- BUG/MINOR: sink: missing sft free in sink_deinit()
- BUG/MINOR: ring: size warning incorrectly reported as fatal error
- BUG/MINOR: ring: maxlen warning reported as alert
- BUG/MINOR: log: LF upsets maxlen for UDP targets
- MINOR: sink/api: pass explicit maxlen parameter to sink_write()
- BUG/MEDIUM: log: improper use of logsrv->maxlen for buffer targets
- BUG/MINOR: log: fix missing name error message in cfg_parse_log_forward()
- BUG/MINOR: log: fix multiple error paths in cfg_parse_log_forward()
- BUG/MINOR: log: free errmsg on error in cfg_parse_log_forward()
- BUG/MINOR: sink: invalid sft free in sink_deinit()
- BUG/MINOR: sink: fix errors handling in cfg_post_parse_ring()
- BUG/MINOR: server: set rid default value in new_server()
- MINOR: hlua_fcn/mailers: handle timeout mail from mailers section
- BUG/MINOR: sink/log: properly deinit srv in sink_new_from_logsrv()
- EXAMPLES: maintain haproxy 2.8 retrocompatibility for lua mailers script
- BUG/MINOR: hlua_fcn/queue: use atomic load to fetch queue size
- BUG/MINOR: config: Remove final '\n' in error messages
- BUG/MINOR: config: Lenient port configuration parsing
- BUG/MEDIUM: quic: token IV was not computed using a strong secret
- BUG/MINOR: quic: retry token remove one useless intermediate expand
- BUG/MEDIUM: quic: missing check of dcid for init pkt including a token
- BUG/MEDIUM: quic: timestamp shared in token was using internal time clock
- CLEANUP: quic: remove useless parameter 'key' from quic_packet_encrypt
- BUG/MINOR: hlua: hlua_yieldk ctx argument should support pointers
- BUG/MEDIUM: hlua_fcn/queue: bad pop_wait sequencing
- DOC: config: Fix fc_src description to state the source address is returned
- BUG/MINOR: sample: Fix wrong overflow detection in add/sub conveters
- BUG/MINOR: http: Return the right reason for 302
- MEDIUM: ssl: new sample fetch method to get curve name
- CI: add naming convention documentation
- CI: explicitely highlight VTest result section if there's something
- BUG/MINOR: quic: Unckecked encryption levels availability
- BUILD: quic: fix warning during compilation using gcc-6.5
- BUG/MINOR: hlua: add check for lua_newstate
- BUG/MINOR: h1-htx: Return the right reason for 302 FCGI responses
- MINOR: lua: Allow reading "proc." scoped vars from LUA core.
- MINOR: cpuset: add cpu_map_configured() to know if a cpu-map was found
- BUG/MINOR: config: do not detect NUMA topology when cpu-map is configured
- BUG/MINOR: cpuset: remove the bogus "proc" from the cpu_map struct
- BUG/MINOR: init: set process' affinity even in foreground
- CLEANUP: cpuset: remove the unused proc_t1 field in cpu_map
- CLEANUP: config: make parse_cpu_set() return documented values
- BUG/MINOR: server: Don't warn on server resolution failure with init-addr none
- MINOR: peers: add peers keyword registration
- MINOR: quic: Stop storing the TX encoded transport parameters
- MINOR: quic: Dynamic allocation for negotiated Initial TLS cipher context.
- MINOR: quic: Release asap the negotiated Initial TLS context.
- MINOR: quic: Add traces to qc_may_build_pkt()
- MEDIUM: quic: Packet building rework.
- CLEANUP: quic: Remove a useless TLS related variable from quic_conn_io_cb().
- MEDIUM: quic: Handshake I/O handler rework.
- MINOR: quic: Add traces for qc_frm_free()
- MINOR: quic: add trace about pktns packet/frames releasing
- BUG/MINOR: quic: Missing parentheses around PTO probe variable.
- MINOR: quic: Ping from Initial pktns before reaching anti-amplification limit
- BUG/MINOR: server-state: Ignore empty files
- BUG/MINOR: server-state: Avoid warning on 'file not found'
- BUG/MEDIUM: listener: Acquire proxy's lock in relax_listener() if necessary
- MINOR: quic: QUIC openssl wrapper implementation
- MINOR: quic: Include QUIC opensssl wrapper header from TLS stacks compatibility header
- MINOR: quic: Do not enable O-RTT with USE_QUIC_OPENSSL_COMPAT
- MINOR: quic: Set the QUIC connection as extra data before calling SSL_set_quic_method()
- MINOR: quic: Do not enable 0RTT with SSL_set_quic_early_data_enabled()
- MINOR: quic: Add a compilation option for the QUIC OpenSSL wrapper
- MINOR: quic: Export some KDF functions (QUIC-TLS)
- MINOR: quic: Make ->set_encryption_secrets() be callable two times
- MINOR: quic: Initialize TLS contexts for QUIC openssl wrapper
- MINOR: quic: Call the keylog callback for QUIC openssl wrapper from SSL_CTX_keylog()
- MINOR: quic: Add a quic_openssl_compat struct to quic_conn struct
- MINOR: quic: Useless call to SSL_CTX_set_quic_method()
- MINOR: quic: SSL context initialization with QUIC OpenSSL wrapper.
- MINOR: quic: Missing encoded transport parameters for QUIC OpenSSL wrapper
- MINOR: quic: Add "limited-quic" new tuning setting
- DOC: quic: Add "limited-quic" new tuning setting
- DOC: install: Document how to build a limited support for QUIC
Document "limited-quic" new tuning setting which must be used to
enable the QUIC listener bindings when haproxy is compiled against
a TLS/SSL stack without QUIC support.
MINOR: quic: Add "limited-quic" new tuning setting
This setting which may be used into a "global" section, enables the QUIC listener
bindings when haproxy is compiled with the OpenSSL wrapper. It has no effect
when haproxy is compiled against a TLS stack with QUIC support, typically quictls.
MINOR: quic: Missing encoded transport parameters for QUIC OpenSSL wrapper
This wrapper needs to have an access to an encoded version of the local transport
parameter (to be sent to the peer). They are provided to the TLS stack thanks to
qc_ssl_compat_add_tps_cb() callback.
These encoded transport parameters were attached to the QUIC connection but
removed by this commit to save memory:
MINOR: quic: Stop storing the TX encoded transport parameters
This patch restores these transport parameters and attaches them again
to the QUIC connection (quic_conn struct), but only when the QUIC OpenSSL wrapper
is compiled.
Implement qc_set_quic_transport_params() to encode the transport parameters
for a connection and to set them into the stack and make this function work
for both the OpenSSL wrapper or any other TLS stack with QUIC support. Its uses
the encoded version of the transport parameters attached to the connection
when compiled for the OpenSSL wrapper, or local parameters when compiled
with TLS stack with QUIC support. These parameters are passed to
quic_transport_params_encode() and SSL_set_quic_transport_params() as before
this patch.
MINOR: quic: SSL context initialization with QUIC OpenSSL wrapper.
When the QUIC OpenSSL wrapper is used, the keylog has to be set and a QUIC
specific TLS 1.3 extension must be added to the EncryptedExtensions message.
This is done by quic_tls_compat_init().
MINOR: quic: Useless call to SSL_CTX_set_quic_method()
SSL_set_quic_method() is already called at SSL session level. This call
is useless. Furthermore, SSL_CTX_set_quic_method() is not implemented by
the QUIC OpenSSL wrapper to come.
Should be backported as far as 2.6 to ease further backports to come.
MINOR: quic: Call the keylog callback for QUIC openssl wrapper from SSL_CTX_keylog()
SSL_CTX_keylog() is the callback used when the TLS keylog feature is enabled with
tune.ssl.keylog configuration setting. But the QUIC openssl wrapper also needs
to use such a callback to receive the QUIC TLS secrets from the TLS stack.
Add a call to the keylog callback for the QUIC openssl wrapper to SSL_CTX_keylog()
to ensure that it will be called when the TLS keylog feature is enabled.
MINOR: quic: Initialize TLS contexts for QUIC openssl wrapper
When the QUIC OpenSSL wrapper use is enabled, all the TLS contexts (SSL_CTX) must
be configured to support it. This is done calling quic_tls_compat_init() from
ssl_sock_prepare_ctx(). Note that quic_tls_compat_init() ignore the TLS context
which are not linked to non-QUIC TLS sessions/connections.
MINOR: quic: Make ->set_encryption_secrets() be callable two times
With this patch, ha_set_encryption_secrets() may be callable two times,
one time to derive the RX secrets and a second time to derive the TX secrets.
There was a missing step to do so when the RX secret was received from the stack.
In this case the secret was not stored for the keyupdate, leading the keyupdate
RX part to be uninitialized.
Add a label to initialize the keyupdate RX part and a "goto" statement to run
the concerned code after having derived the RX secrets.
This patch is required to make the keupdate feature work with the OpenSSL wrapper.
MINOR: quic: Do not enable 0RTT with SSL_set_quic_early_data_enabled()
SSL_set_quic_early_data_enabled is not implemented by the QUIC OpenSSL wrapper.
Furthermore O-RTT is not supported by this wrapper. Do not know why at
this time.
MINOR: quic: Set the QUIC connection as extra data before calling SSL_set_quic_method()
This patch is required for the QUIC OpenSSL wrapper, and does not break anything
for the other TLS stacks with their own QUIC support (quictls for instance).
The implementation of SSL_set_quic_method() needs to access the quic_conn object
to store data within. But SSL_set_quic_method() is only aware of the SSL session
object. This is the reason why it is required to set the quic_conn object
as extra data to the SSL session object before calling SSL_set_quic_method()
so that it can be retrieve by SSL_set_quic_method().
MINOR: quic: Include QUIC opensssl wrapper header from TLS stacks compatibility header
Include haproxy/quic_openssl_compat.h from haproxy/openssl-compat.h when the
compilation of the QUIC openssl wrapper for TLS stacks is enabled with
USE_QUIC_OPENSSLCOMPAT.
and SSL_QUIC_METHOD QUIC specific bio method which are also implemented by quictls
to support QUIC from OpenSSL. So, its aims is to support QUIC from a standard OpenSSL
stack without QUIC support. It relies on the OpenSSL keylog feature to retreive
the secrets derived by the OpenSSL stack during a handshake and to pass them to
the ->set_encryption_secrets() callback as this is done by quictls. It makes
usage of a callback (quic_tls_compat_msg_callback()) to handle some TLS messages
only on the receipt path. Some of them must be passed to the ->add_handshake_data()
callback as this is done with quictls to be sent to the peer as CRYPTO data.
quic_tls_compat_msg_callback() callback also sends the received TLS alert with
->send_alert() callback.
AES 128-bits with CCM mode is not supported at this time. It is often disabled by
the OpenSSL stack, but as it can be enabled by "ssl-default-bind-ciphersuites",
the wrapper will send a TLS alerts (Handhshake failure) if this algorithm is
negotiated between the client and the server.
BUG/MEDIUM: listener: Acquire proxy's lock in relax_listener() if necessary
Listener functions must follow a common locking pattern:
1. Get the proxy's lock if necessary
2. Get the protocol's lock if necessary
3. Get the listener's lock if necessary
We must take care to respect this order to avoid any ABBA issue. However, an
issue was introduced in the commit bcad7e631 ("MINOR: listener: add
relax_listener() function"). relax_listener() gets the lisener's lock and if
resume_listener() is called, the proxy's lock is then acquired.
So to fix the issue, the proxy's lock is first acquired in relax_listener(),
if necessary.
This patch should fix the issue #2222. It must be backported as far as 2.4
because the above commit is marked to be backported there.
BUG/MINOR: server-state: Avoid warning on 'file not found'
On a clean installation, users might want to use server-state-file and
the recommended zero-warning option. This caused a problem if
server-state-file was not found, as a warning was emited, causing
startup to fail.
This will allow users to specify nonexistent server-state-file at first,
and dump states to the file later.
Fixes #2190
CF: Technically speaking, this patch can be backported to all stable
versions. But it is better to do so to 2.8 only for now.
Users might want to pre-create an empty file for later dumping
server-states. This commit allows for that by emiting a notice in case
file is empty and a warning if file is not empty, but version is unknown
Fix partially: #2190
CF: Technically speaking, this patch can be backported to all stable
versions. But it is better to do so to 2.8 only for now.
MINOR: quic: Ping from Initial pktns before reaching anti-amplification limit
There are cases where there are enough room on the network to send 1200 bytes
into a PING only Initial packets. This may be considered as the last chance
for the connection to complete the handshake. Indeed, the client should
reply with at least a 1200 bytes datagram with an Initial packet inside.
This would give the haproxy endpoint a credit of 3600 bytes to complete
the handshake before reaching the anti-amplification limit again, and so on.