Emeric Brun [Wed, 25 May 2022 08:12:07 +0000 (10:12 +0200)]
BUG/MEDIUM: peers: fix segfault using multiple bind on peers sections
If multiple "bind" lines were present on the "peers" section, multiple
listeners were added to a list but the code mistakenly initialize
the first member and this first listener was re-configured instead of
the newly created one. The last one remains uninitialized causing a null
dereference a soon a connection is received.
In addition, the 'peers' sections and protocol are not currently designed to
handle multiple listeners.
This patch check if there is already a listener configured on the 'peers'
section when we want to create a new one. This is rising an error if
a listener is already present showing the file and line in the error
message.
To keep the file and line number of the previous listener available
for the error message, the 'bind_conf_uniq_alloc' function was modified
to keep the file/line data the struct 'bind_conf' was firstly
allocated (previously it was updated each time the 'bind_conf' was
reused).
BUG/MEDIUM: resolvers: Don't defer resolutions release in deinit function
resolvers_deinit() function is called on error, during post-parsing stage,
or on deinit, when HAProxy is stopped. It releases all entities: resolvers,
resolutions and SRV requests. There is no reason to defer the resolutions
release by moving them in the death_row list because this function is
terminal. And it is in fact a bug. Resolutions must not be released at the
end of the function because resolvers were already freed. However some
resolutions may still be attached to a reolver. Thus, when we try to remove
it from the resolver's tree, in resolv_reset_resolution(), this resolver was
already released.
So now, resolution are immediately released. It means there is no more
reason to track this function. calls to
enter_resolver_code()/leave_resolver_code() have been removed.
This patch should fix the issue #1680 and may be related to #1485. It must
be backported as far as 2.2.
Willy Tarreau [Tue, 24 May 2022 13:34:26 +0000 (15:34 +0200)]
MEDIUM: h1: enlarge the scope of accepted version chars with accept-invalid-http-request
We used to support both RTSP and HTTP protocol version names with and
without accept-invalid-http-request, but since this is based on the
characters themselves, any protocol made of chars {0-9/.HPRST} was
possible and not others. Now that such non-standard protocols are
restricted to accept-invalid-http-request, there's no reason for not
allowing other letters. With this patch, characters {0-9./A-Z} are
permitted when the option is set.
This patch hardens the verification of the HTTP/1.x version line
(i.e. the first line within an HTTP/1.x request) to verify that
the protocol name within the version actually reads "HTTP".
Previously protocols that superficially resembled the wire-format
of HTTP/1.x and having a 4-letter acronym as the protocol name, such
as RTSP would pass this check.
This patch fixes GitHub issue #540, it must be backported to all
supported versions. The legacy, non-HTX parser is affected as well,
a fix must be created for it as well.
Note that such protocols can still be used when option
accept-invalid-http-request is set.
Willy Tarreau [Tue, 24 May 2022 05:43:57 +0000 (07:43 +0200)]
CLEANUP: init: address a coverity warning about possible multiply overflow
In issue #1585 Coverity suspects a risk of multiply overflow when
calculating the SSL cache size, though in practice the cache is
limited to 2^32 anyway thus it cannot really happen. Nevertheless,
casting the operation should be sufficient to avoid marking it as a
false positive.
This patch was useful mainly for the docker image of QUIC interop to
have traces on stdout.
A better solution has been found by integrating this patch directly in
the qns repository which is used to build the docker image. Thus, this
hack is not require anymore in the main repository.
Amaury Denoyelle [Mon, 23 May 2022 06:52:58 +0000 (08:52 +0200)]
BUG/MEDIUM: mux-quic: adjust buggy proxy closing support
The wake handler detects if the frontend is closed. This can happen if
the proxy has been disabled individually or even on process soft-stop.
Before this patch, in this condition QCS instances were freed before
being detached from the cs_endpoint. This clearly violates the haproxy
connection architecture and cause a BUG_ON statement crash in cs_free().
To handle this properly, cs_endpoint is notified by setting RD_SH|WR_SH
on connection flags. The cs_endpoint will thus use the detach operation
which allows the QCS instance to be freed.
This code allows the soft-stop process to complete as soon as possible.
However, the client is not notified about the connection closing. It
should be done by emitting a H3 GOAWAY + CONNECTION_CLOSE. Sadly, this
is impossible at this stage because the listener sockets are closed so
the quic-conn cannot use it to emit new frames. At this stage the client
will most probably detect connection closing on its idle timeout
expiration.
Thus, to completely support proxy closing/soft-stop, important
architecture changes are required in QUIC socket management. This is
also linked with the reload feature.
Tim Duesterhus [Sun, 22 May 2022 10:40:58 +0000 (12:40 +0200)]
CLEANUP: tools: Clean up non-QUIC error message handling in str2sa_range()
If QUIC support is enabled both branches of the ternary conditional are
identical, upsetting Coverity. Move the full conditional into the non-QUIC
preprocessor branch to make the code more clear.
Willy Tarreau [Fri, 20 May 2022 21:31:51 +0000 (23:31 +0200)]
[RELEASE] Released version 2.6-dev11
Released version 2.6-dev11 with the following main changes :
- CI: determine actual LibreSSL version dynamically
- BUG/MEDIUM: ncbuf: fix null buffer usage
- MINOR: ncbuf: fix warnings for testing build
- MEDIUM: http-ana: Add a proxy option to restrict chars in request header names
- MEDIUM: ssl: Delay random generator initialization after config parsing
- MINOR: ssl: Add 'ssl-propquery' global option
- MINOR: ssl: Add 'ssl-provider' global option
- CLEANUP: Add missing header to ssl_utils.c
- CLEANUP: Add missing header to hlua_fcn.c
- CLEANUP: Remove unused function hlua_get_top_error_string
- BUILD: fix build warning on solaris based systems with __maybe_unused.
- MINOR: tools: add get_exec_path implementation for solaris based systems.
- BUG/MINOR: ssl: Fix crash when no private key is found in pem
- CLEANUP: conn-stream: Remove cs_applet_shut declaration from header file
- MINOR: applet: Prepare appctx to own the session on frontend side
- MINOR: applet: Let the frontend appctx release the session
- MINOR: applet: Change return value for .init callback function
- MINOR: stream: Export stream_free()
- MINOR: applet: Add appctx_init() helper fnuction
- MINOR: applet: Add a function to finalize frontend appctx startup
- MINOR: applet: Add function to release appctx on error during init stage
- MEDIUM: dns: Refactor dns appctx creation
- MEDIUM: spoe: Refactor SPOE appctx creation
- MEDIUM: lua: Refactor cosocket appctx creation
- MEDIUM: httpclient: Refactor http-client appctx creation
- MINOR: sink: Add a ref to sink in the sink_forward_target structure
- MEDIUM: sink: Refactor sink forwarder appctx creation
- MINOR: peers: Add a ref to peers section in the peer structure
- MEDIUM: peers: Refactor peer appctx creation
- MINOR: applet: Add API to start applet on a thread subset
- MEDIUM: applet: Add support for async appctx startup on a thread subset
- MINOR: peers: Track number of applets run by thread
- MEDIUM: peers: Balance applets across threads
- MINOR: conn-stream/applet: Stop setting appctx as the endpoint context
- CLEANUP: proxy: Remove dead code when parsing "http-restrict-req-hdr-names" option
- REGTESTS: abortonclose: Fix some race conditions
- MINOR: ssl: Add 'ssl-provider-path' global option
- CLEANUP: http_ana: Make use of the return value of stream_generate_unique_id()
- BUG/MINOR: spoe: Fix error handling in spoe_init_appctx()
- CLEANUP: peers: Remove unreachable code in peer_session_create()
- CLEANUP: httpclient: Remove useless test on ss_dst in httpclient_applet_init()
- BUG/MEDIUM: quic: fix Rx buffering
- OPTIM: quic: realign empty Rx buffer
- BUG/MINOR: ncbuf: fix ncb_is_empty()
- MINOR: ncbuf: refactor ncb_advance()
- BUG/MINOR: mux-quic: update session's idle delay before stream creation
- MINOR: h3: do not wait a complete frame for demuxing
- MINOR: h3: flag demux as full on HTX full
- MEDIUM: mux-quic: implement recv on io-cb
- MINOR: mux-quic: remove qcc_decode_qcs() call in XPRT
- MINOR: mux-quic: reorganize flow-control frames emission
- MINOR: mux-quic: implement MAX_STREAM_DATA emission
- MINOR: mux-quic: implement MAX_DATA emission
- BUG/MINOR: mux-quic: support nul buffer with qc_free_ncbuf()
- MINOR: mux-quic: free RX buf if empty
- BUG/MEDIUM: config: Reset outline buffer size on realloc error in readcfgfile()
- BUG/MINOR: check: Reinit the buffer wait list at the end of a check
- MEDIUM: check: No longer shutdown the connection in .wake callback function
- REORG: check: Rename and export I/O callback function
- MEDIUM: check: Use the CS to handle subscriptions for read/write events
- BUG/MINOR: quic: break for error on sendto
- MINOR: quic: abort on unlisted errno on sendto()
- MINOR: quic: detect EBADF on sendto()
- BUG/MEDIUM: quic: fix initialization for local/remote TPs
- CLEANUP: quic: adjust comment/coding style for TPs init
- BUG/MINOR: cfgparse: abort earlier in case of allocation error
- MINOR: quic: Dump initial derived secrets
- MINOR: quic_tls: Add quic_tls_derive_retry_token_secret()
- MINOR: quic_tls: Add quic_tls_decrypt2() implementation
- MINOR: quic: Retry implementation
- MINOR: cfgparse: Update for "cluster-secret" keyword for QUIC Retry
- MINOR: quic: Move quic_lstnr_dgram_dispatch() out of xprt_quic.c
- BUILD: stats: Missing headers inclusions from stats.h
- MINOR: quic_stats: Add a new stats module for QUIC
- MINOR: quic: Attach proxy QUIC stats counters to the QUIC connection
- BUG/MINOR: quic: Fix potential memory leak during QUIC connection allocations
- MINOR: quic: QUIC stats counters handling
- MINOR: quic: Add tune.quic.retry-threshold keyword
- MINOR: quic: Dynamic Retry implementation
- MINOR: quic/mux-quic: define CONNECTION_CLOSE send API
- MINOR: mux-quic: emit FLOW_CONTROL_ERROR
- MINOR: mux-quic: emit STREAM_LIMIT_ERROR
- MINOR: mux-quic: close connection on error if different data at offset
- BUG/MINOR: peers: fix error reporting of "bind" lines
- CLEANUP: config: improve address parser error report for unmatched protocols
- CLEANUP: config: provide cleare hints about unsupported QUIC addresses
- MINOR: protocol: replace ctrl_type with xprt_type and clarify it
- MINOR: listener: provide a function to process all of a bind_conf's arguments
- MINOR: config: use the new bind_parse_args_list() to parse a "bind" line
- CLEANUP: listener: add a comment about what the BC_SSL_O_* flags are for
- MINOR: listener: add a new "options" entry in bind_conf
- CLEANUP: listener: replace all uses of bind_conf->is_ssl with BC_O_USE_SSL
- CLEANUP: listener: replace bind_conf->generate_cers with BC_O_GENERATE_CERTS
- CLEANUP: listener: replace bind_conf->quic_force_retry with BC_O_QUIC_FORCE_RETRY
- CLEANUP: listener: store stream vs dgram at the bind_conf level
- MINOR: listener: detect stream vs dgram conflict during parsing
- MINOR: listener: set the QUIC xprt layer immediately after parsing the args
- MINOR: listener/ssl: set the SSL xprt layer only once the whole config is known
- MINOR: connection: add flag MX_FL_FRAMED to mark muxes relying on framed xprt
- MINOR: config: detect and report mux and transport incompatibilities
- MINOR: listener: automatically select a QUIC mux with a QUIC transport
- MINOR: listener: automatically enable SSL if a QUIC transport is found
- BUG/MINOR: quic: Fixe a typo in qc_idle_timer_task()
- BUG/MINOR: quic: Missing <conn_opening> stats counter decrementation
- BUILD/MINOR: cpuset fix build for FreeBSD 13.1
- CI: determine actual OpenSSL version dynamically
David CARLIER [Wed, 18 May 2022 14:45:40 +0000 (15:45 +0100)]
BUILD/MINOR: cpuset fix build for FreeBSD 13.1
the cpuset api changes done fir the future 14 release had been
backported to the 13.1 release so changing the cpuset api of choice
condition change accordingly.
When we receive a CONNECTION_CLOSE frame, we should decrement this counter
if the handshake state was not successful and if we have not received
a TLS alert from the TLS stack.
[WARNING] (17867) : config : Proxy 'decrypt': A certificate was specified but SSL was not enabled on bind 'quic4@:4449' at [quic-mini.cfg:24] (use 'ssl').
Let's automatically turn SSL on when QUIC is detected, as it doesn't
exist without SSL anyway. It solves the runtime issue, and also makes
sure it is not possible to accidentally configure a quic listener with
no certificate since the error is detected via the SSL checks.
A warning is emitted in this case, to encourage the user to fix the
configuration so that it remains reviewable.
Willy Tarreau [Fri, 20 May 2022 16:07:06 +0000 (18:07 +0200)]
MINOR: listener: automatically select a QUIC mux with a QUIC transport
When no mux protocol is configured on a bind line with "proto", and the
transport layer is QUIC, right now mux_h1 is being used, leading to a
crash.
Now when the transport layer of the bind line is already known as being
QUIC, let's automatically try to configure the QUIC mux, so that users
do not have to enter "proto quic" all the time while it's the only
supported option. this means that the following line now works:
Willy Tarreau [Fri, 20 May 2022 15:53:32 +0000 (17:53 +0200)]
MINOR: config: detect and report mux and transport incompatibilities
Till now, placing "proto h1" or "proto h2" on a "quic" bind or placing
"proto quic" on a TCP line would parse fine but would crash when traffic
arrived. The reason is that there's a strong binding between the QUIC
mux and QUIC transport and that they're not expected to be called with
other types at all.
Now that we have the mux's type and we know the type of the protocol used
on the bind conf, we can perform such checks. This now returns:
[ALERT] (16978) : config : frontend 'decrypt' : stream-based MUX protocol 'h2' is incompatible with framed transport of 'bind quic4@:4448' at [quic-mini.cfg:27].
[ALERT] (16978) : config : frontend 'decrypt' : frame-based MUX protocol 'quic' is incompatible with stream transport of 'bind :4448' at [quic-mini.cfg:29].
This config tightening is only tagged MINOR since while such a config,
despite not reporting error, cannot work at all so even if it breaks
experimental configs, they were just waiting for a single connection
to crash.
MINOR: connection: add flag MX_FL_FRAMED to mark muxes relying on framed xprt
In order to be able to check compatibility between muxes and transport
layers, we'll need a new flag to tag muxes that work on framed transport
layers like QUIC. Only QUIC has this flag now.
Willy Tarreau [Fri, 20 May 2022 15:14:31 +0000 (17:14 +0200)]
MINOR: listener/ssl: set the SSL xprt layer only once the whole config is known
We used to preset XPRT_SSL on bind_conf->xprt when parsing the "ssl"
keyword, which required to be careful about what QUIC could have set
before, and which makes it impossible to consider the whole line to
set all options.
Now that we have the BC_O_USE_SSL option on the bind_conf, it becomes
easier to set XPRT_SSL only once the bind_conf's args are parsed.
Willy Tarreau [Fri, 20 May 2022 15:10:00 +0000 (17:10 +0200)]
MINOR: listener: set the QUIC xprt layer immediately after parsing the args
It used to be set when parsing the listeners' addresses but this comes
with some difficulties in that other places have to be careful not to
replace it (e.g. the "ssl" keyword parser).
Now we know what protocols a bind_conf line relies on, we can set it
after having parsed the whole line.
Willy Tarreau [Fri, 20 May 2022 14:20:52 +0000 (16:20 +0200)]
MINOR: listener: detect stream vs dgram conflict during parsing
Now that we have a function to parse all bind keywords, and that we
know what types of sock-level and xprt-level protocols a bind_conf
is using, it's easier to centralize the check for stream vs dgram
conflict by putting it directly at the end of the args parser. This
way it also works for peers, provides better precision in the report,
and will also allow to validate transport layers. The check was even
extended to detect inconsistencies between xprt layer (which were not
covered before). It can even detect that there are two incompatible
"bind" lines in a single peers section.
Willy Tarreau [Fri, 20 May 2022 14:15:01 +0000 (16:15 +0200)]
CLEANUP: listener: store stream vs dgram at the bind_conf level
Let's collect the set of xprt-level and sock-level dgram/stream protocols
seen on a bind line and store that in the bind_conf itself while they're
being parsed. This will make it much easier to detect incompatibilities
later than the current approch which consists in scanning all listeners
in post-parsing.
Willy Tarreau [Fri, 20 May 2022 13:52:31 +0000 (15:52 +0200)]
MINOR: listener: add a new "options" entry in bind_conf
There is no way to store useful info there, yet there's about one entry
per boolean. Let's add an "options" attribute which will collect various
options.
In practice, even the BC_O_SSL_* flags and a few info such as strict_sni
could move there.
Willy Tarreau [Fri, 20 May 2022 13:44:17 +0000 (15:44 +0200)]
MINOR: config: use the new bind_parse_args_list() to parse a "bind" line
This now makes sure that both the peers' "bind" line and the regular one
will use the exact same parser with the exact same behavior. Note that
the parser applies after the address and that it could be factored
further, since the peers one still does quite a bit of duplicated work.
Willy Tarreau [Fri, 20 May 2022 13:41:45 +0000 (15:41 +0200)]
MINOR: listener: provide a function to process all of a bind_conf's arguments
The "bind" parsing code was duplicated for the peers section and as a
result it wasn't kept updated, resulting in slightly different error
behavior (e.g. errors were not freed, warnings were emitted as alerts)
Let's first unify it into a new dedicated function that properly reports
and frees the error.
Willy Tarreau [Fri, 20 May 2022 14:36:46 +0000 (16:36 +0200)]
MINOR: protocol: replace ctrl_type with xprt_type and clarify it
There's been some great confusion between proto_type, ctrl_type and
sock_type. It turns out that ctrl_type was improperly chosen because
it's not the control layer that is of this or that type, but the
transport layer, and it turns out that the transport layer doesn't
(normally) denaturate the underlying control layer, except for QUIC
which turns dgrams to streams. The fact that the SOCK_{DGRAM|STREAM}
set of values was used added to the confusion.
Let's replace it with xprt_type which reuses the later introduced
PROTO_TYPE_* values, and update the comments to explain which one
works at what level.
Willy Tarreau [Fri, 20 May 2022 13:19:48 +0000 (15:19 +0200)]
BUG/MINOR: peers: fix error reporting of "bind" lines
In case the str2listener() parser reports a generic error with no message
when parsing the argument of a "bind" statement in a "peers" section, the
reported error indicates an invalid address on the empty arg. This has
existed since 2.0 with commit 355b2033e ("MINOR: cfgparse: SSL/TLS binding
in "peers" sections."), so this must be backported till 2.0.
Amaury Denoyelle [Fri, 20 May 2022 13:14:57 +0000 (15:14 +0200)]
MINOR: mux-quic: close connection on error if different data at offset
As specified by the RFC reception of different STREAM data for the same
offset should be treated with a CONNECTION_CLOSE with error
PROTOCOL_VIOLATION.
Use ncbuf API to detect this case : if add operation fails with
NCB_RET_DATA_REJ with add mode NCB_ADD_COMPARE.
Amaury Denoyelle [Fri, 20 May 2022 14:45:32 +0000 (16:45 +0200)]
MINOR: mux-quic: emit STREAM_LIMIT_ERROR
Send a CONNECTION_CLOSE on reception of a STREAM frame for a STREAM id
exceeding the maximum value enforced. Only implemented for bidirectional
streams for the moment.
Amaury Denoyelle [Fri, 20 May 2022 13:05:07 +0000 (15:05 +0200)]
MINOR: mux-quic: emit FLOW_CONTROL_ERROR
Send a CONNECTION_CLOSE if the peer emits more data than authorized by
our flow-control. This is implemented for both stream and connection
level.
Fields have been added in qcc/qcs structures to differentiate received
offsets for limit enforcing with consumed offsets for sending of
MAX_DATA/MAX_STREAM_DATA frames.
Amaury Denoyelle [Fri, 20 May 2022 13:04:38 +0000 (15:04 +0200)]
MINOR: quic/mux-quic: define CONNECTION_CLOSE send API
Define an API to easily set a CONNECTION_CLOSE. This will mainly be
useful for the MUX when an error is detected which require to close the
whole connection.
On the MUX side, a new flag is added when a CONNECTION_CLOSE has been
prepared. This will disable add future send operations.
We rely on <conn_opening> stats counter and tune.quic.retry_threshold
setting to dynamically start sending Retry packets. We continue to send such packets
when "quic-force-retry" setting is set. The difference is when we receive tokens.
We check them regardless of this setting because the Retry could have been
dynamically started. We must also send Retry packets when we receive Initial
packets without token if the dynamic Retry threshold was reached but only for connection
which are not currently opening or in others words for Initial packets without
connection already instantiated. Indeed, we must not send Retry packets for all
Initial packets without token. For instance a client may have already sent an
Initial packet without receiving Retry packet because the Retry feature was not
started, then the Retry starts on exeeding the threshold value due to others
connections, then finally our client decide to send another Initial packet
(to ACK Initial CRYPTO data for instance). It does this without token. So, for
this already existing connection we must not send a Retry packet.
This QUIC specific keyword may be used to set the theshold, in number of
connection openings, beyond which QUIC Retry feature will be automatically
enabled. Its default value is 100.
First commit to handle the QUIC stats counters. There is nothing special to say
except perhaps for ->conn_openings which is a gauge to count the number of
connection openings. It is incremented after having instantiated a quic_conn
struct, then decremented when the handshake was successful (handshake completed
state) or failed or when the connection timed out without reaching the handshake
completed state.
BUG/MINOR: quic: Fix potential memory leak during QUIC connection allocations
Move the code which finalizes the QUIC connections initialisations after
having called qc_new_conn() into this function to benefit from its
error handling to release the memory allocated for QUIC connections
the initialization of which could not be finalized.
BUILD: stats: Missing headers inclusions from stats.h
If we add a new stats module to C source files including only
stats.h we get these errors:
include/haproxy/stats.h:39:31: error: array type has incomplete element type
‘struct name_desc’
39 | extern const struct name_desc stat_fields[];
include/haproxy/stats.h:55:50: warning: ‘struct listener’ declared inside
parameter list will not be visible outside of this definition or declaration
55 | int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags,
name_desc struct is defined in tools-t.h and listener struct in listner-t.h.
Here is the format of a token:
- format (1 byte)
- ODCID (from 9 up 21 bytes)
- creation timestamp (4 bytes)
- salt (16 bytes)
A format byte is required to distinguish the Retry token from others sent in
NEW_TOKEN frames.
The Retry token is ciphered after having derived a strong secret from the cluster secret
and generated the AEAD AAD, as well as a 16 bytes long salt. This salt is
added to the token. Obviously it is not ciphered. The format byte is not
ciphered too.
The AAD are built by quic_generate_retry_token_aad() which concatenates the version,
the client SCID and the IP address and port. We had to implement quic_saddr_cpy()
to copy the IP address and port to the AAD buffer. Only the Retry SCID is generated
on our side to build a Retry packet, the others fields come from the first packet
received by the client. It must reuse this Retry SCID in response to our Retry packet.
So, we have not to store it on our side. Everything is offloaded to the client (stateless).
quic_generate_retry_token() must be used to generate a Retry packet. It calls
quic_pkt_encrypt() to cipher the token.
quic_generate_retry_check() must be used to check the validity of a Retry token.
It is able to decipher a token which arrives into an Initial packet in response
to a Retry packet. It calls parse_retry_token() after having deciphered the token
to store the ODCID into a local quic_cid struct variable. Finally this ODCID may
be stored into the transport parameter thanks to qc_lstnr_params_init().
The Retry token lifetime is 10 seconds. This lifetime is also checked by
quic_generate_retry_check(). If quic_generate_retry_check() fails, the received
packet is dropped without anymore packet processing at this time.
This function does exactly the same thing as quic_tls_decrypt(), except that
it does reuse its input buffer as output buffer. This is needed
to decrypt the Retry token without modifying the packet buffer which
contains this token. Indeed, this would prevent us from decryption
the packet itself as the token belong to the AEAD AAD for the packet.
This function must be used to derive strong secrets from a non pseudo-random
secret (cluster-secret setting in our case) and an IV. First it call
quic_hkdf_extract_and_expand() to do that for a temporary strong secret (tmpkey)
then two calls to quic_hkdf_expand() reusing this strong temporary secret
to derive the final strong secret and IV.
Willy Tarreau [Fri, 20 May 2022 07:13:38 +0000 (09:13 +0200)]
BUG/MINOR: cfgparse: abort earlier in case of allocation error
In issue #1563, Coverity reported a very interesting issue about a
possible UAF in the config parser if the config file ends in with a
very large line followed by an empty one and the large one causes an
allocation failure.
The issue essentially is that we try to go on with the next line in case
of allocation error, while there's no point doing so. If we failed to
allocate memory to read one config line, the same may happen on the next
one, and blatantly dropping it while trying to parse what follows it. In
the best case, subsequent errors will be incorrect due to this prior error
(e.g. a large ACL definition with many patterns, followed by a reference of
this ACL).
Let's just immediately abort in such a condition where there's no recovery
possible.
This may be backported to all versions once the issue is confirmed to be
addressed.
Amaury Denoyelle [Thu, 19 May 2022 14:45:37 +0000 (16:45 +0200)]
BUG/MEDIUM: quic: fix initialization for local/remote TPs
The local and remote TPs were both processed through the same function
quic_transport_params_init(). This caused the remote TPs to be
overwritten with values configured for our local usage.
Change this by reserving quic_transport_params_init() only for our local
TPs. Remote TPs are simply initialized via
quic_dflt_transport_params_cpy().
This bug could result in a connection closed in error by the client due
to a violation of its TPs. For example, curl client closed the
connection after receiving too many CONNECTION_ID due to an invalid
active_connection_id value used.
Amaury Denoyelle [Wed, 18 May 2022 16:26:13 +0000 (18:26 +0200)]
MINOR: quic: abort on unlisted errno on sendto()
If an unlisted errno is reported, abort the process. If a crash is
reported on this condition, we must determine if the error code is a
bug, should interrupt emission on the fd or if we can retry the syscall.
Amaury Denoyelle [Wed, 18 May 2022 16:14:12 +0000 (18:14 +0200)]
BUG/MINOR: quic: break for error on sendto
If sendto returns an error, we should not retry the call and break from
the sending loop. An exception is made for EINTR which allows to retry
immediately the syscall.
This bug caused an infinite loop reproduced when the process is in the
closing state by SIGUSR1 but there is still QUIC data emission left.
MEDIUM: check: Use the CS to handle subscriptions for read/write events
Instead of using the health-check to subscribe to read/write events, we now
rely on the conn-stream. Indeed, on the server side, the conn-stream's
endpoint is a multiplexer. Thus it seems appropriate to handle subscriptions
for read/write events the same way than for the streams. Of course, the I/O
callback function is not the same. We use srv_chk_io_cb() instead of
cs_conn_io_cb().
REORG: check: Rename and export I/O callback function
event_srv_chk_io() function is renamed srv_chk_io_cb() to be consistant with
the I/O callback function of connections. In addition, this function is
exported. It will be required to use the conn-stream's subscriptions.
MEDIUM: check: No longer shutdown the connection in .wake callback function
The connection is already closed by the health-check itself. Thus there is
now reason to duplicate this part in the .wake callback function. It is
enough to wake the health-check and wait.
BUG/MINOR: check: Reinit the buffer wait list at the end of a check
The buffer wait list is used to deal with buffer allocation failure. But at
the end of health-check, it must be reinitialized. There is no reason to
reason to get a buffer between two health-check runs. And in fact, the
associated flags, CHK_ST_IN_ALLOC and CHK_ST_OUT_ALLOC, are already cleared
at the end of a health-check.
This patch must be backported as far as 2.2. On the 2.2, MT_LIST_ADDED and
MT_LIST_DEL must be used instead of LIST_INLIST and LIST_DEL_INIT.
BUG/MEDIUM: config: Reset outline buffer size on realloc error in readcfgfile()
When the line parsing failed because outline buffer must be reallocated, if
my_realloc2() call fails, the buffer size must be reset. Indeed, in this case
the current line is skipped, a fatal error is reported and we jump to the next
line. At this stage the outline buffer is NULL. If the buffer size is not reset,
the next call to parse_line() crashes because we try to write in the buffer. We
fail to detect the outline buffer is too small to copy any character.
To fix the issue, outlinesize variable must be set to 0 when outline allocation
failed.
This patch should fix the issue #1563. It must be backported as far as 2.2.
Amaury Denoyelle [Wed, 18 May 2022 14:19:47 +0000 (16:19 +0200)]
MINOR: mux-quic: free RX buf if empty
Release the QCS RX buffer if emptied afer qcs_consume(). This improves
memory usage and avoids a QCS to keep an allocated buffer, particularly
when no data is received anymore. Buffer is automatically reallocated if
needed via qc_get_ncbuf().
Amaury Denoyelle [Tue, 17 May 2022 16:53:21 +0000 (18:53 +0200)]
BUG/MINOR: mux-quic: support nul buffer with qc_free_ncbuf()
qc_free_ncbuf() may now be used with a NCBUF_NULL buffer as parameter.
This is useful when using this function on a QCS with no allocated
buffer. This case was not reproduced for the moment, but it will soon
become more present as buffers will be released if emptied.
Also a call to offer_buffers() is added to conform with the dynamic
buffer management of haproxy.
Amaury Denoyelle [Mon, 16 May 2022 14:19:59 +0000 (16:19 +0200)]
MINOR: mux-quic: implement MAX_DATA emission
This commit is similar to the previous one but deals with MAX_DATA for
connection-level data flow control. It uses the same function
qcc_consume_qcs() to update flow control level and generate a MAX_DATA
frame if needed.
Send MAX_STREAM_DATA frames when at least half of the allocated
flow-control has been demuxed, frame and cleared. This is necessary to
support QUIC STREAM with received data greater than a buffer.
Transcoders must use the new function qcc_consume_qcs() to empty the QCS
buffer. This will allow to monitor current flow-control level and
generate a MAX_STREAM_DATA frame if required. This frame will be emitted
via qc_io_cb().
Adjust the mechanism for MAX_STREAMS_BIDI emission. When a bidirectional
stream is removed, current flow-control level is checked. If needed, a
MAX_STREAMS_BIDI frame is generated and inserted in a new list in the
QCS instance. The new frames will be emitted at the start of qc_send().
This has no impact on the current MAX_STREAMS_BIDI behavior. However,
this mechanism is more flexible and will allow to implement quickly
MAX_STREAM_DATA/MAX_DATA emission.
Amaury Denoyelle [Wed, 18 May 2022 09:38:22 +0000 (11:38 +0200)]
MINOR: mux-quic: remove qcc_decode_qcs() call in XPRT
Slightly change the interface for qcc_recv() between MUX and XPRT. The
MUX is now responsible to call qcc_decode_qcs(). This is cleaner as now
the XPRT does not have to deal with an extra QCS parameter and the MUX
will call qcc_decode_qcs() only if really needed.
This change is possible since there is no extra buffering for
out-of-order STREAM frames and the XPRT does not have to handle buffered
frames.
Amaury Denoyelle [Mon, 16 May 2022 11:54:59 +0000 (13:54 +0200)]
MEDIUM: mux-quic: implement recv on io-cb
Previously, qc_io_cb() of mux-quic only dealt with TX. Add support for
RX in it. This is done through a new function qc_recv(qcc). It loops
over all QCS instances and call qcc_decode_qcs(qcs).
This has no impact from the quic-conn layer as qcc_decode_qcs(qcs) is
called directly. However, this allows to have a resume point when demux
is blocked on the upper layer HTX full buffer.
Note that for the moment, only RX for bidirectional streams is managed
in qc_io_cb(). Unidirectional streams use their own mechanism for both
TX/RX. It should be unified in the near future in a refactoring.
Amaury Denoyelle [Mon, 16 May 2022 11:54:31 +0000 (13:54 +0200)]
MINOR: h3: flag demux as full on HTX full
Flag QCS if HTX buffer is full on demux. This will block all future
operations on QCS demux and should limit unnecessary decode_qcs() calls.
The flag is cleared on rcv_buf operation called by conn-stream.
Amaury Denoyelle [Thu, 12 May 2022 14:56:16 +0000 (16:56 +0200)]
MINOR: h3: do not wait a complete frame for demuxing
Previously, H3 demuxer refused to proceed the payload if the frame was
not entirely received and the QCS buffer is not full. This code was
duplicated from the H2 demuxer.
In H2, this is a justified optimization as only one frame at a time can
be demuxed. However, this is not the case in H3 with interleaved frames
in the lower layer QUIC STREAM frames.
This condition is now removed. H3 demuxer will proceed payload as soon
as possible. An exception is kept for HEADERS frame as the code is not
able to deal with partial HEADERS.
With this change, H3 demuxer should consume less memory. To ensure that
we never received a HEADER bigger than the RX buffer, we should use the
H3 SETTINGS_MAX_FIELD_SECTION_SIZE.
Amaury Denoyelle [Tue, 17 May 2022 16:03:37 +0000 (18:03 +0200)]
BUG/MINOR: mux-quic: update session's idle delay before stream creation
This commit is an adaptation from the following patch :
commit d0de6776826ee18da74e6949752e2f44cba8fdf2
Author: Willy Tarreau <w@1wt.eu>
Date: Fri Feb 4 09:05:37 2022 +0100
BUG/MINOR: mux-h2: update the session's idle delay before creating the stream
This should fix the incorrect timeouts present in httplog format for
QUIC requests.
Amaury Denoyelle [Tue, 17 May 2022 16:52:39 +0000 (18:52 +0200)]
MINOR: ncbuf: refactor ncb_advance()
First adjusted some typos in comments inside the function. Second,
change the naming of some variable to reduce confusion.
A special case has been inserted when advance is done inside a GAP block
and this block is the last of the buffer. In this case, the whole buffer
will be emptied, equivalent to a ncb_init() operation.
Amaury Denoyelle [Tue, 17 May 2022 16:52:22 +0000 (18:52 +0200)]
BUG/MINOR: ncbuf: fix ncb_is_empty()
ncb_is_empty() was plainly incorrect as it directly dereferences the
memory to read offset blocks instead of ncb_read_off(). The result is
undefined.
Also, BUG_ON() statement is wrong when the buffer starts with a data
block. In this case, ncb_head() is not the first gap offset but instead
just random data. The calculated sum in BUG_ON() statement has thus no
meaning and may cause an abort. Adjust this by reorganizing the whole
function. Only the first data block size is read. If and only if not
nul, the first gap size is then checked.
ncb_is_full() has been rewritten to share the same model as
ncb_is_empty().
Amaury Denoyelle [Tue, 17 May 2022 13:01:25 +0000 (15:01 +0200)]
OPTIM: quic: realign empty Rx buffer
quic_rx_pkts_del() function removes packets from QUIC RX buffer. In most
cases, the buffer will be emptied after it. In this case, it's useful to
realign it. This will avoid future data wrapping and use of an
unnecessary junk to fill a too small contiguous space.
Amaury Denoyelle [Mon, 16 May 2022 16:13:56 +0000 (18:13 +0200)]
BUG/MEDIUM: quic: fix Rx buffering
The quic-conn manages a buffer to store received QUIC packets. When the
buffer wraps, the gap is filled until the end with junk and packets can
be inserted at the start of the buffer.
On the other end, deletion is implemented via quic_rx_pkts_del().
Packets are removed one by one if their refcount is nul. If junk is
found, the buffer is emptied until its wrap.
This seems to work in most cases but a bug was found in a particular
case : on insertion if buffer gap is not at the end of the buffer. In
this case, the gap was filled, which is useless as now the buffer is
full and the packet cannot be inserted. Worst, on deletion, when junk is
removed there is a risk to removed new packets. This can happens in the
following case :
1. buffer contig space is too small, junk is inserted in the middle of
it
2. on quic_rx_pkts_del() invocation, a packet is removed, but not the
next one because its refcount is still positive. When a new packet is
received, it will be stored after the junk.
3. on next quic_rx_pkts_del(), when junk is removed, all contig data is
cleared, with newer packets data too.
This will cause a transfer between a client and haproxy to be stalled.
This can be reproduced with big enough POST requests. I triggered it
with ngtcp2 and 10M of posted data.
Hopefully, the solution of this bug is simple. If contig space is not
big enough to store a packet, but the space is not at the end of the
buffer, no junk is inserted and the packet is dropped as we cannot
buffered it. This ensures that junk is only present at the end of the
buffer and when removed no packets data is purged with it.
Tim Duesterhus [Tue, 17 May 2022 22:22:15 +0000 (00:22 +0200)]
CLEANUP: http_ana: Make use of the return value of stream_generate_unique_id()
Even if `unique_id` and `s->unique_id` are identical it is a bit odd to
`isttest()` `unique_id` and then use `s->unique_id` in the call to `http_add_header()`.
This "issue" was introduced in a17e66289c08a5bfadc1bb5b5f2c618c9299fe1b,
because before that commit the function returned the length of the ID, as it
was not an ist.
When loading providers with 'ssl-provider' global options, this
ssl-provider-path option can be used to set the search path that is to
be used by openssl. It behaves the same way as the OPENSSL_MODULES
environment variable.
Depending on the timing, the second client that should be reported as a
client abort during connection attempt ("CC--" termination state) is
sometime logged as a server close ("SC--" termination state) instead. It
happens because sometime the connection failure to the server s1 is detected
by haproxy before the client c2 aborts. There is no retries and the
connection timeout is set to 100ms. So, to work, the client abort must be
performed and detected by haproxy in less than 100ms.
To fix the issue, the c2 client is now routed to a backend with a connection
timeout set to 1 second and 10 retries. It should be large enough to detect
the client aborts (~10s)
In addition, there is another race condition when the script is
started. sometime, server s1 is not stopped when the first client sends its
request. So a barrier was added to be sure it is stopped before starting to
send requests. And we wait to be sure the server is detected as DOWN to
unblock the barrier. It is performed by a dedicated backend with an
healthcheck on the server s1.
CLEANUP: proxy: Remove dead code when parsing "http-restrict-req-hdr-names" option
negation or default modifiers are not supported for this option. However,
this was already tested earlier in cfg_parse_listen() function. Thus, when
"http-restrict-req-hdr-names" option is parsed, the keyword modifier is
always equal to KWM_STD. It is useless to test it again at this place.
MINOR: conn-stream/applet: Stop setting appctx as the endpoint context
The appctx is already the endpoint target. It is confusing to also use it to
set the endpoint context. So, never set the endpoint ctx when an appctx is
created or attached to an existing conn-stream.
MEDIUM: applet: Add support for async appctx startup on a thread subset
It is now possible to start an appctx on a thread subset. Some controls were
added here and there. It is forbidden to start a backend appctx on another
thread than the local one. If a frontend appctx is started on another thread
or a thread subset, the applet .init callback function must be defined. This
callback function is responsible to finalize the appctx startup. It can be
performed synchornously. In this case, the appctx is started on the local
thread. It is not really useful but it is valid. Or it can be performed
asynchronously. In this case, .init callback function is called when the
appctx is woken up for the first time. When this happens, the appctx
affinity is set to the current thread to be able to start the session and
the stream.
MINOR: applet: Add API to start applet on a thread subset
In the same way than for the tasks, the applets api was changed to be able
to start a new appctx on a thread subset. For now the feature is
disabled. Only appctx_new_here() is working. But it will be possible to
start an appctx on a specific thread or a subset via a mask.
A .init callback function is defined for the peer_applet applet. This
function finishes the appctx startup by calling appctx_finalize_startup()
and its handles the stream customization.
A .init callback function is defined for the sink_forward_applet applet.
This function finishes the appctx startup by calling
appctx_finalize_startup() and its handles the stream customization.
A .init callback function is defined for the httpclient_applet applet. This
function finishes the appctx startup by calling appctx_finalize_startup()
and its handles the stream customization.
A .init callback function is defined for the update_applet applet. This
function finishes the appctx startup by calling appctx_finalize_startup()
and its handles the stream customization.
A .init callback function is defined for the spoe_applet applet. This
function finishes the spoe_appctx initialization. It also finishes the
appctx startup by calling appctx_finalize_startup() and its handles the
stream customization.
A .init callback function is defined for the dns_session_applet applet. This
function finishes the appctx startup by calling appctx_finalize_startup()
and its handles the stream customization.
MINOR: applet: Add function to release appctx on error during init stage
appctx_free_on_early_error() must be used to release a freshly created
frontend appctx if an error occurred during the init stage. It takes care to
release the stream instead of the appctx if it exists. For a backend appctx,
it just calls appctx_free().
MINOR: applet: Add a function to finalize frontend appctx startup
appctx_finalize_startup() may be used to finalize the frontend appctx
startup. It is responsible to create the appctx's session and the frontend
conn-stream. On error, it is the caller responsibility to release the
appctx. However, the session is released if it was created. On success, if
an error is encountered in the caller function, the stream must be released
instead of the appctx.
This function should ease the init stage when new appctx is created.
It is just a helper function that call the .init applet callback function,
if it exists. This will simplify a bit the init stage when a new applet is
started. For now, this callback function is only used when a new service is
started.