Takes a string representing 7239 forwarded header node (extracted from
either 'for' or 'by' 7239 header fields) as input and translates it
to either unsigned integer or ('_' prefixed obfuscated identifier),
according to 7239RFC.
Example:
# extract 'by' field from forwarded header, extract node port from
# resulting node identifier and store the result in req.fnp
http-request set-var(req.fnp) req.hdr(forwarded),rfc7239_field(by),rfc7239_n2np
#input: "by=\"127.0.0.1:9999\""
# output: 9999
#input: "by=\"_name:_port\""
# output: "_port"
Depends on:
- "MINOR: http_ext: introduce http ext converters"
Takes a string representing 7239 forwarded header node (extracted from
either 'for' or 'by' 7239 header fields) as input and translates it
to either ipv4 address, ipv6 address or str ('_' prefixed if obfuscated
or "unknown" if unknown), according to 7239RFC.
Example:
# extract 'for' field from forwarded header, extract nodename from
# resulting node identifier and store the result in req.fnn
http-request set-var(req.fnn) req.hdr(forwarded),rfc7239_field(for),rfc7239_n2nn
#input: "for=\"127.0.0.1:9999\""
# output: 127.0.0.1
#input: "for=\"_name:_port\""
# output: "_name"
Depends on:
- "MINOR: http_ext: introduce http ext converters"
Takes a string representing 7239 forwarded header single value as
input and extracts a single field/parameter from the header according
to user selection.
Example:
# extract host field from forwarded header and store it in req.fhost var
http-request set-var(req.fhost) req.hdr(forwarded),rfc7239_field(host)
#input: "proto=https;host=\"haproxy.org:80\""
# output: "haproxy.org:80"
# extract for field from forwarded header and store it in req.ffor var
http-request set-var(req.ffor) req.hdr(forwarded),rfc7239_field(for)
#input: "proto=https;host=\"haproxy.org:80\";for=\"127.0.0.1:9999\""
# output: "127.0.0.1:9999"
Depends on:
- "MINOR: http_ext: introduce http ext converters"
This commit is really simple, it adds the required skeleton code to allow
new http_ext converter to be easily registered through STG_REGISTER facility.
Add a new test to prevent any regression for the if-none parameter in
the "forwardfor" proxy option.
This will ensure upcoming refactors don't break reference behavior.
Introducing http_ext class for http extension related work that
doesn't fit into existing http classes.
HTTP extension "forwarded", introduced with 7239 RFC is now supported
by haproxy.
The option supports various modes from simple to complex usages involving
custom sample expressions.
Examples :
# Those servers want the ip address and protocol of the client request
# Resulting header would look like this:
# forwarded: proto=http;for=127.0.0.1
backend www_default
mode http
option forwarded
#equivalent to: option forwarded proto for
# Those servers want the requested host and hashed client ip address
# as well as client source port (you should use seed for xxh32 if ensuring
# ip privacy is a concern)
# Resulting header would look like this:
# forwarded: host="haproxy.org";for="_000000007F2F367E:60138"
backend www_host
mode http
option forwarded host for-expr src,xxh32,hex for_port
# Those servers want custom data in host, for and by parameters
# Resulting header would look like this:
# forwarded: host="host.com";by=_haproxy;for="[::1]:10"
backend www_custom
mode http
option forwarded host-expr str(host.com) by-expr str(_haproxy) for for_port-expr int(10)
# Those servers want random 'for' obfuscated identifiers for request
# tracing purposes while protecting sensitive IP information
# Resulting header would look like this:
# forwarded: for=_000000002B1F4D63
backend www_for_hide
mode http
option forwarded for-expr rand,hex
By default (no argument provided), forwarded option will try to mimic
x-forward-for common setups (source client ip address + source protocol)
The option is not available for frontends.
no option forwarded is supported.
More info about 7239 RFC here: https://www.rfc-editor.org/rfc/rfc7239.html
More info about the feature in doc/configuration.txt
This should address feature request GH #575
Depends on:
- "MINOR: http_htx: add http_append_header() to append value to header"
- "MINOR: sample: add ARGC_OPT"
- "MINOR: proxy: introduce http only options"
This commit is innoffensive but will allow to do some code refactors in
existing proxy http options. Newly created http related proxy options
will also benefit from this.
MINOR: http_htx: add http_append_header() to append value to header
Calling this function as an alternative to http_replace_header_value()
to append a new value to existing header instead of replacing the whole
header content.
If the header already contains one or multiple values: a ',' is automatically
appended before the new value.
This function is not meant for prepending (providing empty ctx value),
in which case we should consider implementing dedicated prepend
alternative function.
Since 7d84439 ("BUILD: hpack: include global.h for the trash that is needed
in debug mode"), hpack decode tool fails to compile on targets that enable
USE_THREAD. (ie: linux-glibc target as reported by Christian Ruppert)
When building hpack devtool, we are including src/hpack-dec.c as a dependency.
src/hpack-dec.c relies on the global trash whe debug mode is enabled.
But as we're building hpack tool with a limited scope of haproxy
sources, global trash (which is declared in src/chunk.c) is not available.
Thus, src/hpack-dec.c relies on a local 'trash' variable declared within
dev/hpack/decode.c
This used to work fine until 7d84439.
But now that global.h is explicitely included in src/hpack-dec.c,
trash variable definition from decode.c conflicts with the one from global.h:
In file included from include/../src/hpack-dec.c:35,
from dev/hpack/decode.c:87:
include/haproxy/global.h:52:35: error: thread-local declaration of 'trash' follows non-thread-local declaration
52 | extern THREAD_LOCAL struct buffer trash;
Adding THREAD_LOCAL attribute to 'decode.c' local trash variable definition
makes the compiler happy again.
This should fix GH issue #2009 and should be backported to 2.7.
Willy Tarreau [Thu, 26 Jan 2023 15:02:01 +0000 (16:02 +0100)]
CLEANUP: mux-h2/trace: shorten the name of the header enc/dec functions
The functions in charge of processing headers have their names in the
traces and they're among the longest of the mux_h2.c file, while even
containing some redundancy. These names are not used outside, let's
shorten them:
Willy Tarreau [Tue, 24 Jan 2023 18:43:11 +0000 (19:43 +0100)]
MEDIUM: mux-h2/trace: add tracing support for headers
Now we can make use of TRACE_PRINTF() to iterate over headers as they
are received or dumped. It's worth noting that the dumps may occasionally
be interrupted due to a buffer full or a realign, but in this case it
will be visible because the trace will restart from the first one. All
these headers (and trailers) may be interleaved with other connections'
so they're all preceeded by the pointer to the connection and optionally
the stream (or alternately the stream ID) to help discriminating them.
Since it's not easy to read the header directions, sent headers are
prefixed with "sndh" and received headers are prefixed with "rcvh", both
of which are rare enough in the traces to conveniently support a quick
grep.
In order to avoid code duplication, h2_encode_headers() was implemented
as a wrapper on top of hpack_encode_header(), which optionally emits the
header to the trace if the trace is active. In addition, for headers that
are encoded using a different method, h2_trace_header() was added as well.
Header names are truncated to 256 bytes and values to 1024 bytes. If
the lengths are larger, they will be truncated and suffixed with
"(... +xxx)" where "xxx" is the number of extra bytes.
Willy Tarreau [Thu, 26 Jan 2023 08:38:53 +0000 (09:38 +0100)]
MINOR: h2: add h2_phdr_to_ist() to make ISTs from pseudo headers
Till now pseudo headers were passed as const strings, but having them as
ISTs will be more convenient for traces. This doesn't change anything for
strings which are derived from them (and being constants they're still
zero-terminated).
Willy Tarreau [Tue, 24 Jan 2023 17:23:59 +0000 (18:23 +0100)]
MINOR: trace: add the long awaited TRACE_PRINTF()
TRACE_PRINTF() can be used to produce arbitrary trace contents at any
trace level. It uses the exact same arguments as other TRACE_* macros,
but here they are mandatory since they are followed by the format-string,
though they may be filled with zeroes. The reason for the arguments is to
match tracking or filtering and not pollute other non-inspected objects.
It will probably be used inside loops, in which case there are two points
to be careful about:
- output atomicity is only per-message, so competing threads may see
their messages interleaved. As such, it is recommended that the
caller places a recognizable unique context at the beginning of the
message such as a connection pointer.
- iterating over arrays or lists for all requests could be very
expensive. In order to avoid this it is best to condition the call
via TRACE_ENABLED() with the same arguments, which will return the
same decision.
- messages longer than TRACE_MAX_MSG-1 (1023 by default) will be
truncated.
For example, in order to dump the list of HTTP headers between hpack
and h2:
if (outlen > 0 &&
TRACE_ENABLED(TRACE_LEVEL_DEVELOPER,
H2_EV_RX_FRAME|H2_EV_RX_HDR, h2c->conn, 0, 0, 0)) {
int i;
for (i = 0; list[i].n.len; i++)
TRACE_PRINTF(TRACE_LEVEL_DEVELOPER, H2_EV_RX_FRAME|H2_EV_RX_HDR,
h2c->conn, 0, 0, 0, "h2c=%p hdr[%d]=%s:%s", h2c, i,
list[i].n.ptr, list[i].v.ptr);
}
In addition, a lower-level TRACE_PRINTF_LOC() macro is provided, that takes
two extra arguments, the caller's location and the caller's function name.
This will allow to emit composite traces from central functions on the
behalf of another one.
Willy Tarreau [Tue, 24 Jan 2023 17:03:07 +0000 (18:03 +0100)]
MINOR: trace: add a trace_no_cb() dummy callback for when to use no callback
By default, passing a NULL cb to the trace functions will result in the
source's default one to be used. For some cases we won't want to use any
callback at all, not event the default one. Let's define a trace_no_cb()
function for this, that does absolutely nothing.
Willy Tarreau [Tue, 24 Jan 2023 16:48:53 +0000 (17:48 +0100)]
MINOR: trace: add a TRACE_ENABLED() macro to determine if a trace is active
Sometimes it would be necessary to prepare some messages, pre-process some
blocks or maybe duplicate some contents before they vanish for the purpose
of tracing them. However we don't want to do that for everything that is
submitted to the traces, it's important to do it only for what will really
be traced.
The __trace() function has all the knowledge for this, to the point of even
checking the lockon pointers. This commit splits the function in two, one
with the trace decision logic, and the other one for the trace production.
The first one is now usable through wrappers such as _trace_enabled() and
TRACE_ENABLED() which will indicate whether traces are going to be produced
for the current source, level, event mask, parameters and tracking.
Willy Tarreau [Tue, 24 Jan 2023 15:02:27 +0000 (16:02 +0100)]
CLEANUP: trace: remove the QUIC-specific ifdefs
There are ifdefs at several places to only define TRC_ARGS_QCON when
QUIC is defined, but nothing prevents this code from building without.
Let's just remove those ifdefs, the single "if" they avoid is not worth
the extra maintenance burden.
Willy Tarreau [Thu, 26 Jan 2023 14:32:12 +0000 (15:32 +0100)]
BUG/MINOR: log: release global log servers on exit
Since 2.6 we have a free_logsrv() function that is used to release log
servers. It must be called from deinit() instead of manually iterating
over the log servers, otherwise some parts of the structure are not
freed (namely the ring name), as reported by ASAN.
Willy Tarreau [Thu, 26 Jan 2023 10:12:11 +0000 (11:12 +0100)]
BUG/MEDIUM: hpack: fix incorrect huffman decoding of some control chars
Commit 9f4f6b038 ("OPTIM: hpack-huff: reduce the cache footprint of the
huffman decoder") replaced the large tables with more space efficient
byte arrays, but one table, rht_bit15_11_11_4, has a 64 bytes hole in it
that wasn't materialized by filling it with zeroes to make the offsets
match, nor by adjusting the offset from the caller. This resulted in
some control chars not properly being decoded and being seen as byte 0,
and the associated messages to be rejected, as can be seen in issue #1971.
This commit fixes it by adjusting the offset used for the higher part of
the table so that we don't need to store 64 zeroes that will never be
accessed.
This needs to be backported to 2.7.
Thanks to Christopher for spotting the bug, and to Juanga Covas for
providing precious traces showing the problem.
Amaury Denoyelle [Wed, 25 Jan 2023 16:44:36 +0000 (17:44 +0100)]
BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission
A major regression was introduced by following patch
commit 71fd03632fff43f11cebc6ff4974723c9dc81c67
MINOR: mux-quic/h3: send SETTINGS as soon as transport is ready
H3 finalize operation is now called at an early stage in the middle of
qc_init(). However, some qcc members are not yet initialized. In
particular the stream tree which will cause a crash when H3 control
stream will be accessed.
To fix this, qcc_install_app_ops() has been delayed at the end of
qc_init(). This ensures that qcc is properly initialized when app_ops
operation are used.
This must be backported wherever above patch is. For the record, it has
been tagged up to 2.7.
Amaury Denoyelle [Wed, 25 Jan 2023 09:50:03 +0000 (10:50 +0100)]
BUG/MINOR: h3: fix GOAWAY emission
Since the rework of QUIC streams send scheduling, each stream has to be
inserted in QUIC-mux send-list to be able to emit content. This was not
the case for GOAWAY which prevent it to be sent. This regression has
been introduced by the following patch :
This new patch fixes the issue by inserting H3 control stream in mux
send-list. The impact is deemed minor as for the moment GOAWAY is only
sent just before connection/mux cleanup with a CONNECTION_CLOSE.
However, it might cause some connections to hang up indefinitely.
Amaury Denoyelle [Tue, 24 Jan 2023 16:35:37 +0000 (17:35 +0100)]
MINOR: mux-quic/h3: send SETTINGS as soon as transport is ready
As specified by HTTP3 RFC, SETTINGS frame should be sent as soon as
possible. Before this patch, this was only done on the first qc_send()
invocation. This delay significantly SETTINGS emission until the first
H3 response is ready to be transferred.
This patch fixes this by ensuring SETTINGS is emitted when MUX-QUIC is
being setup.
As a side point, return value of finalize operation is checked. This
means that an error during SETTINGS emission will cause the connection
init to fail.
Olivier Houchard [Tue, 24 Jan 2023 22:59:32 +0000 (23:59 +0100)]
MINOR: connection: add a BUG_ON() to detect destroying connection in idle list
Add a BUG_ON() in conn_free(), to check that when we're freeing a
connection, it is not still in the idle connections tree, otherwise the
next thread that will try to use it will probably crash.
BUG/MINOR: ssl: Fix leaks in 'update ssl ocsp-response' CLI command
This patch fixes two leaks in the 'update ssl ocsp-response' cli
command. One rather significant one since a whole trash buffer was
allocated for every call of the command, and another more marginal one
in an error path.
Willy Tarreau [Tue, 24 Jan 2023 11:13:14 +0000 (12:13 +0100)]
DEV: haring: add a new option "-r" to automatically repair broken files
In case a file-backed ring was not properly synced before being dumped,
the output can look bogus due to the head pointer not being perfectly
up to date. In this case, passing "-r" will make haring automatically
skip entries not starting with a zero, and resynchronize with the rest
of the messages.
Willy Tarreau [Tue, 24 Jan 2023 11:11:41 +0000 (12:11 +0100)]
BUG/MINOR: sink: make sure to always properly unmap a file-backed ring
The munmap() call performed on exit was incorrect since it used to apply
to the buffer instead of the area, so neither the pointer nor the size
were page-aligned. This patches corrects this and also adds a call to
msync() since munmap() alone doesn't guarantee that data will be dumped.
Willy Tarreau [Sun, 22 Jan 2023 13:20:57 +0000 (14:20 +0100)]
[RELEASE] Released version 2.8-dev2
Released version 2.8-dev2 with the following main changes :
- CLEANUP: htx: fix a typo in an error message of http_str_to_htx
- DOC: config: added optional rst-ttl argument to silent-drop in action lists
- BUG/MINOR: ssl: Fix crash in 'update ssl ocsp-response' CLI command
- BUG/MINOR: ssl: Crash during cleanup because of ocsp structure pointer UAF
- MINOR: ssl: Create temp X509_STORE filled with cert chain when checking ocsp response
- MINOR: ssl: Only set ocsp->issuer if issuer not in cert chain
- MINOR: ssl: Release ssl_ocsp_task_ctx.cur_ocsp when destroying task
- MINOR: ssl: Detect more OCSP update inconsistencies
- BUG/MINOR: ssl: Fix OCSP_CERTID leak when same certificate is used multiple times
- MINOR: ssl: Limit ocsp_uri buffer size to minimum
- MINOR: ssl: Remove mention of ckch_store in error message of cli command
- MINOR: channel: Don't test CF_READ_NULL while CF_SHUTR is enough
- REORG: channel: Rename CF_READ_NULL to CF_READ_EVENT
- REORG: channel: Rename CF_WRITE_NULL to CF_WRITE_EVENT
- MEDIUM: channel: Use CF_READ_EVENT instead of CF_READ_PARTIAL
- MEDIUM: channel: Use CF_WRITE_EVENT instead of CF_WRITE_PARTIAL
- MINOR: channel: Remove CF_READ_ACTIVITY
- MINOR: channel: Remove CF_WRITE_ACTIVITY
- MINOR: channel: Remove CF_ANA_TIMEOUT and report CF_READ_EVENT instead
- MEDIUM: channel: Remove CF_READ_ATTACHED and report CF_READ_EVENT instead
- MINOR: channel: Stop to test CF_READ_ERROR flag if CF_SHUTR is enough
- MINOR: channel/applets: Stop to test CF_WRITE_ERROR flag if CF_SHUTW is enough
- DOC: management: add details on "Used" status
- DOC: management: add details about @system-ca in "show ssl ca-file"
- BUG/MINOR: mux-quic: fix transfer of empty HTTP response
- MINOR: mux-quic: add traces for flow-control limit reach
- MAJOR: mux-quic: rework stream sending priorization
- MEDIUM: h3: send SETTINGS before STREAM frames
- MINOR: mux-quic: use send-list for STOP_SENDING/RESET_STREAM emission
- MINOR: mux-quic: use send-list for immediate sending retry
- BUG/MINOR: h1-htx: Remove flags about protocol upgrade on non-101 responses
- BUG/MINOR: hlua: Fix Channel.line and Channel.data behavior regarding the doc
- BUG/MINOR: resolvers: Wait the resolution execution for a do_resolv action
- BUG/MINOR: ssl: Remove unneeded pointer check in ocsp cli release function
- BUG/MINOR: ssl: Missing ssl_conf pointer check when checking ocsp update inconsistencies
- DEV: tcploop: add minimal support for unix sockets
- BUG/MEDIUM: listener: duplicate inherited FDs if needed
- BUG/MINOR: ssl: OCSP minimum update threshold not properly set
- MINOR: ssl: Treat ocsp-update inconsistencies as fatal errors
- MINOR: ssl: Do not wake ocsp update task if update tree empty
- MINOR: ssl: Reinsert updated ocsp response later in tree in case of http error
- REGTEST: ssl: Add test for 'update ssl ocsp-response' CLI command
- OPTIM: global: move byte counts out of global and per-thread
- BUG/MEDIUM: peers: make "show peers" more careful about partial initialization
- BUG/MINOR: promex: Don't forget to consume the request on error
- MINOR: http-ana: Add a function to set HTTP termination flags
- MINOR: http-ana: Use http_set_term_flags() in most of HTTP analyzers
- BUG/MINOR: http-ana: Report SF_FINST_R flag on error waiting the request body
- MINOR: http-ana: Use http_set_term_flags() when waiting the request body
- BUG/MINOR: http-fetch: Don't block HTTP sample fetch eval in HTTP_MSG_ERROR state
- MAJOR: http-ana: Review error handling during HTTP payload forwarding
- CLEANUP: http-ana: Remove HTTP_MSG_ERROR state
- BUG/MEDIUM: mux-h2: Don't send CANCEL on shutw when response length is unkown
- MINOR: htx: Add an HTX value for the extra field is payload length is unknown
- BUG/MINOR: http-ana: make set-status also update txn->status
- BUG/MINOR: listeners: fix suspend/resume of inherited FDs
- DOC: config: fix wrong section number for "protocol prefixes"
- DOC: config: fix aliases for protocol prefixes "udp4@" and "udp6@"
- DOC: config: mention the missing "quic4@" and "quic6@" in protocol prefixes
- MINOR: listener: also support "quic+" as an address prefix
- CLEANUP: stconn: always use se_fl_set_error() to set the pending error
- BUG/MEDIUM: stconn: also consider SE_FL_EOI to switch to SE_FL_ERROR
- MINOR: quic: Useless test about datagram destination addresses
- MINOR: quic: Disable the active connection migrations
- MINOR: quic: Add "no-quic" global option
- MINOR: sample: Add "quic_enabled" sample fetch
- MINOR: quic: Replace v2 draft definitions by those of the final 2 version
- BUG/MINOR: mux-fcgi: Correctly set pathinfo
- DOC: config: fix "Address formats" chapter syntax
- BUG/MEDIUM: jwt: Properly process ecdsa signatures (concatenated R and S params)
- BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7
- Revert "BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7"
- BUG/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 (missing ECDSA_SIG_set0)
- BUG/MINOR: listener: close tiny race between resume_listener() and stopping
- BUG/MINOR: h3: properly handle connection headers
- MINOR: h3: extend function for QUIC varint encoding
- MINOR: h3: implement TRAILERS encoding
- BUG/MINOR: bwlim: Check scope for period expr for set-bandwitdh-limit actions
- MEDIUM: bwlim: Support constants limit or period on set-bandwidth-limit actions
- BUG/MINOR: bwlim: Fix parameters check for set-bandwidth-limit actions
- MINOR: h3: implement TRAILERS decoding
- BUG/MEDIUM: fd/threads: fix again incorrect thread selection in wakeup broadcast
- BUG/MINOR: thread: always reload threads_enabled in loops
- MINOR: threads: add a thread_harmless_end() version that doesn't wait
- BUG/MEDIUM: debug/thread: make the debug handler not wait for !rdv_requests
- BUG/MINOR: mux-h2: make sure to produce a log on invalid requests
- BUG/MINOR: mux-h2: add missing traces on failed headers decoding
- BUILD: hpack: include global.h for the trash that is needed in debug mode
- BUG/MINOR: jwt: Wrong return value checked
- BUG/MINOR: quic: Do not request h3 clients to close its unidirection streams
- MEDIUM: quic-sock: fix udp source address for send on listener socket
Amaury Denoyelle [Thu, 19 Jan 2023 17:05:54 +0000 (18:05 +0100)]
MEDIUM: quic-sock: fix udp source address for send on listener socket
When receiving a QUIC datagram, destination address is retrieved via
recvmsg() and stored in quic-conn as qc.local_addr. This address is then
reused when using the quic-conn owned socket.
When listener socket mode is preferred, send operation did not specify
the source address of the emitted datagram. If listener socket is bound
on a wildcard address, the kernel is free to choose any address assigned
to the local machine. This may be different from the address selected by
the client on its first datagram which will prevent the client to emit
next replies.
To address this, this patch fixes the UDP source address via sendmsg().
This process is similar to the reception and relies on ancillary
message, so the socket is left untouched after the operation. This is
heavily platform specific and may not be supported by some kernels.
This change has only an impact if listener socket only is used for QUIC
communications. This is the default behavior for 2.7 branch but not
anymore on 2.8. Use tune.quic.socket-owner set to listener to ensure set
it.
BUG/MINOR: quic: Do not request h3 clients to close its unidirection streams
It is forbidden to request h3 clients to close its Control and QPACK unidirection
streams. If not, the client closes the connection with H3_CLOSED_CRITICAL_STREAM(0x104).
Perhaps this could prevent some clients as Chrome to come back for a while.
But at quic_conn level there is no mean to identify the streams for which we cannot
send STOP_SENDING frame. Such a possibility is even not mentionned in RFC 9000.
At this time there is no choice than stopping sending STOP_SENDING frames for
all the h3 unidirectional streams inspecting the ->app_opps quic_conn value.
Willy Tarreau [Thu, 19 Jan 2023 22:22:03 +0000 (23:22 +0100)]
BUG/MINOR: mux-h2: make sure to produce a log on invalid requests
As reported by Dominik Froehlich in github issue #1968, some H2 request
parsing errors do not result in a log being emitted. This is annoying
for debugging because while an RST_STREAM is correctly emitted to the
client, there's no way without enabling traces to find it on the
haproxy side.
After some testing with various abnormal requests, a few places were
found where logs were missing and could be added. In this case, we
simply use sess_log() so some sample fetch functions might not be
available since the stream is not created. But at least there will
be a BADREQ in the logs. A good eaxmple of this consists in sending
forbidden headers or header syntax (e.g. presence of LF in value).
Some quick tests can be done this way:
Willy Tarreau [Thu, 19 Jan 2023 17:52:31 +0000 (18:52 +0100)]
BUG/MEDIUM: debug/thread: make the debug handler not wait for !rdv_requests
The debug handler may deadlock with some threads waiting for isolation.
This may happend during a "show threads" command or even during a panic.
The reason is the call to thread_harmless_end() which waits for rdv_requests
to turn to zero before releasing its position in thread_dump_state,
while that one may not progress if another thread was interrupted in
thread_isolate() and is waiting for that thread to drop thread_dump_state.
In order to address this, we now use thread_harmless_end_sig() introduced
by previous commit:
MINOR: threads: add a thread_harmless_end() version that doesn't wait
However there's a catch: since commit f7afdd910 ("MINOR: debug: mark
oneself harmless while waiting for threads to finish"), there's a second
pair of thread_harmless_now()/thread_harmless_end() that surround the
loop around thread_dump_state. Marking a thread harmless before this
loop and dropping that without checking rdv_requests there could break
the harmless promise made to the other thread if it returns first and
proceeds with its isolated work. Hence we just drop this pair which was
only preventive for other signal handlers, while as indicated in that
patch's commit message, other signals are handled asynchronously and do
not require that extra protection.
This fix must be backported to 2.7.
The problem can be seen by running "show threads" in fast loops (100/s)
while reloading haproxy very quickly (10/s) and sending lots of traffic
to it (100krps, 15 Gbps). In this case the soft stop calls pool_gc()
which isolates a lot and manages to race with the dumps after a few
tens of seconds, leaving the process with all threads at 100%.
Willy Tarreau [Thu, 19 Jan 2023 17:43:48 +0000 (18:43 +0100)]
MINOR: threads: add a thread_harmless_end() version that doesn't wait
thread_harmless_end() needs to wait for rdv_requests to disappear so
that we're certain to respect a harmless promise that possibly allowed
another thread to proceed under isolation. But this doesn't work in a
signal handler because a thread could be interrupted by the debug
handler while already waiting for isolation and with rdv_request>0.
As such this function could cause a deadlock in such a signal handler.
Let's implement a specific variant for this, thread_harmless_end_sig(),
that just resets the thread's bit and doesn't wait. It must of course
not be used past a check point that would allow the isolation requester
to return and see the thread as temporarily harmless then turning back
on its promise.
This will be needed to fix a race in the debug handler.
Willy Tarreau [Thu, 19 Jan 2023 18:14:18 +0000 (19:14 +0100)]
BUG/MINOR: thread: always reload threads_enabled in loops
A few loops waiting for threads to synchronize such as thread_isolate()
rightfully filter the thread masks via the threads_enabled field that
contains the list of enabled threads. However, it doesn't use an atomic
load on it. Before 2.7, the equivalent variables were marked as volatile
and were always reloaded. In 2.7 they're fields in ha_tgroup_ctx[], and
the risk that the compiler keeps them in a register inside a loop is not
null at all. In practice when ha_thread_relax() calls sched_yield() or
an x86 PAUSE instruction, it could be verified that the variable is
always reloaded. If these are avoided (e.g. architecture providing
neither solution), it's visible in asm code that the variables are not
reloaded. In this case, if a thread exists just between the moment the
two values are read, the loop could spin forever.
This patch adds the required _HA_ATOMIC_LOAD() on the relevant
threads_enabled fields. It must be backported to 2.7.
Willy Tarreau [Thu, 19 Jan 2023 16:10:10 +0000 (17:10 +0100)]
BUG/MEDIUM: fd/threads: fix again incorrect thread selection in wakeup broadcast
Commit c1640f79f ("BUG/MEDIUM: fd/threads: fix incorrect thread selection
in wakeup broadcast") fixed an incorrect range being used to pick a thread
when broadcasting a wakeup for a foreign thread, but the selection was still
wrong as the number of threads and their mask was taken from the current
thread instead of the target thread. In addition, the code dealing with the
wakeup of a thread from the same group was still relying on MAX_THREADS
instead of tg->count.
This could theoretically cause random crashes with more than one thread
group though this was never encountered.
Amaury Denoyelle [Fri, 13 Jan 2023 15:40:31 +0000 (16:40 +0100)]
MINOR: h3: implement TRAILERS decoding
Implement the conversion of H3 request trailers as HTX blocks. This is
done through a new function h3_trailers_to_htx(). If the request
contains forbidden trailers it is rejected with a stream error.
BUG/MINOR: bwlim: Fix parameters check for set-bandwidth-limit actions
First, the inspect-delay is now tested if the action is used on a
tcp-response content rule. Then, when an expressions scope is checked, we
now take care to detect the right scope depending on the ruleset used
(tcp-request, tcp-response, http-request or http-response).
MEDIUM: bwlim: Support constants limit or period on set-bandwidth-limit actions
It is now possible to set a constant for the limit or period parameters on a
set-bandwidth-limit actions. The limit must follow the HAProxy size format
and is expressed in bytes. The period must follow the HAProxy time format
and is expressed in milliseconds. Of course, it is still possible to use
sample expressions instead.
The documentation was updated accordingly.
It is not really a bug. Only exemples were written this way in the
documentation. But it could be good to backport this change in 2.7.
Amaury Denoyelle [Thu, 12 Jan 2023 13:53:43 +0000 (14:53 +0100)]
MINOR: h3: implement TRAILERS encoding
This patch implement the conversion of an HTX response containing
trailer into a H3 HEADERS frame. This is done through a new function
named h3_resp_trailers_send().
This was tested with a nginx configuration using <add_trailer>
statement.
It may be possible that HTX buffer only contains a EOT block without
preceeding trailer. In this case, the conversion will produce nothing
but fin will be reported. This causes QUIC mux to generate an empty
STREAM frame with FIN bit set.
Amaury Denoyelle [Tue, 17 Jan 2023 14:21:16 +0000 (15:21 +0100)]
MINOR: h3: extend function for QUIC varint encoding
Slighty adjust b_quic_enc_int(). This function is used to encode an
integer as a QUIC varint in a struct buffer.
A new parameter is added to the function API to specify the width of the
encoded integer. By default, 0 should be use to ensure that the minimum
space is used. Other valid values are 1, 2, 4 or 8. An error is reported
if the width is not large enough.
This new parameter will be useful when buffer space is reserved prior to
encode an unknown integer value. The maximum size of 8 bytes will be
reserved and some data can be put after. When finally encoding the
integer, the width can be requested to be 8 bytes.
With this new parameter, a small refactoring of the function has been
conducted to remove some useless internal variables.
This should be backported up to 2.7. It will be mostly useful to
implement H3 trailers encoding.
Amaury Denoyelle [Tue, 17 Jan 2023 16:47:06 +0000 (17:47 +0100)]
BUG/MINOR: h3: properly handle connection headers
Connection headers are not used in HTTP/3. As specified by RFC 9114, a
received message containing one of those is considered as malformed and
rejected. When converting an HTX message to HTTP/3, these headers are
silently skipped.
This must be backported up to 2.6. Note that assignment to <h3s.err>
must be removed on 2.6 as stream level error has been introduced in 2.7
so this field does not exist in 2.6 A connection error will be used
instead automatically.
Willy Tarreau [Thu, 19 Jan 2023 10:34:21 +0000 (11:34 +0100)]
BUG/MINOR: listener: close tiny race between resume_listener() and stopping
Pierre Cheynier reported a very rare race condition on soft-stop in the
listeners. What happens is that if a previously limited listener is
being resumed by another thread finishing an accept loop, and at the
same time a soft-stop is performed, the soft-stop will turn the
listener's state to LI_INIT, and once the listener's lock is released,
resume_listener() in the second thread will try to resume this listener
which has an fd==-1, yielding a crash in listener_set_state():
FATAL: bug condition "l->rx.fd == -1" matched at src/listener.c:288
The reason is that resume_listener() only checks for LI_READY, but doesn't
consider being called with a non-initialized or a stopped listener. Let's
also make sure we don't try to ressuscitate such a listener there.
BUG/MINOR: ssl: Fix compilation with OpenSSL 1.0.2 (missing ECDSA_SIG_set0)
This function was introduced in OpenSSL 1.1.0. Prior to that, the
ECDSA_SIG structure was public.
This function was used in commit 5a8f02ae "BUG/MEDIUM: jwt: Properly
process ecdsa signatures (concatenated R and S params)".
This patch needs to be backported up to branch 2.5 alongside commit 5a8f02ae.
Willy Tarreau [Thu, 19 Jan 2023 09:50:13 +0000 (10:50 +0100)]
BUILD: ssl: add ECDSA_SIG_set0() for openssl < 1.1 or libressl < 2.7
Commit 5a8f02ae6 ("BUG/MEDIUM: jwt: Properly process ecdsa signatures
(concatenated R and S params)") makes use of ECDSA_SIG_set0() which only
appeared in openssl-1.1.0 and libressl 2.7, and breaks the build before.
Let's just do what it minimally does (only assigns the two fields to the
destination).
This will need to be backported where the commit above is, likely 2.5.
BUG/MEDIUM: jwt: Properly process ecdsa signatures (concatenated R and S params)
When the JWT token signature is using ECDSA algorithm (ES256 for
instance), the signature is a direct concatenation of the R and S
parameters instead of OpenSSL's DER format (see section
3.4 of RFC7518).
The code that verified the signatures wrongly assumed that they came in
OpenSSL's format and it did not actually work.
We now have the extra step of converting the signature into a complete
ECDSA_SIG that can be fed into OpenSSL's digest verification functions.
The ECDSA signatures in the regtest had to be recalculated and it was
made via the PyJWT python library so that we don't end up checking
signatures that we built ourselves anymore.
This patch should fix GitHub issue #2001.
It should be backported up to branch 2.5.
Daniel Corbett [Tue, 17 Jan 2023 23:32:31 +0000 (18:32 -0500)]
DOC: config: fix "Address formats" chapter syntax
The section on "Address formats" doesn't provide the dot (.) after the
chapter numbers, which breaks parsing within the HTML converter.
This commit adds the dot (.) after each chapter within Section 11.
This should be backported to versions 2.4 and above.
Paul Barnetta [Mon, 16 Jan 2023 22:44:11 +0000 (09:44 +1100)]
BUG/MINOR: mux-fcgi: Correctly set pathinfo
Existing logic for checking whether a regex subexpression for pathinfo
is matched results in valid matches being ignored and non-matches having
a new zero length string stored in params->pathinfo. This patch reverses
the logic so params->pathinfo is set when the subexpression is matched.
Without this patch the example configuration in the documentation:
path-info ^(/.+\.php)(/.*)?$
does not result in PATH_INFO being sent to the FastCGI application, as
expected, when the second subexpression is matched (in which case both
pmatch[2].rm_so and pmatch[2].rm_eo will be non-negative integers).
This patch may be backported as far as 2.2, the first release that made
the capture of this second subexpression optional.
This sample fetch returns a boolean. True if the support for QUIC transport
protocol was built and if this protocol was not disabled by "no-quic"
global option.
Add "no-quic" to "global" section to disable the use of QUIC transport protocol
by all configured QUIC listeners. This is listeners with QUIC addresses on their
"bind" lines. Internally, the socket addresses binding is skipped by
protocol_bind_all() for receivers with <proto_quic4> or <proto_quic6> as
protocol (see protocol struct).
Add information about "no-quic" global option to the documentation.
MINOR: quic: Disable the active connection migrations
Set "disable_active_migration" transport parameter to inform the peer
haproxy listeners does not the connection migration feature.
Also drop all received datagrams with a modified source address.
Willy Tarreau [Tue, 17 Jan 2023 15:27:35 +0000 (16:27 +0100)]
BUG/MEDIUM: stconn: also consider SE_FL_EOI to switch to SE_FL_ERROR
In se_fl_set_error() we used to switch to SE_FL_ERROR only when there
is already SE_FL_EOS indicating that the read side is closed. But that
is not sufficient, we need to consider all cases where no more reads
will be performed on the connection, and as such also include SE_FL_EOI.
Without this, some aborted connections during a transfer sometimes only
stop after the timeout, because the ERR_PENDING is never promoted to
ERROR.
This must be backported to 2.7 and requires previous patch "CLEANUP:
stconn: always use se_fl_set_error() to set the pending error".
Willy Tarreau [Tue, 17 Jan 2023 15:25:29 +0000 (16:25 +0100)]
CLEANUP: stconn: always use se_fl_set_error() to set the pending error
In mux-h2 and mux-quic we still had two places manually setting
SE_FL_ERR_PENDING or SE_FL_ERROR depending on the EOS state, instead
of using se_fl_set_error() which takes care of the condition. Better
use the specialized function for this, it will allow to centralize
the conditions. Note that this will be needed to fix a bug.
Willy Tarreau [Mon, 16 Jan 2023 12:55:27 +0000 (13:55 +0100)]
MINOR: listener: also support "quic+" as an address prefix
While we do support quic4@ and quic6@ for listening addresses, it was
not possible to specify that we want to use an FD inherited from the
parent with QUIC. It's just a matter of making it possible to enable
a dgram-type socket and a stream-type transport, so let's add this.
Now it becomes possible to write "quic+fd@12", "quic+ipv4@addr" etc.
Willy Tarreau [Mon, 16 Jan 2023 10:47:01 +0000 (11:47 +0100)]
BUG/MINOR: listeners: fix suspend/resume of inherited FDs
FDs inherited from a parent process do not deal well with suspend/resume
since commit 59b5da487 ("BUG/MEDIUM: listener: never suspend inherited
sockets") introduced in 2.3. The problem is that we now report that they
cannot be suspended at all, and they return a failure. As such, if a new
process fails to bind and sends SIGTTOU to the previous process, that
one will notice the failure and instantly switch to soft-stop, leaving
no chance to the new process to give up later and signal its failure.
What we need to do, however, is to stop receiving new connections from
such inherited FDs, which just means that the FD must be unsubscribed
from the poller (and resubscribed later if finally it has to stay).
With this a new process can start on the already bound FD without
problem thanks to the absence of polling, and when the old process
stops the new process will be alone on it.
Willy Tarreau [Tue, 10 Jan 2023 13:50:44 +0000 (14:50 +0100)]
BUG/MINOR: http-ana: make set-status also update txn->status
Patrick Hemmer reported an interesting case where the status present in
the logs doesn't reflect what was reported to the user. During analysis
we could figure that it was in fact solely caused by the code dealing
with the set-status action. Indeed, set-status does update the status
in the HTX message itself but not in the HTTP transaction. However, at
most places where the status is needed to take a decision, it is
retrieved from the transaction, and the logs proceed like this as well,
though the "status" sample fetch function does retrieve it from the HTX
data. This particularly means that once a set-status has been used to
modify the status returned to the user, logs do not match that status,
and the response code distribution doesn't match either. However a
subsequent rule using the status as a condition will still match because
the "status" sample fetch function does also extract the status from the
HTX stream. Here's an example that fails:
This will return a 400 to the client but log a 503 and increment
http_rsp_5xx.
In the end the root cause is that we need to make txn->status the only
authoritative place to get the status, and as such it must be updated
by the set-status rule. Ideally "status" should just use txn->status
but with the two synchronized this way it's not needed.
This should be backported since it addresses some consistency issues
between logs and what's observed. The set-status action appeared in
1.9 so all stable versions are eligible.
MINOR: htx: Add an HTX value for the extra field is payload length is unknown
When the payload length cannot be determined, the htx extra field is set to
the magical vlaue ULLONG_MAX. It is not obvious. This a dedicated HTX value
is now used. Now, HTX_UNKOWN_PAYLOAD_LENGTH must be used in this case,
instead of ULLONG_MAX.
BUG/MEDIUM: mux-h2: Don't send CANCEL on shutw when response length is unkown
Since commit 473e0e54 ("BUG/MINOR: mux-h2: send a CANCEL instead of ES on
truncated writes"), a CANCEL may be reported when the response length is
unkown. It happens for H1 reponses without "Content-lenght" or
"Transfer-encoding" header.
Indeed, in this case, the end of the reponse is detected when the server
connection is closed. On the fontend side, the H2 multiplexer handles this
event as an abort and sensd a RST_STREAM frame with CANCEL error code.
The issue is not with the above commit but with the commit 4877045f1
("MINOR: mux-h2: make streams know if they need to send more data"). The
H2_SF_MORE_HTX_DATA flag must only be set if the payload length can be
determined.
This patch should fix the issue #1992. It must be backported to 2.7.
MAJOR: http-ana: Review error handling during HTTP payload forwarding
The error handling in the HTTP payload forwarding is far to be ideal because
both sides (request and response) are tested each time. It is espcially ugly
on the request side. To report a server error instead of a client error,
there are some workarounds to delay the error handling. The reason is that
the request analyzer is evaluated before the response one. In addition,
errors are tested before the data analysis. It means it is possible to
truncate data because errors may be handled to early.
So the error handling at this stages was totally reviewed. Aborts are now
handled after the data analysis. We also stop to finish the response on
request error or the opposite. As a side effect, the HTTP_MSG_ERROR state is
now useless. As another side effect, the termination flags are now set by
the HTTP analysers and not process_stream().
BUG/MINOR: http-fetch: Don't block HTTP sample fetch eval in HTTP_MSG_ERROR state
It was inherited from the legacy HTTP mode, but the message parsing is
handled by the underlying mux now. Thus, if a message is in HTTP_MSG_ERROR
state, it is just an analysis error and not a parsing error. So there is no
reason to block the HTTP sample fetch evaluation in this case.
This patch could be backported in all stable versions (For the 2.0, only the
htx part must be updated).
MINOR: http-ana: Use http_set_term_flags() when waiting the request body
When HAProxy is waiting for the request body and an abort or an error is
detected, we can now use http_set_term_flags() function to set the termination
flags of the stream instead of handling it by hand.
BUG/MINOR: http-ana: Report SF_FINST_R flag on error waiting the request body
When we wait for the request body, we are still in the request analysis. So
a SF_FINST_R flag must be reported in logs. Even if some data are already
received, at this staged, nothing is sent to the server.
This patch could be backported in all stable versions.
MINOR: http-ana: Use http_set_term_flags() in most of HTTP analyzers
We use the new function to set the HTTP termination flags in the most
obvious places. The other places are a bit specific and will be handled one
by one in dedicated patched.
MINOR: http-ana: Add a function to set HTTP termination flags
There is already a function to set termination flags but it is not well
suited for HTTP streams. So a function, dedicated to the HTTP analysis, was
added. This way, this new function will be called for HTTP analysers on
error. And if the error is not caugth at this stage, the generic function
will still be called from process_stream().
Here, by default a PRXCOND error is reported and depending on the stream
state, the reson will be set accordingly:
* If the backend SC is in INI state, SF_FINST_T is reported on tarpit and
SF_FINST_R otherwise.
* SF_FINST_Q is the server connection is queued
* SF_FINST_C in any connection attempt state (REQ/TAR/ASS/CONN/CER/RDY).
Except for applets, a SF_FINST_R is reported.
* Once the server connection is established, SF_FINST_H is reported while
HTTP_MSG_DATA state on the response side.
* SF_FINST_L is reported if the response is in HTTP_MSG_DONE state or
higher and a client error/timeout was reported.
BUG/MINOR: promex: Don't forget to consume the request on error
When the promex applet triggers an error, for instance because the URI is
invalid, we must still take care to consume the request. Otherwise, the
error will be handled by HTTP analyzers as a server abort.
Willy Tarreau [Thu, 12 Jan 2023 16:09:34 +0000 (17:09 +0100)]
BUG/MEDIUM: peers: make "show peers" more careful about partial initialization
Since 2.6 with commit 34e4085f8 ("MEDIUM: peers: Balance applets across
threads") the initialization of a peers appctx may be postponed with a
wakeup, causing some partially initialized appctx to be visible. The
"show peers" command used to only care about peers without appctx, but
now it must also take care of those with no stconn, otherwise it can
occasionally crash while dumping them.
This fix must be backported to 2.6. Thanks to Patrick Hemmer for
reporting the problem.
Willy Tarreau [Thu, 12 Jan 2023 15:08:41 +0000 (16:08 +0100)]
OPTIM: global: move byte counts out of global and per-thread
During multiple tests we've already noticed that shared stats counters
have become a real bottleneck under large thread counts. With QUIC it's
pretty visible, with qc_snd_buf() taking 2.5% of the CPU on a 48-thread
machine at only 25 Gbps, and this CPU is entirely spent in the atomic
increment of the byte count and byte rate. It's also visible in H1/H2
but slightly less since we're working with larger buffers, hence less
frequent updates. These counters are exclusively used to report the
byte count in "show info" and the byte rate in the stats.
Let's move them to the thread_ctx struct and make the stats reader
just collect each thread's stats when requested. That's way more
efficient than competing on a single cache line.
After this, qc_snd_buf has totally disappeared from the perf profile
and tests made in h1 show roughly 1% performance increase on small
objects.
MINOR: ssl: Reinsert updated ocsp response later in tree in case of http error
When updating an OCSP response, in case of HTTP error (host unreachable
for instance) we do not want to reinsert the entry at the same place in
the update tree otherwise we might retry immediately the update of the
same response. This patch adds an arbitrary 1min time to the next_update
of a response in such a case.
After an HTTP error, instead of waking the update task up after an
arbitrary 10s time, we look for the first entry of the update tree and
sleep for the apropriate time.
MINOR: ssl: Do not wake ocsp update task if update tree empty
In the unlikely event that the ocsp update task is started but the
update tree is empty, put the update task to sleep indefinitely.
The only way this can happen is if the same certificate is loaded under
two different names while the second one has the 'ocsp-update on'
option. Since the certificate names are distinct we will have two
ckch_stores but a single certificate_ocsp because they are identified by
the OCSP_CERTID which is built out of the issuer certificate and the
certificate id (which are the same regardless of the .pem file name).
MINOR: ssl: Treat ocsp-update inconsistencies as fatal errors
If incompatibilities are found in a certificate's ocsp-update mode we
raised a single alert that will be considered fatal from here on. This
is changed because in case of incompatibilities we will end up with an
undefined behaviour. The ocsp response might or might not be updated
depending on the order in which the multiple ocsp-update options are
taken into account.
BUG/MINOR: ssl: OCSP minimum update threshold not properly set
An arbitrary 5 minutes minimum interval between two updates of the same
OCSP response is defined but it was not properly used when inserting
entries in the update tree.
Willy Tarreau [Wed, 11 Jan 2023 09:59:52 +0000 (10:59 +0100)]
BUG/MEDIUM: listener: duplicate inherited FDs if needed
Since commit 36d9097cf ("MINOR: fd: Add BUG_ON checks on fd_insert()"),
there is currently a test in fd_insert() to detect that we're not trying
to reinsert an FD that had already been inserted. This test catches the
following anomalies:
frontend fail1
bind fd@0
bind fd@0
and:
frontend fail2
bind fd@0 shards 2
What happens is that clone_listener() is called on a listener already
having an FD, and when sock_{inet,unix}_bind_receiver() are called, the
same FD will be registered multiple times and rightfully crash in the
sanity check.
It wouldn't be correct to block shards though (e.g. they could be used
in a default-bind line). What looks like a safer and more future-proof
approach simply is to dup() the FD so that each listener has one copy.
This is also the only solution that might allow later to support more
than 64 threads on an inherited FD.
This needs to be backported as far as 2.4. Better wait for at least
one extra -dev version before backporting though, as the bug should
not be triggered often anyway.
Willy Tarreau [Wed, 11 Jan 2023 09:54:59 +0000 (10:54 +0100)]
DEV: tcploop: add minimal support for unix sockets
Since the tool permits to pass an FD bound for listening, it's convenient
to test haproxy's "bind fd@". Let's add support for UNIX sockets the same
way. -U needs to be passed to change the default address family, and the
address must contain a "/".
E.g.
$ dev/tcploop/tcploop -U /tmp/ux L Xi ./haproxy -f fd1.cfg
BUG/MINOR: resolvers: Wait the resolution execution for a do_resolv action
The do_resolv action triggers a resolution and must wait for the
result. Concretely, if no cache entry is available, it creates a resolution
and wakes up the resolvers task. Then it yields. When the action is
recalled, if the resolution is still running, it yields again.
However, if the resolution is not running, it does not check it was
running. Thus, it is possible to ignore the resolution because the action
was recalled before the resolvers task had a chance to be executed. If there
is result, the action must yield.
This patch should fix the issue #1993. It must be backported as far as 2.0.
BUG/MINOR: hlua: Fix Channel.line and Channel.data behavior regarding the doc
These both functions are buggy and don't respect the documentation. They
must wait for more data, if possible.
For Channel.data(), it must happen if not enough data was received orf if no
length was specified and no data was received. The first case is properly
handled but not the second one. An empty string is return instead. In
addition, if there is no data and the channel can't receive more data, 'nil'
value must be returned.
In the same spirit, for Channel.line(), we must try to wait for more data
when no line is found if not enough data was received or if no length was
specified. Here again, only the first case is properly handled. And for this
function too, 'nil' value must be returned if there is no data and the
channel can't receive more data.
This patch is related to the issue #1993. It must be backported as far as
2.5.
BUG/MINOR: h1-htx: Remove flags about protocol upgrade on non-101 responses
It is possible to have an "upgrade:" header and the corresponding value in
the "connection:" header for a non-101 response. It happens for
426-Upgrade-Required messages. However, on HAProxy side, a parsing error is
reported for this kind of message because no websocket key header
("sec-websocket-accept:") is found in the response.
So a possible fix could be to not perform this test for non-101
responses. However, having flags about protocol upgrade on this kind of
response could lead to other bugs. Instead, corresponding flags are
removed. Thus, during the H1 response post-parsing, H1_MF_CONN_UPG and
H1_MF_UPG_WEBSOCKET flags are removed from any non-101 response.
This patch should fix the issue #1997. It must be backported as far as 2.4.
MINOR: mux-quic: use send-list for immediate sending retry
Sending is done with several iterations over qcs streams in qc_send().
The first loop is conducted over streams in <qcc.send_list>. After this
first iteration, some streams may still have data in their Tx buffer but
were blocked by a full qc_stream_desc buffer. In this case, they have
release their qc_stream_desc buffer in qcc_streams_sent_done().
New iterations can be done for these streams which can allocate new
qc_stream_desc buffer if available. Before this patch, this was done
through another stream list <qcc.send_retry_list>. Now, we can reuse the
new <qcc.send_list> for this usage.
This is safe to use as after first iteration, we have guarantee that
either one of the following is true if there is still streams in
<qcc.send_list> :
* transport layer has rejected data due to congestion
* stream is left because it is blocked on stream flow control
* stream still has data and has released a fulfilled qc_stream_desc
buffer. Immediate retry is useful for these streams : they will
allocate a new qc_stream_desc buffer if possible to continue sending.
MINOR: mux-quic: use send-list for STOP_SENDING/RESET_STREAM emission
When a STOP_SENDING or RESET_STREAM must be send, its corresponding qcs
is inserted into <qcc.send_list> via qcc_reset_stream() or
qcc_abort_stream_read().
This allows to remove the iteration on full qcs tree in qc_send().
Instead, STOP_SENDING and RESET_STREAM is done in the loop over
<qcc.send_list> as with STREAM frames. This should improve slightly the
performance, most notably when large number of streams are opened.
Complete qcc_send_stream() function to allow to specify if the stream
should be handled in priority. Internally this will insert the qcs
instance in front of <qcc.send_list> to be able to treat it before other
streams.
This functionality is useful when some QUIC streams should be sent
before others. Most notably, this is used to guarantee that H3 SETTINGS
is done first via the control stream.
Implement a mechanism to register streams ready to send data in new
STREAM frames. Internally, this is implemented with a new list
<qcc.send_list> which contains qcs instances.
A qcs can be registered safely using the new function qcc_send_stream().
This is done automatically in qc_send_buf() which covers most cases.
Also, application layer is free to use it for internal usage streams.
This is currently the case for H3 control stream with SETTINGS sending.
The main point of this patch is to handle stream sending fairly. This is
in stark contrast with previous code where streams with lower ID were
always prioritized. This could cause other streams to be indefinitely
blocked behind a stream which has a lot of data to transfer. Now,
streams are handled in an order scheduled by se_desc layer.
This commit is the first one of a serie which will bring other
improvments which also relied on the send_list implementation.
This must be backported up to 2.7 when deemed sufficiently stable.