git.ipfire.org Git - thirdparty/haproxy.git/log

BUG/MAJOR: ssl/ocsp: lock the OCSP response around reads in the stapling callback

ssl_sock_ocsp_stapling_cbk() reads ocsp->response.area and
ocsp->response.data without any lock, while
ssl_sock_load_ocsp_response() -- called from the CLI "set ssl
ocsp-response" handler, the ocsp-update task and the reload path --
frees and replaces that buffer via chunk_dup(), also without any
synchronization:

    ssl_buf = OPENSSL_malloc(ocsp->response.data);
    ...
    memcpy(ssl_buf, ocsp->response.area, ocsp->response.data);
    SSL_set_tlsext_status_ocsp_resp(ssl, ssl_buf, ocsp->response.data);

A concurrent update frees the old area between the allocation and the
copy (heap-use-after-free read, confirmed under ASan as a READ of size
12548 in the callback's memcpy), or yields a torn read pairing the new
larger length with the old smaller area (linear over-read). Because the
copied bytes are handed to SSL_set_tlsext_status_ocsp_resp() and sent
to the TLS client in the status_request extension, freed or reused heap
contents can be disclosed to any remote client asking for a stapled
response during an update window; updates are periodic by default when
ocsp-update is enabled. A crash is the more likely practical outcome,
but the disclosure variant makes this heartbleed-class.

Let's take ocsp_tree_lock on both sides, which is the file's existing
idiom for accessing OCSP response contents (see
ssl_get_ocspresponse_detail()). The lock declaration is moved to the
top of the !OPENSSL_NO_OCSP section so that the callback can use it.
The critical section on the handshake path stays short (a validity
check plus a copy of a few KB), and allocating memory under this
spinlock is consistent with the existing "show ssl ocsp-response"
handler, which base64 encodes the response into a growable chunk while
holding the same lock.

This must be backported to all supported versions.

BUG/MEDIUM: sample: reject the deprecated protobuf group wire types

protobuf_field_lookup() dispatches on the field's wire type through
protobuf_parser_defs[], but the entries for wire types 3 and 4
(START_GROUP/STOP_GROUP, deprecated) are zero-initialized:

    [PBUF_TYPE_START_GROUP     ] = {
            /* XXX Deprecated XXX */
    },
    [PBUF_TYPE_STOP_GROUP      ] = {
            /* XXX Deprecated XXX */
    },

while the only validation is the upper bound on the table:

    if (wire_type >= sizeof(protobuf_parser_defs) / sizeof(protobuf_parser_defs[0]))
            return 0;

A field using wire type 3 or 4 therefore passes the check and reaches a
call through the NULL .skip/.smp_store pointer: a request body
consisting of the single byte 0x0b (field 1, wire type 3) crashes the
process at pc=0. Any configuration routing a client body through the
"protobuf" converter (e.g. "http-request set-var(txn.f)
req.body,protobuf(1)") makes this remotely and unauthenticatedly
triggerable.

Let's reject wire types whose parser definition is missing and return 0
("field not found"), in line with the function's existing failure
contract.

The protobuf converter first appeared in 2.2; this should be backported
to all supported versions.

BUG/MEDIUM: peers: check the available room before encoding dict values

peer_prepare_updatemsg() encodes stick-table entries into an update
message of <size> bytes but never checks that the encoded data actually
fits (the function still carries its "TODO: check size" comment). For
entries holding a server_key, the dictionary value, up to ~16 KB, is
copied unconditionally:

    /* Encode the length of the dictionary entry data */
    value_len = de->len;
    intencode(value_len, &end);
    /* Copy the data */
    memcpy(end, de->value.key, value_len);

The peers protocol is plaintext and unauthenticated, so a remote peer
can plant an entry whose server_key is larger than the room left in the
buffer. When the victim later teaches that entry, the memcpy() above
writes past the end of the 16384-byte trash buffer; this was confirmed
under ASan as a heap-buffer-overflow WRITE of size 16360, and it fires
again on every teach retry at the same address.

Let's verify that the value fits in the buffer before copying it, and
return 0 ("unable to encode the message") when it does not, which is
how the function already reports its other encoding failures.

This must be backported to all supported versions (server_key in
stick-tables dates back to 2.4).

CLEANUP: slz: clarify that the size promise applies to the stream, not to a call

The output size guarantee of slz_rfc1951_encode() reads as if it applied
to every call, but up to 31 bits are retained in the queue from one call
to the next (on 64-bit systems), so a call may emit a few bytes that
belong to the data of the previous ones, and a single call may emit up to
5 bytes more than expected. Let's just clarify this to avoid future
surprises.

It should be backported to 1.2 for API clarity.

This is libslz upstream commit 5fa0c8da22b7d0a6d67f287a5a2af6af8e6d2b85.

Note: no impact on running code, it's just a comment that may be backported
if it helps with another patch's context.

BUG/MINOR: slz: avoid undefined shifts when building the word byte by byte

On the architectures that do not define UNALIGNED_FASTER, the 32-bit word
compared against the reference table is assembled one byte at a time:

    word = ((unsigned char)in[pos] << 8) +
           ((unsigned char)in[pos + 1] << 16) +
           ((unsigned char)in[pos + 2] << 24);

Unexpectedly, an unsigned char is promoted to a *signed* int when
shifting, so shifting a value >= 128 by 24 places can overflow it, which
is undefined behaviour depending on build options. It happens to produce
the expected result with the usual compilers, but ubsan on an i386 build
reports it for about half of input bytes:

  src/slz.c:482:107: runtime error: left shift of 220 by 24 places cannot
                     be represented in type 'int'

Let's just properly cast the uchars to u32 before shifting (this does
not change the produced code at all).

This must be backported to 1.2.

This is libslz upstream commit fbbb46aa54a4d330c6037f72693e0f79c7853354.

Note: in practice it doesn't happen with the compilers and build options
      we support in haproxy but the fix is trivial so better fix this to
      clean up the code base and make ubsan happy.

BUG/MINOR: slz: fix the adler32 accumulators signedness on 32-bit

slz_adler32_block() unfortunately uses a signed long as the crc accumulator
instead of an unsigned one, meaning that for CRC values where the 32th bit
is set on 32-bit machines, the right shift will drag sign bits and corrupt
it. This only affects zlib streams on 32-bit systems (rfc1950) and has
been there for a very long time, showing that the zlib format is really
not much used in target environments.

The fix is trivial, just change the accumulators to unsigned long.

This must be backported to 1.2. It was in slz.c prior to 1.3.0.

This is libslz upstream commit 912a707525fd2d6a63c9884ef02f69e7379c304c.

Note: the impact in haproxy is the "zlib" compression algorithm often
      causing CRC errors on clients when haproxy runs on a 32-bit
      system. Since "zlib" has long been avoided due to incompatibilities
      with certain clients in the past, the impact should be almost
      inexistent (this issue was never reported).

BUG/MINOR: slz: do not append a block to an already finished stream

slz_rfc1951_flush() terminates the pending block then emits an empty
stored block to byte-align the output. But encode() called with <more>
cleared may leave the stream in SLZ_ST_LAST, that is, inside a block whose
BFINAL bit has already been sent. Flushing there terminated that block,
which completes the deflate stream, and then emitted one more block with
BFINAL set again. Those 5 bytes sit past the end of the stream, so the
gzip or zlib trailer that finish() appends right after is no longer where
the peer expects it: it reads the empty block as the checksum.

The data itself decodes correctly, only the check fails, which makes it
particularly difficult to diagnose. Raw deflate is not affected as it has
no trailer to shift, and a stream that ends in EOB state is not affected
either since its queue is empty and flush() returns early. Fuzzing random
sequences of encode/flush/finish showed ~2% corrupt streams for gzip and
zlib.

When the terminated block was the final one there is nothing left to align
to, and nothing may be appended, so let's just flush the pending bits and
return. This also makes the flush 4 bytes shorter in that case, and the
BFINAL bit of the empty block is now always zero, which is what the state
guarantees at that point.

This must be backported to 1.2.

This is libslz upstream commit 12e726390c96a21bd910d02a2fc594bcd743c131.

Note: no practical impact in haproxy since we never send streams back-
to-back, but some deflate stream decoders might occasionally spot
an error.

BUG/MEDIUM: slz: bound the bits wasted by the 9-bit literals

Commit 002e838 ("bug: always make sure to limit fixed output to less than
worst case literals") made sure that switching from EOB to FIXED to emit
a reference is only done when the reference pays for the way back, based
on the fact that once in FIXED state a reference is always smaller than
the bytes it replaces.

This was unfortunately not enough: the literals interleaved with those
references can still fail. Indeed, octets 144 to 255 cost 9 bits instead
of the 8 they would cost in a stored block. The <bit9> counter measures
exactly that, but it was reset after every reference, so a stream
alternating just under 52 such literals with a cheap reference never
reached the threshold that sends them as a stored block, and kept
inflating for as long as the pattern lasted. 51 random literals >= 144
followed by 4 bytes copied from 8 bytes earlier (i.e. almost the exact
same pattern as previously tested) produce 1073161 bytes out of 1048576
(+2.34%) where the API promises at most 1048663, i.e. 24 kB more than a
caller sizing its output buffer from that promise would expect for a
single call.

Bit9 isn't sufficient to track the debt cross references, so let's add a
second <debt> counter to the stream's unused space. It accumulates the
bits actually wasted by the literals emitted in huffman mode, and each
reference records what it saved over the same bytes sent as literals,
bounded to zero. Above SLZ_MAX_DEBT (200 bits, i.e. 25 bytes) the encoder
stops trusting bit9: pending literals are stored and references have to
compensate for the full round trip, which lets the literal runs merge into
a full 65535-byte block and stop the growth.

The crafted stream now produces 1048677 bytes (+14 bytes over the
documented maximum instead of +24509). Other inputs such as text or
silesia corpus do not show any change since it's quite hard to fall
into this case.

Note that the threshold is deliberately much larger than the 52 bits of
a switch to amortize oscillations without needlessly sending literals.

This needs to be backported to 1.2.

This is libslz upstream commit 039fdf8aac3acfdcaa27ae30b387c942bc58ef84.

Note: the impact in haproxy will start with tune.bufsize above 43kB
      for the default 1kB reserve. A simple workaround consists in
      always keeping the reserve (tune.maxrewrite) at least 1/32 of
      bufsize.

BUG/MINOR: slz: use the exact switch cost for the last literals of a block

The decision to send the pending literals as a stored block rather than in
fixed huffman mode is taken when the 9-bit literals wasted more than the
52 bits it costs to leave the fixed huffman encoding and to come back to
it. But for the last literals of a block, nothing comes after the stored
block, so there is no need to pay for the block type of a next block nor
for the EOB, while the huffman variant still has to send an EOB. The
switch is thus 10 bits cheaper, and 10 more when the stream is still in
EOB state, since then the block type is needed in both cases and no EOB
has to be terminated.

Using 52 there made the encoder prefer huffman for data that was cheaper
to store, and the output could exceed the documented maximum. The smallest
case found by fuzzing is a 47-byte input entirely made of bytes >= 144
which produced 55 bytes (3 bits of block type + 47*9 bits + 7 bits of EOB)
where the stored block only needs 52, for a documented maximum of 54.

With these correct costs, we no longer see outputs exceed the documented
maximum, wether it's with small inputs (tested with ~3 million random
small inputs as small as 47 bytes), or usual files found in tests/ and
bash, gcc, libc, and silesia. No performance change was observed either.

Note that a stream can still exceed the documented maximum by a few bytes
(17 bytes were observed on a 390000-byte crafted input) because each
reference emitted between two stored blocks forces them out and adds a
5-byte block header that the accounting attributes to the reference. This
is for a future fix.

This should be backported to 1.2.

This is libslz upstream commit 97757536178f24aeb2cb41278706a88c1242f414.

Note: the impact in haproxy is practically inexistent since we have a 1kB
      reserve and build by default with the tag at the end of pools, even
      if two extra bytes were to be emitted, it would have no effect.
      Better backport it to avoid triggering ASAN though.

CLEANUP: slz: fix the documented worst case size of flush() and finish()

The documented output buffer requirements of the flush() and finish()
functions date back to the 32-bit queue, where at most 7 bits could be
pending. Since the 64-bit queue was introduced (used on x86_64 and armv8),
up to 31 bits may be pending, and the accounting also forgot the EOB that
may have to be emitted before the empty block. As a result a caller
strictly sizing its output buffer from the documentation could be short
by one to two bytes and see the encoder write past the end of its buffer.

Let's update the document worst cases for these functions depending on
what they still have to emit: 31 pending bits + 7 for EOB + 3 for
BFINAL/BTYPE + 7 for EOB or 32 for LEN+NLEN, rounded up to the next
byte (easily forgotten):

  function           claimed   real
  rfc1951_flush()        9      10
  rfc1951_finish()       4       6
  rfc1950_flush()       11      12
  rfc1950_finish()       8      10
  rfc1952_flush()       19      20
  rfc1952_finish()      12      14

Note that all values are at least as large as the previously claimed ones
and that 32-bit systems never consume more than what was claimed, so the
new documented values are valid both for 32 and 64 bits.

Even though this patch only touches comments, it's marked as a bug so
that it is backported where it matters and users have a chance to spot
the new values.

This should be backported to 1.2.

This is libslz upstream commit 1d774851bc0fe2788ef75d9d74057ecd0c54f868.

Note: this only updates comments, no code was changed. It may be
      backported if it helps with context for other backports.

BUG/MINOR: slz: do not read past the end of the input around the match loop

slz_rfc1951_encode() pre-loads the first 3 bytes of the input into <word>
before entering the main loop on architectures which do not define
UNALIGNED_FASTER (e.g. i386, or big endian ones). This was done
unconditionally, thus inputs shorter than 3 bytes (including empty ones)
caused up to 3 bytes to be read past the end of the input buffer, which
may segfault if the buffer ends on the last page of a mapping. This is
easily reproduced on i386 by placing a zero-length input right before an
unmapped page.

The pre-loaded word is only ever used if the main loop is entered, which
requires at least 4 remaining bytes, so let's simply condition the load
on this.

The exact same case exists at the end of the loop where we can go beyond
end-3 and try to read 3 or 4 bytes before getting back to the beggining
of the loop, so we're using the same condition here, which helps the
compiler perform the test only once and use unconditional branches from
there.

The code is unchanged on x86_64 and armv8 (out of ifdef) and no
measurable change is observed on other archs.

This should be backported to 1.2. The patch is easier consulted with
git show -b.

This is libslz upstream commit 4ff4b66c804629089c0bb16f141a6320f92eba10.

Note: the impact in haproxy is practically inexistent since we build by
      default with the tag at the end of pools, even if an extra byte
      were to be accessed on slower architectures, it would have no
      effect. Better backport it to avoid triggering ASAN though.

BUG/MINOR: http-rules: fix release of a failed "set-cookie-fmt" redirect rule

<redirect_rule.cookie> is a union holding either an ist (set-cookie,
clear-cookie) or a log-format expression (set-cookie-fmt), and
http_free_redirect_rule() picks the member to release from the
REDIRECT_FLAG_COOKIE_FMT flag:

if ((rdr->flags & REDIRECT_FLAG_COOKIE_FMT))
lf_expr_deinit(&rdr->cookie.fmt);
else
istfree(&rdr->cookie.str);

But http_parse_redirect_rule() accumulates the flags in a local variable and
only assigns them with "rule->flags = flags" at the very end, once
everything succeeded. So when the log-format string of a "set-cookie-fmt"
fails to parse, the "goto err" runs http_free_redirect_rule() on a rule
whose ->flags is still 0, and istfree() is called on the <str> member while
<fmt> is the live one, leading the process to crash.

Let's init the rule members (code, type and flags) as soon as possible, so
after the rule allocation. And REDIRECT_FLAG_COOKIE_FMT flag is not set
directly on the rule's flags, before the log-format string parsing.

This only happens while parsing an invalid configuration, hence no security
impact, but a configuration checker must report errors rather than abort.

This should be backported to all versions having "set-cookie-fmt", so 2.9 and
above.

BUG/MINOR: htx: Transfer HTX_FL_EOM flag on success in htx_append_msg()

htx_append_msg() function copy all blocks from a source message to a
destination one. But it never take care to also transfer HTX_FL_EOM flag if
necessary on success. It is important because this function is used to copy
error messages during HTTP analysis.

It seems to be harmless because when an error is triggered the stream is
also closed and most of time a raw copy is performed instead of a
block-per-block copy. But this could lead to prematurely close the
connection at the end of the response.

This patch should be backported to all supported versions.

BUG/MINOR: htx: Perform raw copy for messages of same size in htx_copy_msg()

htx_copy_msg() takes a shortcut when the destination message is empty and copies
the whole source area, HTX header included, in one memcpy():

if (htx_is_empty(htx) && htx_free_space(htx)) {
memcpy(htx, msg->area, msg->size);
return 1;
}

but it never verifies that the destination buffer is at least as large as
<msg->size>. The only caller, http_reply_to_htx(), copies an error message
buffer, which http_str_to_htx() always allocates with a size of
global.tune.bufsize, into the response channel buffer. Two things follow:

  - if the destination were smaller, this would be a plain heap overflow. It is
    not reachable today because the response channel buffer is never a small
    one: only the request channel may be moved to a small buffer, by the
    PR_O2_USE_SBUF_QUEUE code in stream.c, and the L7-retry buffer is not a
    channel. But nothing states nor checks that invariant, and small buffers
    are recent, so this is a landmine.

  - if the destination is larger, which does happen since "http-response
    wait-for-body <time> use-large-buffer" moves the response channel to a large
    buffer, the copy also installs the source's ->size, so the destination HTX
    message ends up believing it is only bufsize-sized. That is harmless (it
    only under-uses the buffer and heals on the next reset) but wrong.

Let's restrict the raw copy to the case where both underlying buffers have
exactly the same size, and fall back to the existing block-per-block append
otherwise.

This should be backported to all supported versions.

CLEANUP: http-conf: rename local trash variable

In, sample_conv_url_enc(), the local trash variable was renamed to chk to
avoid mix-up with the global variable.

CLEANUP: http-conv: Remove useless enc_type init to ENC_QUERY

this commit dropped a dead "enc_type = ENC_QUERY;" store that was
immediately overwritten.

BUG/MEDIUM: tools: make string encoding possible to fail instead of truncating

encode_string() and encode_chunk() functions were silently truncated the
output string if it is too small. encode_string() is not used but
encode_chunk() is used to encode urls (url_enc converter and ocsp). In this
context it is not expected to have an url partially encoded.

So let's slightly change the API to add an extra argument to these functions
to be able to fail when the output string is too small. <truncate> must be
set to 0 to return an error (NULL) instead of truncate the output string.

url_enc converter and ocsp were also update to trigger an error in that
case.

The issue is pretty minor but it remains an API change, so it is tagged as
MEDIUM. It could be backported to 3.4 and probably as far as 3.2 or 3.0.

BUG/MINOR: http-htx: check the strdup() of the "lf-string" http reply argument

In http_parse_http_reply(), the "lf-string" argument copies its value with
strdup() without checking the result:

obj = strdup(args[cur_arg]);
objlen = strlen(args[cur_arg]);
reply->type = HTTP_REPLY_LOGFMT;

while every sibling argument does check it ("string", and "lf-file" through its
combined "!obj || read(...)" test). <obj> is later handed over to
parse_logformat_string(), which starts with "lf_expr->str = strdup(fmt)", so a
NULL would be passed to strdup() and dereferenced.

This only happens if an allocation fails while parsing the configuration, so the
impact is limited, but the check is missing where all the others are present.

This should be backported to all supported versions.

CLEANUP: http-conv: index the captures array with hdr->index in the converters

smp_conv_req_capture() and smp_conv_res_capture() look the capture slot up by
walking the proxy's capture list backwards until the decreasing counter matches
the requested id, then allocate the storage using <hdr->index> but write it
using <idx>:

if (smp->strm->req_cap[hdr->index] == NULL)
smp->strm->req_cap[hdr->index] = pool_alloc(hdr->pool);
...
memcpy(smp->strm->req_cap[idx], smp->data.u.str.area, len);

Both are in fact always equal, because every place that appends to the list
(cfgparse-listen.c, proxy.c, tcp_rules.c and http_act.c) does "hdr->next = px->
req_cap; hdr->index = px->nb_req_cap++; px->req_cap = hdr;", so the head always
carries the highest index and the walk keeps hdr->index == i. But relying on
that invariant to index an array while the neighbouring lines use the field that
actually describes the slot is confusing, and it silently ties these two
functions to the way the list happens to be built.

Let's use hdr->index everywhere. No functional change.

BUG/MINOR: http-act: reject a negative capture id in the capture actions

"http-request capture <expr> id <idx>" and "http-response capture <expr> id
<idx>" parse their identifier with strtol() and only reject trailing garbage:

id = strtol(args[cur_arg], &error, 10);
if (*error != '\0') {
memprintf(err, "cannot parse id '%s'", args[cur_arg]);

A negative identifier is therefore accepted at boot. The check functions only
verify the upper bound ("idx >= px->nb_req_cap"), and even that only for a proxy
with the frontend capability, so nothing complains. At run time,
http_action_req_capture_by_id() looks the slot up by walking the capture list
backwards:

for (h = fe->req_cap, i = fe->nb_req_cap - 1;
h != NULL && i != rule->arg.capid.idx ;
i--, h = h->next);
if (!h)
return ACT_RET_CONT;

<i> never matches a negative <idx> so the walk ends on a NULL <h> and the action
silently does nothing. There is no out-of-bounds access here, unlike in the
"capture.req.hdr" sample fetch, but a configuration that can only ever be a
no-op must be rejected rather than silently ignored.

Let's reject negative ids where they are parsed, which also covers the proxies
that have no frontend capability and are thus not checked at all.

This should be backported to all supported versions.

BUG/MINOR: http-act: work on a copy of the sample in del-headers-bin

http_action_del_headers_bin() walks the varint-encoded list of header names
directly in the sample expression result:

p = b_orig(&hdrs_bin->data.u.str);
end = b_tail(&hdrs_bin->data.u.str);

and uses the names it decodes there while calling http_remove_header() on the
HTX message in between. But the sample may perfectly well point into that very
message: a string sample is cast to a binary one in place, so an expression such
as "req.hdr(x-list)" hands out a pointer inside the header block it describes.
http_remove_header() memmoves the payload of the block it shortens and marks the
blocks it removes as unused, so <p>, <end> and <n> may then designate stale or
recycled bytes and the loop goes on deleting names decoded from garbage. The
header value must be a valid varint-encoded list for this to happen, which is
possible since only NUL, CR and LF are rejected in an H1 header value.

This is the exact same problem as the one fixed for the sibling actions by
commit 43932db85 ("BUG/MEDIUM: http-act: Make a copy of the sample expr in
(set/add)-headers-bin"), which this action was left out of. Let's copy the
sample into a private chunk the same way before decoding it.

This should be backported to 3.4, like the commit above.

BUG/MINOR: http-act: restore the response buffer state in the early-hint action

http_action_early_hint() starts with:

struct htx *htx = htx_from_buf(&res->buf);

htx_from_buf() marks the underlying buffer as full (b_data = b_size) and, as
documented, it is the caller's responsibility to call htx_to_buf() to update it
back. The function never does it. On the success path this is harmless because
the HTX message is not empty, which is exactly what a full buffer represents,
and on the error path channel_htx_truncate() takes care of it. But the very
first test of the function is:

if (!(s->txn.http->req.flags & HTTP_MSGF_VER_11))
goto leave;

so for an HTTP/1.0 client, where no 103 response may be emitted, the response
buffer is left flagged as containing b_size bytes of data while the HTX message
is empty, until some other code path happens to call htx_to_buf() on it. Any
code looking at b_data(&res->buf) in between sees a full response buffer.

No misbehaviour could be observed (the HTX-aware helpers all work on the HTX
message, not on b_data), but leaving the buffer in a state that contradicts its
contents is an accident waiting to happen.

Let's simply call htx_to_buf() on the leave path. It is a no-op for the other
two paths since the message is either non-empty or was already truncated.

This should be backported to all supported versions.

BUG/MINOR: http-act: fix a double free of the map reference on a parsing error

parse_http_set_map() extracts the map/acl file name into <rule->arg.map.ref>,
then parses one or two log-format strings. On a parsing failure it releases the
reference before reporting the error:

if (!parse_logformat_string(args[cur_arg], px, &rule->arg.map.key, ...)) {
free(rule->arg.map.ref);
return ACT_RET_PRS_ERR;
}

but <rule->release_ptr> has already been set to release_http_map(), which starts
with "free(rule->arg.map.ref)". Since the caller calls free_act_rule() upon
ACT_RET_PRS_ERR, the same pointer is freed twice. Both error paths of the
function (the key and the value patterns) are affected, and they cover the
"add-acl", "del-acl", "set-map" and "del-map" actions.

Reproduced with:

http-request set-map(/tmp/m.map) %[nosuchfetch] somevalue

which reports "free(): double free detected in tcache 2" and aborts, so
"haproxy -c" dies instead of reporting the configuration error.

This only happens while parsing an invalid configuration, so there is no
security impact, but a configuration checker must not crash.

Let's use ha_free() so the pointer is reset and release_http_map() becomes a
no-op for it.

This should be backported to all supported versions.

BUG/MINOR: http-act: fix a double free of the regex on a rule parsing error

parse_replace_uri() and parse_http_replace_header() compile their regex into
<rule->arg.http.re>, and when the log-format argument that follows fails to
parse they release it before reporting the error:

if (!parse_logformat_string(args[cur_arg + 1], px, &rule->arg.http.fmt, ...)) {
regex_free(rule->arg.http.re);
return ACT_RET_PRS_ERR;
}

The pointer is left dangling in the rule while <rule->release_ptr> has already
been set to release_http_action(), which does exactly the same:

if (rule->arg.http.re)
regex_free(rule->arg.http.re);

On ACT_RET_PRS_ERR the caller (parse_http_req_cond() & friends) calls
free_act_rule(), which invokes release_ptr, so regex_free() runs twice on the
same object. It ends up calling regfree()/pcre*_free() on freed memory and
free() on an already freed pointer.

It is easily reproduced with:

http-request replace-uri ^/foo /bar%[nosuchfetch]
http-request replace-header X-Foo ^a b%[nosuchfetch]

Both abort under MALLOC_CHECK_=3, and the second one even segfaults with the
libc regex backend, so "haproxy -c" dies instead of reporting the configuration
error (and the remaining errors of the file are never reported).

This only happens on an invalid configuration during parsing, so it has no
security impact, but a configuration checker must not crash.

Let's reset the pointer after releasing it, as done for <arg.http_reply> in
release_act_http_reply().

This should be backported to all supported versions.

CLEANUP: flt-comp: remove a no-op http_remove_header() call

In select_compression_request_header(), the "compression offload" block starts
with:

http_remove_header(htx, &ctx);
ctx.blk = NULL;
while (http_find_header(htx, ist("Accept-Encoding"), &ctx, 1))
http_remove_header(htx, &ctx);

The first call can never do anything: st->comp_algo is only set from the
"Accept-Encoding" loop above, whose exit condition is http_find_header()
returning 0, which resets ctx.blk to NULL, and http_remove_header() returns
immediately for a NULL ctx.blk. The loop that follows removes all the
occurrences of the header anyway, so the call is dead code, and it would remove
the wrong header if ctx were ever to carry another context.

Let's drop it. No functional change.

CLEANUP: htx: remove the unreachable "append_data" label in htx_reserve_max_data()

htx_reserve_max_data() carries an "append_data:" label which is never the target
of a goto: the function simply falls through it. It looks like a leftover from
htx_add_data(), which does use such a label. Let's drop it, the code is reached
by fallthrough anyway and an unused label is only confusing.

BUG/MINOR: http-ana: fix a one-byte over-read in the client-side cookie parser

In http_manage_client_side_cookies(), each iteration of the cookie loop skips
the blanks in front of the attribute name, then tests it against '$':

while (att_beg < hdr_end && HTTP_IS_SPHT(*att_beg))
att_beg++;
...
if (*att_beg == '$')
continue;

<att_beg> may legitimately have reached <hdr_end>, in which case the test reads
one byte past the end of the Cookie header value. This happens for any value
ending with a delimiter, optionally followed by blanks, e.g. "Cookie: a=b; ":
the last iteration starts on the ';', skips it and the trailing space, and lands
exactly on <hdr_end>.

The extra byte always belongs to the HTX buffer (the payload area is followed by
other blocks, the free space or the block table), so there is no out-of-bounds
access at the allocation level, and the only functional consequence is that a
neighbour byte holding a '$' makes the parser skip the end of the header as if
it were an attribute instead of marking the header as to be preserved. But the
value must not be read past its end, and the check is free since <hdr_end> is
already at hand.

Note that http_manage_server_side_cookies() does not have this issue, it uses
"equal == val_end" to detect the empty trailing element.

This has been there since the HTX rewrite of the analysers by commit f4eb75d17
("MINOR: htx: Add proto_htx.c file") in 1.9-dev7, and the pre-HTX code had the
same construct, so it should be backported to all supported versions.

BUG/MINOR: h3: don't use a block pointer to roll back a partial HTX conversion

h3_resp_headers_to_htx() and h3_trailers_to_htx() save a pointer on the tail
HTX block of the destination message and use it to remove whatever they added
when the conversion fails:

tailblk = htx_get_tail_blk(htx);
...
out:
if (appbuf) {
if ((ssize_t)len < 0)
htx_truncate_blk(htx, tailblk);

An HTX block pointer encodes a block *position* (the table is indexed backwards
from the end of the storage area), so it stops designating the same block as
soon as the table is compacted. htx_add_trailer() may reach
htx_reserve_nxblk(), which calls htx_defrag_blks() when the block table has
grown down to the payload while htx->head > 0, i.e. for a nearly full buffer
whose head was already consumed. All blocks then move down by the old value of
htx->head while the saved pointer does not follow, and htx_truncate_blk()
truncates at the wrong place, leaving partially converted trailers in the
message or removing valid blocks.

Only the trailers are really concerned: h3_resp_headers_to_htx() refuses to
work on a non-empty message, so no defragmentation can happen there, but it is
fixed the same way for consistency.

Let's save the amount of data present before the conversion and use
htx_truncate(), which works on a byte offset and is thus insensitive to any
block move, as htx_append_msg() already does.

This should be backported to all versions where the H3 trailers are supported,
so 2.8 and above.

BUG/MINOR: h2: don't use a block pointer to roll back a partial HTX conversion

h2_make_htx_request(), h2_make_htx_response() and h2_make_htx_trailers() save a
pointer on the tail HTX block on entry and use it on their error path to remove
everything they added:

struct htx_blk *tailblk = htx_get_tail_blk(htx);
...
fail:
htx_truncate_blk(htx, tailblk);

An HTX block pointer is not a stable reference: it is computed from a block
*position* ("htx->blocks + htx->size - (pos + 1) * sizeof(struct htx_blk)"), so
it only designates the same block as long as the block table is not compacted.
htx_add_header()/htx_add_trailer() end up in htx_reserve_nxblk(), which calls
htx_defrag_blks() when the table has grown down to the payload while
htx->head > 0. After such a compaction all the blocks move down by the old
value of htx->head, but <tailblk> still points at the same address, hence at a
block located <head> positions further in the message. htx_truncate_blk() would
then truncate at the wrong place, either leaving partially converted headers or
trailers in the message, or dropping valid blocks that were there before.

This only concerns the trailers, and possibly the response on the backend side,
because the destination message may already hold payload with a partially
consumed head there, while a request is always converted into an empty buffer
which cannot defragment.

Let's save the amount of data present on entry and use htx_truncate() instead,
which relies on a byte offset and is therefore immune to any block move. This
is the same pattern as the one already used by htx_append_msg().

This has been there since the H2 to HTX conversion was introduced, so it should
be backported to all supported versions.

BUG/MINOR: h1: report the right error position on authority/host mismatch

When an absolute-form request target does not match the Host header value,
h1_headers_to_hdr_list() reports the error at two places depending on whether
the message must be blocked or only captured:

if (h1m->err_pos < -1) {
state = H1_MSG_LAST_LF;
ptr = host.ptr; /* Set ptr on the error */
goto http_msg_invalid;
}
if (h1m->err_pos == -1) /* capture the error pointer */
h1m->err_pos = v.ptr - start + skip; /* >= 0 now */

The strict path correctly points at the Host header value while the tolerant
path uses <v>, which at this point still holds the value of the *last* parsed
header field, whatever it was. So with "option
accept-unsafe-violations-in-http-request" enabled, the offset stored in
h1m->err_pos, later used by h1_capture_bad_message() and reported by "show
errors", designates an unrelated part of the message.

Let's use host.ptr in both paths.

This was introduced by commit 25bcdb1d9 ("BUG/MAJOR: h1: Be stricter on request
target validation during message parsing") in 3.0-dev12, so it should be
backported to 3.0 and above.

BUG/MINOR: http-htx: check the trash allocation in http_scheme_based_normalize()

http_scheme_based_normalize() uses a trash chunk to rebuild the target URI when
the default port must be dropped or when an empty path must be replaced by "/",
but it does not test the allocation:

struct buffer *temp = alloc_trash_chunk();
struct ist meth, vsn;

/* meth */
chunk_memcat(temp, HTX_SL_REQ_MPTR(sl), HTX_SL_REQ_MLEN(sl));

alloc_trash_chunk() takes its memory from pool_head_trash and returns NULL when
the pool is exhausted, in which case chunk_memcat() dereferences NULL and the
worker crashes. All the other users of alloc_trash_chunk() in the HTTP code
(http_act.c, http_ana.c) do test the result.

Normalization is performed on every absolute-form request URI (from H1, H2 and
H3), so this is reachable under memory pressure. Let's simply report a failure,
which the callers already handle as a rewrite error.

This was introduced by commit 4c0882b1b ("MEDIUM: http: implement scheme-based
normalization") in 2.5-dev2, so it should be backported to all supported
versions.

BUG/MINOR: http: fix an out-of-bounds read in http_get_host_port() on empty host

http_get_host_port() walks backwards from the end of the host looking for the
first non-digit, then checks whether it is a colon:

start = istptr(host);
end = istend(host);
for (ptr = end; ptr > start && isdigit((unsigned char)*--ptr););

/* no port found */
if (likely(*ptr != ':'))
return IST_NULL;

When <host> is empty, the loop condition fails immediately, the pre-decrement
is never evaluated and <ptr> is left equal to <end>, so *ptr reads one byte
past the end of the string. With an IST_NULL argument this is a NULL
dereference.

Both cases are reachable with an empty host: h1_validate_mismatch_authority()
calls it on the Host header value, which may be empty ("Host:\r\n") while an
absolute-form request URI is used, and http_scheme_based_normalize() calls it
on the authority extracted from the URI, which is empty for a request like
"GET http:///x HTTP/1.1". In practice the extra byte always lies inside the
request buffer, so the observable effect is limited to possibly mistaking a
neighbour byte for a colon and returning a bogus port, but it remains an
out-of-bounds read and the helper must be usable with an unset ist.

Let's return IST_NULL right away for an empty host.

This was introduced by commit 658f97162 ("MINOR: http: Add function to get port
part of a host") in 2.7-dev2, so it should be backported to 2.8 and above.

BUG/MINOR: http-fetch: fix a NULL channel dereference in smp_fetch_body()

smp_fetch_body() is also used from the health-check context, where the HTX
message comes from <check->bi> and where there is no channel at all: for
"res.body", SMP_RES_CHN(smp) evaluates to NULL because smp->strm is NULL. The
function is aware of this and guards the channel a few lines below:

if (!finished && (check || (chn && !channel_full(chn, global.tune.maxrewrite) &&

but the trash chunk retrieval isn't:

chk = get_best_trash_chunk(&chn->buf, htx->data);

so as soon as a check response is made of more than one HTX DATA block, a
bogus address close to NULL is dereferenced and the worker crashes. Let's use
the check input buffer when there is no channel, as done elsewhere.

Note that no reproducer could be built: with the h1 mux the successive DATA
blocks of a check response always end up appended to the same HTX block, so
the multi-block case was not reachable in practice. The dereference is
nevertheless plainly invalid.

This is a regression introduced by commit ac37158a6 ("BUG/MEDIUM: chunk:
Review chunks usage to not retrieve a large buffer by error") in 3.5-dev2,
which replaced get_trash_chunk_sz(htx->data), that did not look at the
channel, with get_best_trash_chunk(&chn->buf, htx->data). smp_fetch_body_param()
received the same change but is only reachable through "req.body_param", for
which <check> is always NULL and thus <chn> always set, so it is not affected.

It must be backported wherever the commit above was backported, so 3.4 and
above.

BUG/MEDIUM: http-fetch: reject a negative capture id in capture.{req,res}.hdr

The "capture.req.hdr" and "capture.res.hdr" sample fetches use their integer
argument directly as an index in the stream's captures array:

idx = args->data.sint;

if (idx > (fe->nb_req_cap - 1) || smp->strm->req_cap == NULL ||
smp->strm->req_cap[idx] == NULL)
return 0;

Only the upper bound is verified, and unlike "req.hdr" and friends, which rely
on val_hdr() to enforce the lower bound of the occurrence number, these two
keywords are declared with no argument checker at all. A negative identifier is
therefore accepted at boot, and at runtime req_cap[-1] is read; if the pointer
found there is not NULL it is then passed to strlen() and returned as a string.

This is trivially reproduced with a frontend containing:

capture request header Host len 32
http-request return status 200 hdr X-Cap "%[capture.req.hdr(-1)]"

which segfaults the worker on the very first request.

Since a negative capture identifier is meaningless, the cleanest fix is to
reject it at configuration parsing time, as is done for the header occurrence.
Let's add a val_cap_id() checker and reference it from both keywords.

Note that the "capture-req"/"capture-res" converters in http_conv.c index the
same array with an unchecked value too, but they are saved by the list walk
that precedes the access and which stops on a NULL <hdr>, so they only fail to
capture. They are left untouched.

This bug has been there since the keywords were introduced, so this should be
backported to all supported versions.

BUG/MEDIUM: http-fetch: don't parse a non-HTTP check buffer as an HTX message

In the health-check context, smp_prefetch_htx() unconditionally treats
<check->bi> as an HTX message:

if (!s || !chn) {
if (check) {
htx = htxbuf(&check->bi);

This is only true for an HTTP check, where the h1 mux fills the buffer with
HTX blocks. For any other check ruleset (plain "tcp-check", or the default TCP
connect check), the buffer holds the raw bytes received from the server.

All the response-side HTTP sample fetches (res.body, res.hdr, res.hdrs,
res.ver, status, res.cook...) are declared with SMP_SRC_HRSHV/HRSHP/HRSBO and
the validity table in sample.c makes them usable at the SMP_CKP_BE_CHK_RUL
checkpoint, so referencing one of them from a "tcp-check" rule (typically in an
"on-error" log-format string, or in a "status-code" expression) is accepted at
boot without any warning. When the check then runs, the first bytes of the
server's answer are used as "struct htx" fields: ->size, ->head, ->tail and
->first are entirely provided by the peer. htx_get_first_blk() computes
"htx->blocks + htx->size - (first + 1) * sizeof(struct htx_blk)" and the
resulting block is dereferenced, which is a wild read that crashes the worker,
and which may otherwise return arbitrary process memory as the fetched value.

This can be reproduced with a backend using:

option tcp-check
tcp-check connect
tcp-check expect string ZZZZ on-error "body=%[res.body]"

and a server answering with 24 bytes crafted as a "struct htx" header with a
large ->size (e.g. 0x40000000), ->head = 0, ->tail = 1 and ->first = 0,
followed by any padding: the worker segfaults at the first check.

Let's simply refuse to look at the buffer when the check is not relying on
HTX. HTTP checks are unaffected, the fetches now simply return no sample on
other checks.

Note that this requires a hostile server, which is a trusted component for a
reverse proxy, so this is not a security issue, but the parser must not be fed
a buffer whose format it cannot assume.

This should be backported to all supported versions.

BUG/MINOR: http-htx: fix the length moved when removing a header value

http_remove_header() removes <len> bytes at the offset <off> of the header
value, <off> being "start - v.ptr" once <start> has been adjusted to also eat
the comma surrounding the removed value. The number of bytes that remain to be
moved down is therefore "v.len - off - len", but the function passes
"v.len - len" to memmove(), i.e. <off> bytes too many.

As a result, as soon as the removed value is not the first one of the
comma-delimited list (<off> != 0), memmove() reads <off> bytes past the end of
the header value and, when <off> is larger than <len>, writes (<off> - <len>)
bytes past the new end of the HTX block payload, silently corrupting whatever
follows it in the HTX buffer (typically the payload of the next blocks). For
instance with "x-test: aaaaaaaaaa,b,c", removing the second value gives
off = 10 and len = 2, hence 10 bytes read and 8 bytes written past the 20-byte
block payload. The resulting header value itself is not affected, which is
probably why this went unnoticed.

All the current callers happen to be safe: they either use <full> = 1 when
looking the header up, in which case the whole value matches and the block is
removed as a whole by the early return, or they only ever remove the first
value of the list ("Expect: 100-continue" in http_ana.c and "Age" in cache.c).
So this is only a latent out-of-bounds write for now, but the function is a
generic helper and the next caller removing a non-first value would corrupt
the message.

Let's simply compute the moved length from the end of the value. This can be
verified with the "http_htx" unit test added in the previous commit
(DEBUG_UNIT=1, then "haproxy -U http_htx").

This bug was introduced with the HTX conversion by commit 47596d378 ("MINOR:
http_htx: Add functions to manipulate HTX messages in http_htx.c") in 1.9-dev7
so it should be backported to all supported versions.

CLEANUP: haload: use <arg_thrd> instead of <global.nbthread> where applicable

Replace references to <global.nbthread> with <arg_thrd> across loops,
memory allocations, and thread ID distributions. Having two variables
sharing the same value and meaning is confusing, so it is cleaner to
rely exclusively on <arg_thrd>.

BUG/MINOR: haload: fix CPU topology detection by omitting forced "nbthread"

Always writing "nbthread" in the generated global configuration forced a
static thread count, preventing HAProxy from using its automatic CPU
topology detection and correct core binding. This caused severe performance
degradation on large multi-core machines. Fix this by omitting the "nbthread"
directive when -t is not explicitly specified, while ensuring <arg_thrd>
is properly initialized to global.nbthread for internal calculations such as
connection and request rates.

Also ensure <arg_thrd> <= <arg_usr> when -R is specified for internal
calculations.

MINOR: mux-h1: Use htx version to send default low-level errors

When an error was returned by the H1 multiplexer, the raw version was used
for default low-level errors. It is not really an issue, but it is more
consitent to use the HTX version. This way, by default, header names for
such errors are now sent in lower case. And the case of header names can
still be adjusted if necessary, thanks to the previous fix.

It is not really a bug, there is no reason to backport it, except if someone
ask for it.

MINOR: mux-h1: Lower the case for Sec-Websocket-* headers when manually added

During the message formatting, if websocket handshake key must be added by
the mux itself, the corresponding headers were not inserted in lowercase as
expected. It is not really an issue but it is not consistant with processing
on other headers. So let's fix it.

BUG/MEDIUM: mux-h1: Always adjust case for all outgoing headers as expected

In HTX, all header names are converted to lowercase. However, some legacy H1
applications are still sensitive to the header names case. For this
purpose, it is possible to provide a map to automatically adjust the case of
header names. While it is performed for most responses, it is not true for
the low-level errors triggered during requests parsing. It the same ways,
the case of "Sec-Websocket-Key" a "Sec-Websocket-Accept" headers were
adjusted as expected.

To fix this issue the h1-htx API was slightly changed. Now the map used to
adjust the case of header names, if any, is passed to the function
responsible to format the headers. In the H1 multiplexer, the map is first
retrieved then passed as argument to h1_format_htx_hdr() and
h1_format_htx_msg() functions. A NULL pointer is passed if no rewrite must
be performed.

In the H1 multiplexer, h1_adjust_case_outgoing_hdr() was replaced by
h1_get_hdrs_map().

All other calls to h1_format_htx_hdr() were adapted to use a NULL pointer
(httpclient, http-fetch).

This patch relies on "REORG: h1-htx: Move h1 headers map in h1-htx". Both
commits should be backported to all supported versions.

This should fix the issue #3448.

REORG: h1-htx: Move h1 headers map in h1-htx

The map used to adjust the H1 headers case was moved in h1-htx part. For
this purpose h1_htx-t.h file was added and h1_hdrs_map and h1_hdr_entry
structures were moved into this file.

This commit is mandatory for the next fix.

MINOR: halog: Add support filtering on header capture values using -hdr-match

This patch extends the existing support for printing captured header fields
(`-hdr`) by a new filter (`-hdr-match`) that only processes lines where the
given capture has a specific value. It works together with all existing filters
and output formats.

The full syntax is `-hdr-match <block>:<field>=<value>`, where <block> and <field>
work just like `-hdr` and `<value>` is an exact string match:

Example:

    capture request  header a len 50
    capture request  header b len 50
    capture request  header c len 50
    capture response header d len 50
    capture response header e len 50
    capture response header f len 50

- `-hdr-match 1:1=foo` will filter for requests where `a` is equal to "foo".
- `-hdr-match "2:3=foo bar"` will filter for requests where `f` is equal to
  "foo bar".

The chosen syntax leaves future scope for allowing `<block>:<field>*<value>`
for substring matches and `<block>:<field>^<value>` for prefix matches without
introducing a breaking change.

CLEANUP: halog: Clean up naming for variables related to `-hdr` processing

This is in preparation of allowing to filter based on the values of a header
capture by having a common "capture" prefix for variables related to extracting
header captures.

MINOR: halog: Add reusable function to extract the value of header captures

This is in preparation of allowing to filter based on the values of a header
capture. No functional change is expected.

DEBUG: fd: catch access attempts to closed FDs

Certain rare bugs may cause an fd_want_recv() or any of the other
operations being done on a closed FD. This triggers a BUG_ON() on the
next call trying to insert the FD but it's too late to figure when
this happened. Let's just add some BUG_ON_HOT() to detect an attempt
to modify the state of a closed FD so that the culprit is detected.

It will only be enabled with DEBUG_STRICT=2, since by design this may
never happen so it's not needed to enable it in default builds. It was
verified not to trigger on various tests.

OPTIM: tools: keep a cache of recent localtime() and gmtime()

The log subsystem already keeps a cache of the latest generated time
header to avoid paying the price of snprintf() notably. However, as
reported in GH issue #3444 by @zino7825, localtime() and gmtime() are
affected (at least in glibc) by a tzset_lock held during the call to
__tz_convert(), which ruins performance when logging time such as the
accept date. Even just "option httplog" sees its performance divided
by two on a 64-core machine, from 1.8M to 910k req/s.

The following config even goes down to 388k req/s:

  defaults
    mode http

  frontend fe
    bind :8001
    log 127.0.0.1:5514 len 8192 local0
    log-format '{"t":"%t","tr":"%tr","ci":"%ci"}'
    http-request set-var(txn.now) date(),ltime("%Y-%m-%dT%H:%M:%S")
    option httplog
    default_backend be

  backend be
    server s1 198.18.0.31:8000

With native_queued_spin_lock_slowpath() taking 80% of the CPU, showing
that it's also involved in futexes.

This patch, suggested by @zino7825, implements a very simple thread-
local cache for the 4 previously seen values for both localtime() and
gmtime(). The cache is visited in reverse order so that most recently
updated values are visited first (the most likely ones to be used). A
test with the config above and 12.8M requests showed 38.4M lookups with
only 12k total misses. A more naive scan from 0 to N caused 59M misses.
With this patch, all variants of the tests above remain at native speed
without native_queued_spin_lock_slowpath() being noticeable at all. If
for any reason a log format was so complex that it needed more stored
local times, it would be easy to change the cache size by redefining
TIME_CACHE_BITS.

The functions are no longer inlined, they were moved to tools.c since
we'd rather avoid loops and complex constructs in inlined functions.

Co-authored-by: Jinho Kong <zino7825@users.noreply.github.com>

OPTIM/MEDIUM: proxy/server: avoid server list reordering on startup

Prior to this patch, each server parsed from the configuration was added
at the front of the proxy list. The list was then reversed once parsing
is finished to reflect the configuration order.

Now that proxy servers list has been converted to a doubly linked list,
there is no more a reason for this. Thus, this patch changes the server
insertion order on configuration parsing : this is now performed
directly at the end of the proxy list. Reversal is unnecessary and has
been removed, so post-config performance may be slightly improved.

Peers parsing is the only module which relies on the order insertion.
Thus it has been adapted to now use the last server in its proxy.

MAJOR: proxy: convert server list to a doubly linked struct list

Servers are stored in a list in their parent proxy. Prior to this patch,
this list was singly linked.

This patch converts the proxy server list to a doubly linked struct
list. Server <next> pointer is replaced by a struct list <el_px> attach
point.

The main benefit from this patch is that it removes the bottleneck
performance for add and delete server operations at runtime. As with
main proxies list conversion, this is labelled as major as it is an API
change.

Most of the changes are straightforward : for/while statements are
replaced by list_for_each_entry() macros. LIST_ISEMPTY() is now used to
detect if a proxy does not contain any server.

Server insertion at the front position during config parsing is kept at
the moment, with reordering on post parsing. With the current patch,
this is not strictly necessary so this will be removed in a next change.

MINOR: proxy: define server list iteration functions

Define wrappers function to iterate over a proxy servers list. These
functions are used when a standard for loop over the whole list is not
desirable.

This patch will ease the conversion of the proxy servers list into a
doubly linked struct list type, as the final conversion patch changes
will be smaller.

MINOR: server: do not return next server on srv_drop()

Previously, srv_drop() returned the next server entry in the parent
proxy. This was used as convenience during server iteration. However,
the code has evolved several times to better deal with the server
deletion risk. Currently, srv_drop() return value is only used in a
single place.

Thus, this patch simplifies srv_drop() by removing its return value. As
a side effect, this will also reduce the modification required to
convert the proxy server list to a doubly linked struct list.

MINOR: server: rename global servers_list to all_servers

Rename global <servers_list> to <all_servers>. This name better reflects
that it contains all the servers and servers a similar purpose to
<all_proxies> list.

BUILD: ssl: Do not use SSL3_MT_KEY_UPDATE, hardcode 24 instead

Not every SSL lib provides SSL3_MT_KEY_UPDATE, so just hardcode 24
instead.
This should be backported up to 2.8, when
91004114fe8816f848025fe71def4ea23e72a5f6 will be backported.

BUG/MEDIUM: ssl: Put CO_ER_SSL_KEYUPDATE at the right place

FOr some reason, CO_ER_SSL_KEYUPDATE has been put in the middle of the
enum containing all the different possible connection errors, but
reg-tests depend on their numeric value, so it broke them. Put it at the
end of the enum instead.

This should be backported up to 2.8 with commit
91004114fe8816f848025fe71def4ea23e72a5f6.

DOC: ssl: Document tune.ssl.keyupdate-rate-limit

Add documentation for tune.ssl.keyupdate-rate-limit

This should be backported up to 2.8, when commit
91004114fe8816f848025fe71def4ea23e72a5f6 will be backported.

MEDIUM: ssl: Add a way to rate-limit TLSv1.3 KeyUpdate

Processing TLSv1.3 KeyUpdate is expensive in term of CPU, and in normal
usage there is very few reason to get a lot of them. So add a new
keyword, tune.ssl.keyupdate-rate-limit, that gives the maximum number of
KeyUpdate we're okay with receiving per second. The default is 100,
which should be enough. 0 means no rate-limiting at all.

This should mitigate the problem reported in Github issue #3450.

This should be backported as far back as 2.8.

BUG/MEDIUM: ssl: Handle non-application data record while splicing

When using splicing with kTLS, if we receive a record that is not an
application data record, such as a KeyUpdate, then splicing will fail.
If that happens, temporarily disable splicing and go the regular way so that
recvmsg() is used, we get the record, and we can resume splicing.
Please not that KeyUpdate is still not handled with AWS-LC. Only recent
Linux kernels support it, and the code hasn't been written for that yet.

This should be backported up to 3.3.

This should help with github issue #3450.

BUG/MEDIUM: ssl: Spell HAVE_VANILLA_OPENSSL correctly

Use HAVE_VANILLA_OPENSSL, which is what we actually define when we're
linked against OpenSSL, and not USE_VANILLA_OPENSSL.
Not using the right macro means we may start splicing with kTLS while
there are still data available in OpenSSL's internal buffers, leading to
data not being properly read.

This should be backported up to 3.3.

MINOR: proxy: rename proxies list to all_proxies

Rename global list <proxies> to <all_proxies>. This better highlights
the difference with the other list <main_proxies>. The first one contain
every proxy instances, whereas the second list is only a subset of user
visible proxies.

In the future, it should be sufficient to only keep <all_proxies> list.
However, this requires careful code analysis, in particular to ensure
proxies iteration are always executed with the proper capabilities
filter.

CLEANUP: proxy/config: clean up after proxies list conversion

All proxies list (main proxies, log forward and sink) have been
converted to the doubly linked standard list type. Proxy <next> member
is now unneeded, it is thus removed from the structure.

This patch also removes obsolete _get_next_proxy() wrapper used during
check_config_validity().

OPTIM/MEDIUM: proxy: avoid main proxies list reordering on startup

Prior to this patch, proxies were added in front ot the visible proxies
list, resulting in the reverse order of their parsing. The list was then
reversed once parsing was completed to reflect the configuration order.
This was performed for performance reason, as visible proxies list was
simply linked.

This list has recently been converted to a doubly linked struct list.
Thus, it is now possible to append a newly parsed proxy at the end of
the list in constant time.

Thus, with this patch, a newly parsed instance is now directly inserted
at the end of main proxies list. This is implemented by updating
main_proxies_register() wrapper. Post-config performance are slightly
improved as reversal is now unneeded.

MAJOR: proxy: convert proxies_list to a doubly linked struct list

Frontend, backend and listen proxies are all stored in <proxies_list>
which contain all user visible entries. This list is read in several
places, for example during stats dump.

Dynamic backend have just been introduced in the previous 3.4 release,
with creation and removal of new instances at runtime. The main
bottleneck of these operation is the manipulation of <proxies_list> due
to its singly linked nature.

This patch converts <proxies_list> into a doubly linked standard list
type. Proxy member <el> is reused as attach point. It is already used
for internal proxies in other list, not intersecting with this usage.
The main benefit of this change is to improve performance for add and
delete backend operations. It is labelled as major as it may break
external components which manipulate this list.

Also, old naming <proxies_list> has been replaced by <main_proxies>.
This helps to differentiate with the already existing superset <proxies>
list which contain all proxies instances, even internal ones. In the
future, it should be sufficient to only keep the latter list. However,
this requires careful code analysis, in particular to ensure proxies
iteration are always executed with the proper capabilities filter.

Proxies reversal on post parsing has been updated to use the new list
format. This will be removed in a latter patch as it is now possible to
insert proxies in the order of their parsing in constant time. Most of
the other changes are straightforward : for/while statements are
replaced by list_for_each_entry(). For incomplete iteration, recently
introduced iteration wrappers are updated to use LIST macros.

MINOR: proxy: define proxies_list iteration functions

Define wrappers function for <proxies_list> iteration. These functions
are used when a standard for loop over the whole list is not desirable.

The objective of this patch is to ease the conversion of <proxies_list>
from a singly linked list to a standard doubly linked list type. This
will help to keep the final conversion patch smaller.

Also, iteration over struct list is sligthly more complicated as it
requires more operation, for example for list end detection. Thus, the
wrappers will become handy to avoid tedious code repetion.

MINOR: proxy: centralize proxies_list insert during config parsing

Define a new function wrapper mainpx_register() which is used to insert
a proxy instance in <proxies_list> which contain all user visible
entries. Insertion is performed at the front of the list. Thus it must
only be used during configuration parsing, before proxies list reversal.

The main objective of this patch is to simplify the list conversion to a
doubly linked one by centralizing the required changes in a single
function. A nice benefit is that it is now easier to determine the exact
proxies list composition.

MINOR: sink: convert list to standard doubly linked one

This patch is similar to the previous one, dealing this time with sink
proxies stored in <sink_proxies_list>.

This list is converted to a doubly linked standard list type, in
preparation to the conversion of the visible proxies list. The objective
is similar : code simplification for check_config_validity() when
looping over several proxies list.

MINOR: log: convert list to standard doubly linked one

Log-forward proxies are not stored in <proxies_list> alongside other
visible user proxies. Instead, they are stored in a dedicated list
<cfg_log_forward>.

As with the visible proxies list, log forwarders list was defined as a
simple singly linked one. This patch converts it to the doubly linked
standard list type, frequently used in haproxy source code. Proxy <el>
member is reused as list attach point. It was only used by defaults
instances prior to this patch.

This patch does not bring noticeable change. However it will simplify
visible proxies list similar conversion as both lists are looped over
together during check_config_validity().

MINOR: config: define wrapper for proxies loop during check config

On check_config_validity(), proxy post init code is reexcuted over
several proxies list : the main proxies list, log sections and sinks.

This patch defines an internal wrapper to help performing this
iteration. It will become necessary once each list is converted from a
singly linked format to a doubly linked struct list. Once all lists are
converted, the wrapper should be removed as it will be once again
unnecessary.

MINOR: log: use curproxy during config parsing

Adjusts configuration parsing for log-forward sections : replace
<cfg_log_forward> head pointer list usage in individual keywords parser
by <curproxy> global variable. This parsing patterm is already
implemetned in cfgparse-listen module.

This is only a refactoring change without any visible effect. It is a
preparatoy step before converting <cfg_log_forward> to a full doubly
linked struct list.

MINOR: haload: add rate limiting support using -R option

Implement request/connection rate limiting per second using the new
-R command line option <arg_rate>.

Instead of reinventing the wheel, this implementation aligns with the
h1load approach by relying on existing frequency counters. Dedicated
tasks <hld_rate_task> are spawned on each thread solely to manage and
throttle connection initializations based on <arg_rate>.

The counter <running_usrs> is renamed to <running_tasks> to track both
users and connection rate-limiting tasks.

Modify the doc/haload.txt file.

BUG/MINOR: haload: fix display glitches by flushing stdout in summary

Fix a bug in hld_summary() where the output stream was not explicitly
flushed after printing the periodic statistics line. This caused text
buffering issues and visual overlapping in the terminal output.

BUG/MINOR: haload: set default thread count to 1

By default <arg_thrd> was unset (-1) and fell back to global.nbthread,
causing inconsistencies since <arg_usr> defaults to 1. This patch sets
the default thread count to 1, enforces that <arg_thrd> does not exceed
<arg_usr>, updates usage help, and writes nbthread unconditionally.

Modify the doc/haload.txt file.

BUG/MINOR: haload: fix use-after-free upon updating task expiration

Fix a bug in hld_strm_task() where hs was accessed after it had already been
freed via hldstream_free(&hs). The task expiration update and queueing
now safely rely on <usr> rather than <hs->usr>.

MINOR: proxy: stress "show errors" handler

Implement stress function for "show errors" output. This forces the
command to yield on every newly dumped instance, which is useful for
debugging.

BUG/MEDIUM: proxy: protect "show errors" against backend deletion

Command "show errors" is not safe as it loops over the list of proxies.
A crash may occur if the task yields on a proxy which is deleted just
before the dump is restarted.

As with previous fixes dealing with similar cases, this is fixed via the
watcher mechanism. A new <px_watcher> member is defined in
<show_errors_ctx> and used to loop over the proxies list.

This must be backported up to 3.4.

CLEANUP: mux_quic: remove unused prototype

Remove qcc_update_shut_id() declaration. This function was never defined
in the code.

[RELEASE] Released version 3.5-dev3

Released version 3.5-dev3 with the following main changes :
    - BUILD: haload: Increase a buffer size so that gcc will stop complaining
    - BUILD: task: Fix build when no 8B CAS is available at all
    - BUG/MINOR: hlua: Apply socket timeout on server side only
    - BUG/MEDIUM: applet: Reenable reads in applet context if requesting a connection
    - MINOR: stats: factor the proxy vs scope check into its own function
    - BUG/MEDIUM: stats: subject "stats admin" accesses to "stats scope" filtering
    - BUG/MINOR: shctx: fix shctx_row_data_get() when offset exceeds a block
    - MINOR: shctx: clamp shctx_row_data_get() reads against the offset
    - DOC: stats: document the per-proxy byte count fields in the CSV list
    - BUG/MEDIUM: cache: reattach the row when a secondary entry is incomplete
    - MINOR: http: add two header parsing functions
    - MINOR: cache: minor changes ahead of support for sending early hints
    - MINOR: cache: add config options for early hints support
    - MINOR: cache: add helper functions for early hints support
    - MINOR: shctx: allow consumers to customize eviction strategy
    - MINOR: cache: track full and hints entries in per-pool LRU lists
    - MEDIUM: cache: early hints-aware eviction code
    - MINOR: cache: indicate whether entries are stripped or not
    - MINOR: http: factor 103 emission into start/end helpers
    - MEDIUM: cache: emit early hints if configured to do so
    - MINOR: cache: add a counter for cache hits serving early hints
    - MINOR: cache: allow opting out of early hints at the rule level
    - MINOR: cache: allow customizing ratio for early hints
    - REGTESTS: cache: validate the emission of 103s
    - MINOR: cache: factor cache_extract_link_hints out of cache_extract_hints
    - MEDIUM: cache: add support for hints-only HTTP caches
    - MEDIUM: ssl: introduce src/fips.c with TLS version check
    - MEDIUM: ssl: add FIPS TLS 1.2 cipher check for AWS-LC
    - MEDIUM: ssl: set FIPS-approved cipher defaults for AWS-LC FIPS builds
    - MEDIUM: ssl: add FIPS TLS 1.3 ciphersuite check for AWS-LC
    - MEDIUM: ssl: set FIPS-approved curve defaults for AWS-LC FIPS builds
    - MEDIUM: ssl: add FIPS elliptic curve check for AWS-LC
    - MEDIUM: ssl: set FIPS-approved sigalgs defaults for AWS-LC FIPS builds
    - MEDIUM: ssl: add FIPS signature algorithm check for AWS-LC
    - CLEANUP: cache: align the cache_hint_hits increment with its siblings
    - BUG/MEDIUM: stats: Ensure that Origin is valid on POSTs
    - DOC: stats: Document that stats admin is vulnerable to a CSRF attack
    - BUG/MEDIUM: ssl-gencert: Don't forget to free memory when done
    - BUG/MEDIUM: protobuf: adjust sample size capacity after pointer shift
    - BUG/MEDIUM: protobuf: fix nested path bypass in field lookup
    - BUG/MINOR: ssl: fix proxy lookup for show ssl sni
    - REGTESTS: protobuf: add regression test for nested vs flat paths
    - DEBUG: add BUG_ON_STATIC(): a compile-time BUG_ON()
    - CLEANUP: event_hdl: Use BUG_ON_STATIC()
    - DOC: internals: update core-principles with initializations
    - BUG/MINOR: mux-h1: Don't delay send if message with c-l was fully sent
    - BUG/MEDIUM: net-helper: Adjust sample size capacity after pointer shift
    - BUG/MEDIUM: sample: Adjust sample size capacity after pointer shift for bytes()
    - BUG/MEDIUM: sample: Adjust sample size capacity after pointer shift for ltrim()
    - BUG/MINOR: sample: Fix bytes() when length it greater than remaining data
    - BUILD: Makefile: error when trying to build with aws-lc with the wrong flags
    - DOC: fix typo in "del ssl ech" command
    - DOC: remove outdated experimental mention on dynamic backends
    - BUG/MEDIUM: proxy: protect "show servers ..." against server deletion
    - BUG/MEDIUM: proxy: protect "show servers ..." against backend deletion
    - BUG/MEDIUM: proxy: protect show backend against be deletion
    - MINOR: proxy: stress CLI commands with backends/servers loop
    - BUG/MEDIUM: server: Properly check for streams before deletion
    - BUG/MINOR: resolvers: do not index resolvers names in the proxies

BUG/MINOR: resolvers: do not index resolvers names in the proxies

Resolvers are self-sustaining sections that have their own list, but for
the purpose of supporting TCP connections, a backend proxy is created
with them, and it holds the same name as the section. Since we started
to create a "default" resolvers section, a similarly named backend in
TCP mode automatically appeared, resulting in issues such as the one
described in GH #3445:

   defaults
       mode http

   frontend public
       bind :8001
       default_backend default

It yells: "... tries to use incompatible tcp proxy 'default' (<internal>:0)
in a 'use_backend' rule (see 'mode')" because it in fact finds the hidden
resolvers proxy. Depending on versions, adding such a default backend makes
the issue go away or not.

Worse, the following config:

   frontend public
       bind :8001
       default_backend default

actually routes the traffic to the local resolvers :-(

Resolvers do not need to appear in proxies index tree, as they're looked
up by find_resolvers_by_id() which iterates over all resolvers sections
(i.e. tree not used), so we can safely avoid indexing them.

Some other proxies are now protected against this via the use of PR_CAP_INT
which also prevents indexing, but that flag changed multiple times along
versions and is extremely sensitive in older ones, to the point of not
being suitable for a backport.

Instead, commit 116983ad94 ("MEDIUM: cfgparse: do not store unnamed
defaults in name tree") added the following test to refrain from
indexing an unnamed proxy in setup_new_proxy():

  @@ -1718,7 +1718,8 @@ int setup_new_proxy(struct proxy *px, const char *name, unsigned int cap, char *
          px->cap = cap;
          px->last_change = ns_to_sec(now_ns);

  -       if (name && !(cap & PR_CAP_INT))
  +       /* Internal proxies or with empty name are not stored in the named tree. */
  +       if (name && name[0] != '\0' && !(cap & PR_CAP_INT))
                  proxy_store_name(px);

          if (!(cap & PR_CAP_DEF))

This part is totally suitable for backporting and permits to simply
change the resolvers' proxy creation to pass NULL or an empty string
instead of the resolvers section's name so that we don't index that
proxy. That's what this patch does. For versions before 3.4, the
commented patch above must be explicitly backported. This patch must
be backported to 3.0. Thanks to Lukasz Nowak for reporting this bug.

It's worth noting that this area is really falling apart and urgently
needs attention. The internal API is currently too limited to permit
reliable lookups, which is the reason why so many proxies are hidden
that way. In an ideal world, all names would be indexed and caps and
modes properly set to unambiguous values, so that lookup functions
only look up specific caps and modes, and support being passed a mask
to avoid matching undesired caps. At the very least a mask should be
given in lookups, and a few extra caps should permit to distinguish
regular proxies (those declared with the "frontend" and "backend"
keyword, supporting use_backend) from other ones (rings, resolvers,
log-forward etc).

BUG/MEDIUM: server: Properly check for streams before deletion

When a server delete command happens, before we actually delete the
server, we check if there are any streams still attached to the server,
and will refuse the deletion if it is so. Unfortunately, the check was
not exhaustive, we would check if there's any stream in the queue, if
there are any connection to the server established, and if served is
non-zero.
But there are at least two cases where a stream will have a reference to
the server, but none of those will be true : the first one is the case
where we tried a connection to the server, and it failed. At this point
we decremented served, but we will still hold a reference to the server
in target until a new server has been assigned. The second one happens
with cookie persistence, in which case target will be set to the server
address way before any connection is attempted, and so way before served
is incremented.
To fix that, introduce a new per-thread-group counter in the server
structure, nb_strm, that counts the streams whose target points to that
server. It is incremented when a stream's target is set to a server, and
decremented when the target changes to another server, to a non-server,
or the stream ends. To make this reliable, all assignments to a stream's
target now go through the new stream_set_target() helper, which keeps
nb_strm up to date and also clears sv_tgcounters when the target is no
longer a server, so we don't keep a dangling reference to the server's
counters (its failure counters have already been accounted for by that
point). To check if there are still any streams attached to the server,
the code just looks at all the per-thread-group counters, and if one of
them is non-zero, there are still streams on that server. This will
always be accurate because the server deletion command runs with thread
isolation.

This should be backported up to 3.0.

MINOR: proxy: stress CLI commands with backends/servers loop

Update handlers for "show backend" and "show servers state/conn" so that
stress mode can be used to force the task yielding each time a new
instance is dumped.

This is useful for debugging commands safety with backend/server
deletion in parallel.

BUG/MEDIUM: proxy: protect show backend against be deletion

Command "show backend" loops over the list of visible proxies to display
information about all the backend instances. This command may yield in
case of a long output, but this is not safe with introduction of dynamic
backend deletion at runtime.

Fixes this by using the watcher mechanism, similarly to what is
implemented for stats dump. To support this, a new context dedicated to
"show backend" has been defined.

This must be backported up to 3.4.

BUG/MEDIUM: proxy: protect "show servers ..." against backend deletion

This patch is a direct follow up of the previous one which fixes
commands "show servers conn/state". The current patch ensures that these
commands are safe even with task yielding and backend deletion in
parallel. This is also based on the watcher mechanism.

This must be backported up to 3.4.

BUG/MEDIUM: proxy: protect "show servers ..." against server deletion

Command show servers conn/state loops over a server list to display
various information. This command may yield in case of a large output.
This can cause a crash if the server on which the command was paused is
deleted prior to restarting the command.

Fixes this by using the watcher mechanism, similar to what is already
implementing with stats dump.

This must be backported at least up to 3.0. For older releases, watcher
mechanism is not available, server refcount + SRV_F_DELETED flag must be
used instead.

DOC: remove outdated experimental mention on dynamic backends

Dynamic backends commands were first implemented with experimental
status guard. However, this status was removed before the final release
as it is considered stable enough.

This patch cleans up outdated references to experimental status for
add/del backend commands in management documentation.

This must be backported up to 3.4.

DOC: fix typo in "del ssl ech" command

Correct "det" to "del" keyword.

This must be backported up to 3.3.

BUILD: Makefile: error when trying to build with aws-lc with the wrong flags

USE_OPENSSL_AWSLC=1 is required when trying to build with aws-lc. This
was already required before, but kind of worked without it since
f76e8e50f460 ("BUILD: ssl: replace USE_OPENSSL_AWSLC by
OPENSSL_IS_AWSLC").

However since 9ac590b5 ("MEDIUM: ssl: introduce src/fips.c with TLS
version check"), this can't work by accident anymore.

This patch emits an error when we find OPENSSL_IS_AWSLC, but we didn't
found USE_OPENSSL_AWSLC=1.

BUG/MINOR: sample: Fix bytes() when length it greater than remaining data

For bytes() converter, when the length parameter is greater than the
remaining data, no truncation on length must be performed. However, in that
case, nothing was performed at all. The changes because of the offset
parameter was just ignored.

Now, when the length value is too large, the sample data are moved
accordingly to the offset value as expected.

The coresponding reg-test was updated to test this case with a non-zero
offset. In addition the documentation was fixed to properly match what the
converter do. It was wrongly updated when the support of variables was
introduced in 2.9.

This patch must be backported as far as 3.0.

BUG/MEDIUM: sample: Adjust sample size capacity after pointer shift for ltrim()

For the ltrim() converter, on success, the area pointer of the sample buffer
is moved forward without updating the buffer size accordingly. If this
converter is followed by another one relying on the buffer size to do some
operations on the buffer area, this could lead to a buffer overflow.

Thanks to Charles Vosburgh <theminershive@gmail.com> for reporting this and
providing the fix.

This patch must be backported to all supported versions.

BUG/MEDIUM: sample: Adjust sample size capacity after pointer shift for bytes()

For the bytes() converter, on success, the area pointer of the sample buffer
is moved forward without updating the buffer size accordingly. If this
converter is followed by another one relying on the buffer size to do some
operations on the buffer area, this could lead to a buffer overflow.

Thanks to Charles Vosburgh <theminershive@gmail.com> for reporting this and
providing the fix.

This patch must be backported to 3.0.

BUG/MEDIUM: net-helper: Adjust sample size capacity after pointer shift

eth.data, eth.src and ip.data converters are concerned. On success, the area
pointer of the sample buffer is moved forward without updating the buffer
size accordingly. If these converters are followed by another one relying on
the buffer size to do some operations on the buffer area, this could lead to
a buffer overflow.

Thanks to Charles Vosburgh <theminershive@gmail.com> for reporting this and
providing the fix.

This patch must be backported to 3.4.

BUG/MINOR: mux-h1: Don't delay send if message with c-l was fully sent

In the H1 multiplexer, if a message was fully sent considering the specified
content-length, the MSG_MORE flag must be unset, regardless the upper layer
said. This should avoid a extra latency of 200ms when this happens while the
HTX EOM flag is not set on the message. This may happen for H2/H3 messages
with trailers or if the END_OF_STREAM was not reported yet on H2/H3 side. In
both cases, on H1 side, no more data will be sent (trailers will be
dropped). So we must never delay the send.

This patch should fix the issue #3447. It must be backported to all
supported versions.

DOC: internals: update core-principles with initializations

Code reviews tend to remain confused about initialized values depending
on struct types. Let's clarify when pool_alloc() and calloc() are used.

CLEANUP: event_hdl: Use BUG_ON_STATIC()

Now that we have a BUG_ON_STATIC() macro, use it instead of rolling our
own.

DEBUG: add BUG_ON_STATIC(): a compile-time BUG_ON()

Fails the build when <cond>, a constant expression, is true. Maps to
_Static_assert() on C11+; older dialects declare a negatively-sized
extern array instead, which emits no storage nor symbol. Usable at
file and block scope.

REGTESTS: protobuf: add regression test for nested vs flat paths

This patch adds a new VTC regression test to validate the protobuf()
sample converter and prevent future regressions.

It tests four distinct scenarios to ensure correct behavior:
  - validating legitimate nested path extraction (e.g., path "1.2") on a
    nested payload.
  - ensuring flat path extraction (path "2") is blocked when the field is
    actually inside a nested payload.
  - validating legitimate flat sibling extraction (path "2") on a flat
    payload.
  - ensuring nested path extraction (path "1.2") is blocked when run on
    a flat sibling payload which was fixed by this commit

    BUG/MEDIUM: protobuf: fix nested path bypass in field lookup

BUG/MINOR: ssl: fix proxy lookup for show ssl sni

Command "show ssl sni" accepts an extra option -f to restrict output to
a single frontend. If the specified proxy is not found, an error message
should be displayed. However, this does not behave as expected as no
error is reported and the first proxy in the list is used as a sort of
fallback.

This patch fixes the command parsing function when -f is used. The loop
is now interrupted as soon as a matching entry is found. After the loop,
if local variable <px> is still NULL, it indicates that no matching
entry was found. The error message is displayed as intended and the
command is not executed.

This must be backported up to 3.2.

BUG/MEDIUM: protobuf: fix nested path bypass in field lookup

The previous implementation of protobuf_field_lookup() was susceptible
to a security bypass where flat sibling fields could be incorrectly matched
as nested children. This occurred because the parser did not strictly
enforce hierarchical boundaries and depth matching during sequential
iteration, allowing flat structures to satisfy nested path requirements (e.g.
treating a root-level sibling field as if it were nested under a parent).

This patch refactors the lookup logic into a strict, iterative stream
parser that progresses through the defined configuration path (depth).
The new logic guarantees that:

- intermediate matching parent fields must be of type LENGTH_DELIMITED.
   If an intermediate match is found but is a primitive type, the parsing
   aborts immediately.
- when descending into an intermediate parent, the scanning boundary
   (*len) is strictly narrowed down to the nested sub-message size (vlen)
   with an upfront validation against buffer overflow (vlen > *len).
- sibling fields at the current level are skipped cleanly without polluting
   higher or lower hierarchy levels.

This implementation achieves absolute stack safety (O(1) memory complexity,
no recursion) and guarantees strict validation of nested Protobuf paths.

Many thanks to Red Hat and AISLE Research for reporting this.

This must be backported as far as 2.6

BUG/MEDIUM: protobuf: adjust sample size capacity after pointer shift

This bug impacts "protobuf" and "ungrpc" sample fetches.

When extracting fields from a payload, the data pointer
"smp->data.u.str.area" is forwarded to point directly onto the decoded
sub-field data. However, the sample's maximum capacity size tracking property
"smp->data.u.str.size" was left untouched, remaining set to the total
original buffer size (e.g., tune.bufsize).

This creates an architectural size mismatch. Subsequent processing layers or
converters (such as stick-tables executing padding mechanisms via memset())
may assume that the original total capacity is still fully available starting
from the new forwarded offset. In specific configurations involving writable
cloned buffers (e.g., combining "lower" and "protobuf" converters before
tracking key storage), this logical flaw can easily lead to out-of-bounds
heap memory corruption or crashes.

Under rare conditions, depending on the specific combination of frontend
configurations (e.g., use of wait-for-body rules) and the alignment of
incoming HTTP requests containing specific chunked or large payload patterns,
this out-of-bounds write could lead to an immediate haproxy process crash
or data corruption.

Fix this by introducing "protobuf_adjust_smp_size()". This inline function
safely computes the consumed byte offset after each pointer shift and
decrements the sample's residual capacity size accordingly before updating
the payload reference. All "protobuf_smp_store_*" callbacks are updated.

Many thanks to Red Hat and AISLE Research for reporting this.

This must be backported as far as 2.6

BUG/MEDIUM: ssl-gencert: Don't forget to free memory when done

In ssl_sock_do_create_cert(), don't forget to free ctmp and tmp_ssl once
we're done creating the certificate, otherwise we will get a memory leak
each time we have to create a new certificate, which will happen each
time a new SNI is used.

This should be backported as far as 2.6 (though on older releases,
ssl_gencert.c doesn't exist, and that function is found in ssl_sock.c).

This patch was submitted by Red Hat and AISLE Research

DOC: stats: Document that stats admin is vulnerable to a CSRF attack

Document that stats admin is vulnerable to a CSRF attack that can't be
totally mitigated, so if that feature really has to be used, precautions
must be taken.

This should be backported up to 2.6.

Many thanks to Red Hat and AISLE Research for reporting this.

BUG/MEDIUM: stats: Ensure that Origin is valid on POSTs

When receiving a POST on the stats interface, make sure that Origin (or
Referer if no Origin is found) matches the Host, to attempt to mitigate
a CSRF attack.

This should be backported up to 2.6.

Many thanks to Red Hat and AISLE Research for reporting this.