git.ipfire.org Git - thirdparty/haproxy.git/log

MEDIUM: systemd: implement directory loading

Redhat-based system already use a CFGDIR variable to load configuration
files from a directory, this patch implements the same feature.

It now requires that /etc/haproxy/conf.d exists or the service won't be
able to start.

REORG/MINOR: cfgparse: eliminate code duplication by lshift_args()

There were similar parts of the code in "no" and "default" prefix
keywords handling. This duplication caused the bug once.

No backport needed.

BUG/MINOR: cfgparse: fix "default" prefix parsing

Fix the left shift of args when "default" prefix matches. The cause of the
bug was the absence of zeroing of the right element during the shift. The
same bug for "no" prefix was fixed by commit 0f99e3497, but missed for
"default".

The shift of ("default", "option", "dontlog-normal")
    produced ("option", "dontlog-normal", "dontlog-normal")
  instead of ("option", "dontlog-normal", "")

As an example, a valid config line:
    default option dontlog-normal

caused a parse error:
[ALERT]    (32914) : config : parsing [bug-default-prefix.cfg:22] : 'option dontlog-normal' cannot handle unexpected argument 'dontlog-normal'.

The patch should be backported to all stable versions, since the absence of
zeroing was introduced with "default" keyword.

REGTESTS: jwe: Fix tests of algorithms not supported by AWS-LC

Many tests use the A128KW algorithm which is not supported by AWS-LC but
instead of removing those tests we will just have a hardcoded value set
by default in this case.

MINOR: jwe: Some algorithms not supported by AWS-LC

AWS-LC does not have EVP_aes_128_wrap or EVP_aes_192_wrap so the A128KW
and A192KW algorithms will not be supported for JWE token decryption.

DOC: jwe: Add doc for jwt_decrypt converters

Add doc for jwt_decrypt_secret and jwt_decrypt_cert converters.

REGTESTS: jwe: Add jwt_decrypt_secret and jwt_decrypt_cert tests

Test the new jwt_decrypt converters.

MINOR: jwe: Add new jwt_decrypt_cert converter

This converter checks the validity and decrypts the content of a JWE
token that has an asymetric "alg" algorithm (RSA). In such a case, we
must provide a path to an already loaded certificate and private key
that has the "jwt" option set to "on".

MINOR: jwe: Add new jwt_decrypt_secret converter

This converter checks the validity and decrypts the content of a JWE
token that has a symetric "alg" algorithm. In such a case, we only
require a secret as parameter in order to decrypt the token.

REGTESTS: ssl: Add tests for new aes cbc converters

This test mimics what was already done for the aes_gcm converters. Some
data is encrypted and directly decrypted and we ensure that the output
was not changed.

MINOR: ssl: Add new aes_cbc_enc/_dec converters

Those converters allow to encrypt or decrypt data with AES in Cipher
Block Chaining mode. They work the same way as the already existing
aes_gcm_enc/_dec ones apart from the AEAD tag notion which is not
supported in CBC mode.

MINOR: ssl: Factorize AES GCM data processing

The parameter parsing and processing and the actual crypto part of the
aes_gcm converter are interleaved. This patch puts the crypto parts in a
dedicated function for better reuse in the upcoming JWE processing.

MEDIUM: proxy: force traffic on unpublished/disabled backends

A recent patch has introduced a new state for proxies : unpublished
backends. Such backends won't be eligilible for traffic, thus
use_backend/default_backend rules which target them won't match and
content switching rules processing will continue.

This patch defines a new frontend keywords 'force-be-switch'. This
keyword allows to ignore unpublished or disabled state. Thus,
use_backend/default_backend will match even if the target backend is
unpublished or disabled. This is useful to be able to test a backend
instance before exposing it outside.

This new keyword is converted into a persist rule of new type
PERSIST_TYPE_BE_SWITCH, stored in persist_rules list proxy member. This
is the only persist rule applicable to frontend side. Prior to this
commit, pure frontend proxies persist_rules list were always empty.

This new features requires adjustment in process_switching_rules(). Now,
when a use_backend/default_backend rule matches with an non eligible
backend, frontend persist_rules are inspected to detect if a
force-be-switch is present so that the backend may be selected.

MINOR: cfgparse: adapt warnif_cond_conflicts() error output

Utility function warnif_cond_conflicts() is used when parsing an ACL.
Previously, the function directly calls ha_warning() to report an error.
Change the function so that it now takes the error message as argument.
Caller can then output it as wanted.

This change is necessary to use the function when parsing a keyword
registered as cfg_kw_list. The next patch will reuse it.

MINOR: stats: report BE unpublished status

A previous patch defines a new proxy status : unpublished backends. This
patch extends this by changing proxy status reported in stats. If
unpublished is set, an extra "(UNPUB)" is added to the field.

Also, HTML stats is also slightly updated. If a backend is up but
unpublished, its status will be reported in orange color.

MEDIUM: proxy: implement publish/unpublish backend CLI

Define a new set of CLI commands publish/unpublish backend <be>. The
objective is to be able to change the status of a backend to
unpublished. Such a backend is considered ineligible to traffic : this
allows to skip use_backend rules which target it.

Note that contrary to disabled/stopped proxies, an unpublished backend
still has server checks running on it.

Internally, a new proxy flags PR_FL_BE_UNPUBLISHED is defined. CLI
commands handler "publish backend" and "unpublish backend" are executed
under thread isolation. This guarantees that the flag can safely be set
or remove in the CLI handlers, and read during content-switching
processing.

MEDIUM: proxy: do not select a backend if disabled

A proxy can be marked as disabled using the keyword with the same name.
The doc mentions that it won't process any traffic. However, this is not
really the case for backends as they may still be selected via switching
rules during stream processing.

In fact, currently access to disabled backends will be conducted up to
assign_server(). However, no eligible server is found at this stage,
resulting in a connection closure or an HTTP 503, which is expected. So
in the end, servers in disabled backends won't receive any traffic. But
this is only because post-parsing steps are not performed on such
backends. Thus, this can be considered as functional but only via
side-effects.

This patch clarifies the handling of disable backends, so that they are
never selected via switching rules. Now, process_switching_rules() will
ignore disable backends and continue rules evaluation.

As this is a behavior change, this patch is labelled as medium. The
documentation manuel for use_backend is updated accordingly.

REGTESTS: add test on backend switching rules selection

Create a new test to ensure that switching rules selection is fine.
Currently, this checks that dynamic backend switching works as expected.
If a matching rule is resolved to an unexisting backend, the default
backend is used instead.

This regtest should be useful as switching-rules will be extended in a
future set of patches to add new abilities on backends, linked to
dynamic backend support.

MEDIUM: stream: refactor switching-rules processing

This commit rewrites process_switching_rules() function. The objective
is to simplify backend selection so that a single unified
stream_set_backend() call is kept, both for regular and default backends
case.

This patch will be useful to add new capabilities on backends, in the
context of dynamic backend support implementation.

BUG/MINOR: proxy: free persist_rules

force-persist proxy keyword is converted into a persist_rule, stored in
proxy persist_rules list member. Each new rule is dynamically allocated
during parsing.

This commit fixes the memory leak on deinit due to a missing free on
persist_rules list entries. This is done via deinit_proxy()
modification. Each rule in the list is freed, along with its associated
ACL condition type.

This can be backported to every stable version.

MEDIUM: thread: Turn the group mask in thread set into a group counter

If we want to be able to have more than 64 thread groups, we can no
longer use thread group masks as long.
One remaining place where it is done is in struct thread_set. However,
it is not really used as a mask anywhere, all we want is a thread group
counter, so convert that mask to a counter.

BUG/MEDIUM: queues: Fix arithmetic when feeling non_empty_tgids

Fix the arithmetic when pre-filling non_empty_tgids when we still have
more than 32/64 thread groups left, to get the right index, we of course
have to divide the number of thread groups by the number of bits in a
long.
This bug was introduced by commit
7e1fed4b7a8b862bf7722117f002ee91a836beb5, but hopefully was not hit
because it requires to have at least as much thread groups as there are
bits in a long, which is impossible on 64bits machines, as MAX_TGROUPS
is still 32.

MINOR: threads: Eliminate all_tgroups_mask.

Now that it is unused, eliminate all_tgroups_mask, as we can't 64bits
masks to represent thread groups, if we want to be able to have more
than 64 thread groups.

MINOR: queues: Turn non_empty_tgids into a long array.

In order to be able to have more than 64 thread groups, turn
non_empty_tgids into a long array, so that we have enough bits to
represent everty thread group, and manipulate it with the ha_bit_*
functions.

BUG/MINOR: http_act: fix deinit performed on uninitialized lf_expr in release_http_map()

As reported by GH user @Lzq-001 on issue #3245, the config below would
cause haproxy to SEGFAULT after having reported an error:

frontend 0000000
http-request set-map %[hdr(0000)0_

Root cause is simple, in parse_http_set_map(), we define the release
function (which is responsible to clear lf_expr expressions used by the
action), prior to initializing the expressions, while the release
function assumes the expressions are always initialized.

For all similar actions, we already perform the init prior to setting
the related release function, but this was not the case for
parse_http_set_map(). We fix the bug by initializing the expressions
earlier.

Thanks to @Lzq-001 for having reported the issue and provided a simple
reproducer.

It should be backported to all stable versions, note for versions prior to
3.0, lf_expr_init() should be replace by LIST_INIT(), see
6810c41 ("MEDIUM: tree-wide: add logformat expressions wrapper")

MEDIUM: counters: mostly revert da813ae4d7cb77137ed

Contrarily to what was previously believed, there are corner cases where
the counters may not be allocated, and we may want to make them optional
at a later date, so we have to check if those counters are there.
However, just checking that shared.tg is non-NULL is enough, we can then
assume that shared.tg[tgid - 1] has properly been allocated too.
Also modify the various COUNTER_SHARED_* macros to make sure they check
for that too.

BUG/MEDIUM: quic: fix ACK ECN frame parsing

ACK frames are either of type 0x02 or 0x03. The latter is an indication
that it contains extra ECN related fields. In haproxy QUIC stack, this
is considered as a different frame type, set to QUIC_FT_ACK_ECN, with
its own set of builder/parser functions.

This patch fixes ACK ECN parsing function. Indeed, the latter suffered
from two issues. First, 'first ACK range' and 'ACK ranges' were
inverted. Then, the three remaining ECN fields were simply ignored by
the parsing function.

This issue can cause desynchronization in the frames parsing code, which
may result in various result. Most of the time, the connection will be
aborted by haproxy due to an invalid frame content read.

Note that this issue was not detected earlier as most clients do not
enable ECN support if the peer is not able to emit ACK ECN frame first,
which haproxy currently never sends. Nevertheless, this is not the case
for every client implementation, thus proper ACK ECN parsing is
mandatory for a proper QUIC stack support.

Fix this by adjusting quic_parse_ack_ecn_frame() function. The remaining
ECN fields are parsed to ensure correct packet parsing. Currently, they
are not used by the congestion controller.

This must be backported up to 2.6.

BUG/MEDIUM: threads: Fix binding thread on bind.

The code to parse the "thread" keyword on bind lines was changed to
check if the thread numbers were correct against the value provided with
max-threads-per-group, if any were provided, however, at the time those
thread keywords have been set, it may not yet have been set, and that
breaks the feature, so revert to check against MAX_THREADS_PER_GROUP instead,
it should have no major impact.

MEDIUM: counters: Remove some extra tests

Before updating counters, a few tests are made to check if the counters
exits. but those counters should always exist at this point, so just
remmove them.
This commit should have no impact, but can easily be reverted with no
functional impact if various crashes appear.

MEDIUM: counters: Dynamically allocate per-thread group counters

Instead of statically allocating the per-thread group counters,
based on the max number of thread groups available, allocate
them dynamically, based on the number of thread groups actually
used. That way we can increase the maximum number of thread
groups without using an unreasonable amount of memory.

BUG/MINOR: net_helper: fix IPv6 header length processing

The IPv6 header contains a payload length that excludes the 40 bytes of
IPv6 packet header, which differs from IPv4's total length which includes
it. As a result, the parser was wrong and would only see the IP part and
not the TCP one unless sufficient options were present tocover it.

This issue came in 3.4-dev2 with recent commit e88e03a6e4 ("MINOR:
net_helper: add ip.fp() to build a simplified fingerprint of a SYN"),
so no backport is needed.

BUG/MINOR: hlua_fcn: ensure Patref:add_bulk() is given a table object before using it

As reported by GH user @kanashimia in GH #3241, providing anything else
than a table to Patref:add_bulk() method could cause a segfault because
we were calling lua_next() with the lua object without ensuring it
actually is a table.

Let's add the missing lua_istable() check on the stack object before
calling lua_next() function on it.

It should be backported up to 3.2 with 884dc62 ("MINOR: hlua_fcn:
add Patref:add_bulk()")

BUG/MINOR: hlua_fcn: fix broken yield for Patref:add_bulk()

In GH #3241, GH user @kanashimia reported that the Patref:add_bulk()
method would raise a Lua exception when called with more than 101
elements at once.

As identified by @kanashimia there was an error in the way the
add_bulk() method was forced to yield after 101 elements precisely.
The yield is there to ensure Lua doesn't eat too much ressources at
once and doesn't impact haproxy's core responsiveness, but the check
for the yield was misplaced resulting in improper stack content upon
resume.

Thanks to user @kanashimia who even provided a reproducer which helped
a lot to troubleshoot the issue.

This fix should be backported up to 3.2 with 884dc62 ("MINOR: hlua_fcn:
add Patref:add_bulk()") where the bug was introduced.

BUG/MINOR: stats-file: Use a 16bits variable when loading tgid

Now that the tgid stored in the stats file has been increased to 16bits
by commit 022cb3ab7fdce74de2cf24bea865ecf7015e5754, don't forget to
increase the variable size when reading it from the file, too.
This should have no impact given the maximum thread group limit is still
32.

MINOR: stats: Increase the tgid from 8bits to 16bits

Increase the size of the stored tgid in the stat file from 8bits to
32bits, so that we can have more than 256 thread group. 65536 should be
enough for some time.

This bumps thet stat file minor version, as the structure changes.

MINOR: receiver: Dynamically alloc the "members" field of shard_info

Instead of always allocating MAX_TGROUPS members, allocate them
dynamically, using the number of thread groups we'll use, so that
increasing MAX_TGROUPS will not have a huge impact on the structure
size.

CLEANUP: connection: Remove outdated note about CO_FL `0x00002000` being unused

This flag is used as of commit dcce9369129f6ca9b8eed6b451c0e20c226af2e3
("MINOR: connections: Add a new CO_FL_SSL_NO_CACHED_INFO flag"). This patch
should be backported to 3.3. Apparently dcce9369129 has been backported
to 3.2 and 3.1 already, with that change already applied, so no need for a
backport there.

MINOR: tcp-sample: permit retrieving tcp_info from the connection/session stage

The fc_xxx info that are retrieved over tcp_info could currently not
be accessed before a stream is created due to a test that verified the
existence of a stream. The rationale here was that the function works
both for frontend and backend. Let's always retrieve these info from
the session for the frontend case so that it now becomes possible to
set variables at connection/session time. The doc did not mention this
limitation so this could almost be considered as a bug.

MINOR: sample: also support retrieving fc.timer.handshake without a stream

Some timers, like the handshake timer, are stored in the session and are
only copied to the logs struct when a stream is created. But this means
we can't measure it without a stream, nor store it once for all in a
variable at session creation time. Let's extend the sample fetch function
to retrieve it from the session when no stream is present. The doc did not
mention this limitation so this could almost be considered as a bug.

MINOR: cfgparse: remove duplicate "force-persist" in common kw list

"force-persist" proxy keyword is listed twice in common_kw_list. This
patch removes the duplicated occurence.

This could be backported up to 2.4.

MEDIUM: config: warn if some userlist hashes are too slow

It was reported in GH #2956 and more recently in GH #3235 that some
hashes are way too slow. The former triggers watchdog warnings during
checks, the second sees the config parsing take 20 seconds. This is
always due to the use of hash algorithms that are not suitable for use
in low-latency environments like web. They might be fine for a local
auth though. The difficulty, as explained by Philipp Hossner, is that
developers are not aware of this cost and adopt this without suspecting
any side effect.

The proposal here is to measure the crypt() call time and emit a warning
if it takes more than 10ms (which is already extreme). This was tested
by Philipp and confirmed to catch his case.

This is marked medium as it might start to report warnings on config
suffering from this problem without ever detecting it till now.

BUG/MINOR: ech/quic: enable ech configuration also for quic listeners

Patch dba4fd24 ("MEDIUM: ssl/ech: config and load keys") introduced
ECH configuration for bind lines, but the QUIC configuration parsers
still suffers from not using the same code as the TCP/TLS one, so the
init for QUIC was missed.

Must be backported in 3.3.

CI: github: remove ERR=1 temporarly from the ECH job

The ECH job still fails to compile since the openssl 4.0 deprecated
functions were not removed yet. Let's remove ERR=1 temporarly.

We do know that there's a regression in OpenSSL 4.0 with these
reg-tests though:

Error: #    top  TEST reg-tests/ssl/set_ssl_crlfile.vtc FAILED (0.219) exit=2
Error: #    top  TEST reg-tests/ssl/set_ssl_cafile.vtc FAILED (0.236) exit=2
Error: #    top  TEST reg-tests/quic/set_ssl_crlfile.vtc FAILED (0.196) exit=2

REGTESTS: ssl: Fix reg-tests curve check

OpenSSL changed the output from "Server Temp Key" in prior versions to
"Peer Temp Key" in recent ones.
https://github.com/openssl/openssl/commit/a39dc27c2573da14e85ca8961970c82009bd4ff6
It looks like it affects OpenSSL >=3.5.0
This broke the reg-test for e.g. Debian 13 builds, using OpenSSL 3.5.1

Fixes bug #3238

Could be backported in every branches.

Signed-off-by: Christian Ruppert <idl0r@qasl.de>

BUG/MINOR: cli/stick-tables: argument to "show table" is optional

Discussed in issue #3187, the CLI help is confusing for the "show table"
command as it seems that the argument is mandatory.

This patch adds the arguments between square brackets to remove the
confusion.

BUILD: sockpair: fix build issue on macOS related to variable-length arrays

In GH issue #3226, Sergey Fedorov (@barracuda156) reported that since
commit 10c14a1ed0 ("MINOR: proto_sockpair: send_fd_uxst: init iobuf,
cmsghdr, cmsgbuf to zeros"), macOS 10.6.8 with gcc 14.3.0 doesn't build
anymore:

  src/proto_sockpair.c: In function 'send_fd_uxst':
  src/proto_sockpair.c:246:49: error: variable-sized object may not be initialized except with an empty initializer
    246 |         char cmsgbuf[CMSG_SPACE(sizeof(int))] = {0};
        |                                                 ^
  src/proto_sockpair.c:247:45: error: variable-sized object may not be initialized except with an empty initializer
    247 |         char buf[CMSG_SPACE(sizeof(int))] = {0};
        |                                             ^

Upon investigation, it appears that the CMSG_SPACE() macro on this OS
looks too complex for gcc to consider it as a constant, so it takes
these buffers for variable-length arrays and cannot initialize them.

Let's move to a simple memset() instead, which Sergey confirmed fixes
the problem.

This needs to be backported as far as 3.1. Thanks to Sergey for the
report, the bisect and testing the fix.

MINOR: cfgparse: Refactor "userlist" parser to print it in -dKall operation

This patch covers issue https://github.com/haproxy/haproxy/issues/3221.

The parser for the "userlist" section did not use the standard keyword
registration mechanism. Instead, it relied on a series of strcmp()
comparisons to identify keywords such as "group" and "user".

This had two main drawbacks:
1. The keywords were not discoverable by the "-dKall" dump option,
   making it difficult for users to see all available keywords for the
   section.
2. The implementation was inconsistent with the parsers for other
   sections, which have been progressively refactored to use the
   standard cfg_kw_list infrastructure.

This patch refactors the userlist parser to align it with the project's
standard conventions.

The parsing logic for the "group" and "user" keywords has been extracted
from the if/else block in cfg_parse_users() into two new dedicated
functions:
- cfg_parse_users_group()
- cfg_parse_users_user()

These two keywords are now registered via a dedicated cfg_kw_list,
making them visible to the rest of the HAPorxy ecosystem, including the
-dKall dump.

BUG/MINOR: cfgparse: wrong section name upon error

When a unknown keyword was used in the "userlist" section, the error was
mentioning the "users" section, instead of "userlist".

Could be backported in every branches.

BUILD: tools: memchr definition changed in C23

New gcc and clang versions from fedora rawhide seems to use the C23
standard by default. This version changes the definition of some
string.h functions, which now return a const char * instead of a char *.

src/tools.c: In function ‘fgets_from_mem’:
src/tools.c:7200:17: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
7200 | new_pos = memchr(*position, '\n', size);
| ^

Strangely, -Wdiscarded-qualifiers does not seem to catch all the
memchr.

Should fix issue #3228.

This could be backported in previous versions.

BUILD: ssl: strchr definition changed in C23

New gcc and clang versions from fedora rawhide seems to use the C23
standard by default. This version changes the definition of some
string.h functions, which now return a const char * instead of a char *.

src/ssl_sock.c: In function ‘SSL_CTX_keylog’:
src/ssl_sock.c:4475:17: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
4475 | lastarg = strrchr(line, ' ');

Strangely, -Wdiscarded-qualifiers does not seem to catch all the
strrchr.

Should fix issue #3228.

This could be backported in previous versions.

[RELEASE] Released version 3.4-dev2

Released version 3.4-dev2 with the following main changes :
    - BUG/MEDIUM: mworker/listener: ambiguous use of RX_F_INHERITED with shards
    - BUG/MEDIUM: http-ana: Properly detect client abort when forwarding response (v2)
    - BUG/MEDIUM: stconn: Don't report abort from SC if read0 was already received
    - BUG/MEDIUM: quic: Don't try to use hystart if not implemented
    - CLEANUP: backend: Remove useless test on server's xprt
    - CLEANUP: tcpcheck: Remove useless test on the xprt used for healthchecks
    - CLEANUP: ssl-sock: Remove useless tests on connection when resuming TLS session
    - REGTESTS: quic: fix a TLS stack usage
    - REGTESTS: list all skipped tests including 'feature cmd' ones
    - CI: github: remove openssl no-deprecated job
    - CI: github: add a job to test the master branch of OpenSSL
    - CI: github: openssl-master.yml misses actions/checkout
    - BUG/MEDIUM: backend: Do not remove CO_FL_SESS_IDLE in assign_server()
    - CI: github: use git prefix for openssl-master.yml
    - BUG/MEDIUM: mux-h2: synchronize all conditions to create a new backend stream
    - REGTESTS: fix error when no test are skipped
    - MINOR: cpu-topo: Turn the cpu policy configuration into a struct
    - MEDIUM: cpu-topo: Add a "threads-per-core" keyword to cpu-policy
    - MEDIUM: cpu-topo: Add a "cpu-affinity" option
    - MEDIUM: cpu-topo: Add a new "max-threads-per-group" global keyword
    - MEDIUM: cpu-topo: Add the "per-thread" cpu_affinity
    - MEDIUM: cpu-topo: Add the "per-ccx" cpu_affinity
    - BUG/MINOR: cpu-topo: fix -Wlogical-not-parentheses build with clang
    - DOC: config: fix number of values for "cpu-affinity"
    - MINOR: tools: add a secure implementation of memset
    - MINOR: mux-h2: add missing glitch count for non-decodable H2 headers
    - MINOR: mux-h2: perform a graceful close at 75% glitches threshold
    - MEDIUM: mux-h1: implement basic glitches support
    - MINOR: mux-h1: perform a graceful close at 75% glitches threshold
    - MEDIUM: cfgparse: acknowledge that proxy ID auto numbering starts at 2
    - MINOR: cfgparse: remove useless checks on no server in backend
    - OPTIM/MINOR: proxy: do not init proxy management task if unused
    - MINOR: patterns: preliminary changes for reorganization
    - MEDIUM: patterns: reorganize pattern reference elements
    - CLEANUP: patterns: remove dead code
    - OPTIM: patterns: cache the current generation
    - MINOR: tcp: add new bind option "tcp-ss" to instruct the kernel to save the SYN
    - MINOR: protocol: support a generic way to call getsockopt() on a connection
    - MINOR: tcp: implement the get_opt() function
    - MINOR: tcp_sample: implement the fc_saved_syn sample fetch function
    - CLEANUP: assorted typo fixes in the code, commits and doc
    - BUG/MEDIUM: cpu-topo: Don't forget to reset visited_ccx.
    - BUG/MAJOR: set the correct generation ID in pat_ref_append().
    - BUG/MINOR: backend: fix the conn_retries check for TFO
    - BUG/MINOR: backend: inspect request not response buffer to check for TFO
    - MINOR: net_helper: add sample converters to decode ethernet frames
    - MINOR: net_helper: add sample converters to decode IP packet headers
    - MINOR: net_helper: add sample converters to decode TCP headers
    - MINOR: net_helper: add ip.fp() to build a simplified fingerprint of a SYN
    - MINOR: net_helper: prepare the ip.fp() converter to support more options
    - MINOR: net_helper: add an option to ip.fp() to append the TTL to the fingerprint
    - MINOR: net_helper: add an option to ip.fp() to append the source address
    - DOC: config: fix the length attribute name for stick tables of type binary / string
    - MINOR: mworker/cli: only keep positive PIDs in proc_list
    - CLEANUP: mworker: remove duplicate list.h include
    - BUG/MINOR: mworker/cli: fix show proc pagination using reload counter
    - MINOR: mworker/cli: extract worker "show proc" row printer
    - MINOR: cpu-topo: Factorize code
    - MINOR: cpu-topo: Rename variables to better fit their usage
    - BUG/MEDIUM: peers: Properly handle shutdown when trying to get a line
    - BUG/MEDIUM: mux-h1: Take care to update <kop> value during zero-copy forwarding
    - MINOR: threads: Avoid using a thread group mask when stopping.
    - MINOR: hlua: Add support for lua 5.5
    - MEDIUM: cpu-topo: Add an optional directive for per-group affinity
    - BUG/MEDIUM: mworker: can't use signals after a failed reload
    - BUG/MEDIUM: stconn: Move data from <kip> to <kop> during zero-copy forwarding
    - DOC: config: fix a few typos and refine cpu-affinity
    - MINOR: receiver: Remove tgroup_mask from struct shard_info
    - BUG/MINOR: quic: fix deprecated warning for window size keyword

BUG/MINOR: quic: fix deprecated warning for window size keyword

QUIC configuration was cleaned up in the previous release. Several
global keyword names were changed to unify the configuration. For each
of them the older keyword is marked as deprecated, with a warning to
mention the newer alternative.

This patch fixes the warning for 'tune.quic.frontend.default-max-size'
as the alternative proposed was not correct. The proper value now is
'tune.quic.fe.cc.max-win-size'.

This must be backported up to 3.3.

MINOR: receiver: Remove tgroup_mask from struct shard_info

The only purpose from tgroup_mask seems to be to calculate how many
tgroups share the same shard, but this is an information we can
calculate differently, we just have to increment the number when a new
receiver is added to the shard, and decrement it when one is detached
from the shard. Removing thread group masks will allow us to increase
the maximum number of thread groups past 64.

DOC: config: fix a few typos and refine cpu-affinity

There were two typos in the recently updated parts about per-group.
Also, change the commas to ':' after the options values, as sometimes
it would be confusing. Last, place quotes around keyword names so that
they're explicitly referred to as language keywords. No backport is
needed.

BUG/MEDIUM: stconn: Move data from <kip> to <kop> during zero-copy forwarding

The <kip> of producer was not forwarded to <kop> of consumer when zero-copy
data forwarding was tried. Because of the issue, the chunking of emitted H1
messages could be invalid.

To fix the bug, sc_ep_fwd_kip() must be called at this stage.

This fix is related to the previous one (529a8dbfb "BUG/MEDIUM: mux-h1: Take
care to update <kop> value during zero-copy forwarding"). Both are required
to fully fix the issue #3230.

This patch must be backported to 3.3.

BUG/MEDIUM: mworker: can't use signals after a failed reload

In issue #3229 it was reported that the master couldn't reload after a
failed reload following a wrong configuration.

It is still possible to do a reload using the "reload" command of the
master CLI. But every signals are blocked.

The problem was introduced in 709cde6d0 ("BUG/MEDIUM: mworker: signals
inconsistencies during startup and reload") which fixes the blocking of
signals during the reload.

However the patch missed a case, indeed, the
run_master_in_recovery_mode() is not being called when the worker failed
to parse the configuration, it is only failing when the master is
failing.

To handle this case, the mworker_unblock_signals() function must be
called upon mworker_on_new_child_failure(). But since this is called in
an haproxy signal handler it would mess with the signals.

Instead, the patch adds a task which is started by the signal handler,
and restores the signals outside of it.

This must be backported as far as 3.1.

MEDIUM: cpu-topo: Add an optional directive for per-group affinity

When using per-group affinity, add an optional new directive. It accepts
the values of "auto", where when multiple thread groups are created, the
available CPUs are split equally across the groups, and is the new
default, and "loose", where all groups are bound to all available CPUs,
this is the old default.

MINOR: hlua: Add support for lua 5.5

Lua 5.5 adds an extra argument to lua_newstate(). Since there are
already a few other ifdefs in hlua.c checking for the Lua version,
and there's a single call place, let's do the same here. This should
be safe for backporting if needed.

Signed-off-by: Mike Lothian <mike@fireburn.co.uk>

MINOR: threads: Avoid using a thread group mask when stopping.

Remove the "stopped_tgroup_mask" variable, that indicated which thread
groups were stopping, and instead just use "stopped_tgroups", a counter
indicating how many thread groups are stopping. We want to remove all
thread group masks, so that we can increase the maximum number of thread
groups past 64.

BUG/MEDIUM: mux-h1: Take care to update <kop> value during zero-copy forwarding

Since the extra field was removed from the HTX structure, a regression was
introduced when forwarding of chunked messages. The <kop> value was not
decreased as it should be when data were sent via the zero-copy
forwarding. Because of this bug, it was possible to announce a chunk size
larger than the chunk data sent.

To fix the bug, an helper function was added to properly update the <kop>
value when a chunk size is emitted. This function is now called when new
chunk is announced, including during zero-copy forwarding.

As a workaround, "tune.disable-zero-copy-forwarding" or just
"tune.h1.zero-copy-fwd-send off" can be set in the global section.

This patch should fix the issue #3230. It must be backported to 3.3.

BUG/MEDIUM: peers: Properly handle shutdown when trying to get a line

When a shutdown was reported to a peer applet, the event was not properly
handled if it failed to receive data. The function responsible to get data
was exiting too early if the applet buffer was empty, without testing the
sedesc status. Because of this issue, it was possible to have frozen peer
applets. For instance, it happend on client timeout. With too many frozen
applets, it was possible to reach the maxconn.

This patch should fix the issue #3234. It must be backported to 3.3.

MINOR: cpu-topo: Rename variables to better fit their usage

Rename "visited_tsid" and "visited_ccx" to "touse_tsid" and
"touse_ccx". They are not there to remember which tsid/ccx we
alreaday visited, contrarily to visited_ccx_set and
visited_cl_set, they are there to know which tsid/ccx we should
use, so make that clear.

MINOR: cpu-topo: Factorize code

Factorize the code common to cpu_policy_group_by_ccx() and
cpu_policy_group_by_cluster() into a new function,
cpu_policy_assign_threads().

MINOR: mworker/cli: extract worker "show proc" row printer

Introduce cli_append_worker_row() to centralize formatting of a single
worker row. Also, replace duplicated row-printing code in both current
and old workers loops with the helper. Motivation: Reduces LOC and
improves readability by removing duplication.

BUG/MINOR: mworker/cli: fix show proc pagination using reload counter

After commit 594408cd612b5 ("BUG/MINOR: mworker/cli: 'show proc' is limited
by buffer size"), related to ticket #3204, the "show proc" logic
has been fixed to be able to print more than 202 processes. However, this
fix can lead to the omission of entries in case they have the same
timestamp.

To fix this, we use the unique reload counter instead of the timestamp.
On partial flush, set ctx->next_reload = child->reloads.
On resume skip entries with child->reloads >= ctx->next_reload.
Finally, we clear ctx->next_reload at the end of a complete dump so
subsequent show proc starts from the top.

Could be backported in all stable branches.

CLEANUP: mworker: remove duplicate list.h include

Drop the second #include <haproxy/list.h> from mworker.c.
No functional change; reduces redundancy and keeps includes tidy.

MINOR: mworker/cli: only keep positive PIDs in proc_list

Change mworker_env_to_proc_list() to if (child->pid > 0) before
LIST_APPEND, avoiding invalid PIDs (0/-1) in the process list.
This has no functional impact beyond stricter validation and it aligns
with existing kill safeguards.

DOC: config: fix the length attribute name for stick tables of type binary / string

The stick-table doc was reworked and moved in 3.2 with commit da67a89f3
("DOC: config: move stick-tables and peers to their own section"), however
the optional length attribute for binary/string types was mistakenly
spelled "length" while it's "len".

This must be backported to 3.2.

MINOR: net_helper: add an option to ip.fp() to append the source address

The new value 4 will permit to append the source address to the
fingerprint, making it easier to build rules checking a specific path.

MINOR: net_helper: add an option to ip.fp() to append the TTL to the fingerprint

With mode value 1, the TTL will be appended immediately after the 7 bytes,
making it a 8-byte fingerprint.

MINOR: net_helper: prepare the ip.fp() converter to support more options

It can make sense to support extra components in the fingerprint to ease
configuration, so let's change the 0/1 value to a bit field. We also turn
the current 1 (TCP options list) to 2 so that we'll reuse 1 for the TTL.

MINOR: net_helper: add ip.fp() to build a simplified fingerprint of a SYN

Here we collect all the stuff that depends on the sender's settings,
such as TOS, IP version, TTL range, presence of DF bit or IP options,
presence of DATA in the SYN, CWR+ECE flags, TCP header length, wscale,
initial window, mss, as well as the list of TCP extension kinds. It's
obviously fairly limited but can allows to avoid blacklisting certain
valid clients sharing the same IP address as a misbehaving one.

It supports both a short and a long mode depending on the argument.
These can be used with the tcp-ss bind option. The doc was updated
accordingly.

MINOR: net_helper: add sample converters to decode TCP headers

This adds the following converters, used to decode fields
in an incoming tcp header:

   tcp.dst, tcp.flags, tcp.seq, tcp.src, tcp.win,
   tcp.options.mss, tcp.options.tsopt, tcp.options.tsval,
   tcp.options.wscale, tcp.options_list,

These can be used with the tcp-ss bind option. The doc was updated
accordingly.

MINOR: net_helper: add sample converters to decode IP packet headers

This adds a few converters that help decode parts of IP packets:
  - ip.data : returns the next header (typically TCP)
  - ip.df   : returns the dont-fragment flags
  - ip.dst  : returns the destination IPv4/v6 address
  - ip.hdr  : returns only the IP header
  - ip.proto: returns the upper level protocol (udp/tcp)
  - ip.src  : returns the source IPv4/v6 address
  - ip.tos  : returns the TOS / TC field
  - ip.ttl  : returns the TTL/HL value
  - ip.ver  : returns the IP version (4 or 6)

These can be used with the tcp-ss bind option. The doc was updated
accordingly.

MINOR: net_helper: add sample converters to decode ethernet frames

This adds a few converters that help decode parts of ethernet frame
headers:
  - eth.data : returns the next header (typically IP)
  - eth.dst  : returns the destination MAC address
  - eth.hdr  : returns only the ethernet header
  - eth.proto: returns the ethernet proto
  - eth.src  : returns the source MAC address
  - eth.vlan : returns the VLAN ID when present

These can be used with the tcp-ss bind option. The doc was updated
accordingly.

BUG/MINOR: backend: inspect request not response buffer to check for TFO

In 2.6, do_connect_server() was introduced by commit 0a4dcb65f ("MINOR:
stream-int/backend: Move si_connect() in the backend scope") and changed
the approach to work with a stream instead of a stream-interface. However
si_oc(si) was wrongly turned to &s->res instead of &s->req, which breaks
TFO by always inspecting the response channel to figure whether there are
data pending.

This fix can be backported to all versions till 2.6.

BUG/MINOR: backend: fix the conn_retries check for TFO

In 2.6, the retries counter on a stream was changed from retries left
to retries done via commit 731c8e6cf ("MINOR: stream: Simplify retries
counter calculation"). However, one comparison fell through the cracks
in order to detect whether or not we can use TFO (only first attempt),
resulting in TFO never working anymore.

This may be backported to all versions till 2.6.

BUG/MAJOR: set the correct generation ID in pat_ref_append().

This fixes crashes when creating more than one new revision of a map or
acl file and purging the previous version.

BUG/MEDIUM: cpu-topo: Don't forget to reset visited_ccx.

We want to reset visited_ccx, as introduced by commit
8aef5bec1ef57eac449298823843d6cc08545745, each time we run the loop,
otherwise the chances of its content being correct are very low, and
will likely end up being bound to the wrong threads.
This was reported in github issue #3224.

CLEANUP: assorted typo fixes in the code, commits and doc

MINOR: tcp_sample: implement the fc_saved_syn sample fetch function

This function retrieves the copy of a SYN packet that the system has
kept for us when bind option "tcp-ss" was set to 1 or above. It's
recommended to copy it to a local variable because it will be freed
after being read. It allows to inspect all parts of an incoming SYN
packet, provided that it was preserved (e.g. not possible with SYN
cookies). The doc provides examples of how to use it.

MINOR: tcp: implement the get_opt() function

It relies on the generic sock_conn_get_opt() function and will permit
sample fetch functions to retrieve generic TCP-level info.

MINOR: protocol: support a generic way to call getsockopt() on a connection

It's regularly needed to call getsockopt() on a connection, but each
time the calling code has to do all the job by itself. This commit adds
a "get_opt()" callback on the protocol struct, that directly calls
getsockopt() on the connection's FD. A generic implementation for
standard sockets is provided, though QUIC would likely require a
different approach, or maybe a mapping. Due to the overlap between
IP/TCP/socket option values, it is necessary for the caller to indicate
both the level and the option. An abstraction of the level could be
done, but the caller would nonetheless have to know the optname, which
is generally defined in the same include files. So for now we'll
consider that this callback is only for very specific use.

The levels and optnames are purposely passed as signed ints so that it
is possible to further extend the API by using negative levels for
internal namespaces.

MINOR: tcp: add new bind option "tcp-ss" to instruct the kernel to save the SYN

This option enables TCP_SAVE_SYN on the listening socket, which will
cause the kernel to try to save a copy of the SYN packet header (L2,
IP and TCP are supported). This can permit to check the source MAC
address of a client, or find certain TCP options such as a source
address encapsulated using RFC7974. It could also be used as an
alternate approach to retrieving the source and destination addresses
and ports. For now setting the option is enabled, but sample fetch
functions and converters will be needed to extract info.

OPTIM: patterns: cache the current generation

This makes a significant difference when loading large files and during
commit and clear operations, thanks to improved cache locality. In the
measurements below, master refers to the code before any of the changes
to the patterns code, not the code before this one commit.

Timing the replacement of 10M entries from the CLI with this command
which also reports timestamps at start, end of upload and end of clear:

  $ (echo "prompt i"; echo "show activity"; echo "prepare acl #0";
     awk '{print "add acl @1 #0",$0}' < bad-ip.map; echo "show activity";
     echo "commit acl @1 #0"; echo "clear acl @0 #0";echo "show activity") |
    socat -t 10 - /tmp/sock1 | grep ^uptim

master, on a 3.7 GHz EPYC, 3 samples:

  uptime_now: 6.087030
  uptime_now: 25.981777  => 21.9 sec insertion time
  uptime_now: 29.286368  => 3.3 sec commit+clear

  uptime_now: 5.748087
  uptime_now: 25.740675  => 20.0s insertion time
  uptime_now: 29.039023  => 3.3 s commit+clear

  uptime_now: 7.065362
  uptime_now: 26.769596  => 19.7s insertion time
  uptime_now: 30.065044  => 3.3s commit+clear

And after this commit:

  uptime_now: 6.119215
  uptime_now: 25.023019  => 18.9 sec insertion time
  uptime_now: 27.155503  => 2.1 sec commit+clear

  uptime_now: 5.675931
  uptime_now: 24.551035  => 18.9s insertion
  uptime_now: 26.652352  => 2.1s commit+clear

  uptime_now: 6.722256
  uptime_now: 25.593952  => 18.9s insertion
  uptime_now: 27.724153  => 2.1s commit+clear

Now timing the startup time with a 10M entries file (on another machine)
on master, 20 samples:

Standard Deviation, s: 0.061652677408033
Mean:        4.217

And after this commit:

Standard Deviation, s: 0.081821371548669
Mean:        3.78

CLEANUP: patterns: remove dead code

Situations where we are iterating over elements and find one with a
different generation ID cannot arise anymore since the elements are kept
per-generation.

MEDIUM: patterns: reorganize pattern reference elements

Instead of a global list (and tree) of pattern reference elements, we
now have an intermediate pat_ref_gen structure and store the elements in
those. This simplifies the logic of some operations such as commit and
clear, and improves performance in some cases - numbers to be provided
in a subsequent commit after one important optimization is added.

A lot of the changes are due to adding an extra level of indirection,
changing many cases where we iterate over all elements to an outer loop
iterating over the generation and an inner one iterating over the
elements of the current generation. It is therefore easier to read this
patch using 'git diff -w'.

MINOR: patterns: preliminary changes for reorganization

Safe and non-functional changes that only add currently unused
structures, field, functions and macros, in preparation of larger
changes that alter the way pattern reference elements are stored.

This includes code to create and lookup generation objects, and
macros to iterate over the generations of a pattern reference.

OPTIM/MINOR: proxy: do not init proxy management task if unused

Each proxy has its owned task for internal purpose. Currently, it is
only used either by frontends or if a stick-table is present.

This commit rendres the task allocation optional to only the required
case. Thus, it is not allocated anymore for backend only proxies without
stick-table.

MINOR: cfgparse: remove useless checks on no server in backend

A legacy check could be activated at compile time to reject backends
without servers. In practice this is not used anymore and does not have
much sense with the introduction of dynamic servers.

MEDIUM: cfgparse: acknowledge that proxy ID auto numbering starts at 2

Each frontend/backend/listen proxies is assigned an unique ID. It can
either be set explicitely via 'id' keyword, or automatically assigned on
post parsing depending on the available values.

It was expected that the first automatically assigned value would start
at '1'. However, due to a legacy bug this is not the case as this value
is always skipped. Thus, automatically assigned proxies always start at
'2' or more.

To avoid breaking the current existing state, this situation is now
acknowledged with the current patch. The code is rewritten with an
explicit warning to ensure that this won't be fixed without knowing the
current status. A new regtest also ensures this.

MINOR: mux-h1: perform a graceful close at 75% glitches threshold

This avoids hitting the hard wall for connections with non-compliant
peers that are accumulating errors. We recycle the connection early
enough to permit to reset the counter. Example below with a threshold
set to 100:

Before, 1% errors:
  $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445
  #     time conns tot_conn  tot_req      tot_bytes    err  cps  rps  bps   ttfb
           1     1     1039   103872        6763365   1038 1k03 103k 54M1 9.426u
           2     1     2128   212793       14086140   2127 1k08 108k 58M5 8.963u
           3     1     3215   321465       21392137   3214 1k08 108k 58M3 8.982u
           4     1     4307   430684       28735013   4306 1k09 109k 58M6 8.935u
           5     1     5390   538989       36016294   5389 1k08 108k 58M1 9.021u

After, no more errors:
  $ h1load -H "Host : blah" -c 1 -n 10000000 0:4445
  #     time conns tot_conn  tot_req      tot_bytes    err  cps  rps  bps   ttfb
           1     1     1509   113161        7487809      0 1k50 113k 59M9 8.482u
           2     1     3002   225101       15114659      0 1k49 111k 60M9 8.582u
           3     1     4508   338045       22809911      0 1k50 112k 61M5 8.523u
           4     1     5971   447785       30286861      0 1k46 109k 59M7 8.772u
           5     1     7472   560335       37955271      0 1k49 112k 61M2 8.537u

MEDIUM: mux-h1: implement basic glitches support

We now count glitches for each parsing error, including those that
have been accepted via accept-unsafe-violations-*. Front and back
are considered and the connection gets killed on error once if the
threshold is reached or passed and the CPU usage is beyond the
configured limit (0 by default). This was tested with:

   curl -ivH "host : blah" 0:4445{,,,,,,,,,}

which sends 10 requests to a configuration having a threshold of 5.
The global keywords are named similarly to H2 and quic:

     tune.h1.be.glitches-threshold xxxx
     tune.h1.fe.glitches-threshold xxxx

The glitches count of each connection is also reported when non-null
in the connection dumps (e.g. "show fd").

MINOR: mux-h2: perform a graceful close at 75% glitches threshold

This avoids hitting the hard wall for connections with non-compliant
peers that would be accumulating errors over long connections. We now
permit to recycle the connection early enough to reset the connection
counter.

This was tested artificially by adding this to h2c_frt_handle_headers():

  h2c_report_glitch(h2c, 1, "new stream");

or this to h2_detach():

  h2c_report_glitch(h2c, 1, "detaching");

and injecting using h2load -c 1 -n 1000 0:4445 on a config featuring
tune.h2.fe.glitches-threshold 1000:

  finished in 8.74ms, 85802.54 req/s, 686.62MB/s
  requests: 1000 total, 751 started, 751 done, 750 succeeded, 250 failed, 250 errored, 0 timeout
  status codes: 750 2xx, 0 3xx, 0 4xx, 0 5xx
  traffic: 6.00MB (6293303) total, 132.57KB (135750) headers (space savings 29.84%), 5.86MB (6144000) data
                       min         max         mean         sd        +/- sd
  time for request:        9us       178us        10us         6us    99.47%
  time for connect:      139us       139us       139us         0us   100.00%
  time to 1st byte:      339us       339us       339us         0us   100.00%
  req/s           :   87477.70    87477.70    87477.70        0.00   100.00%

The failures are due to h2load not supporting reconnection.

MINOR: mux-h2: add missing glitch count for non-decodable H2 headers

One rare error case could produce a protocol error on the stream when
not being able to decode response headers wasn't being accounted as a
glitch, so let's fix it.

MINOR: tools: add a secure implementation of memset

This guarantees that the compiler will not optimize away the memset()
call if it detects a dead store.

Use this to clear SSL passphrases.

No backport needed.

DOC: config: fix number of values for "cpu-affinity"

It said "accepts 2 values" then goes on enumerating 5 since more were
added one at a time. Let's fix it by removing the number. No backport
is needed.

BUG/MINOR: cpu-topo: fix -Wlogical-not-parentheses build with clang

src/cpu_topo.c:1325:15: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]
1325 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^                      ~
src/cpu_topo.c:1325:15: note: add parentheses after the '!' to evaluate the bitwise operator first
1325 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^
      |                                     (                                                     )
src/cpu_topo.c:1325:15: note: add parentheses around left hand side expression to silence this warning
1325 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^
      |                                    (                     )
src/cpu_topo.c:1533:15: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]
1533 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^                      ~
src/cpu_topo.c:1533:15: note: add parentheses after the '!' to evaluate the bitwise operator first
1533 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^
      |                                     (                                                     )
src/cpu_topo.c:1533:15: note: add parentheses around left hand side expression to silence this warning
1533 |                         } else if (!cpu_policy_conf.flags & CPU_POLICY_ONE_THREAD_PER_CORE)
      |                                    ^
      |                                    (                     )

No backport needed.

MEDIUM: cpu-topo: Add the "per-ccx" cpu_affinity

Add a new cpu-affinity keyword, "per-ccx".
If used, each thread will be bound to all the hardware threads available
in one CCX of the threads group.

MEDIUM: cpu-topo: Add the "per-thread" cpu_affinity

Add a new cpu-affinity keyword, "per-thread".
If used, each thread will be bound to only one hardware thread of the
thread group.
If used in conjonction with the "threads-per-core 1" cpu_policy, then
each thread will be bound on a different core.