git.ipfire.org Git - thirdparty/haproxy.git/log

BUG/MINOR: tcpcheck: Don't use sni as pool-conn-name for non-SSL connections

When we try to ruse connection to perform an healtcheck, the SNI, from the
tcpcheck connection or the healthcheck itself, must not be used as
connection name for non-SSL connections.

This patch must be backported to 3.2.

OPTIM: tcpcheck: Don't set SNI and ALPN for non-ssl connections

There is no reason to set the SNI and ALPN for non-ssl connections. It is
not really an issue because ssl_sock_set_servername() and
ssl_sock_set_alpn() functions will do nothing. But it is cleaner this way
and this could avoid bugs in future.

No backport needed, because there is no bug.

OPTIM: proto_rhttp: Don't set SNI for non-ssl connections

There is no reason to set the SNI for non-ssl connections. It is not really
an issue because ssl_sock_set_servername() function will do nothing. But
there is no reason to uselessly evaluate an expression.

No backport needed, because there is no bug.

OPTIM: backend: Don't set SNI for non-ssl connections

There is no reason to set the SNI for non-ssl connections. It is not really
an issue because ssl_sock_set_servername() function will do nothing. But
there is no reason to uselessly evaluate an expression.

No backport needed, because there is no bug.

BUG/MINOR: server: Update healthcheck when server settings are changed via CLI

not all changes are concerned. But when the SSL is enabled or disabled for a
server, the healthcheck xprt must be eventually be updated too. This happens
when the healthcheck relies on the server settings.

In the same spirit, when the healthcheck address and port are updated, we
must fallback on the raw xprt if the SSL is not explicitly enabled for the
healthcheck with a "check-ssl" parameter.

This patch should be backported to all stable versions.

BUG/MEDIUM: server: Use sni as pool connection name for SSL server only

By default, for a given server, when no pool-conn-name is specified, the
configured sni is used. However, this must only be done when SSL is in-use
for the server. Of course, it is uncommon to have a sni expression for
now-ssl server. But this may happen.

In addition, the SSL may be disabled via the CLI. In that case, the
pool-conn-name must be discarded if it was copied from the sni. And, we must
of course take care to set it if the ssl is enabled.

Finally, when the attac-srv action is checked, we now checked the
pool-conn-name expression.

This patch should be backported as far as 3.0.

MINOR: server: Parse sni and pool-conn-name expressions in a dedicated function

This change is mandatory to fix an issue. The parsing of sni and
pool-conn-name expressions (from string to expression) is now handled in a
dedicated function. This will avoid to duplicate the same code at different
places.

MEDIUM: stats-file: add some BUG_ON() guards to ensure exported structs are not changed by accident

Add two BUG_ON() in shm_stats_file_prepare() which will trigger if
exported structures (shm_stats_file_hdr and shm_stats_file_object) change
in size, because it means that they will become incompatible with older
versions and thus precautions should be taken by the developer to ensure
compatibility with olders versions, or at least detect incompatible
versions by changing the version number to prevent bugs resulting
from inconsistent mapping between versions. The BUG_ON() may be
safely adjusted then.

Please note that it doesn't protect against accidental struct member
re-ordering if the resulting struct size is equal..

MINOR: stats-file: reserve some bytes in exported structs

We may need additional struct members in shm_stats_file_object and
shm_stats_file_hdr, yet since these structs are exported they should
not change in size nor ordering else it would require a version change
to break compability on purpose since mapping would differ.

Here we reserve 64 additional bytes in shm_stats_file_object, and
128 bytes in shm_stats_file_hdr for future usage.

BUILD: stats-file: fix aligment issues

Document some byte holes and fix some potential aligment issues
between 32 and 64 bits architectures to ensure the shm_stats_file memory
mapping is consistent between operating systems.

MINOR: compiler: add ALWAYS_PAD() macro

same as THREAD_PAD() but doesn't depend on haproxy being compiled with
thread support. It may be useful for memory (or files) that may be
shared between multiple processed.

OPTIM: stats-file: don't unnecessarily die hard on shm_stats_file_reuse_object()

shm_stats_file_reuse_object() has a non negligible cost, especially if
the shm file contains a lot of objects because the functions scans the
whole shm file to find available slots.

During startup, if no existing objects could be mapped in the shm
file shm_stats_file_add_object() for each object (server, fe, be or
listener) with a GUID set. On large config it means
shm_stats_file_add_object() could be called a lot of times in a row.

With current implementation, each shm_stats_file_add_object() call
leverages shm_stats_file_reuse_object(), so the more objects are defined
in the config, the slower the startup will be.

To try to optimize startup time a bit with large configs, we don't
sytematically call shm_stats_file_reuse_object(), especially when we
know that the previous attempt to reuse objects failed. In this case
we add a small tempo between failed attempts to reuse objects because
we assume the new attempt will probably fail anyway. (For slots to
become available, either an old process has to clean its entries,
or they have to time out which implies that the clock needs to be updated)

DOC: config: document "shm-stats-file" directive

Add some documentation for "shm-stats-file" and
"shm-stats-file-max-objects" experimental directives related to the use
of shared memory for storing stats counters (see previous commits for
implementation details)

MEDIUM: stats-file/counters: store and preload stats counters as shm file objects

This is the last patch of the shm stats file series, in this patch we
implement the logic to store and fetch shm stats objects and associate
them to existing shared counters on the current process.

Shm objects are stored in the same memory location as the shm stats file
header. In fact they are stored right after it. All objects (struct
shm_stats_file_object) have the same size (no matter their type), which
allows for easy object traversal without having to check the object's
type, and could permit the use of external tools to scan the SHM in the
future. Each object stores a guid (of GUID_MAX_LEN+1 size) and tgid
which allows to match corresponding shared counters indexes. Also,
as stated before, each object stores the list of users making use of
it. Objects are never released (the map can only grow), but unused
objects (when no more users or active users are found in objects->users),
the object is automatically recycled. Also, each object stores its
type which defines how the object generic data member should be handled.

Upon startup (or reload), haproxy first tries to scan existing shm to
find objects that could be associated to frontends, backends, listeners
or servers in the current config based on GUID. For associations that
couldn't be made, haproxy will automatically create missing objects in
the SHM during late startup. When haproxy matches with an existing object,
it means the counter from an older process is preserved in the new
process, so multiple processes temporarily share the same counter for as
long as required for older processes to eventually exit.

MINOR: stats-file: add process slot management for shm stats file

Now that all processes tied to the same shm stats file now share a
common clock source, we introduce the process slot notion in this
patch.

Each living process registers itself in a map at a free index: each slot
stores information about the process' PID and heartbeat. Each process is
responsible for updating its heartbeat, a slot is considered as "free" if
the heartbeat was never set or if the heartbeat is expired (60 seconds of
inactivity). The total number of slots is set to 64, this is on purpose
because it allows to easily store the "users" of a given shm object using
a 64 bits bitmask. Given that when haproxy is reloaded olders processes
are supposed to die eventually, it should be large enough (64 simultaneous
processes) to be safe. If we manage to reach this limit someday, more
slots could be added by splitting "users" bitmask on multiple 64bits
variable.

MEDIUM: stats-file: processes share the same clock source from shm-stats-file

The use of the "shm-stats-file" directive now implies that all processes
using the same file now share a common clock source, this is required
for consistency regarding time-related operations.

The clock source is stored in the shm stats file header.
When the directive is set, all processes share the same clock
(global_now_ms and global_now_ns both point to variables in the map),
this is required for time-based counters such as freq counters to work
consistently. Since all processes manipulate global clock with atomic
operations exclusively during runtime, and don't systematically relies
on it (thanks to local now_ms and now_ns), it is pretty much transparent.

MINOR: stats-file: introduce shm-stats-file directive

add initial support for the "shm-stats-file" directive and
associated "shm-stats-file-max-objects" directive. For now they are
flagged as experimental directives.

The shared memory file is automatically created by the first process.
The file is created using open() so it is up to the user to provide
relevant path (either on regular filesystem or ramfs for performance
reasons). The directive takes only one argument which is path of the
shared memory file. It is passed as-is to open().

The maximum number of objects per thread-group (hard limit) that can be
stored in the shm is defined by "shm-stats-file-max-objects" directive,

Upon initial creation, the main shm stats file header is provisioned with
the version which must remains the same to be compatible between processes
and defaults to 2k. which means approximately 1mb max per thread group
and should cover most setups. When the limit is reached (during startup)
an error is reported by haproxy which invites the user to increase the
"shm-stats-file-max-objects" if desired, but this means more memory will
be allocated. Actual memory usage is low at start, because only the mmap
(mapping) is provisionned with the maximum number of objects to avoid
relocating the memory area during runtime, but the actual shared memory
file is dynamically resized when objects are added (resized by following
half power of 2 curve when new objects are added, see upcoming commits)

For now only the file is created, further logic will be implemented in
upcoming commits.

MINOR: counters: retrieve detailed errmsg upon failure with counters_{fe,be}_shared_prepare()

counters_{fe,be}_shared_prepare now take an extra <errmsg> parameter
that contains additional hints about the error in case of failure.

It must be freed accordingly since it is allocated using memprintf

OPTIM: stick-tables: exit expiry faster when the update lock is held

It helps keep the contention level low: when we hold the update lock
that we know other parts may be relying on (peers, track-sc etc),
we decrease the remaining visit counters 4 times as fast to further
reduce the contention. At this point no more warnings are seen during
intense synchronization (2x64 cores, 1.5M req/s with a track-sc each,
5M entries in use).

MINOR: stick-tables: limit the number of visited nodes during expiration

As reported by Felipe in GH issue #3084, on large systems it's not
sufficient to leave the expiration process after a certain number of
expired entries, because if they accumulate too fast, it's possible
to still spend some time visiting many (e.g. those still in use),
which takes time.

Thus here we're taking a stricter approach consisting in counting the
number of visited entries, which allows to leave early if we can't do
the expected work in a reasonable amount of time.

In order to avoid always stopping on first shards and never visiting
last ones, we're always starting from a random shard number and looping
from that one. This way even if we always leave early, all shards will
be handled equally.

This should be backported to 3.2.

BUG/MEDIUM: peers: don't fail twice to grab the update lock

When the expire task is running fast (i.e. running almost alone), it's
super hard to grab the update lock and peers can easily trigger the
watchdog because the time it takes to grab this lock is multiplied by
the number of updates to perform. This is easier to trigger at the end
of an injection session where the expire task is omni-present. Let's
just record that we failed once and don't fail a second time in the
loop.

This should be backported to 3.2, but probably not further given that
this area changed significantly in 3.2.

BUG/MINOR: stick-tables: never leave used entries without expiration

When trying to kill/expire entries, if a ref-counted entry is found,
let's requeue it with its expiration timer instead of leaving it out,
because other ref-counters (e.g. peers) will not purge it otherwise,
leaving it orphan. This one seems trickier to trigger, though it seems
to happen sometimes when peers are late and a long resync is active
and competing with intense calls to process_table_expire() (i.e. when
no other acitvity is there).

This must be backported to 3.2. It's likely that older versions are
affected as well, but possibly differently since the expiration
mechanism changed between 3.1 and 3.2, so better not take unneeded
risks there.

BUG/MEDIUM: stick-tables: don't leave the expire loop with elements deleted

In 3.2, the table expiration latency was improved by commit 994cc58576
("MEDIUM: stick-tables: Limit the number of entries we expire"), however
it introduced an issue by which it's possible to leave the loop after a
certain number of elements were expired, without requeuing the deleted
elements. The issue it causes is that other places with a non-null ref_cnt
will not necessarily delete it themselves, resulting in orphan elements in
the table. These ones will then pollute it and force recycling old ones
more often which in turn results in an increase of the contention.

Let's check for the expiration counter before deleting the element so
that it can be found upon next visit.

This fix must be backported to 3.2. It is directly related to GH
issue #3084. Thanks to Felipe and Ricardo for sharing precious info
and testing a candidate fix.

MEDIUM: cfgparse: warn when using user/group when built statically

In issue #3013, an user observed a crash at startup of haproxy when
building statically and using the "user" global section.

This is a known problem of the glibc and the linker even warn about
this:

> warning: Using 'getgrnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
> warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

Let's emit a warning when using user/group in this case.

CI: fix syntax of Quic Interop pipelines

previously, wrong syntax of passing build arguments was used, thus
previously images were built using default SSLLIB=QuicTLS-1.1.1

MINOR: quic: Add more information about RX packets

This patch is very useful to debug issues at RX packet processing level.

Should be easily backported as far as 2.6 (for debug purposes).

BUILD: acl: silence a possible null deref warning in parse_acl_expr()

The fix in commit 441cd614f9 ("BUG/MINOR: acl: set arg_list->kw to
aclkw->kw string literal if aclkw is found") involves an unchecked
access to "al" after that one is tested for possibly being NULL. This
rightfully upsets Coverity (GH #3095) and might also trigger warnings
depending on the compilers. However, no known caller to date passes
a NULL arg list here so there's no way to trigger this theoretical
bug.

This should be backported along with the fix above to avoid emitting
warnings, possibly as far as 2.6 since that fix was tagged as such.

BUG/MINOR: haproxy: be sure not to quit too early on soft stop

The fix in 4a9e3e102e ("BUG/MINOR: haproxy: only tid 0 must not sleep
if got signal") had the nasty side effect of breaking the graceful
reload operations: threads whose id is non-zero could quit too early and
not process incoming traffic, which is visible with broken connections
during reloads. They just need to ignore the the stopping condition
until the signal queue is empty. In any case, it's the thread in charge
of the signal queue which will notify them once it receives the signal.

It was verified that connections are no longer broken with this fix,
and that the issue that required it (#2537, looping threads on reload)
does not re-appear with the reproducer, while it still did without the
fix above. Since the fix above was backported to every stable version,
this one will also have to.

DOC: configuration: rework the jwt_verify keyword documentation

Split the documentation in multiple sections:

- Explanation about what it does and how
- <alg> parameter with array of parameters
- <key> parameter with details about certificates and public keys
- Return value

Others changes:

- certificates does not need to be known during configuration parsing
- differences between public key and certificate

MEDIUM: quic: strengthen BUG_ON() for unpad Initial packet on client

To avoid anti-amplification limit, it is required that Initial packet
are padded to be at least 1.200 bytes long. On server side, this only
applies to ack-eliciting packets. However, for client side, this is
mandatory for every packets.

This patch adjusts qc_txb_store() BUG_ON statement used to catch too
small Initial packets. On QUIC client side, ack-eliciting flag is now
ignored, thus every packets are checked.

This is labelled as MEDIUM as this BUG_ON() is known to be easily
triggered, as QUIC datagrams encoding function are complex. However,
it's important that a QUIC endpoint respects it, else the peer will drop
the invalid packet and could immediately close the connection.

BUG/MINOR: quic: pad Initial pkt with CONNECTION_CLOSE on client

Currently, when connection is closing, only CONNECTION_CLOSE frame is
emitted via qc_prep_pkts()/qc_do_build_pkt(). Also, only the first
registered encryption level is considered while the others are
dismissed. This results in a single packet datagram.

This can cause issues for QUIC client support, as padding is required
for every Initial packet, contrary to server side where only
ack-eliciting packets are eligible. Thus a client must add padding to a
CONNECTION_CLOSE frame on Initial level.

This patch adjusts qc_prep_pkts() to ensure such packet will be
correctly padded on client side. It sets <final_packet> variable which
instructs that if padding is necessary it must be apply immediately on
the current encryption level instead of the last one.

It could appear as unnecessary to pad a CONNECTION_CLOSE packet, as the
peer will enter in draining state when processing it. However, RFC
mandates that a client Initial packet too small must be dropped by the
server, so there is a risk that the CONNECTION_CLOSE is simply discarded
prior to its processing if stored in a too small datagram.

No need to backport as this is a QUIC backend issue only.

BUG/MINOR: quic: fix padding issue on INITIAL retransmit

On loss detection timer expiration, qc_dgrams_retransmit() is used to
reemit lost packets. Different code paths are present depending on the
active encryption level.

If Initial level is still initialized, retransmit is performed both for
Initial and Handshake spaces, by first retrieving the list of lost
frames for each of them.

Prior to this patch, Handshake level was always registered for emission
after Initial, even if it dit not have any frame to reemit. In this
case, most of the time it would result in a datagram containing Initial
reemitted frames packet coalesced with a Handshake packet consisting
only of a PADDING frame. This is because padding is only added for the
last registered QEL.

For QUIC backend support, this may cause issues. This is because
contrary to QUIC server side, Initial and Handshake levels keys are not
derived simultaneously for a QUIC client. Thus, if the latter keys are
unavailable, Handshake packet cannot be encoded in sending, leaving a
single Initial packet. However, this is now too late to add PADDING.
Thus the resulting datagram is invalid : this triggers the BUG_ON()
assert failure located on qc_txb_store().

This patch fixes this by amending qc_dgrams_retransmit(). Now, Handshake
level is only registered for emission if there is frame to retransmit,
which implies that Handshake keys are already available. Thus, PADDING
will now either be added at Initial or Handshake level as expected.

Note that this issue should not be present on QUIC frontend, due to
Initial and Handshake keys derivation almost simultaneously. However,
this should still be backported up to 3.0.

BUG/MINOR: quic: fix room check if padding requested

qc_prep_pkts() activates padding when building an Initial packet. This
ensures that resulting datagram will always be at least 1.200 bytes,
which is mandatory to prevent deadlock over anti-amplication.

Prior to padding activation, a check is performed to ensure that output
buffer is big enough for a padded datagram. However, this did not take
into account previously built packets which would be coalesced in the
same datagram. Thus this patch fixes this comparison check.

In theory, prior to this patch, in some cases Initial packets could not
be built despite a datagram of the proper size. Currently, this probably
never happens as Initial packet is always the first encoded in a
datagram, thus there is no coalesced packet prior to it. However, there
is no hard requirement on this, so it's better to reflect this in the
code.

This should be backported up to 2.6.

MINOR: quic/flags: complete missing flags

Add missing quic_conn flags definition for dev utility.

BUG/MINOR: quic: ignore AGAIN ncbuf err when parsing CRYPTO frames

This fix follows this previous one:

BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets

which is not sufficient when a client fragments and mixes its CRYPTO frames AND
leaveswith holes by packets. ngtcp2 (and perhaps chrome) splits theire CRYPTO
frames but without hole by packet. In such a case, the CRYPTO parsing leads to
QUIC_RX_RET_FRM_AGAIN errors which cannot be fixed when the peer resends its packets.
Indeed, even if the peer resends its frames in a different order, this does not
help because since the previous commit, the CRYPTO frames are ordered on haproxy side.

This issue was detected thanks to the interopt tests with quic-go as client. This
client fragments its CRYPTO frames, mixes them, and generate holes, and most of
the times with the retry test.

To fix this, when a QUIC_RX_RET_FRM_AGAIN error is encountered, the CRYPTO frames
parsing is not stop. This leaves chances to the next CRYPTO frames to be parsed.

Must be backported as far as 2.6 as the commit mentioned above.

BUG/MINOR: tools: Add OOM check for malloc() in indent_msg()

This patch adds a missing out-of-memory (OOM) check after
the call to `malloc()` in `indent_msg()`. If memory
allocation fails, the function returns NULL to prevent
undefined behavior.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: compression: Add OOM check for calloc() in parse_compression_options()

This patch adds a missing out-of-memory (OOM) check after
the call to `calloc()` in `parse_compression_options()`. If
memory allocation fails, an error message is set, the function
returns -1, and parsing is aborted to ensure safe handling
of low-memory conditions.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: cfgparse: Add OOM check for calloc() in cfg_parse_listen()

This commit adds a missing out-of-memory (OOM) check
after the call to `calloc()` in `cfg_parse_listen()`.
If memory allocation fails, an alert is logged, error
codes are set, and parsing is aborted to prevent
undefined behavior.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: acl: Add OOM check for calloc() in smp_fetch_acl_parse()

This patch adds a missing out-of-memory (OOM) check after
the call to `calloc()` in `smp_fetch_acl_parse()`. If
memory allocation fails, an error message is set and
the function returns 0, improving robustness in
low-memory situations.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: log: Add OOM checks for calloc() and malloc() in logformat parser and dup_logger()

This patch adds missing out-of-memory (OOM) checks after calls
to `calloc()` and `malloc()` in the logformat parser and the
`dup_logger()` function. If memory allocation fails, an error
is reported or NULL is returned, preventing undefined behavior
in low-memory conditions.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: halog: Add OOM checks for calloc() in filter_count_srv_status() and filter_count_url()

This patch adds missing out-of-memory (OOM) checks after calls to
calloc() in the functions `filter_count_srv_status()` and `filter_count_url()`.
If memory allocation fails, an error message is printed to stderr
and the process exits with status 1. This improves robustness
and prevents undefined behavior in low-memory situations.

Co-authored-by: Christian Norbert Menges <christian.norbert.menges@sap.com>

BUG/MINOR: acl: Properly detect overwritten matching method

A bug was introduced by the commit 6ea50ba46 ("MINOR: acl; Warn when
matching method based on a suffix is overwritten"). The test on the match
function, when defined was not correct. It is now fixed.

No backport needed, except if the commit above is backported.

BUG/MINOR: server: Duplicate healthcheck's sni inherited from default server

It is not really an issue, but the "check-sni" value inerited from a default
server is not duplicated while the paramter value is duplicated during the
parsing. So here there is a small leak if several "check-sni" parameters are
used on the same server line. The previous value is never released. But to
fix this issue, the value inherited from the default server must also be
duplicated. At the end it is safer this way and consistant with the parsing
of the "sni" parameter.

It is harmless so there is no reason to backport this patch.

BUG/MEDIUM: server: Duplicate healthcheck's alpn inherited from default server

When "check-alpn" parameter is inherited from the default server, the value
is not duplicated, the pointer of the default server is used. However, when
this parameter is overridden, the old value is released. So the "check-alpn"
value of the default server is released. So it is possible to have a UAF if
if another server inherit from the same the default server.

To fix the issue, the "check-alpn" parameter must be handled the same way
the "alpn" is. The default value is duplicated. So it could be safely
released if it is forced on the server line.

This patch should fix the issue #3096. It must be backported to all stable
versions.

MINOR: acl; Warn when matching method based on a suffix is overwritten

From time to time, issues are reported about string matching based on suffix
(for instance path_beg). Each time, it appears these ACLs are used in
conjunction with a converter or followed by an explicit matching method
(-m).

Unfortunatly, it is not an issue but an expected behavior, while it is not
obvious. matching suffixes can be consider as aliases on the corresponding
'-m' matching method. Thus "path_beg" is equivalent to "path -m beg". When a
converter is used the original matching (string) is used and the suffix is
lost. When followed by an explicit matching method, it overwrites the
matching method based on the suffix.

It is expected but confusing. Thus now a warning is emitted because it is a
configuration issue for sure. Following sample fetch functions are concerned:

* base
* path
* req.cook
* req.hdr
* res.hdr
* url
* urlp

The configuration manual was modified to make it less ambiguous.

MINOR: acl: Only allow one '-m' matching method

Several '-m' explicit matching method was allowed, but only the last one was
really used. There is no reason to specify several matching method and it is
most probably an error or a lack of understanding of how matchings are
performed. So now, an error is triggered during the configuration parsing to
avoid any bad usage.

REG-TESTS: map_redirect: Don't use hdr_dom in ACLs with "-m end" matching method

hdr_dom() is a alias of "hdr() -m dom". So using it with another explicit
matching method does not work because the matching on the domain will never
be performed. Only the last matching method is used. The scripts was working
by chance because no port was set on host header values.

The script was fixed by using "host_only" converter. In addition, host
header values were changed to have a port now.

MINOR: conn/muxes/ssl: add ASSUME_NONNULL() prior to _srv_add_idle

When manipulating idle backend connections for input/output processing,
special care is taken to ensure the connection cannot be accessed by
another thread, for example via a takeover. When processing is over,
connection is reinserted in its original list.

A connection can either be attached to a session (private ones) or a
server idle tree. In the latter case, <srv> is guaranteed to be non null
prior to _srv_add_idle() thanks to CO_FL_LIST_MASK comparison with conn
flags. This patch adds an ASSUME_NONNULL() to better reflect this.

This should fix coverity reports from github issue #3095.

BUG/MAJOR: mux-quic: fix crash on reload during emission

MUX QUIC restricts buffer allocation per connection based on the
underlying congestion window. If a QCS instance cannot allocate a new
buffer, it is put in a buf_wait list. Typically, this will cause stream
upper layer to subscribe for sending.

A BUG_ON() was present on snd_buf and nego_ff callback prologue to
ensure that these functions were not called if QCS is already in
buf_wait list. The objective was to guarantee that there is no wake up
on a stream if it cannot allocate a buffer.

However, this BUG_ON() is not correct, as it can be fired legitimely.
Indeed, stream layer can retry emission even if no wake up occured. This
case can happen on reload. Thus, BUG_ON() will cause an unexpected
crash.

Fix this by removing these BUG_ON(). Instead, snd_buf/nego_ff callbacks
ensure that QCS is not subscribed in buf_wait list. If this is the case,
a nul value will be returned, which is sufficient for the stream layer
to pause emission and subscribe if necessary.

Occurences for this crash have been reported on the mailing list. It is
also the subject of github issue #3080, which should be fixed with this
patch.

This must be backported up to 3.0.

BUG/MEDIUM: quic: CRYPTO frame freeing without eb_delete()

Since this commit:

BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets

when they are parsed, the CRYPTO frames are ordered by their offsets into an ebtree.
Then their data are provided to the ncbufs.

But in case of error, when qc_handle_crypto_frm() returns QUIC_RX_RET_FRM_FATAL
or QUIC_RX_RET_FRM_AGAIN), they remain attached to their tree. Then
from <err> label, they are deteleted and deleted (with a while(node) { eb_delete();
qc_frm_free();} loop). But before this loop, these statements directly
free the frame without deleting it from its tree, if this is a CRYPTO frame,
leading to a use after free when running the loop:

if (frm)
qc_frm_free(qc, &frm);

This issue was detected by the interop tests, with quic-go as client. Weirdly, this
client sends CRYPTO frames by packet with holes.
Must be backported as far as 2.6 as the commit mentioned above.

CLEANUP: quic: remove a useless CRYPTO frame variable assignment

This modification should have arrived with this commit:

MINOR: quic: remove ->offset qf_crypto struct field

Since this commit, the CRYPTO offset node key assignment is done at parsing time
when calling qc_parse_frm() from qc_parse_pkt_frms().

This useless assigment has been reported in GH #3095 by coverity.

This patch should be easily backported as far as 2.6 as the one mentioned above
to ease any further backport to come.

DOC: proxy-protocol: Make example for PP2_SUBTYPE_SSL_SIG_ALG accurate

The docs call out that this field is the algorithm used to
sign the certificate. However, the example only had the hash portion of
the signature algorithm. This change updates the example to be accurate
based on a value written by HAProxy, which is based on an OID for
signature algorithms. I based example on a real TLV written by
HAProxy on my machine with all SSL TLVs enabled in config.

BUG/BUILD: stats: fix build due to missing stat enum definition

Recently, new server counter for private idle connections have been
added to statistics output. However, the patch was missing
ST_I_PX_PRIV_IDLE_CUR enum definition.

No need to backport.

MEDIUM: proxy: Reject some header names for 'http-send-name-header' directive

From time to time, we saw the 'http-send-name-header' directive used to
overwrite the Host header to workaround limitations of a buggy application.
Most of time, this led to troubles. This was never officially supported and
each time we strongly discouraged anyone to do so. We already thought to
deprecate this directive, but it seems to be still used by few people. So
for now, we decided to strengthen the tests performed on it.

The header name is now checked during the configuration parsing to forbid
some risky names. 'Host', 'Content-Length', 'Transfer-Encoding' and
'Connection' header names are now rejected. But more headers could be added
in future.

MINOR: proxy: extend "show servers conn" output

CLI command "show servers conn" is used as a debugging tool to monitor
the number of connections per server. This patch extends its output by
adding the content of two server counters.

<served> is the first added column. It represents the number of active
streams on a server. <curr_sess_idle_conns> is the second added column.
This is a recently added value which account private idle connections
referencing a server.

MINOR: stats: display new curr_sess_idle_conns server counter

Add a new stats column in proxy stats to display server counter for
private idle connections. This counter has been introduced recently.

The value is displayed on CSV output on the last column before modules.
It is also displayed on HTLM page alongside other idle server counters.

MINOR: doc: add missing statistics column

Complete documentation with missing description of newly added columns.

This must be backported up to 2.8.

MINOR: doc: add missing statistics column

Complete documentation with missing description of newly added columns.

This should be backported up to 2.4

DOC: configuration: confuse "strict-mode" with "zero-warning"

4b10302fd8 ("MINOR: cfgparse: implement a simple if/elif/else/endif
macro block handler") introduces a confusion between "strict-mode" and
"zero-warning".

This patch fixes the issue by changing "strict-mode" by "zero-warning"
in section 2.4. Conditional blocks.

Must be backported as far as 2.4.

OPTIM: backend: set release on takeover for strict maxconn

When strict maxconn is enforced on a server, it may be necessary to kill
an idle connection to never exceed the limit. To be able to delete a
connection from any thread, takeover is first used to migrate it on the
current thread prior to its deletion.

As takeover is performed to delete a connection instead of reusing it,
<release> argument can be set to true. This removes unnecessary
allocations of resources prior to connection deletion. As such, this
patch is a small optimization for strict maxconn implementation.

Note that this patch depends on the previous one which removes any
assumption in takeover implementation that thread isolation is active if
<release> is true.

MINOR: muxes: adjust takeover with buf_wait interaction

Takeover operation defines an argument <release>. It's a boolean which
if set indicate that freed connection resources during the takeover does
not have to be reallocated on the new thread. Typically, it is set to
false when takever is performed to reuse a connection. However, when
used to be able to delete a connection from a different thread,
<release> should be set to true.

Previously, <release> was only set in conjunction with "del server"
handler. This operation was performed under thread isolation, which
guarantee that not thread-safe operation such as removal from buf_wait
list could be performed on takeover if <release> was true. In the
contrary case, takeover operation would fail.

Recently, "del server" handler has been adjusted to remove idle
connection cleanup with takeover. As such, <release> is never set to
true in remaining takeover usage.

However, takeover is also used to enforce strict-maxconn on a server.
This is performed to delete a connection from any thread, which is the
primary reason of <release> to true. But for the moment as takeover
implementers considers that thread isolation is active if <release> is
set, this is not yet applicable for strict-maxconn usage.

Thus, the purpose of this patch is to adjust takeover implementation.
Remove assumption between <release> and thread-isolation mode. It's not
possible to remove a connection from a buf_wait list, an error will be
return in any case.

DOC: unreliable sockpair@ on macOS

We discovered that the sockpair@ protocol is unreliable in macOS, this
is the same problem that we fixed in d7f6819. But it's not possible to
implement a acknowledgment once the socket are in non-blocking mode.

The problem was discovered in issue #3045.

Must be backported in every stable versions.

BUILD: mworker: fix ignoring return value of ‘read’

Fix read return value unused result.

src/haproxy.c: In function ‘main’:
src/haproxy.c:3630:17: error: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
3630 | read(sock_pair[1], &c, 1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~

Must be backported where d7f6819 is backported.

MAJOR: server: do not remove idle conns in del server

Do not remove anymore idle and purgeable connections directly under the
"del server" handler. The main objective of this patch is to reduce the
amount of work performed under thread isolation. This should improve
"del server" scheduling with other haproxy tasks.

Another objective is to be able to properly support dynamic servers with
QUIC. Indeed, takeover is not yet implemented for this protocol, hence
it is not possible to rely on cleanup of idle connections performed by a
single thread under "del server" handler.

With this change it is not possible anymore to remove a server if there
is still idle connections referencing it. To ensure this cannot be
performed, srv_check_for_deletion() has been extended to check server
counters for idle and idle private connections.

Server deletion should still remain a viable procedure, as first it is
mandatory to put the targetted server into maintenance. This step forces
the cleanup of its existing idle connections. Thanks to a recent change,
all finishing connections are also removed immediately instead of
becoming idle. In short, this patch transforms idle connections removal
from a synchronous to an asynchronous procedure. However, this should
remain a steadfast and quick method achievable in less than a second.

This patch is considered major as some users may notice this change when
removing a server. In particular with the following CLI commands
pipeline:
"disable server <X>; shutdown sessions server <X>; del server <X>"

Server deletion will now probably fail, as idle connections purge cannot
be completed immediately. Thus, it is now highly advise to always use a
small delay "wait srv-removable" before "del server" to ensure that idle
connections purge is executed prior.

Along with this change, documentation for "del server" and related
"shutdown sessions server" has been refined, in particular to better
highlight under what conditions a server can be removed.

MEDIUM: session: account on server idle conns attached to session

This patch adds a new member <curr_sess_idle_conns> on the server. It
serves as a counter of idle connections attached on a session instead of
regular idle/safe trees. This is used only for private connections.

The objective is to provide a method to detect if there is idle
connections still referencing a server.

This will be particularly useful to ensure that a server is removable.
Currently, this is not yet necessary as idle connections are directly
freed via "del server" handler under thread isolation. However, this
procedure will be replaced by an asynchronous mechanism outside of
thread isolation.

Careful: connections attached to a session but not idle will not be
accounted by this counter. These connections can still be detected via
srv_has_streams() so "del server" will be safe.

This counter is maintain during the whole lifetime of a private
connection. This is mandatory to guarantee "del server" safety and is
conform with other idle server counters. What this means it that
decrement is performed only when the connection transitions from idle to
in use, or just prior to its deletion. For the first case, this is
covered by session_get_conn(). The second case is trickier. It cannot be
done via session_unown_conn() as a private connection may still live a
little longer after its removal from session, most notably when
scheduled for idle purging.

Thus, conn_free() has been adjusted to handle the final decrement. Now,
conn_backend_deinit() is also called for private connections if
CO_FL_SESS_IDLE flag is present. This results in a call to
srv_release_conn() which is responsible to decrement server idle
counters.

MAJOR: server: implement purging of private idle connections

When a server goes into maintenance, or if its IP address is changed,
idle connections attached to it are scheduled for deletion via the purge
mechanism. Connections are moved from server idle/safe list to the purge
list relative to their thread. Connections are freed on their owned
thread by the scheduled purge task.

This patch extends this procedure to also handle private idle
connections stored in sessions instead of servers. This is possible
thanks via <sess_conns> list server member. A call to the newly
defined-function session_purge_conns() is performed on each list
element. This moves private connections from their session to the purge
list alongside other server idle connections.

This change relies on the serie of previous commits which ensure that
access to private idle connections is now thread-safe, with idle_conns
lock usage and careful manipulation of private idle conns in
input/output handlers.

The main benefit of this patch is that now all idle connections
targetting a server set in maintenance are removed. Previously, private
connections would remain until their attach sessions were closed.

MEDIUM: mux-quic: enforce thread-safety of backend idle conns

Complete QUIC MUX for backend side. Ensure access to idle connections
are performed in a thread-safe way. Even if takeover is not yet
implemented for this protocol, it is at least necessary to ensure that
there won't be any issue with idle connections purging mechanism.

This change will also be necessary to ensure that QUIC servers can
safely be removed via CLI "del server". This is not yet sufficient as
currently server deletion still relies on takeover for idle connections
removal. However, this will be adjusted in a future patch to instead use
idle connections standard purging mechanism.

MEDIUM: conn/muxes/ssl: remove BE priv idle conn from sess on IO

This is a direct follow-up of previous patch which adjust idle private
connections access via input/output handlers.

This patch implement the handlers prologue part. Now, private idle
connections require a similar treatment with non-private idle
connections. Thus, private conns are removed temporarily from its
session under protection of idle_conns lock.

As locking usage is already performed in input/output handler,
session_unown_conn() cannot be called. Thus, a new function
session_detach_idle_conn() is implemented in session module, which
performs basically the same operation but relies on external locking.

MEDIUM: conn/muxes/ssl: reinsert BE priv conn into sess on IO completion

When dealing with input/output on a connection related handler, special
care must be taken prior to access the connection if it is considered as
idle, as it could be manipulated by another thread. Thus, connection is
first removed from its idle tree before processing. The connection is
reinserted on processing completion unless it has been freed during it.

Idle private connections are not concerned by this, because takeover is
not applied on them. However, a future patch will implement purging of
these connections along with regular idle ones. As such, it is necessary
to also protect private connections usage now. This is the subject of
this patch and the next one.

With this patch, input/output handlers epilogue of
muxes/SSL/conn_notify_mux() are adjusted. A new code path is able to
deal with a connection attached to a session instead of a server. In
this case, session_reinsert_idle_conn() is used. Contrary to
session_add_conn(), this new function is reserved for idle connections
usage after a temporary removal.

Contrary to _srv_add_idle() used by regular idle connections,
session_reinsert_idle_conn() may fail as an allocation can be required.
If this happens, the connection is immediately destroyed.

This patch has no effect for now. It must be coupled with the next one
which will temporarily remove private idle connections on input/output
handler prologue.

MINOR: muxes: enforce thread-safety for private idle conns

When a backend connnection becomes idle, muxes must activate some
protection to mark future access on it as dangerous. Indeed, once a
connection is inserted in an idle list, it may be manipulated by another
thread, either via takeover or scheduled for purging.

Private idle connections are stored into a session instead of the server
tree. They are never subject to a takeover for reuse or purge mechanism.
As such, currently they do not require the same level of protection.

However, a new patch will introduce support for private idle connections
purging. Thus, the purpose of this patch is to ensure protection is
activated as well now.

TASK_F_USR1 was already set on them as an anticipation for such need.
Only some extra operations were missing, most notably xprt_set_idle()
invokation. Also, return path of muxes detach operation is adjusted to
ensure such connection are never accessed after insertion.

MINOR: server: cleanup idle conns for server in maint already stopped

When a server goes into maintenance mode, its idle connections are
scheduled for an immediate purge. However, this is not the case if the
server is already in stopped state, for example due to a health check
failure.

Adjust _srv_update_status_adm() to ensure that idle connections are
always scheduled for purge when going into maintenance in both cases.

The main advantage of this patch is to ensure consistent behavior for
server maintenance mode.

Note that it will also become necessary as server deletion will be
adjusted with a future patch. Idle connection closure won't be performed
by "del server" handler anymore, so it's important to ensure that a full
cleanup is always performed prior to executing it, else the server may
not be removable during a certain delay.

MEDIUM: session: close new idle conns if server in maintenance

Previous patch ensures that a backend connection going into idle state
is rejected and freed if its target server is in maintenance.

This patch introduces a similar change for connections attached in the
session. session_check_idle_conn() now returns an errorl if connection
target server is in maintenance, similarly to session max idle conns
limit reached. This is sufficient to instruct muxes to delete the
connection immediately.

MEDIUM: server: close new idle conns if server in maintenance

Currently, when a server is set on maintenance mode, its idle connection
are scheduled for purge. However, this does not prevent currently used
connection to become idle later on, even if the server is still off.

Change this behavior : an idle connection is now rejected by the server
if it is in maintenance. This is implemented with a new condition in
srv_add_to_idle_list() which returns an error value. In this case, muxes
stream detach callback will immediately free the connection.

A similar change is also performed in each MUX and SSL I/O handlers and
in conn_notify_mux(). An idle connection is not reinserted in its idle
list if server is in maintenance, but instead it is immediately freed.

MINOR: server: shard by thread sess_conns member

Server member <sess_conns> is a mt_list which contains every backend
connections attached to a session which targets this server. These
connecions are not present in idle server trees.

The main utility of this list is to be able to cleanup these connections
prior to removing a server via "del server" CLI. However, this procedure
will be adjusted by a future patch. As such, <sess_conns> member must be
moved into srv_per_thread struct. Effectively, this duplicates a list
for every threads.

This commit does not introduce functional change. Its goal is to ensure
that these connections are now ordered by their owning thread, which
will allow to implement a purge, similarly to idle connections attached
to servers.

MEDIUM: session: protect sess conns list by idle_conns_lock

Introduce idle_conns_lock usage to protect manipulation to <priv_conns>
session member. This represents a list of intermediary elements used to
store backend connections attached to a session to prevent their sharing
across multiple clients.

Currently, this patch is unneeded as sessions are only manipulated on a
single-thread. Indeed, contrary to idle connections stored in servers,
takeover is not implemented for connections attached to a session.
However, a future patch will introduce purging of these connections,
which is already performed for connections attached to servers. As this
can be executed by any thread, it is necessary to introduce
idle_conns_lock usage to protect their manipulation.

MINOR: session: refactor alloc/lookup of sess_conns elements

By default backend connections are stored into idle/avail server trees.
However, if such connections cannot be shared between multiple clients,
session serves as the alternative storage.

To be able to quickly reuse a backend conn from a session, they are
indexed by their target, which is either a server or a backend proxy.
This is the purpose of 'struct sess_priv_conns' intermediary stockage
element.

Lookup and allocation of these elements are performed in several session
function, for example to add, get or remove a backend connection from a
session. The purpose of this patch is to simplify this by providing two
internal functions sess_alloc_sess_conns() and sess_get_sess_conns().

Along with this, a new BUG_ON() is added into session_unown_conn(),
which ensure that sess_priv_conns element is found when the connection
is removed from the session.

MINOR: session: uninline functions related to BE conns management

Move from header to source file functions related to session management
of backend connections. These functions are big enough to remove inline
attribute.

MINOR: session: document explicitely that session_add_conn() is safe

A set of recent patches have simplified management of backend connection
attached to sessions. The API is now stricter to prevent any misuse.

One of this change is the addition of a BUG_ON() in session_add_conn(),
which ensures that a connection is not attached to a session if its
<owner> field points to another entry.

On older haproxy releases, this assertion could not be enforced due to
NTLM as a connection is turned as private during its transfer. When
using a true multiplexed protocol on the backend side, the connection
could be assigned in turn to several sessions. However, NTLM is now only
applied for HTTP/1.1 as it does not make sense if the connection is
already shared.

To better clarify this situation, extend the comment on BUG_ON() inside
session_add_conn().

BUG/MINOR: mux-quic: do not access conn after idle list insert

Once a connection is inserted into the server idle/safe tree during
stream detach, it is not accessed anymore by the muxes without
idle_conns_lock protection. This is because the connection could have
been already stolen by a takeover operation.

Adjust QUIC MUX detach implementation to follow the same pattern. Note
that, no bug can occur due to takeover as QUIC does not implement it.
However, prior to this patch, there may still exist race-conditions with
idle connection purging.

No backport needed.

BUG/MINOR: server: decrement session idle_conns on del server

When a server is deleted, each of its idle connections are removed. This
is also performed for every private connections stored on sessions which
referenced the target server.

As mentionned above, these private connections are idle, guaranteed by
srv_check_for_deletion(). A BUG_ON() on CO_FL_SESS_IDLE is already
present to guarantee this. Thus, these connections are accounted on the
session to enforce max-session-srv-conns limit.

However, this counter is not decremented during private conns cleanup on
"del server" handler. This patch fixes this by adding a decrement for
every private connections removed via "del server".

This should be backported up to 3.0.

MINOR: cli: display failure reason on wait command

wait CLI command can be used to wait until either a defined timeout or a
specific condition is reached. So far, srv-removable is the only event
supported. This is tested via srv_check_for_deletion().

This is implemented via srv_check_for_deletion(), which is
able to report a message describing the reason if the condition is
unmet.

Previously, wait return a generic string, to specify if the condition is
met, the timer has expired or an immediate error is encountered. In case
of srv-removable, it did not report the real reason why a server could
not be removed.

This patch improves wait command with srv-removable. It now displays the
last message returned by srv_check_for_deletion(), either on immediate
error or on timeout. This is implemented by using dynamic string output
with cli_dynmsg/dynerr() functions.

BUG/MINOR: connection: remove extra session_unown_conn() on reverse

When a connection is reversed via rhttp protocol on the edge endpoint,
it migrates from frontend to backend side. This operation is performed
by conn_reverse(). During this transition, the conn owning session is
freed as it becomes unneeded.

Prior to this patch, session_unown_conn() was also called during
frontend to backend migration. However, this is unnecessary as this
function is only used for backend connection reuse. As such, this patch
removes this unnecessary call.

This does not cause any harm to the process as session_unown_conn() can
handle a connection not inserted yet. However, for clarity purpose it's
better to backport this patch up to 3.0.

BUG/MINOR: connection: rearrange union list members

A connection can be stored in several lists, thus there is several
attach points in struct connection. Depending on its proxy side, either
frontend or backend, a single connection will only access some of them
during its lifetime.

As an optimization, these attach points are organized in a union.
However, this repartition was not correctly achieved along
frontend/backend side delimitation.

Furthermore, reverse HTTP has recently been introduced. With this
feature, a connection can migrate from frontend to backend side or vice
versa. As such, it becomes even more tedious to ensure that these
members are always accessed in a safe way.

This commit rearrange these fields. First, union is now clearly splitted
between frontend and backend only elements. Next, backend elements are
initialized with conn_backend_init(), which is already used during
connection reversal on an edge endpoint. A new function
conn_frontend_init() serves to initialize the other members, called both
on connection first instantiation and on reversal on a dialer endpoint.

This model is much cleaner and should prevent any access to fields from
the wrong side.

Currently, there is no known case of wrong access in the existing code
base. However, this cleanup is considered an improvement which must be
backported up to 3.0 to remove any possible undefined behavior.

BUG/MEDIUM: mworker: fix startup and reload on macOS

Since the mworker rework in haproxy 3.1, the worker need to tell the
master that it is ready. This is done using the sockpair protocol by
sending a _send_status message to the master.

It seems that the sockpair protocol is buggy on macOS because of a known
issue around fd transfer documented in sendmsg(2):

https://man.freebsd.org/cgi/man.cgi?sendmsg(2) BUGS section

  Because sendmsg() does not necessarily block until the data has been
  transferred, it is possible to transfer an open file descriptor across
  an AF_UNIX domain socket (see recv(2)), then close() it before it has
  actually been sent, the result being that the receiver gets a closed
  file descriptor. It is left to the application to implement an
  acknowledgment mechanism to prevent this from happening.

Indeed the recv side of the sockpair is closed on the send side just
after the send_fd_uxst(), which does not implement an acknowledgment
mechanism. So the master might never recv the _send_status message.

In order to implement an acknowledgment mechanism, a blocking read() is
done before closing the recv fd on the sending side, so we are sure that
the message was read on the other side.

This was only reproduced on macOS, meaning the master CLI is also
impacted on macOS. But no solution was found on macOS for it.
Implementing an acknowledgment mechanism would complexify too much the
protocol in non-blocking mode.

The problem was reported in ticket #3045, reproduced and analyzed by
@cognet.

Must be backported as far as 3.1.

BUG/MINOR: acl: set arg_list->kw to aclkw->kw string literal if aclkw is found

During configuration parsing *args can contain different addresses, it is
changing from line to line. smp_resolve_args() is called after the
configuration parsing, it uses arg_list->kw to create an error message, if a
userlist referenced in some ACL is absent. This leads to wrong keyword names
reported in such message or some garbage is printed.

It does not happen in the case of sample fetches. In this case arg_list->kw is
assigned to a string literal from the sample_fetch struct returned by
find_sample_fetch(). Let's do the same in parse_acl_expr(), when find_acl_kw()
lookup returns a corresponding acl_keyword structure.

This fixes the issue #3088 at GitHub.
This should be backported in all stable versions since 2.6 including 2.6.

BUG/MINOR: mux-quic: trace with non initialized qcc

This issue leads to crashes when the QUIC mux traces are enabled and could be
reproduced with -dMfail. When the qcc allocation fails (qcc_init()) haproxy
crashes into qmux_dump_qcc_info() because ->conn qcc member is initialized:

Program terminated with signal SIGSEGV, Segmentation fault.
at src/qmux_trace.c:146
146 const struct quic_conn *qc = qcc->conn->handle.qc;
[Current thread is 1 (LWP 1448960)]
(gdb) p qcc
$1 = (const struct qcc *) 0x7f9c63719fa0
(gdb) p qcc->conn
$2 = (struct connection *) 0x155550508
(gdb)

This patch simply fixes the TRACE() call concerned to avoid <qcc> object
dereferencing when it is NULL.

Must be backported as far as 3.0.

MINOR: quic: remove ->offset qf_crypto struct field

This patch follows this previous bug fix:

BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets

where a ebtree node has been added to qf_crypto struct. It has the same
meaning and type as ->offset_node.key field with ->offset_node an eb64tree node.
This patch simply removes ->offset which is no more useful.

This patch should be easily backported as far as 2.6 as the one mentioned above
to ease any further backport to come.

DOC: configuration: clarify 'default-crt' and implicit default certificates

Clarify the behavior of implicit default certificates when used on the
same line as the default-crt keyword.

Should be backported as far as 3.2

MEDIUM: ssl: convert diag to warning for strict-sni + default-crt

Previous patch emits a diag warning when both 'strict-sni' +
'default-crt' are used on the same bind line.

This patch converts this diagnostic warning to a real warning, so the
previous patch could be backported without breaking configurations.

This was discussed in #3082.

MINOR: ssl: diagnostic warning when both 'default-crt' and 'strict-sni' are used

It possible to use both 'strict-sni' and 'default-crt' on the same bind
line, which does not make much sense.

This patch implements a check which will look for default certificates
in the sni_w tree when strict-sni is used. (Referenced by their empty
sni ""). default-crt sets the CKCH_INST_EXPL_DEFAULT flag in
ckch_inst->is_default, so its possible to differenciate explicits
default from implicit default.

Could be backported as far as 3.0.

This was discussed in ticket #3082.

BUG/MINOR: quic: reorder fragmented RX CRYPTO frames by their offsets

This issue impacts the QUIC listeners. It is the same as the one fixed by this
commit:

BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO

As chrome, ngtcp2 client decided to fragment its CRYPTO frames but in a much
more agressive way. This could be fixed with a list local to qc_parse_pkt_frms()
to please chrome thanks to the commit above. But this is not sufficient for
ngtcp2 which often splits its ClientHello message into more than 10 fragments
with very small ones. This leads the packet parser to interrupt the CRYPTO frames
parsing due to the ncbuf gap size limit.

To fix this, this patch approximatively proceeds the same way but with an
ebtree to reorder the CRYPTO by their offsets. These frames are directly
inserted into a local ebtree. Then this ebtree is reused to provide the
reordered CRYPTO data to the underlying ncbuf (non contiguous buffer). This way
there are very few less chances for the ncbufs used to store CRYPTO data
to reach a too much fragmented state.

Must be backported as far as 2.6.

BUG/MEDIUM: quic-be: avoid crashes when releasing Initial pktns

This bug arrived with this fix:

BUG/MINOR: quic-be: missing Initial packet number space discarding

leading to crashes when dereferencing ->ipktns.

Such crashes could be reproduced with -dMfail option. To reach them, the
memory allocations must fail. So, this is relatively rare, except on systems
with limited memory.

To fix this, do not call quic_pktns_discard() if ->ipktns is NULL.

No need to backport.

DOC: configuration: reword 'generate-certificates'

Reword the 'generate-certificates' keyword documentation to clarify
what's happening upon error.

This was discussed in ticket #3082.

MINOR: proxy: handle shared listener counters preparation from proxy_postcheck()

We used to allocate and prepare listener counters from
check_config_validity() all at once. But it isn't correct, since at that
time listeners's guid are not inserted yet, thus
counters_fe_shared_prepare() cannot work correctly, and so does
shm_stats_file_preload() which is meant to be called even earlier.

Thus in this commit (and to prepare for upcoming shm shared counters
preloading patches), we handle the shared listener counters prep in
proxy_postcheck(), which means that between the allocation and the
prep there is the proper window for listener's guid insertion and shm
counters preloading.

No change of behavior expected when shm shared counters are not
actually used.

MEDIUM: server: split srv_init() in srv_preinit() + srv_postinit()

We actually need more granularity to split srv postparsing init tasks:
Some of them are required to be run BEFORE the config is checked, and
some of them AFTER the config is checked.

Thus we push the logic from 368d0136 ("MEDIUM: server: add and use
srv_init() function") a little bit further and split the function
in two distinct ones, one of them executed under check_config_validity()
and the other one using REGISTER_POST_SERVER_CHECK() hook.

SRV_F_CHECKED flag was removed because it is no longer needed,
srv_preinit() is only called once, and so is srv_postinit().

MINOR: haproxy: abort config parsing on fatal errors for post parsing hooks

When pre-check and post-check postparsing hooks= are evaluated in
step_init_2() potential fatal errors are ignored during the iteration
and are only taken into account at the end of the loop. This is not ideal
because some errors (ie: memory errors) could cause multiple alert
messages in a row, which could make troubleshooting harder for the user.

Let's stop as soon as a fatal error is encountered for post parsing
hooks, as we use to do everywhere else.

BUG/MEDIUM: spoe: Improve error detection in SPOE applet on client abort

It is possible to interrupt a SPOE applet without reporting an error. For
instance, when the client of the parent stream aborts. Thanks to this patch,
we take care to report an error on the SPOE applet to be sure to interrupt
the processing. It is especially important if the connection to the agent is
queued. Thanks to 886a248be ("BUG/MEDIUM: mux-spop: Reject connection
attempts from a non-spop frontend"), it is no longer an issue. But there is
no reason to continue to process if the parent stream is gone.

In addition, in the SPOE filter, if the processing is interrupted when the
filter is destroyed, no specific status code was set. It is not a big deal
because it cannot be logged at this stage. But it can be used to notify the SPOE
applet. So better to set it.

This patch should be backported as far as 3.1.

REGTESTS: jwt: create dynamically "cert.ecdsa.pem"

Stop declaring "cert.ecdsa.pem" in a crt-store, and add it dynamically
over the stats socket insted.

This way we fully verify a JWS signature with a certificate which never
existed at HAProxy startup.

BUG/MEDIUM: mux-spop: Reject connection attempts from a non-spop frontend

It is possible to crash the process by initializing a connection to a SPOP
server from a non-spop frontend. It is of course unexpected and invalid. And
there are some checks to prevent that when the configuration is
loaded. However, it is not possible to handle all cases, especially the
"use_backend" rules relying on log-format strings.

It could be good to improve the backend selection by checking the mode
compatibility (for now, it is only performed for the HTTP).

But at the end, this can also be handled by the SPOP multiplexer when it is
initialized. If the opposite SD is not attached to an SPOE agent, we should
fail the mux initialization and return an internal error.

This patch must be backported as far as 3.1.

MEDIUM: applet: Set .rcv_buf and .snd_buf functions on default ones if not set

Based on the applet flags, it is possible to set .rcv_buf and .snd_buf
callback functions if necessary. If these functions are not defined for an
applet using the new API, it means the default functions must be used.

We also take care to choose the raw version or the htx version, depending on
the applet flags.