git.ipfire.org Git - thirdparty/haproxy.git/log

MINOR: log: add dup_logsrv() helper function

ease code maintenance by introducing dup_logsrv() helper function to
properly duplicate an existing logsrv struct.

MEDIUM: httpclient/logs: rely on per-proxy post-check instead of global one

httpclient used to register a global post-check function to iterate over
all known proxies and post-initialize httpclient related ones (mainly
for logs initialization).

But we currently have an issue: post_sink_resolve() function which is
also registered using REGISTER_POST_CHECK() macro conflicts with
httpclient_postcheck() function.

This is because post_sink_resolve() relies on proxy->logsrvs to be
correctly initialized already, and httpclient_postcheck() may create
and insert new logsrvs entries to existing proxies when executed.

So depending on which function runs first, we could run into trouble.

Hopefully, to this day, everything works "by accident" due to
http_client.c file being loaded before sink.c file when compiling source
code.

But as soon as we would move one of the two functions to other files, or
if we rename files or make changes to the Makefile build recipe, we could
break this at any time.

To prevent post_sink_resolve() from randomly failing in the future, we now
make httpclient postcheck rely on per-proxy post-checks by slightly
modifying httpclient_postcheck() function so that it can be registered
using REGISTER_POST_PROXY_CHECK() macro.

As per-proxy post-check functions are executed right after config parsing
for each known proxy (vs global post-check which are executed a bit later
in the init process), we can be certain that functions registered using
global post-check macro, ie: post_sink_resolve(), will always be executed
after httpclient postcheck, effectively resolving the ordering conflict.

This should normally not cause visible behavior changes, and while it
could be considered as a bug, it's probably not worth backporting it
since the only way to trigger the issue is through code refactors,
unless we want to backport it to ease code maintenance of course,
in which case it should easily apply for >= 2.7.

MINOR: log: move log-forwarders cleanup in log.c

Move the log-forwarded proxies cleanup from global deinit() function into
log dedicated deinit function.

No backport needed.

MEDIUM: sink: don't perform implicit truncations when maxlen is not set

maxlen now defaults ~0 (instead of BUFSIZE) to make sure no implicit
truncation will be performed when the option is not specified, since the
doc doesn't mention any default value for maxlen. As such, if the payload
is too big, it will be dropped (this is the default expected behavior).

MINOR: sink: inform the user when logs will be implicitly truncated

Consider the following example:

|log ring@test-ring len 2000 local0
|
|ring test-ring
| maxlen 1000

This would result in emitted logs being silently truncated to 1000 because
test-ring maxlen is smaller than the log directive maxlen.

In this patch we're adding an extra check in post_sink_resolve() to detect
this kind of confusing setups and warn the user about the implicit
truncation when DIAG mode is on.

This commit depends on:
- "MINOR: sink: simplify post_sink_resolve function"

MINOR: log/sink: detect when log maxlen exceeds sink size

To prevent logs from being silently (and unexpectly droppped) at runtime,
we check that the maxlen parameter from the log directives are
strictly inferior to the targeted ring size.

      |global
      |     tune.bufsize 16384
      |     log tcp@127.0.0.1:514 len 32768
      |     log myring@127.0.0.1:514 len 32768
      |ring myring
      |     # no explicit size

On such configs, a diag warning will be reported.

This commit depends on:
- "MINOR: sink: simplify post_sink_resolve function"
- "MINOR: ring: add a function to compute max ring payload"

MINOR: sink: simplify post_sink_resolve function

Simplify post_sink_resolve() function to reduce code duplication and
make it easier to maintain.

BUG/MEDIUM: ring: adjust maxlen consistency check

When user specifies a maxlen parameter that is greater than the size of
a given ring section, a warning is emitted to inform that the max length
exceeds size, and then the maxlen is forced to size.

The logic is good, but imprecise, because it doesn't take into account
the slight overhead from storing payloads into the ring.

In practise, we cannot store a single message which is exactly the same
length than size. Doing so will result in the message being dropped at
runtime.

Thanks to the ring_max_payload() function introduced in "MINOR: ring: add
a function to compute max ring payload", we can now deduce the maximum
value for the maxlen parameter before it could result in messages being
dropped.

When maxlen value is set to an improper value, the warning will be emitted
and maxlen will be forced to the maximum "single" payload len that could
fit in the ring buffer, preventing messages from being dropped
unexpectedly.

This commit depends on:
- "MINOR: ring: add a function to compute max ring payload"

This may be backported as far as 2.2

MINOR: ring: add a function to compute max ring payload

Add a helper function to the ring API to compute the maximum payload
length that could fit into the ring based on ring size.

CI: github: Add a weekly CI run building with AWS-LC

Use determine_latest_aws_lc() from matrix.py to always test with
the latest release of AWS-LC. Run the common "default,bug,devel"
tests.

CI: Update matrix.py so all code is contained in functions.

Refactor matrix.py so all the logic is contained inside either
helper functions or a new main function. Run the new main function
by default. This lets other GitHub actions use functions in the
python code without generating the whole matrix.

CI: add support to matrix.py to determine the latest AWS-LC release

Refactor the existing OpenSSL tag parsing logic to share some of GitHub
tag logic. OpenSSL and AWS-LC don't follow the same naming convention so
each library has it's own sorting logic.

CI: scripts: add support to build-ssl.sh to download and build AWS-LC

Relies on a new enviornment variable 'AWS_LC_VERSION' to be set to
the GitHub tag to download and build.

MINOR: http_ana: position the FINAL flag for http_after_res execution

Ensure that the ACT_OPT_FINAL flag is always set when executing actions
from http_after_res context.

This will permit lua functions to be executed as http_after_res actions
since hlua_ctx_resume() automatically disables "yielding" when such flag
is set: the hlua handler will only allow 1shot executions at this point
(lua or not, we don't wan't to reschedule http_after_res actions).

BUG/MINOR: hlua/action: incorrect message on E_YIELD error

When hlua_action error messages were reworked in d5b073cf1
("MINOR: lua: Improve error message"), an error was made for the
E_YIELD case.

Indeed, everywhere E_YIELD error is handled: "yield is not allowed" or
similar error message is reported to the user. But instead we currently
have: "aborting Lua processing on expired timeout".

It is quite misleading because this error message often refers to the
HLUA_E_ETMOUT case.

Thus, we now report the proper error message thanks to this patch.

This should be backported to all stable versions.
[on 2.0, the patch needs to be slightly adapted]

BUG/MINOR: quic: Dereferenced unchecked pointer to Handshke packet number space

This issue was reported by longrtt interop test with quic-go as client
and @chipitsine in GH #2282 when haproxy is compiled against libressl.

Add two checks to prevent a pointer to the Handshake packet number space
to be dereferenced if this packet number space was released.

Thank you to @chipitsine for this report.

No need to backport.

BUG/MINOR: ring/cli: Don't expect input data when showing events

The "show events" command may wait for now events if "-w" option is used. In
this case, no timeout must be triggered. So we explicitly state no input
data are expected. This disables the read timeout on the client side.

This patch should be backported to 2.8. It is probably useless to backport
it further. In all cases, it depends on the commit "BUG/MINOR: applet:
Always expect data when CLI is waiting for a new command"

BUG/MINOR: applet: Always expect data when CLI is waiting for a new command

There is a mechanism for applets to disable the read timeout on the opposite
side if it is now waiting for any data. Of course, there is also a way to
re-activate it. But, it must excplicitly be handle by applets.

For the CLI, some commands may state no input data are expected. So we must
be sure to reset its state when the applet is waiting for a new command. For
now, it is not a bug because no CLI command uses this mechanism.

This patch must be backported to 2.8.

NUG/MEDIUM: stconn: Always update stream's expiration date after I/O

It is a revert of following patches:

* d7111e7ac ("MEDIUM: stconn: Don't requeue the stream's task after I/O")
* 3479d99d5 ("BUG/MEDIUM: stconn: Update stream expiration date on blocked sends")

Because the first one is reverted, the second one is useless and can be reverted
too.

The issue here is that I/O may be performed without stream wakeup. So if no
expiration date was set on the last call to process_stream(), the stream is
never rescheduled and no timeout can be detected. This especially happens on
TCP streams because fast-forward is enabled very early.

Instead of tracking all places where the stream's expiration data must be
updated, it is now centralized in sc_notify(), as it was performed before
the timeout refactoring.

This patch must be backported to 2.8.

BUG/MEDIUM: stconn/stream: Forward shutdown on write timeout

The commit 7f59d68fe2 ("BUG/MEDIIM: stconn: Flush output data before
forwarding close to write side") introduced a regression. When a write
timeout is detected, the shutdown is no longer forwarded. Dependig on the
channels state, it may block the processing, waiting the client or the
server leaves.

The commit above tries to avoid to truncate messages on shutdown but on
write timeout, if the channel is not empty, there is nothing more we can do
to send these data. It means the endpoint is unable to send data. In this
case, we must forward the shutdown.

This patch should be backported as far as 2.2.

BUG/MEDIUM: applet: Report an error if applet request more room on aborted SC

If an abort was performed and the applet still request more room, it means
the applet has not properly handle the error on its own. At least the CLI
applet is concerned. Instead of reviewing all applets, the error is now
handled in task_run_applet() function.

Because of this bug, a session may be blocked infinitly and may also lead to
a wakup loop.

This patch must only be backported to 2.8 for now. And only to lower
versions if a bug is reported because it is a bit sensitive and the code
older versions are very different.

BUG/MEDIUM: stconn: Report read activity when a stream is attached to front SC

It only concerns the front SC. But it is important to report a read activity
when a stream is created and attached to the front SC, especially in TCP. In
HTTP, when this happens, the request was necessarily received. But in TCP,
the client may open a connection without sending anything. We must still
report a first read activity in this case to be able to properly report
client timeout.

This patch must be backported to 2.8.

BUG/MEDIUM: applet: Fix API for function to push new data in channels buffer

All applets only check the -1 error value (need room) for applet_put*
functions while the underlying functions may also return -2 if the input is
closed or -3 if the data length is invalid. It means applets already handle
other cases by their own.

The API should be fixed but for now, to ease backports, we only fix
applet_put* functions to always return -1 on error. This way, at least for
the applets point of view, the API is consistent.

This patch should be backported to 2.8. Probably not further. Except if we
suspect it could fix a bug.

BUG/MINOR: stconn: Don't inhibit shutdown on connection on error

In the SC function responsible to perform shutdown, there is a statement
inhibiting the shutdown if an error was encountered on the SC. This
statement is inherited from very old version and should in fact be
removed. The error may be set from the stream. In this case the shutdown
must be performed. In all cases, it is not a big deal if the shutdown is
performed twice because underlying functions already handle multiple calls.

This patch does not fix any bug. Thus there is no reason to backport it.

BUG/MINOR: quic: Wrong RTT computation (srtt and rrt_var)

Due to the fact that several variable values (rtt_var, srtt) were stored as multiple
of their real values, some calculations were less accurate as expected.

Stop storing 4*rtt_var values, and 8*srtt values.
Adjust all the impacted statements.

Must be backported as far as 2.6.

BUG/MINOR: quic: Wrong RTT adjusments

There was a typo in the test statement to check if the rtt must be adjusted
(>= incorectly replaced by >).

Must be backported as far as 2.6.

MINOR: httpclient: allow to configure the timeout.connect

When using the httpclient, one could be bothered with it returning
after a very long time when failing. By default the httpclient has a
retries of 3 and a timeout connect of 5s, which can results in pause of
20s upon failure.

This patch allows the user to configure the "timeout connect" of the
httpclient so it could reduce the time to return an error.

This patch helps fixing part of the issue #2269.

Could be backported in 2.7 if needed.

MINOR: httpclient: allow to configure the retries

When using the httpclient, one could be bothered with it returning after
a very long time when failing. By default the httpclient has a retries
of 3 and a timeout connect of 5s, which can results in pause of 20s
upon failure.

This patch allows the user to configure the retries of the httpclient so
it could reduce the time to return an error.

This patch helps fixing part of the issue #2269.

Could be backported in 2.7 if needed.

MEDIUM: mworker: display a more accessible message when a worker crash

Should fix issue #1034.

Display a more accessible message when a worker crash about what to do.

Example:

$ ./haproxy -W -f haproxy.cfg
[NOTICE]   (308877) : New worker (308884) forked
[NOTICE]   (308877) : Loading success.
[NOTICE]   (308877) : haproxy version is 2.9-dev4-d90d3b-58
[NOTICE]   (308877) : path to executable is ./haproxy
[ALERT]    (308877) : Current worker (308884) exited with code 139 (Segmentation fault)
[WARNING]  (308877) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies.
Please check that you are running an up to date and maintained version of haproxy and open a bug report.
HAProxy version 2.9-dev4-d90d3b-58 2023/09/05 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: Linux 6.2.0-31-generic #31-Ubuntu SMP PREEMPT_DYNAMIC Mon Aug 14 13:42:26 UTC 2023 x86_64
[ALERT]    (308877) : exit-on-failure: killing every processes with SIGTERM
[WARNING]  (308877) : All workers exited. Exiting... (139)

MINOR: global: export the display_version() symbol

Export the display_version() function which can be used elsewhere than
in haproxy.c

BUG/MINOR: quic: Unchecked pointer to Handshake packet number space

It is possible that there are still Initial crypto data in flight without
Handshake crypto data in flight. This is very rare but possible.

This issue was reported by handshakeloss interop test with quic-go as client
and @chipitsine in GH #2279.

No need to backport.

BUILD: quic: Compilation issue on 32-bits systems with quic_may_send_bytes()

quic_may_send_bytes() implementation arrived with this commit:

MINOR: quic: Amplification limit handling sanitization.

It returns a size_t. So when compared with QUIC_MIN() with qc->path->mtu there is
no need to cast this latted anymore because it is also a size_t.

Detected when compiled with -m32 gcc option.

MEDIUM: threads: detect excessive thread counts vs cpu-map

This detects when there are more threads bound via cpu-map than CPUs
enabled in cpu-map, or when there are more total threads than the total
number of CPUs available at boot (for unbound threads) and configured
for bound threads. In this case, a warning is emitted to explain the
problems it will cause, and explaining how to address the situation.

Note that some configurations will not be detected as faulty because
the algorithmic complexity to resolve all arrangements grows in O(N!).
This means that having 3 threads on 2 CPUs and one thread on 2 CPUs
will not be detected as it's 4 threads for 4 CPUs. But at least configs
such as T0:(1,4) T1:(1,4) T2:(2,4) T3:(3,4) will not trigger a warning
since they're valid.

MEDIUM: threads: detect incomplete CPU bindings

It's very easy to mess up with some cpu-map directives and to leave
some thread unbound. Let's add a test that checks that either all
threads are bound or none are bound, but that we do not face the
intermediary situation where some are pinned and others are left
wandering around, possibly on the same CPUs as bound ones.

Note that this should not be backported, or maybe turned into a
notice only, as it appears that it will easily catch invalid
configs and that may break updates for some users.

MINOR: cpuset: centralize a reliable bound cpu detection

Till now the CPUs that were bound were only retrieved in
thread_cpus_enabled() in order to count the number of CPUs allowed,
and it relied on arch-specific code.

Let's slightly arrange this into ha_cpuset_detect_bound() that
reuses the ha_cpuset struct and the accompanying code. This makes
the code much clearer without having to carry along some arch-specific
stuff out of this area.

Note that the macos-specific code used in thread.c to only count
online CPUs but not retrieve a mask, so for now we can't infer
anything from it and can't implement it.

In addition and more importantly, this function is reliable in that
it will only return a value when the detection is accurate, and will
not return incomplete sets on operating systems where we don't have
an exact list, such as online CPUs.

MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets

This operation was not implemented and will be needed later.

MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU in a set

This function will be convenient to test for the presence of a given CPU
in a set.

BUILD: checks: shut up yet another stupid gcc warning

gcc has always had hallucinations regarding value ranges, and this one
is interesting, and affects branches 4.7 to 11.3 at least. When building
without threads, the randomly picked new_tid that is reduced to a multiply
by 1 shifted right 32 bits, hence a constant output of 0 shows this
warning:

  src/check.c: In function 'process_chk_conn':
  src/check.c:1150:32: warning: array subscript [-1, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds]
  In file included from include/haproxy/thread.h:28,
                   from include/haproxy/list.h:26,
                   from include/haproxy/action.h:28,
                   from src/check.c:31:

or this one when trying to force the test to see that it cannot be zero(!):

  src/check.c: In function 'process_chk_conn':
  src/check.c:1150:54: warning: array subscript [0, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds]
   1150 |         uint t2_act  = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks);
        |                                         ~~~~~~~~~~~~~^~~~~~
  include/haproxy/atomic.h:66:40: note: in definition of macro 'HA_ATOMIC_LOAD'
     66 | #define HA_ATOMIC_LOAD(val)          *(val)
        |                                        ^~~
  src/check.c:1150:24: note: in expansion of macro '_HA_ATOMIC_LOAD'
   1150 |         uint t2_act  = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks);
        |                        ^~~~~~~~~~~~~~~

Let's just add an ALREADY_CHECKED() statement there, no other check seems
to get rid of it. No backport is needed.

BUILD: bug: make BUG_ON() void to avoid a rare warning

When building without threads, the recently introduced BUG_ON(tid != 0)
turns to a constant expression that evaluates to 0 and that is not used,
resulting in this warning:

src/connection.c: In function 'conn_free':
src/connection.c:584:3: warning: statement with no effect [-Wunused-value]

This is because the whole thing is declared as an expression for clarity.
Make it return void to avoid this. No backport is needed.

REGTESTS: ssl: skip ssl_dh test with AWS-LC

skip ssl_dh test when HAProxy is built with AWS-LC which does not support FFDH ciphersuites.

BUILD: ssl: Build with new cryptographic library AWS-LC

This adds a new option for the Makefile USE_OPENSSL_AWSLC, and
update the documentation with instructions to use HAProxy with
AWS-LC.

Update the type of the OCSP callback retrieved with
SSL_CTX_get_tlsext_status_cb with the actual type for
libcrypto versions greater than 1.0.2. This doesn't affect
OpenSSL which casts the callback to void* in SSL_CTX_ctrl.

MINOR: properly mark the end of the CLI command in error messages

In several places in the file src/ssl_ckch.c, in the message about the
incorrect use of the CLI command, the end of that CLI command is not
correctly marked with the sign ' .

DOC: configuration: update examples for req.ver

Update the documentation for the req.ver sample fetch.

Could be backported as far as 2.6.

BUG/MINOR: stream: further protect stream_dump() against incomplete sessions

As found by Coverity in issue #2273, the fix in commit e64bccab2 ("BUG/MINOR:
stream: protect stream_dump() against incomplete streams") was still not
enough, as scf/scb are still dereferenced to dump their flags and states.

This should be backported to 2.8.

BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer

A previous fix to ensure that there is sufficient space on the output buffer
to place parsed data (#2053) introduced an issue that if the output buffer is
filled on a chunk boundary no data is parsed but the congested flag is not set
due to the state not being H1_MSG_DATA.

The check to ensure that there is sufficient space in the output buffer is
actually already performed in all downstream functions before it is used.
This makes the early optimisation that avoids the state transition to
H1_MSG_DATA needless. Therefore, in order to allow the chunk parser to
continue in this edge case we can simply remove the early check. This
ensures that the state can progress and set the congested flag correctly
in the caller.

This patch fixes #2262. The upstream change that caused this logic error was
backported as far as 2.5, therefore it makes sense to backport this fix back
that far also.

BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV patches

In commit fecc573da ("MEDIUM: connection: Generic, list-based allocation
and look-up of PPv2 TLVs") there was a tiny mistake, elements of length
<= 128 are allocated from pool_pp_128 but only those of length < 128 are
released to this pool, other ones go to pool_pp_256. Because of this,
elements of size exactly 128 are allocated from 128 and released to 256.
It can be reproduced a few times by running sample_fetches/tlvs.vtc 1000
times with -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_EXPR
-DDEBUG_STRICT=2 -DDEBUG_POOL_INTEGRITY -DDEBUG_POOL_TRACING
-DDEBUG_NO_POOLS. Not sure why it doesn't reproduce more often though.

No backport is needed. This should address github issues #2275 and #2274.

BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced

It is possible that there are still Initial crypto data in flight without
Handshake crypto data in flight. This is very rare but possible.

This issue was reported by long-rtt interop test with quic-go as client
and @chipitsine in GH #2276.

No need to backport.

BUG/MAJOR: quic: Really ignore malformed ACK frames.

If not correctly parsed, an ACK frame must be ignored without any more
treatment. Before this patch an ACK frame could be partially correctly
parsed, then some errors could be detected which leaded newly acknowledged
packets to be released in a wrong way calling free_quic_tx_pkts() called
by qc_parse_ack_frm(). But there is no reason to release such packets because
of a malformed ACK frame.

This patch modifies qc_parse_ack_frm(). The newly acknowledged TX packets is done
in two steps. It first collects the newly acknowledged packet calling
qc_newly_acked_pkts(). Then proceed the same way as before for the treatments of
haproxy TX packets acknowledged by the peer. If the ACK frame could not be fully
parsed, the newly ackowledged packets are replaced back from where they were
detached: the tree of TX packets for their encryption level.

Must be backported as far as 2.6.

MINOR: quic: Add a trace to quic_release_frm()

Display the address of the frame to be released as soon as entering into
quic_release_frm() whose job is obviously to released the memory allocated
for the frame <frm> passed as parameter.

BUG/MINOR: quic: Possible skipped RTT sampling

There are very few chances this bug may occur. Furthermore the consequences
are not dramatic: an RTT sampling may be ignored. I guess this may happen
when the now_ms global value wraps.

Do not rely on the time variable value a packet was sent to decide if it
is a newly acknowledged packet but on its presence or not in the tx packet
ebtree.

Must be backported as far as 2.6.

BUG/MEDIUM: stconn: Don't block sends if there is a pending shutdown

For the same reason than the previous patch, we must not block the sends
when there is a pending shutdown. In other words, we must consider the sends
are allowed when there is a pending shutdown.

This patch must slowly be backported as far as 2.2. It should partially fix
issue #2249.

BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending shutdown

An applet is not woken up on sending path if it is not waiting for data or
if it states it will not consume data. However, it is important to still
wake it up if there is a pending shutdown. Otherwise, the event may be
missed and some data may remain blocked in the channel's buffer.

Because of this bug, it is possible to have a stream stuck if data are also
blocked on the opposite channel. It is for instance possible to hit the buf
with the stats applet and a client not consuming data.

This patch must slowly be backported as far as 2.2. It should partially fix
issue #2249.

BUG/MINOR: stconn: Don't report blocked sends during connection establishment

The server timeout must not be handled during the connection establishment
to not superseed the connect timeout. To do so, we must not consider
outgoing data are blocked during this stage. Concretly, it means the fsb
time must not be updated during connection establishment.

It is not an issue with regular clients because the server timeout is only
defined when the connection is estalished. However, it may be an issue for the
HTTP client, when the server timeout is lower than the connect timeout. In this
case, an early 502 may be reported with no connection retries.

This patch must be backported to 2.8.

BUG/MEDIUM: stconn: Update stream expiration date on blocked sends

When outgoing data are blocked, we must update the stream expiration date
and requeue the task. It is important to be sure to properly handle write
timeout, expecially if the stream cannot expire on reads. This bug was
introduced when handling of channel's timeouts was refactored to be managed
by the stream-connectors.

It is an issue if there is no server timeout and the client does not consume
the response (or the opposite but it is less common). It is also possible to
trigger the same scenario with applets on server side because, most of time,
there is no server timeout.

This patch must be backported to 2.8.

DEBUG: applet: Properly report opposite SC expiration dates in traces

The wrong label was used in trace to report expiration dates of the opposite
SC. "sc" was used instead of "sco".

This patch should be backported to 2.8.

MINOR: checks: also consider the thread's queue for rebalancing

Let's also check for other threads when the current one is queueing,
let's not wait for the load to be high. Now this totally eliminates
differences between threads.

MEDIUM: checks: implement a queue in order to limit concurrent checks

The progressive adoption of OpenSSL 3 and its abysmal handshake
performance has started to reveal situations where it simply isn't
possible anymore to succesfully run health checks on many servers,
because between the moment all the checks are started and the moment
the handshake finally completes, the timeout has expired!

This also has consequences on production traffic which gets
significantly delayed as well, all that for lots of checks. While it's
possible to increase the check delays, it doesn't solve everything as
checks still take a huge amount of time to converge in such conditions.

Here we take a different approach by permitting to enforce the maximum
concurrent checks per thread limitation and implementing an ordered
queue. Thanks to this, if a thread about to start a check has reached
its limit, it will add the check at the end of a queue and it will be
processed once another check is finished. This proves to be extremely
efficient, with all checks completing in a reasonable amount of time
and not being disturbed by the rest of the traffic from other checks.
They're just cycling slower, but at the speed the machine can handle.

One must understand however that if some complex checks perform multiple
exchanges, they will take a check slot for all the required duration.
This is why the limit is not enforced by default.

Tests on SSL show that a limit of 5-50 checks per thread on local
servers gives excellent results already, so that could be a good starting
point.

MEDIUM: checks: search more aggressively for another thread on overload

When the current check is overloaded (more running checks than the
configured limit), we'll try more aggressively to find another thread.
Instead of just opportunistically looking for one half as loaded, now if
the current thread has more than 1% more active checks than another one,
or has more than a configured limit of concurrent running checks, it will
search for a more suitable thread among 3 other random ones in order to
migrate the check there. The number of migrations remains very low (~1%)
and the checks load very fair across all threads (~1% as well). The new
parameter is called tune.max-checks-per-thread.

MINOR: check: also consider the random other thread's active checks

When checking if it's worth transferring a sleeping thread to another
random thread, let's also check if that random other thread has less
checks than the current one, which is another reason for transferring
the load there.

This commit adds a function "check_thread_cmp_load()" to compare two
threads' loads in order to simplify the decision taking.

The minimum active check count before starting to consider rebalancing
the load was now raised from 2 to 3, because tests show that at 15k
concurrent checks, at 2, 50% are evaluated for rebalancing and 30%
are rebalanced, while at 3, this is cut in half.

MINOR: checks: maintain counters of active checks per thread

Let's keep two check counters per thread:
  - one for "active" checks, i.e. checks that are no more sleeping
    and are assigned to the thread. These include sleeping and
    running checks ;

  - one for "running" checks, i.e. those which are currently
    executing on the thread.

By doing so, we'll be able to spread the health checks load a bit better
and refrain from sending too many at once per thread. The counters are
atomic since a migration increments the target thread's active counter.
These numbers are reported in "show activity", which allows to check
per thread and globally how many checks are currently pending and running
on the system.

Ideally, we should only consider checks in the process of establishing
a connection since that's really the expensive part (particularly with
OpenSSL 3.0). But the inner layers are really not suitable to doing
this. However knowing the number of active checks is already a good
enough hint.

MINOR: check/activity: collect some per-thread check activity stats

We now count the number of times a check was started on each thread
and the number of times a check was adopted. This helps understand
better what is observed regarding checks.

MINOR: check: remember when we migrate a check

The goal here is to explicitly mark that a check was migrated so that
we don't do it again. This will allow us to perform other actions on
the target thread while still knowing that we don't want to be migrated
again. The new READY bit combine with SLEEPING to form 4 possible states:

SLP  RDY   State      Description
  0    0    -          (reserved)
  0    1    RUNNING    Check is bound to current thread and running
  1    0    SLEEPING   Check is sleeping, not bound to a thread
  1    1    MIGRATING  Check is migrating to another thread

Thus we set READY upon migration, and check for it before migrating, this
is sufficient to prevent a second migration. To make things a bit clearer,
the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are
adjacent.

MINOR: checks: pin the check to its thread upon wakeup

When a check leaves the sleeping state, we must pin it to the thread that
is processing it. It's normally always the case after the first execution,
but initial checks that start assigned to any thread (-1) could be assigned
much later, causing problems with planned changes involving queuing. Thus
better do it early, so that all threads start properly pinned.

MINOR: checks: start the checks in sleeping state

The CHK_ST_SLEEPING state was introduced by commit d114f4a68 ("MEDIUM:
checks: spread the checks load over random threads") to indicate that
a check was not currently bound to a thread and that it could easily
be migrated to any other thread. However it did not start the checks
in this state, meaning that they were not redispatchable on startup.

Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0)
the cost of setting up new connections is so high that some threads may
experience connection timeouts on startup. In this case it's better if
they can transfer their excess load to other idle threads. By just
marking the check as sleeping upon startup, we can do this and
significantly reduce the number of failed initial checks.

BUG/MINOR: checks: do not queue/wake a bounced check

A small issue was introduced with commit d114f4a68 ("MEDIUM: checks:
spread the checks load over random threads"): when a check is bounced
to another thread, its expiration time is set to TICK_ETERNITY. This
makes it show as not expired upon first wakeup on the next thread,
thus being detected as "woke up too early" and being instantly
rescheduled. Only this after this next wakeup it will be properly
considered.

Several approaches were attempted to fix this. The best one seems to
consist in resetting t->expire and expired upon wakeup, and changing
the !expired test for !tick_is_expired() so that we don't trigger on
this case.

This needs to be backported to 2.7.

MINOR: activity: report the current run queue size

While troubleshooting the causes of load spikes, it appeared that the
length of individual run queues was missing, let's add it to "show
activity".

MEDIUM: server/ssl: pick another thread's session when we have none yet

The per-thread SSL context in servers causes a burst of connection
renegotiations on startup, both for the forwarded traffic and for the
health checks. Health checks have been seen to continue to cause SSL
rekeying for several minutes after a restart on large thread-count
machines. The reason is that the context is exlusively per-thread
and that the more threads there are, the more likely it is for a new
connection to start on a thread that doesn't have such a context yet.

In order to improve this situation, this commit ensures that a thread
starting an SSL connection to a server without a session will first
look at the last session that was updated by another thread, and will
try to use it. In order to minimize the contention, we're using a read
lock here to protect the data, and the first-level index is an integer
containing the thread number, that is always valid and may always be
dereferenced. This way the session retrieval algorithm becomes quite
simple:

  - if the last thread index is valid, then try to use the same session
    under a read lock ;
  - if any error happens, then atomically nuke the index so that other
    threads don't use it and the next one to update a connection updates
    it again

And for the ssl_sess_new_srv_cb(), we have this:

  - update the entry under a write lock if the new session is valid,
    otherwise kill it if the session is not valid;
  - atomically update the index if it was 0 and the new one is valid,
    otherwise atomically nuke it if the session failed.

Note that even if only the pointer is destroyed, the element will be
re-allocated by the next thread during the sess_new_srv_sb().

Right now a session is picked even if the SNI doesn't match, because
we don't know the SNI yet during ssl_sock_init(), but that's essentially
a matter of API, since connect_server() figures the SNI very early, then
calls conn_prepare() which calls ssl_sock_init(). Thus in the future we
could easily imaging storing a number of SNI-based contexts instead of
storing contexts per thread.

It could be worth backporting this to one LTS version after some
observation, though this is not strictly necessary. the current commit
depends on the following ones:

  BUG/MINOR: ssl_sock: fix possible memory leak on OOM
  MINOR: ssl_sock: avoid iterating realloc(+1) on stored context
  DOC: ssl: add some comments about the non-obvious session allocation stuff
  CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()
  MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid
  MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session
  MINOR: server/ssl: maintain an index of the last known valid SSL session
  MINOR: server/ssl: clear the shared good session index on failure
  MEDIUM: server/ssl: pick another thread's session when we have none yet

MINOR: server/ssl: clear the shared good session index on failure

If we fail to set the session using SSL_set_session(), we want to quickly
erase our index from the shared one so that any other thread with a valid
session replaces it.

MINOR: server/ssl: maintain an index of the last known valid SSL session

When a thread creates a new session for a server, if none was known yet,
we assign the thread id (hence the reused_sess index) to a shared variable
so that other threads will later be able to find it when they don't have
one yet. For now we only set and clear the pointer upon session creation,
we do not yet pick it.

Note that we could have done it per thread-group, so as to avoid any
cross-thread exchanges, but it's anticipated that this is essentially
used during startup, at a moment where the cost of inter-thread contention
is very low compared to the ability to restart at full speed, which
explains why instead we store a single entry.

MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session

The goal will be to permit a thread to update its session while having
it shared with other threads. For now we only place the lock and arrange
the code around it so that this is quite light. For now only the owner
thread uses this lock so there is no contention.

Note that there is a subtlety in the openssl API regarding
i2s_SSL_SESSION() in that it fills the area pointed to by its argument
with a dump of the session and returns a size that's equal to the
previously allocated one. As such, it does modify the shared area even
if that's not obvious at first glance.

MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid

In ssl_sock_set_servername(), we're retrieving the current server name
from the current thread, hoping it will not have changed. This is a
bit dangerous as strictly speaking it's not easy to prove that no other
connection had to use one between the moment it was retrieved in
ssl_sock_init() and the moment it's being read here. In addition, this
forces us to maintain one session per thread while this is not the real
need, in practice we only need one session per SNI. And the current model
prevents us from sharing sessions between threads.

This had been done in 2.5 via commit e18d4e828 ("BUG/MEDIUM: ssl: backend
TLS resumption with sni and TLSv1.3"), but as analyzed with William, it
turns out that a saner approach consists in keeping the call to
SSL_get_servername() there and instead to always assign the SNI to the
current SSL context via SSL_set_tlsext_host_name() immediately when the
session is retreived. This way the session and SNI are consulted atomically
and the host name is only checked from the session and not from possibly
changing elements.

As a bonus the rdlock that was added by that commit could now be removed,
though it didn't cost much.

CLEANUP: ssl: keep a pointer to the server in ssl_sock_init()

We're using about 6 times "__objt_server(conn->target)" there, it's not
quite easy to read, let's keep a pointer to the server.

DOC: ssl: add some comments about the non-obvious session allocation stuff

The SSL session allocation/reuse part is far from being trivial, and
there are some necessary tricks such as allocating then immediately
freeing that are required by the API due to internal refcount. All of
this is particularly hard to grasp, even with the scarce man pages.

Let's document a little bit what's granted and expected along this path
to help the reader later.

MINOR: ssl_sock: avoid iterating realloc(+1) on stored context

The SSL context storage in servers is per-thread, and the contents are
allocated for a length that is determined from the session. It turns out
that placing some traces there revealed that the realloc() that is called
to grow the area can be called multiple times in a row even for just
health checks, to grow the area by just one or two bytes. Given that
malloc() allocates in multiples of 8 or 16 anyway, let's round the
allocated size up to the nearest multiple of 8 to avoid this unneeded
operation.

MINOR: sample: Add common TLV types as constants for fc_pp_tlv

This patch adds common TLV types as specified in the PPv2 spec.
We will use the suffix of the type, e.g., PP2_TYPE_AUTHORITY becomes AUTHORITY.

MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch

The fetch logic is redundant and can be simplified by simply
calling the generic fetch with the correct TLV ID set as an
argument, similar to fc_pp_authority.

MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch

We already have a call that can retreive an TLV with any value.
Therefore, the fetch logic is redundant and can be simplified
by simply calling the generic fetch with the correct TLV ID
set as an argument.

MEDIUM: sample: Add fetch for arbitrary TLVs

Based on the new, generic allocation infrastructure, a new sample
fetch fc_pp_tlv is introduced. It is an abstraction for existing
PPv2 TLV sample fetches. It takes any valid TLV ID as argument and
returns the value as a string, similar to fc_pp_authority and
fc_pp_unique_id.

MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs

In order to be able to implement fetches in the future that allow
retrieval of any TLVs, a new generic data structure for TLVs is introduced.

Existing TLV fetches for PP2_TYPE_AUTHORITY and PP2_TYPE_UNIQUE_ID are
migrated to use this new data structure. TLV related pools are updated
to not rely on type, but only on size. Pools accomodate the TLV list
element with their associated value. For now, two pools for 128 B and
256 B values are introduced. More fine-grained solutions are possible
in the future, if necessary.

CLEANUP/MINOR: connection: Improve consistency of PPv2 related constants

This patch improves readability by scoping HA proxy related PPv2 constants
with a 'HA" prefix. Besides, a new constant for the length of a CRC32C
TLV is introduced. The length is derived from the PPv2 spec, so 32 Bit.

MEDIUM: capabilities: enable support for Linux capabilities

For a while there has been the constraint of having to run as root for
transparent proxying, and we're starting to see some cases where QUIC is
not running in socket-per-connection mode due to the missing capability
that would be needed to bind a privileged port. It's not realistic to
ask all QUIC users on port 443 to run as root, so instead let's provide
a basic support for capabilities at least on linux. The ones currently
supported are cap_net_raw, cap_net_admin and cap_net_bind_service. The
mechanism was made OS-specific with a dedicated file because it really
is. It can be easily refined later for other OSes if needed.

A new keyword "setcaps" is added to the global section, to enumerate the
capabilities that must be kept when switching from root to non-root. This
is ignored in other situations though. HAProxy has to be built with
USE_LINUX_CAP=1 for this to be supported, which is enabled by default
for linux-glibc, linux-glibc-legacy and linux-musl.

A good way to test this is to start haproxy with such a config:

    global
        uid 1000
        setcap cap_net_bind_service

    frontend test
        mode http
        timeout client 3s
        bind quic4@:443 ssl crt rsa+dh2048.pem allow-0rtt

and run it under "sudo strace -e trace=bind,setuid", then connecting
there from an H3 client. The bind() syscall must succeed despite the
user id having been switched.

DOC: config: mention uid dependency on the tune.quic.socket-owner option

This option defaults to "connection" but is also dependent on the user
being allowed to bind the specified port. Since QUIC can easily run on
non-privileged ports, usually this is not a problem, but if bound to port
443 it will usually fail. Let's mention this.

BUG/MINOR: stream: protect stream_dump() against incomplete streams

If a stream is interrupted during its initialization by a panic signal
and tries to dump itself, it may cause a crash during the dump due to
scf and/or scb not being fully initialized. This may also happen while
releasing an endpoint to attach a new one. The effect is that instead
of dying on an abort, the process dies on a segv. This race is ultra-
rare but totally possible. E.g:

  #0  se_fl_test (test=1, se=0x0) at include/haproxy/stconn.h:98
  #1  sc_ep_test (test=1, sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:148
  #2  sc_conn (sc=0x7ff8d5cbd560) at include/haproxy/stconn.h:223
  #3  stream_dump (buf=buf@entry=0x7ff9507e7678, s=0x7ff4c40c8800, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>, eol=eol@entry=10 '\n') at src/stream.c:2840
  #4  0x000055996c493b42 in ha_task_dump (buf=buf@entry=0x7ff9507e7678, task=<optimized out>, pfx=pfx@entry=0x55996c558cb3 ' ' <repeats 13 times>) at src/debug.c:328
  #5  0x000055996c493edb in ha_thread_dump_one (thr=thr@entry=18, from_signal=from_signal@entry=0) at src/debug.c:227
  #6  0x000055996c493ff1 in ha_thread_dump (buf=buf@entry=0x7ff9507e7678, thr=thr@entry=18) at src/debug.c:270
  #7  0x000055996c494257 in ha_panic () at src/debug.c:430
  #8  ha_panic () at src/debug.c:411
  (...)
  #23 0x000055996c341fe8 in ssl_sock_close (conn=<optimized out>, xprt_ctx=0x7ff8dcae3880) at src/ssl_sock.c:6699
  #24 0x000055996c397648 in conn_xprt_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:148
  #25 conn_full_close (conn=0x7ff8c297b0c0) at include/haproxy/connection.h:192
  #26 h1_release (h1c=0x7ff8c297b3c0) at src/mux_h1.c:1074
  #27 0x000055996c39c9f0 in h1_detach (sd=<optimized out>) at src/mux_h1.c:3502
  #28 0x000055996c474de4 in sc_detach_endp (scp=scp@entry=0x7ff9507e3148) at src/stconn.c:375
  #29 0x000055996c4752a5 in sc_reset_endp (sc=<optimized out>, sc@entry=0x7ff8d5cbd560) at src/stconn.c:475

Note that this cannot happen on "show sess" since a stream never leaves
process_stream in such an uninitialized state, thus it's really only the
crash dump that may cause this.

It should be backported to 2.8.

BUG/MINOR: ssl/cli: can't find ".crt" files when replacing a certificate

Bug was introduced by commit 26654 ("MINOR: ssl: add "crt" in the
cert_exts array").

When looking for a .crt directly in the cert_exts array, the
ssl_sock_load_pem_into_ckch() function will be called with a argument
which does not have its ".crt" extensions anymore.

If "ssl-load-extra-del-ext" is used this is not a problem since we try
to add the ".crt" when doing the lookup in the tree.

However when using directly a ".crt" without this option it will failed
looking for the file in the tree.

The fix removes the "crt" entry from the array since it does not seem to
be really useful without a rework of all the lookups.

Should fix issue #2265

Must be backported as far as 2.6.

BUILD: pools: import plock.h to build even without thread support

In 2.9-dev4, commit 544c2f2d9 ("MINOR: pools: use EBO to wait for
unlock during pool_flush()") broke the thread-less build by calling
pl_wait_new_long() without explicitly including plock.h which is
normally included by thread.h when threads are enabled.

BUILD: import: guard plock.h against multiple inclusion

Surprisingly there's no include guard in plock.h though there is one in
atomic-ops.h. Let's add one, or we cannot risk including the file multiple
times.

BUG/MEDIUM: mux-h2: fix crash when checking for reverse connection after error

If the connection is closed in h2_release(), which is indicated by ret<0, we
must not dereference conn anymore. This was introduced in 2.9-dev4 by commit
5053e8914 ("MEDIUM: h2: prevent stream opening before connection reverse
completed") and detected after a few hours of runtime thanks to running with
pool integrity checks and caller enabled. No backport is needed.

[RELEASE] Released version 2.9-dev4

Released version 2.9-dev4 with the following main changes :
    - DEV: flags/show-sess-to-flags: properly decode fd.state
    - BUG/MINOR: stktable: allow sc-set-gpt(0) from tcp-request connection
    - BUG/MINOR: stktable: allow sc-add-gpc from tcp-request connection
    - DOC: typo: fix sc-set-gpt references
    - SCRIPTS: git-show-backports: automatic ref and base detection with -m
    - REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (3)
    - DOC: jwt: Add explicit list of supported algorithms
    - BUILD: Makefile: add the USE_QUIC option to make help
    - BUILD: Makefile: add USE_QUIC_OPENSSL_COMPAT to make help
    - BUILD: Makefile: realigned USE_* options in make help
    - DEV: makefile: fix POSIX compatibility for "range" target
    - IMPORT: plock: also support inlining the int code
    - IMPORT: plock: always expose the inline version of the lock wait function
    - IMPORT: lorw: support inlining the wait call
    - MINOR: threads: inline the wait function for pthread_rwlock emulation
    - MINOR: atomic: make sure to always relax after a failed CAS
    - MINOR: pools: use EBO to wait for unlock during pool_flush()
    - BUILD/IMPORT: fix compilation with PLOCK_DISABLE_EBO=1
    - MINOR: quic+openssl_compat: Do not start without "limited-quic"
    - MINOR: quic+openssl_compat: Emit an alert for "allow-0rtt" option
    - BUG/MINOR: quic: allow-0rtt warning must only be emitted with quic bind
    - BUG/MINOR: quic: ssl_quic_initial_ctx() uses error count not error code
    - MINOR: pattern: do not needlessly lookup the LRU cache for empty lists
    - IMPORT: xxhash: update xxHash to version 0.8.2
    - MINOR: proxy: simplify parsing 'backend/server'
    - MINOR: connection: centralize init/deinit of backend elements
    - MEDIUM: connection: implement passive reverse
    - MEDIUM: h2: reverse connection after SETTINGS reception
    - MINOR: server: define reverse-connect server
    - MINOR: backend: only allow reuse for reverse server
    - MINOR: tcp-act: parse 'tcp-request attach-srv' session rule
    - REGTESTS: provide a reverse-server test
    - MINOR: tcp-act: define optional arg name for attach-srv
    - MINOR: connection: use attach-srv name as SNI reuse parameter on reverse
    - REGTESTS: provide a reverse-server test with name argument
    - MINOR: proto: define dedicated protocol for active reverse connect
    - MINOR: connection: extend conn_reverse() for active reverse
    - MINOR: proto_reverse_connect: parse rev@ addresses for bind
    - MINOR: connection: prepare init code paths for active reverse
    - MEDIUM: proto_reverse_connect: bootstrap active reverse connection
    - MINOR: proto_reverse_connect: handle early error before reversal
    - MEDIUM: h2: implement active connection reversal
    - MEDIUM: h2: prevent stream opening before connection reverse completed
    - REGTESTS: write a full reverse regtest
    - BUG/MINOR: h2: fix reverse if no timeout defined
    - CI: fedora: fix "dnf" invocation syntax
    - BUG/MINOR: hlua_fcn: potentially unsafe stktable_data_ptr usage
    - DOC: lua: fix Sphinx warning from core.get_var()
    - DOC: lua: fix core.register_action typo
    - BUG/MINOR: ssl_sock: fix possible memory leak on OOM
    - MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)
    - MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs)
    - MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list
    - MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock.
    - DOC: map/acl: Remove the comments about map/acl performance issue
    - DOC: Explanation of be_name and be_id fetches
    - MINOR: connection: simplify removal of idle conns from their trees
    - MINOR: server: move idle tree insert in a dedicated function
    - MAJOR: connection: purge idle conn by last usage

MAJOR: connection: purge idle conn by last usage

Backend idle connections are purged on a recurring occurence during the
process lifetime. An estimated number of needed connections is
calculated and the excess is removed periodically.

Before this patch, purge was done directly using the idle then the safe
connection tree of a server instance. This has a major drawback to take
no account of a specific ordre and it may removed functional connections
while leaving ones which will fail on the next reuse.

The problem can be worse when using criteria to differentiate idle
connections such as the SSL SNI. In this case, purge may remove
connections with a high rate of reusing while leaving connections with
criteria never matched once, thus reducing drastically the reuse rate.

To improve this, introduce an alternative storage for idle connection
used in parallel of the idle/safe trees. Now, each connection inserted
in one of this tree is also inserted in the new list at
`srv_per_thread.idle_conn_list`. This guarantees that recently used
connection is present at the end of the list.

During the purge, use this list instead of idle/safe trees. Remove first
connection in front of the list which were not reused recently. This
will ensure that connection that are frequently reused are not purged
and should increase the reuse rate, particularily if distinct idle
connection criterias are in used.

MINOR: server: move idle tree insert in a dedicated function

Define a new function _srv_add_idle(). This is a simple wrapper to
insert a connection in the server idle tree. This is reserved for simple
usage and require to idle_conns lock. In most cases,
srv_add_to_idle_list() should be used.

This patch does not have any functional change. However, it will help
with the next patch as idle connection will be always inserted in a list
as secondary storage along with idle/safe trees.

MINOR: connection: simplify removal of idle conns from their trees

Small change of API for conn_delete_from_tree(). Now the connection
instance is taken as argument instead of its inner node.

No functional change introduced with this commit. This simplifies
slightly invocation of conn_delete_from_tree(). The most useful changes
is that this function will be extended in the next patch to be able to
remove the connection from its new idle list at the same time as in its
idle tree.

DOC: Explanation of be_name and be_id fetches

The be_name and be_id fetches contain data related to the current
backend and can be used in frontend responses. Yet, in cases where
no backend is used due to a local response or backend selection
failure, these fetches retain details of the current frontend.

This patch enhances the clarity of the values provided by these
fetches.

Signed-off-by: Sébastien Gross <sgross@haproxy.com>

DOC: map/acl: Remove the comments about map/acl performance issue

These commits have improved the performances of "set-map", "add-acl"
http rule actions:

MINOR: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl" actions perfs)
MINOR: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)

MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock.

Replace ->lock type of pat_ref struct by HA_RWLOCK_T.
Replace all calls to HA_SPIN_LOCK() (resp. HA_SPIN_UNLOCK()) by HA_RWLOCK_WRLOCK()
(resp. HA_RWLOCK_WRUNLOCK()) when a write access is required.
There is only one read access which is needed. This is in the "show map" command
callback, cli_io_handler_map_lookup() where a HA_SPIN_LOCK() call is replaced
by HA_RWLOCK_RDLOCK() (resp. HA_SPIN_UNLOCK() by HA_RWLOCK_RDUNLOCK).
Replace HA_SPIN_INIT() calls by HA_RWLOCK_INIT() calls.

MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list

Replace as much as possible list_for_each*() around ->head list, member of
pat_ref_elt struct by use of its ->ebpt_root member which is an ebtree.

MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs)

Store a pointer to the expression (struct pattern_expr) into the data structure
used to chain/store the map element references (struct pat_ref_elt) , e.g. the
struct pattern_tree when stored into an ebtree or struct pattern_list when
chained to a list.

Modify pat_ref_set_elt() to stop inspecting all the expressions attached to a map
and to look for the <elt> element passed as parameter to retrieve the sample data
to be parsed. Indeed, thanks to the pointer added above to each pattern tree nodes
or list elements, they all can be inspected directly from the <elt> passed as
parameter and its ->tree_head and ->list_head member: the pattern tree nodes are
stored into elt->tree_head, and the pattern list elements are chained to
elt->list_head list. This inspection was also the job of pattern_find_smp() which
is no more useful. This patch removes the code of this function.

MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)

Organize reference to pattern element of map (struct pat_ref_elt) into an ebtree:
  - add an eb_root member to the map (pat_ref struct) and an ebpt_node to its
    element (pat_ref_elt struct),
  - modify the code to insert these nodes into their ebtrees each time they are
    allocated. This is done in pat_ref_append().

Note that ->head member (struct list) of map (struct pat_ref) is not removed
could have been removed. This is not the case because still necessary to dump
the map contents from the CLI in the order the map elememnts have been inserted.

This patch also modifies http_action_set_map() which is the callback at least
used by "set-map" action. The pat_ref_elt element returned by pat_ref_find_elt()
is no more ignored, but reused if not NULL by pat_ref_set() as first element to
lookup from. This latter is also modified to use the ebtree attached to the map
in place of the ->head list attached to each map element (pat_ref_elt struct).

Also modify pat_ref_find_elt() to makes it use ->eb_root map ebtree added to the
map by this patch in place of inspecting all the elements with a strcmp() call.

BUG/MINOR: ssl_sock: fix possible memory leak on OOM

That's the classical realloc() issue: if it returns NULL, the old area
is not freed but we erase the pointer. It was brought by commit e18d4e828
("BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3"), and
should be backported where this commit was backported.

DOC: lua: fix core.register_action typo

"converter" was used in place of "action" as a result of a copy-paste
error probably.

Also, rephrasing the "actions" keyword explanation to prevent confusion
between action name (which is the name of new action about to be created)
and action facilities where we want to expose the new action.

This could be backported to every stable versions.

DOC: lua: fix Sphinx warning from core.get_var()

Since f034139bc0 ("MINOR: lua: Allow reading "proc." scoped vars from
LUA core."), a new Sphinx warning is emitted when generating the lua doc:

"WARNING: Field list ends without a blank line; unexpected unindent."

This is due to a missing space after the line break continuation, sphinx
parser is very restrictive unfortunately!

Suppressing the warning and fixing the html output at the same time by
adding the missing space.