git.ipfire.org Git - thirdparty/haproxy.git/log

MEDIUM: queue: use a trylock on the server's queue

Doing so makes sure that threads attempting to wake up new connections
for a server will give up early if another thread is already in charge
of this. The goal is to avoid unneeded contention on low server counts.

Now with a single server with 16 threads in roundrobin we get the same
performance as with multiple servers, i.e. ~575kreq/s instead of ~496k
before. Leastconn is seeing a similar jump, from ~460 to ~560k (the
difference being the calls to fwlc_srv_reposition).

The overhead of process_srv_queue() is now around 2% instead of ~20%
previously.

MEDIUM: queue: take the proxy lock only during the px queue accesses

There's no point keeping the proxy lock held for a long time, it's
only needed when checking the proxy's queue, and keeping it prevents
multiple servers from dequeuing in parallel. Let's move it into
pendconn_process_next_strm() and release it ASAP. The pendconn
remains under the server queue lock's protection, guaranteeing that
no stream will release it while it's being touched.

For roundrobin, the performance increases by 76% (327k to 575k) on
16 threads. Even with a single server and maxconn=100, the performance
increases from 398 to 496 kreq/s. For leastconn, almost no change is
visible (less than one percent) but this is expected since most of the
time there is spent in fwlc_reposition() and fwlc_get_next_server().

MINOR: queue: use atomic-ops to update the queue's index (v2)

Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.

MINOR: queue: factor out the proxy/server queuing code (v2)

The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.

The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.

The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.

MEDIUM: queue: determine in process_srv_queue() if the proxy is usable (v2)

By doing so we can move some evaluations outside of the lock and the
loop.

MEDIUM: queue: simplify again the process_srv_queue() API (v2)

This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.

MEDIUM: queue: use a dedicated lock for the queues (v2)

Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 362k
to 374k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 9%.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.

MEDIUM: queue: update px->served and lb's take_conn once per loop

There's no point doing atomic incs over px->served/px->totpend under the
locks from the inner loop, as this value is used by the LB algorithms but
not during the dequeuing step. In addition, the LB algo's take_conn()
doesn't need to be refreshed for each and every connection taken
under the lock, it can be performed once at the end and out of the
lock.

While the gain on roundrobin is not noticeable (only the atomic inc),
on leastconn which uses take_conn(), the performance increases from
355k to 362k req/s on 16 threads.

Revert "MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn"

This reverts commit 5304669e1b1a213d2754755a47735ecd5549ce7b.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MINOR: queue: update proxy->served once out of the loop"

This reverts commit 3e92a31783b545dd58c4be6c588808763e0042bc.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: refine the locking in process_srv_queue()"

This reverts commit 1b648c857bb9e0fb857e86838bcca0c9ed01e2bd.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: use a dedicated lock for the queues"

This reverts commit fcb8bf8650ec6b5614d1b88db54f1200ebd96cbd.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: simplify again the process_srv_queue() API"

This reverts commit c83e45e9b001591633188a480a896c935d3c9625.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MINOR: queue: factor out the proxy/server queuing code"

This reverts commit 3eecdb65c5a6b933399ebb0ac4ef86a7b97cf85d.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MINOR: queue: use atomic-ops to update the queue's index"

This reverts commit 1335eb9867b76c8e4570ad4242dc728287af3d56.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: determine in process_srv_queue() if the proxy is usable"

This reverts commit de814dd4228fa5e528d9ac1f0e1c4c7325ee52d3.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()"

This reverts commit 9a6d0ddbd6788a05c6730bf0e9e8550d7c5b11aa.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MEDIUM: queue: unlock as soon as possible"

This reverts commit 5b3927531145384bad8d2b46ca21f017afde81c7.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

Revert "MINOR: queue: make pendconn_first() take the lock by itself"

This reverts commit 772e968b06f9348afea7f5785c97214a84c75d19.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.

DOC: Replace issue templates by issue forms

GitHub's issue forms are the next evolution of issue templates and allow for
showing an actual form with separate inputs when creating an issue. They ensure
that all the required fields are filled in and automatically format code parts
(e.g. haproxy -vv or the configuration) as actual code blocks, possibly with
syntax highlighting.

Co-authored-by: Maximilian Mader <max@bastelstu.be>

CLEANUP: dns: Remove a forgotten debug message

A debug message was forgotten in the dns part.

This patch should fix the issue #1304. It must be backported to 2.4.

DOC: config: Add missing actions in "tcp-request session" documentation

set-src/set-src-port and set-dst/set-dst-port actions were not listed in the
documentation of "tcp-request session".

This patch may be backported to all stable versions.

MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules

If it possible to set source IP/Port from "tcp-request connection",
"tcp-request session" and "http-request" rules but not from "tcp-request
content" rules. There is no reason for this limitation and it may be a
problem for anyone wanting to call a lua fetch to dynamically set source
IP/Port from a TCP proxy. Indeed, to call a lua fetch, we must have a
stream. And there is no stream when "tcp-request connection/session" rules
are evaluated.

Thanks to this patch, "set-src" and "set-src-port" action are now supported
by "tcp_request content" rules.

This patch is related to the issue #1303. It may be backported to all stable
versions.

CLEANUP: backend: remove impossible case of round-robin + consistent hash

In 1.4, consistent hashing was brought by commit 6b2e11be1 ("[MEDIUM]
backend: implement consistent hashing variation") which took care of
replacing all direct calls to map_get_server_rr() with an alternate
call to chash_get_next_server() if consistent hash was being used.

One of them, however, cannot happen because a preliminary test for
static round-robin is being done prior to the call, so we're certain
that if it matches it cannot use a consistent hash tree.

Let's remove it.

MINOR: queue: make pendconn_first() take the lock by itself

Dealing with the queue lock in the caller remains complicated. Let's
change pendconn_first() to take the queue instead of the tree head,
and handle the lock itself. It now returns an element with a locked
queue or no element with an unlocked queue. It can avoid locking if
the queue is already empty.

MEDIUM: queue: unlock as soon as possible

There's no point keeping the server's queue lock after seeing that the
server's queue is empty, just like there's no need to keep the proxy's
lock when its queue is empty. This patch checks for emptiness and
releases these locks as soon as possible.

With this the performance increased from 524k to 530k on 16 threads
with round-robin.

MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()

By placing the lock there, it becomes possible to lock the proxy
later and to unlock it earlier. The server unlocking also happens slightly
earlier.

The performance on roundrobin increases from 481k to 524k req/s on 16
threads. Leastconn shows about 513k req/s (the difference being the
take_conn() call).

The performance profile changes from this:
   9.32%  hap-pxok            [.] process_srv_queue
   7.56%  hap-pxok            [.] pendconn_dequeue
   6.90%  hap-pxok            [.] pendconn_add

to this:
   7.42%  haproxy             [.] process_srv_queue
   5.61%  haproxy             [.] pendconn_dequeue
   4.95%  haproxy             [.] pendconn_add

MEDIUM: queue: determine in process_srv_queue() if the proxy is usable

By doing so we can move some evaluations outside of the lock and the
loop. In the round robin case, the performance increases from 497k to
505k rps on 16 threads with 100 servers.

MINOR: queue: use atomic-ops to update the queue's index

Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.

MINOR: queue: factor out the proxy/server queuing code

The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.

The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.

The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.

MEDIUM: queue: simplify again the process_srv_queue() API

This basically undoes the API changes that were performed by commit
0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.

MEDIUM: queue: use a dedicated lock for the queues

Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 491k
to 507k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 6%.

The performance profile changes from this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

to this:
  11.95%  haproxy             [.] fwlc_srv_reposition
   7.57%  haproxy             [.] fwlc_get_next_server
   3.51%  haproxy             [.] process_srv_queue
   1.74%  haproxy             [.] pendconn_dequeue
   1.70%  haproxy             [.] pendconn_add

At this point the differences are mostly measurement noise.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.

MINOR: server: replace the pendconns-related stuff with a struct queue

Just like for proxies, all three elements (pendconns, nbpend, queue_idx)
were moved to struct queue.

MINOR: proxy: replace the pendconns-related stuff with a struct queue

All three elements (pendconns, nbpend, queue_idx) were moved to struct
queue.

MINOR: queue: create a new structure type "queue"

This structure will be common to proxies and servers and will contain
everything needed to handle their respective queues. For now it's only
a tree head, a length and an index.

MINOR: lb/api: remove the locked argument from take_conn/drop_conn

This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers
of take_conn/drop_conn tell if they have the lock") that was merged
during 2.4 before the various locks could be eliminated at the lower
layers. Passing that information complicates the cleanup of the queuing
code and it's become useless.

MEDIUM: queue: refine the locking in process_srv_queue()

The lock in process_srv_queue() was placed around the whole loop to
avoid the cost of taking/releasing it multiple times. But in practice
almost all calls to this function only dequeue a single connection, so
that argument doesn't really stand. However by placing the lock inside
the loop, we'd make it possible to release it before manipulating the
pendconn and waking the task up. That's what this patch does.

This increases the performance from 431k to 491k req/s on 16 threads
with 20 servers under leastconn.

The performance profile changes from this:
  14.09%  haproxy             [.] process_srv_queue
  10.22%  haproxy             [.] fwlc_srv_reposition
   6.39%  haproxy             [.] fwlc_get_next_server
   3.97%  haproxy             [.] pendconn_dequeue
   3.84%  haproxy             [.] pendconn_add

to this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

The difference is even slightly more visible in roundrobin which
does not have take_conn() call.

MINOR: queue: update proxy->served once out of the loop

It's not needed during all these operations and doesn't even affect
queueing in the LB algo, so we can safely update it out of the loop
and the lock.

MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn

It used to do far too much under the lock, including waking up tasks,
updating counters and repositionning entries in the load balancing algo.

This patch first moves all that stuff out of the function into the only
caller (process_srv_queue()). The decision to update the LB algo is now
taken out of the lock. The wakeups could be performed outside of the
loop by using a local list.

This increases the performance from 377k to 431k req/s on 16 threads
with 20 servers under leastconn.

The perf profile changes from this:
  23.17%  haproxy             [.] process_srv_queue
   6.58%  haproxy             [.] pendconn_add
   6.40%  haproxy             [.] pendconn_dequeue
   5.48%  haproxy             [.] fwlc_srv_reposition
   3.70%  haproxy             [.] fwlc_get_next_server

to this:
  13.95%  haproxy             [.] process_srv_queue
   9.96%  haproxy             [.] fwlc_srv_reposition
   6.21%  haproxy             [.] fwlc_get_next_server
   3.96%  haproxy             [.] pendconn_dequeue
   3.75%  haproxy             [.] pendconn_add

REGTESTS: fix maxconn update with agent-check

Correct the typo in the parameter used to update the 'maxconn' via
agent-check. The test is also completed to detect the update of maxconn
using CLI 'show stats'.

BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check

The server_parse_maxconn_change_request locks the server lock. However,
this function can be called via agent-checks or lua code which already
lock it. This bug has been introduced by the following commit :

  commit 79a88ba3d09f7e2b73ae27cb5d24cc087a548fa6
  BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'

This commit tried to fix another deadlock with can occur because
previoulsy server_parse_maxconn_change_request requires the server lock
to be held. However, it may call internally process_srv_queue which also
locks the server lock. The locking policy has thus been updated. The fix
is functional for the CLI 'set maxconn' but fails to address the
agent-check / lua counterparts.

This new issue is fixed in two steps :
- changes from the above commit have been reverted. This means that
  server_parse_maxconn_change_request must again be called with the
  server lock.

- to counter the deadlock fixed by the above commit, process_srv_queue
  now takes an argument to render the server locking optional if the
  caller already held it. This is only used by
  server_parse_maxconn_change_request.

The above commit was subject to backport up to 1.8. Thus this commit
must be backported in every release where it is already present.

CLEANUP: Prevent channel-t.h from being detected as C++ by GitHub

GitHub uses github/linguist to determine the programming language used for each
source file to show statistics and to power the search. In cases of unique file
extensions this is easy, but for `.h` files the situation is less clear as they
are used for C, C++, Objective C and more. In these cases linguist makes use of
heuristics to determine the language.

One of these heuristics for C++ is that the file contains a line beginning with
`try`, only preceded by whitespace indentation. This heuristic matches the long
comment at the bottom of `channel-t.h`, as one sentence includes the word `try`
after a linebreak.

Fix this misdetection by changing the comment to follow the convention that all
lines start with an asterisk.

MINOR: queue: update the stream's pend_pos before queuing it

Since commit c7eedf7a5 ("MINOR: queue: reduce the locked area in
pendconn_add()") the stream's pend_pos is set out of the lock, after
the pendconn is queued. While this entry is only manipulated by the
stream itself and there is no bug caused by this right now, it's a
bit dangerous because another thread could decide to look at this
field during dequeuing and could randomly see something else. Also
in case of crashes, memory inspection wouldn't be as trustable.
Let's assign the pendconn before it can be found in the queue.

REGTESTS: server: test ssl support for dynamic servers

Create a new regtest to test SSL support for dynamic servers.

The first step of the test is to create the ca-file via the CLI. Then a
dynamic server is created with the ssl option using the ca-file. A
client request is made through it to achieve the test.

MINOR: ssl: support ssl keyword for dynamic servers

Activate the 'ssl' keyword for dynamic servers. This is the final step
to have ssl dynamic servers feature implemented. If activated,
ssl_sock_prepare_srv_ctx will be called at the end of the 'add server'
CLI handler.

At the same time, update the management doc to list all ssl keywords
implemented for dynamic servers.

MINOR: ssl: enable a series of ssl keywords for dynamic servers

These keywords are deemed safe-enough to be enable on dynamic servers.
Their parsing functions are simple and can be called at runtime.

- allow-0rtt
- alpn
- ciphers
- ciphersuites
- force-sslv3/tlsv10/tlsv11/tlsv12/tlsv13
- no-sslv3/tlsv10/tlsv11/tlsv12/tlsv13
- no-ssl-reuse
- no-tls-tickets
- npn
- send-proxy-v2-ssl
- send-proxy-v2-ssl-cn
- sni
- ssl-min-ver
- ssl-max-ver
- tls-tickets
- verify
- verifyhost

'no-ssl-reuse' and 'no-tls-tickets' are enabled to override the default
behavior.

'tls-tickets' is enable to override a possible 'no-tls-tickets' set via
the global option 'ssl-default-server-options'.

'force' and 'no' variants of tls method options are useful to override a
possible 'ssl-default-server-options'.

MINOR: ssl: support crl arg for dynamic servers

File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crl is used at runtime for a dynamic server. The crl must
have already been loaded either in the config or through the 'ssl crl'
CLI commands.

MINOR: ssl: support crt arg for dynamic servers

File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crt is used at runtime for a dynamic server. The cert must
have already been loaded either in the config or through the 'ssl cert'
CLI commands.

MINOR: ssl: support ca-file arg for dynamic servers

File-access through ssl_store_load_locations_file is deactivated if
srv_parse_ca_file is used at runtime for a dynamic server. The ca-file
must have already been loaded either in the config or through the 'ssl
ca-file' CLI commands.

MINOR: ssl: split parse functions for alpn/check-alpn

This will be in preparation for support of ssl on dynamic servers. The
'alpn' keyword will be allowed for dynamic servers but not the
'check-alpn'.

The alpn parsing is extracted into a new function parse_alpn. Each
srv_parse_alpn and srv_parse_check_alpn called it.

MINOR: ssl: render file-access optional on server crt loading

The function ssl_sock_load_srv_cert will be used at runtime for dynamic
servers. If the cert is not loaded on ckch tree, we try to access it
from the file-system.

Now this access operation is rendered optional by a new function
argument. It is only allowed at parsing time, but will be disabled for
dynamic servers at runtime.

MINOR: server: disable CLI 'set server ssl' for dynamic servers

'set server ssl' uses ssl parameters from default-server. As dynamic
servers does not reuse any default-server parameters, this command has
no sense for them.

MINOR: ssl: check allocation in parse npn/sni

These checks are especially required now as this function will be used
at runtime for dynamic servers.

MINOR: ssl: check allocation in parse ciphers/ciphersuites/verifyhost

These checks are especially required now as this function will be used
at runtime for dynamic servers.

MINOR: ssl: check allocation in ssl_sock_init_srv

These checks are especially required now as this function will be used
at runtime for dynamic servers.

MINOR: ssl: always initialize random generator

Explicitly call ssl_initialize_random to initialize the random generator
in init() global function. If the initialization fails, the startup is
interrupted.

This commit is in preparation for support of ssl on dynamic servers. To
be able to activate ssl on dynamic servers, it is necessary to ensure
that the random generator is initialized on startup regardless of the
config. It cannot be called at runtime as access to /dev/urandom is
required.

This also has the effect to fix the previous non-consistent behavior.
Indeed, if bind or server in the config are using ssl, the
initialization function was called, and if it failed, the startup was
interrupted. Otherwise, the ssl initialization code could have been
called through the ssl server for lua, but this times without blocking
the startup on error. Or not called at all if lua was deactivated.

MINOR: ssl: fix typo in usage for 'new ssl ca-file'

Fix the usage for the command new ssl ca-file, which has a missing '-'
dash separator.

BUG/MINOR: cache: Correctly handle existing-but-empty 'accept-encoding' header

RFC 7231#5.3.4 makes a difference between a completely missing
'accept-encoding' header and an 'accept-encoding' header without any values.

This case was already correctly handled by accident, because an empty accept
encoding does not match any known encoding. However this resulted in the
'other' encoding being added to the bitmap. Usually this also succeeds in
serving cached responses, because the cached response likely has no
'content-encoding', thus matching the identity case instead of not serving the
response, due to the 'other' encoding. But it's technically not 100% correct.

Fix this by special-casing 'accept-encoding' values with a length of zero and
extend the test to check that an empty accept-encoding is correctly handled.
Due to the reasons given above the test also passes without the change in
cache.c.

Vary support was added in HAProxy 2.4. This fix should be backported to 2.4+.

BUG/MINOR: server/cli: Fix locking in function processing "set server" command

The commit c7b391aed ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn
is set from the CLI") introduced 2 bugs. The first one is a typo on the
server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about
the server's lock itself. It must be acquired to execute the "agent-send"
subcommand.

The patch above is marked to be backported as far as 1.8. Thus, this one
must also backported as far 1.8.

BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand

BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()

The commit dcac41806 ("BUG/MEDIUM: resolvers: Add a task on servers to check
SRV resolution status") introduced a type. In resolv_srvrq_expire_task()
function, the resolver's lock must be used instead of the resolver itself.

This patch must be backported with the patch above (at least as far as 2.2).

BUG/MINOR: backend: do not set sni on connection reuse

When reusing a backend connection, do not reapply the SNI on the
connection. It should already be defined when the connection was
instantiated on a previous connect_server invocation. As the SNI is a
parameter used to select a connection, only connection with same value
can be reused.

The impact of this bug is unknown and may be null. No memory leak has
been reported by valgrind. So this is more a cleaning fix.

This commit relies on the SF_SRV_REUSED flag and thus depends on the
following fix :
BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose

This should be backported up to 2.4.

BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose

The SF_SRV_REUSED flag was set if a stream reused a backend connection.
One of its purpose is to count the total reuse on the backend in
opposition to newly instantiated connection.

However, the flag was diverted from its original purpose since the
following commit :

e8f5f5d8b228d71333fb60229dc908505baf9222
BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready.

With this change, the flag is not set anymore if the mux is not ready
when a connection is picked for reuse. This can happen for multiplexed
connections which are inserted in the available list as soon as created
in http-reuse always mode. The goal of this change is to not retry
immediately this request in case on an error on the same server if the
reused connection is not fully ready.

This change is justified for the retry timeout handling but it breaks
other places which still uses the flag for its original purpose. Mainly,
in this case the wrong 'connect' backend counter is incremented instead
of the 'reuse' one. The flag is also used in http_return_srv_error and
may have an impact if a http server error is replied for this stream.

To fix this problem, the original purpose of the flag is restored by
setting it unconditionaly when a connection is reused. Additionally, a
new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the
connection is reused but the mux is not ready yet. For the timeout
handling on error, the request is retried immediately only if the stream
reused a connection without this newly anticipated flag.

This must be backported up to 2.1.

BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status

When a server relies on a SRV resolution, a task is created to clean it up
(fqdn/port and address) when the SRV resolution is considered as outdated
(based on the resolvers 'timeout' value). It is only possible if the server
inherits outdated info from a state file and is no longer selected to be
attached to a SRV item. Note that most of time, a server is attached to a
SRV item. Thus when the item becomes obsolete, the server is cleaned
up.

It is important to have such task to be sure the server will be free again
to have a chance to be resolved again with fresh information. Of course,
this patch is a workaround to solve a design issue. But there is no other
obvious way to fix it without rewritting all the resolvers part. And it must
be backportable.

This patch relies on following commits:
* MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
* MINOR: resolvers: Remove server from named_servers tree when removing a SRV item

All the series must be backported as far as 2.2 after some observation
period. Backports to 2.0 and 1.8 must be evaluated.

MINOR: resolvers: Remove server from named_servers tree when removing a SRV item

When a server is cleaned up because the corresponding SRV item is removed,
we always remove the server from the srvrq's name_servers tree. For now, it
is useless because, if a server was attached to a SRV item, it means it was
already removed from the tree. But it will be mandatory to fix a bug.

MINOR: resolvers: Clean server in a dedicated function when removing a SRV item

A dedicated function is now used to clean up servers when a SRV item becomes
obsolete or when a requester is removed from a resolution. This patch is
mandatory to fix a bug.

BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI

To perform servers resolution, the resolver's lock is first acquired then
the server's lock when necessary. However, when the fqdn is set via the CLI,
the opposite is performed. So, it is possible to experience an ABBA
deadlock.

To fix this bug, the server's lock is acquired and released for each
subcommand of "set server" with an exception when the fqdn is set. The
resolver's lock is first acquired. Of course, this means we must be sure to
have a resolver to lock.

This patch must be backported as far as 1.8.

BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled

If a server is configured to rely on a SRV resolution, we must forbid to
change its fqdn on the CLI. Indeed, in this case, the server retrieves its
fqdn from the SRV resolution. If the fqdn is changed via the CLI, this
conflicts with the SRV resolution and leaves the server in an undefined
state. Most of time, the SRV resolution remains enabled with no effect on
the server (no update). Some time the A/AAAA resolution for the new fqdn is
not enabled at all. It depends on the server state and resolver state when
the CLI command is executed.

This patch must be backported as far as 2.0 (maybe to 1.8 too ?) after some
observation period.

CLEANUP: server: a separate function for initializing the per_thr field

To avoid repeating the same source code, allocating memory and initializing
the per_thr field from the server structure is transferred to a separate
function.

CI: ssl: keep the old method for ancient OpenSSL versions

I forgot about OpenSSL 1.0.2, which neither supports the build_sw target
to build only the software, nor reliably supports parallel builds. Given
that we're building 1.0.2 and 3.0.0, let's stay on the safe side and
keep 1.x sequential.

CI: ssl: do not needlessly build the OpenSSL docs

1/4 of the OpenSSL build time is spent building the docs, let's just
build the software and not the doc, by replacing the "all" target
with "build_sw". With this my build time drops from 1'28 to 1'09.

Nothing was done for the other libs, as it's unknown whether they
provide specific build targets.

CI: ssl: enable parallel builds for OpenSSL on Linux

Running the "make all" phase on my machine with -j$(nproc) shrinks the
build time from 4'52 to 1'28. It will not be that big of a change in
the CI since it looks like two CPUs are exposed, but it should still
remain a net win. Let's enable it. The install phase obviously remains
sequential however.

REGTESTS: Remove support for REQUIRE_BINARIES

This is no longer used since the migration to the native `feature cmd`
functionality.

REGTESTS: Replace REQUIRE_BINARIES with 'command -v'

This migrates the tests to the native `feature cmd` functionality of VTest.

REGTESTS: Replace REQUIRE_OPTIONS with 'haproxy -cc' for 2.5+ tests

This migrates the tests for HAProxy versions that support '-cc' to the native
VTest functionality.

REGTESTS: Replace REQUIRE_VERSION=2.5 with 'haproxy -cc'

This is safe, because running `haproxy -cc 'version_atleast(2.5-dev0)'` on
HAProxy 2.4 will also result in an exit code of 1.

CI: Replace the requirement for 'sudo' with a call to 'ulimit -n'

Using 'sudo' required quite a few workarounds in various places. Setting an
explicit 'ulimit -n' removes the requirement for 'sudo', resulting in a cleaner
workflow configuration.

CLEANUP: assorted typo fixes in the code and comments

This is 24th iteration of typo fixes

BUG/MINOR: mux-h2/traces: bring back the lost "sent H2 REQ/RES" traces

In 2.4, commit d1ac2b90c ("MAJOR: htx: Remove the EOM block type and
use HTX_FL_EOM instead") changed the HTX processing to destroy the
blocks as they are processed. So the traces that were emitted at the
end of the send headers functions didn't have anything to show.

Let's move these traces earlier in the function, right before the HTX
processing, so that everything is still in place.

This should be backported to 2.4.

BUG/MINOR: mux-h2/traces: bring back the lost "rcvd H2 REQ" trace

Since commit 7d013e796 ("BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper
layer when creating a front stream"), the rxbuf is lost during the
call to h2c_frt_stream_new(), so the trace that happens later cannot
find a request there and we've lost the useful part indicating what
the request looked like. Let's move the trace before this call.

This should be backported to 2.4.

MINOR: mux-h2: obey http-ignore-probes during the preface

We're seeing some browsers setting up multiple connections and closing
some to just keep one. It looks like they do this in case they'd
negotiate H1. This results in aborted prefaces and log pollution about
bad requests and "PR--" in the status flags.

We already have an option to ignore connections with no data, it's called
http-ignore-probes. But it was not used by the H2 mux. However it totally
makes sense to use it during the preface.

This patch changes this so that connections aborted before sending the
preface can avoid being logged.

This should be backported to 2.4 and 2.3 at least, and probably even
as far as 2.0.

BUG/MINOR: stats: make "show stat typed desc" work again

As part of the changes to support per-module stats data in 2.3-dev6
with commit ee63d4bd6 ("MEDIUM: stats: integrate static proxies stats
in new stats"), a small change resulted in the description field to
be replaced by the name field, making it pointless. Let's fix this
back.

This should fix issue #1291. Thanks to Nick Ramirez for reporting this
issue.

This patch can be backported to 2.3.

CLEANUP: mux-h2/traces: better align user messages

"sent H2 request" was already misaligned with the 3 other ones
(sent/rcvd, request/response), and now with "new H2 connection" that's
yet another alignment making the traces even less legible. Let's just
realign all 5 messages, this even eases quick pointer comparisons. This
should probably be backported to 2.4 as it's where it's the most likely
to be used in the mid-term.

MINOR: mux-h2/trace: report a few connection-level info during h2_init()

It is currently very difficult to match some H2 trace outputs against
some log extracts because there's no exactly equivalent info.

This patch tries to address this by adding a TRACE_USER() call in h2_init()
that is matched in h2_trace() to report:
  - connection pointer and direction
  - frontend's name or server's name
  - transport layer and control layer (e.g. "SSL/tcpv4")
  - source and/or destination depending on what is set

This now permits to get something like this at verbosity level complete:

  <0>2021-06-16T18:30:19.810897+02:00 [00|h2|1|mux_h2.c:1006] new H2 connection : h2c=0x19fee50(F,PRF) : conn=0x7f373c026850(IN) fe=h2gw RAW/tcpv4 src=127.0.0.1:19540
  <0>2021-06-16T18:30:19.810919+02:00 [00|h2|1|mux_h2.c:2731] rcvd H2 request  : h2c=0x19fee50(F,FRH)
  <0>2021-06-16T18:30:19.810998+02:00 [00|h2|1|mux_h2.c:1006] new H2 connection : h2c=0x1a04ee0(B,PRF) : conn=0x1a04ce0(OUT) sv=h2gw/s1 RAW/tcpv4 dst=127.0.0.1:4446

MINOR: connection: add helper conn_append_debug_info()

This function appends to a buffer some information from a connection.
This will be used by traces and possibly some debugging as well. A
frontend/backend/server, transport/control layers, source/destination
ip:port, connection pointer and direction are reported depending on
the available information.

BUG/MINOR: mux-h1: do not skip the error response on bad requests

Since 2.4-dev3 with commit c4bfa59f1 ("MAJOR: mux-h1: Create the client
stream as later as possible"), a request error doesn't result in any
error response if "option http-ignore-probes" is set, there's just a
close. This is caused by an unneeded b_reset() in h1_process_demux()'s
error path, which makes h1_handle_bad_req() believe there was an empty
request. There is no reason for this reset to be there, it must have
been a leftover of an earlier attempt at dealing with the error, let's
drop it.

This should be backported to 2.4.

MINOR: backend: only skip LB when there are actual connections

In 2.3, a significant improvement was brought against situations where
the queue was heavily used, because some LB algos were still checked
for no reason before deciding to put the request into the queue. This
was commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend
is full").

As seen in previous commit ("BUG/MAJOR: queue: set SF_ASSIGNED when
setting strm->target on dequeue") the dequeuing code is extremely
tricky, and the optimization above tends to emphasize transient issues
by making them permanent until the next reload, which is not acceptable
as the code must always be robust against any bad situation.

This commit brings a protection against such a situation by slightly
relaxing the test. Instead of checking that there are pending connections
in the backend queue, it also verifies that the backend's connections are
not solely composed of queued connections, which would then indicate we
are in this situation. This is not rocket science, but at least if the
situation happens, we know that it will unlock by itself once the streams
have left, as new requests will be allowed to reach the servers and to
flush the queue again.

This needs to be backported to 2.4 and 2.3.

BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue

Commit 82cd5c13a ("OPTIM: backend: skip LB when we know the backend is
full") has uncovered a long-burried bug in the dequeing code: when a
server releases a connection, it picks a new one from the proxy's or
its queue. Technically speaking it only picks a pendconn which is a
link between a position in the queue and a stream. It then sets this
pendconn's target to itself, and wakes up the stream's task so that
it can try to connect again.

The stream then goes through the regular connection setup phases,
calls back_try_conn_req() which calls pendconn_dequeue(), which
sets the stream's target to the pendconn's and releases the pendconn.
It then reaches assign_server() which sees no SF_ASSIGNED and calls
assign_server_and_queue() to perform load balancing or queuing. This
one first destroys the stream's target and gets ready to perform load
balancing. At this point we're load-balancing for no reason since we
already knew what server was available. And this is where the commit
above comes into play: the check for the backend's queue above may
detect other connections that arrived in between, and will immediately
return FULL, forcing this request back into the queue. If the server
had a very low maxconn (e.g. 1 due to a long slowstart), it's possible
that this evicted connection was the last one on the server and that
no other one will ever be present to process the queue. Usually a
regularly processed request will still have its own srv_conn that will
be used during stream_free() to dequeue other connections. But if the
server had a down-up cycle, then a call to pendconn_grab_from_px()
may start to dequeue entries which had no srv_conn and which will have
no server slot to offer when they expire, thus maintaining the situation
above forever. Worse, as new requests arrive, there are always some
requests in the queue and the situation feeds on itself.

The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue()
when the stream's target is assigned (as it's what this flag means), so
as to avoid a load-balancing pass when dequeuing.

Many thanks to Pierre Cheynier for the numerous detailed traces he
provided that helped narrow this problem down.

This could be backported to all stable versions, but in practice only
2.3 and above are really affected since the presence of the commit
above. Given how tricky this code is it's better to limit it to those
versions that really need it.

CLEANUP: shctx: remove the different inter-process locking techniques

With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX
nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock
to lock between threads.

MEDIUM: config: warn about "bind-process" deprecation

Let's indicate that "bind-process" is deprecated and scheduled for
removal in 2.7, as it only supports "1".

DOC: update references to process numbers in cpu-map and bind-process

Let's mention that cpu-map is limited to process 1 and that bind-process
is deprecated. Other minor adjustments were made to "process" on bind
lines.

MEDIUM: global: remove the relative_pid from global and mworker

The relative_pid is always 1. In mworker mode we also have a
child->relative_pid which is always equalt relative_pid, except for a
master (0) or external process (-1), but these types are usually tested
for, except for one place that was amended to carefully check for the
PROC_O_TYPE_WORKER option.

Changes were pretty limited as most usages of relative_pid were for
designating a process in stats output and peers protocol.

CLEANUP: global: remove unused definition of MAX_PROCS

This one was forced to 1 and the only reference was a test to verify it
was comprised between 1 and LONGBITS.

MEDIUM: cpu-set: make the proc a single bit field and not an array

We only have a single process now so we don't need to store the per-proc
CPU binding anymore.

MEDIUM: config: simplify cpu-map handling

As there's no more nbproc>1, we can remove some loops and tests in cpu-map.
Both the lack of thread number and thread 1 can count as the whole process
now (which is still used for whole process binding when threads are disabled).

MEDIUM: global: remove dead code from nbproc/bind_proc removal

Lots of places iterating over nbproc or comparing with nbproc could be
simplified. Further, "bind-process" and "process" parsing that was
already limited to process 1 or "all" or "odd" resulted in a bind_proc
field that was either 0 or 1 during the init phase and later always 1.

All the checks for compatibilities were removed since it's not possible
anymore to run a frontend and a backend on different processes or to
have peers and stick-tables bound on different ones. This is the largest
part of this patch.

The bind_proc field was removed from both the proxy and the receiver
structs.

Since the "process" and "bind-process" directives are still parsed,
configs making use of correct values allowing process 1 will continue
to work.

CLEANUP: global: remove pid_bit and all_proc_mask

They were already set to 1 and never changed. Let's remove them and
replace their references with 1.

CLEANUP: global: remove the nbproc field from the global structure

Let's use 1 in the rare places where it was still referenced since it's
now its only possible value.

MINOR: mworker: remove the initialization loop over processes

There was a loop used to prepare structures for all current processes.
Let's just assume there's a single iteration now.

MEDIUM: init: remove the loop over processes during init

There was a loop iterating over all nbproc values during init that
couldn't be immediately removed because the loop's index was used
to distinguish a child from a parent. That's now fixed by replacing
the iterator with an in_parent flag. All bindings that were checking
(1UL << proc) or cpu_map.proc[proc] were adjusted to always use zero
for proc.

CLEANUP: global: remove unused definition of stopping_task[]

This is a leftover of a previous attempt that was introduced in 2.4 by
commit d3a88c1c3 ("MEDIUM: connection: close front idling connection on
soft-stop"). It can be backported, as the variable doesn't exist.