git.ipfire.org Git - thirdparty/haproxy.git/log

BUG/MEDIUM: filters: Forward all filtered data at the end of http filtering

When http filtering ends, if there are some filtered data not forwarded yet, we
forward them, in flt_http_end(). Most of time, this doesn't happen, except when
a tunnel is established using a CONNECT. In this case, there is not EOM on the
request and there is no body. Thus the headers are never forwarded, blocking the
stream.

This patch must be backported as far as 2.0. Prior versions don't suffer of this
bug because there is no HTX support. On the 2.0, the change is only applicable
on HTX streams. A special test must be performed to make sure.

REGTESTS: Add sample_fetches/cook.vtc

Add a reg-test verifying the fix in dea7c209f8a77b471323dd97bdc1ac4d7a17b812.

Some parts of the configuration used in the were taken from the initial bug
report from Maciej.

Should be backported together with dea7c209f8a77b471323dd97bdc1ac4d7a17b812
(all stable versions).

Co-authored-by: Maciej Zdeb <maciej@zdeb.pl>

REGTEST: make ssl_client_samples and ssl_server_samples require to 2.2

Some missing sample fetches was backported to 2.2 making these tests compatible
with the 2.2.

MINOR: cfgparse: tighten the scope of newnameserver variable, free it on error.

This should fix issue GH #931.

Also remove a misleading comment.

This commit can be backported as far as 1.9

CLEANUP: config: Return ERR_NONE from config callbacks instead of 0

Return ERR_NONE instead of 0 on success for all config callbacks that should
return ERR_* codes. There is no change because ERR_NONE is a macro equals to
0. But this makes the return value more explicit.

MINOR: config/mux-h2: Return ERR_ flags from init_h2() instead of a status

post-check function callbacks must return ERR_* flags. Thus, init_h2() is fixed
to return ERR_NONE on success or (ERR_ALERT|ERR_FATAL) on error.

This patch may be backported as far as 2.2.

MINOR: init: Fix the prototype for per-thread free callbacks

Functions registered to release memory per-thread have no return value. But the
registering function and the function pointer in per_thread_free_fct structure
specify it should return an integer. This patch fixes it.

This patch may be backported as far as 2.0.

BUG/MINOR: tcpcheck: Don't warn on unused rules if check option is after

When tcp-check or http-check rules are used, if the corresponding check option
(option tcp-check and option httpchk) is declared after the ruleset, a warning
is emitted about an unused check ruleset while there is no problem in reality.

This patch must be backported as far as 2.2.

MINOR: spoe: Don't close connection in sync mode on processing timeout

In sync mode, if an applet receives a ack while the processing delay has already
expired, there is not frame waiting for this ack. But there is no reason to
close the connection in this case. The ack may be ignored and the connection may
be reused to process another frame. The only reason to trigger an error and
close the connection is when the wrong ack is received while there is still a
frame waiting for its ack. In sync mode, this should never happen.

This patch may be backported in all versions supporting the SPOE.

BUG/MAJOR: spoe: Be sure to remove all references on a released spoe applet

When a SPOE applet is used to send a frame, a reference on this applet is saved
in the spoe context of the offladed stream. But, if the applet is released
before receving the corresponding ack, we must be sure to remove this
reference. This was performed for fragmented frames only. But it must also be
performed for a spoe contexts in the applet waiting_queue and in the thread
waiting_queue (used in async mode).

This bug leads to a memory corruption when an offloaded stream try to update the
state of a released applet because it still have a reference on it. There are
many ways to trigger this bug. The easiest is probably during reloads. On the
old process, all applets are woken up to be released ASAP.

Many thanks to Maciej Zdeb to report the bug and to work on it for 2
months. Without his help, it would have been much more difficult to fix the
bug. It is always a huge pleasure to see how some users are enthousiast and
helpful. Thanks again Maciej !

This patch must be backported to all versions where the spoe is supported (>=
1.7).

BUG/MINOR: http-htx: Handle warnings when parsing http-error and http-errors

First of all, this patch is tagged as a bug. But in fact, it only fixes a bug in
the 2.2. On the 2.3 and above, it only add the ability to display warnings, when
an http-error directive is parsed from a proxy section and when an errorfile
directive is parsed from a http-errors section.

But on the 2.2, it make sure to display the warning emitted on a content-length
mismatch when an errorfile is parsed. The following is only applicable to the
2.2.

commit "BUG/MINOR: http-htx: Just warn if payload of an errorfile doesn't match
the C-L" (which is only present in 2.2, 2.1 and 2.0 trees, i.e see commit
7bf3d81d3cf4b9f4587 in 2.2 tree), is changing the behavior of `http_str_to_htx`
function. It may now emit warnings. And, it is the caller responsibility to
display it.

But the warning is missing when an 'http-error' directive is parsed from
a proxy section. It is also missing when an 'errorfile' directive is
parsed from a http-errors section.

This bug only exists on the 2.2. On earlier versions, these directives
are not supported and on later ones, an error is triggered instead of a
warning.

Thanks to William Dauchy that spotted the bug.

This patch must be backported as far as 2.2.

MINOR: check: report error on incompatible connect proto

Report an error when using an explicit proto for a connect rule with
non-compatible mode in regards with the selected check type (tcp-check
vs http-check).

MINOR: check: report error on incompatible proto

If the check mux has been explicitly defined but is incompatible with
the selected check type (tcp-check vs http-check), report a warning and
prevent haproxy startup.

BUG/MEDIUM: check: reuse srv proto only if using same mode

Only reuse the mux from server if the check is using the same mode.
For example, this prevents a tcp-check on a h2 server to select the h2
multiplexer instead of passthrough.

This bug was introduced by the following commit :
BUG/MEDIUM: checks: Use the mux protocol specified on the server line
It must be backported up to 2.2.

Fixes github issue #945.

BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample fetches

req.cook, req.cook_val, req.cook_cnt and and their response counterparts may be
called without cookie name. In this case, empty parentheses may be used, or no
parentheses at all. In both, the result must be the same. But only the first one
works. The second one always returns a failure. This patch fixes this bug.

Note that on old versions (< 2.2), both cases fail.

This patch must be backported in all stable versions.

BUG/MINOR: http-fetch: Extract cookie value even when no cookie name

HTTP sample fetches dealing with the cookies (req/res.cook,
req/res.cook_val and req/res.cook_cnt) must be prepared to be called
without cookie name. For the first two, the first cookie value is
returned, regardless its name. For the last one, all cookies are counted.

To do so, http_extract_cookie_value() may now be called with no cookie
name (cookie_name_l set to 0). In this case, the matching on the cookie
name is ignored and the first value found is returned.

Note this patch also fixes matching on cookie values in ACLs.

This should be backported in all stable versions.

BUG/MEDIUM: peers: fix decoding of multi-byte length in stick-table messages

There is a bug in peer_recv_msg() due to an incorrect cast when trying
to decode the varint length of a stick-table message, causing lengths
comprised between 128 and 255 to consume one extra byte, ending in
protocol errors. The root cause of this is that peer_recv_msg() tries
hard to reimplement all the parsing and control that is already done in
intdecode() just to measure the length before calling it. And it got it
wrong.

Let's just get rid of this unneeded code duplication and solely rely on
intdecode() instead. The bug was introduced in 2.0 as part of a cleanup
pass on this code with commit 95203f218 ("MINOR: peers: Move high level
receive code to reduce the size of I/O handler."), so this patch must
be backported to 2.0.

Thanks to Yves Lafon for reporting the problem.

BUG/MINOR: peers: Missing TX cache entries reset.

The TX part of a cache for a dictionary is made of an reserved array of ebtree nodes
which are pointers to dictionary entries. So when we flush the TX part of such a
cache, we must not only remove these nodes to dictionary entries from their ebtree.
We must also reset their values. Furthermore, the LRU key and the last lookup
result must also be reset.

BUG/MINOR: peers: Do not ignore a protocol error for dictionary entries.

If we could not decode the ID of a dictionary entry from a peer update message,
we must inform the remote peer about such an error as this is done for
any other decoding error.

MINOR: peers: Add traces to peer_treat_updatemsg().

Add minimalistic traces for peers with only one event to diagnose potential
issues when decode peer update messages.

BUG/MEDIUM: stats: prevent crash if counters not alloc with dummy one

Define a per-thread counters allocated with the greatest size of any
stat module counters. This variable is named trash_counters.

When using a proxy without allocated counters, return the trash counters
from EXTRA_COUNTERS_GET instead of a dangling pointer to prevent
segfault.

This is useful for all the proxies used internally and not
belonging to the global proxy list. As these objects does not appears on
the stat report, it does not matter to use the dummy counters.

For this fix to be functional, the extra counters are explicitly
initialized to NULL on proxy/server/listener init functions.

Most notably, the crash has already been detected with the following
vtc:
- reg-tests/lua/txn_get_priv.vtc
- reg-tests/peers/tls_basic_sync.vtc
- reg-tests/peers/tls_basic_sync_wo_stkt_backend.vtc
There is probably other parts that may be impacted (SPOE for example).

This bug was introduced in the current release and do not need to be
backported. The faulty commits are
"MINOR: ssl: count client hello for stats" and
"MINOR: ssl: add counters for ssl sessions".

BUG/MINOR: stats: free dynamically stats fields/lines on shutdown

Register a new function on POST DEINIT to free stats fields/lines for
each domain.

This patch does not fix a critical bug but may be backported to 2.3.

MEDIUM: cache: Change caching conditions

Do not cache responses that do not have an explicit expiration time
(s-maxage or max-age Cache-Control directives or Expires header) or a
validator (ETag or Last-Modified headers) anymore, as suggested in
RFC 7234#3.
The TX_FLAG_IGNORE flag is used instead of the TX_FLAG_CACHEABLE so as
not to change the behavior of the checkcache option.

BUG/MINOR: lua: set buffer size during map lookups

This size is used by some pattern matching to determine if there
is sufficient room in the buffer to add final \0 if necessary.
If the size is not set, the conditions use uninitialized value.

Note: it seems this bug can't cause a crash.

Should be backported until 2.2 (at least)

BUG/MINOR: pattern: a sample marked as const could be written

The functions add final 0 to string if the final 0 is not set,
but don't check the flag CONST. This patch duplicates the strings
if the final zero is not set and the string is CONST.

Should be backported until 2.2 (at least)

REGTEST: ssl: mark reg-tests/ssl/ssl_crt-list_filters.vtc as broken

This regtest requires a version of OpenSSL which supports the
ClientHello callback which is only the case of recents SSL libraries
(openssl 1.1.1).

This was reported in issue #944.

CI: Expand use of GitHub Actions for CI

Travis is becoming overall increasingly unreliable lately. We've already
seen that the timing sensitive tests regularly fail and thus they were
disabled.

Additionally they recently announced a new pricing model that caps the number
of minutes for Open Source projects:
https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing

GitHub Actions VMs are working well, possibly allowing to use custom runners
for special tasks in the future.

In addition to this better performance its workflow configuration language
is more expressive compared to the Travis CI one. Specifically the build
matrix does not need to be specified in YAML. Instead it can be generated
ad-hoc using a script. This allows us to cleanly define the various build
configurations without having an unreadable 80 line mess where the flags
are inconsistently activated. As an example in the current Travis CI
configuration the prometheus exporter is tested together with LibreSSL 2.9.2
for whatever reason.

In addition to all the previous points the UI of Travis is not that nice.
On GitHub you are just seeing that "Travis failed" without any details which
exact job failed. This requires you to visit the slow Travis page and look
up the details there. GitHub Actions creates a single entry for each
configuration that is tested, allowing you to see the details without needing
to leave GitHub.

This new GitHub Actions workflow aims to reproduce the configurations tested
in Travis. It comes close, but is not completely there, yet. Consider this
patch a proof of concept that will evolve in the future, ideally with Ilya's
expertise.

The current configurations are as follows. Each one is tested with both gcc
and clang.
- All features disabled (no USE flags)
- All features enabled (all USE flags)
- Standalone test of each of the supported compression libraries:
  - USE_ZLIB=1
  - USE_SLZ=1
- Standalone test of various SSL libraries:
  - stock (the SSL installed by default on the VM)
  - OpenSSL 1.0.2u
  - LibreSSL 2.9.2, 3.0.2, 3.1.1
- All features enabled with ASAN (clang only)

Future additions of new tests should take care to not test unrelated stuff.
Instead a distinct configuration should be added.

Additionally there is a Mac OS test with clang and all features disabled.

Known issues:
- Apparently the git commit is not properly detected during build. The HEAD
  currently shows as 2.4-dev0.

BUG/MEDIUM: ssl/crt-list: correctly insert crt-list line if crt already loaded

In issue #940, it was reported that the crt-list does not work correctly
anymore. Indeed when inserting a crt-list line which use a certificate
previously seen in the crt-list, this one won't be inserted in the SNI
list and will be silently ignored.

This bug was introduced by commit 47da821 "MEDIUM: ssl: emulates the
multi-cert bundles in the crtlist".

This patch also includes a reg-test which tests this issue.

This bugfix must be backported in 2.3.

REGTEST: ssl: test wildcard and multi-type + exclusions

This test checks that the bug #818 and #810 are fixed.

It test if there is no inconsistency with multiple certificate types and
that the exclusion of the certificate is correctly working with a negative
filter.

BUILD: http-htx: fix build warning regarding long type in printf

Commit a66adf41e ("MINOR: http-htx: Add understandable errors for the
errorfiles parsing") added a warning when loading malformed error files,
but this warning may trigger another build warning due to the %lu format
used. Let's simply cast it for output since it's just used for end user
output.

This must be backported to 2.0 like the commit above.

BUILD: ssl: silence build warning on uninitialised counters

Since commit d0447a7c3 ("MINOR: ssl: add counters for ssl sessions"),
gcc 9+ complains about this:

  CC      src/ssl_sock.o
src/ssl_sock.c: In function 'ssl_sock_io_cb':
src/ssl_sock.c:5416:3: warning: 'counters_px' may be used uninitialized in this function [-Wmaybe-uninitialized]
5416 |   ++counters_px->reused_sess;
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~
src/ssl_sock.c:5133:23: note: 'counters_px' was declared here
5133 |  struct ssl_counters *counters, *counters_px;
      |                                  ^~~~~~~~~~~

Either a listener or a server are expected there, so ther counters are
always initialized and the compiler cannot know this. Let's preset
them and test before updating the counter, we're not in a hot path
here.

No backport is needed.

MINOR: server: remove idle lock in srv_cleanup_connections

This function used to grab the idle lock when scanning the threads for
idle connections, but it doesn't need it since the lock only protects
the tree. Let's remove it.

DOC: config: Fix a typo on ssl_c_chain_der

There is a typo on the ssl_c_chain_der sample fetch
(s/ssl_c_der_chain/ssl_c_chain_der/). This implies a move of the fetch to keep
it at the right place.

This should be backported as far as 2.2 or anywhere the commit a598b500b
("MINOR: ssl: add ssl_{c,s}_chain_der fetch methods") is.

MINOR: ssl: add counters for ssl sessions

Add counters for newly established and resumed sessions.

MINOR: ssl: count client hello for stats

Add a counter for ssl client_hello received on frontends.

MINOR: ssl: instantiate stats module

This module is responsible for providing statistics for ssl. It allocates
counters for frontend/backend/listener/server objects.

MINOR: http-htx: Add understandable errors for the errorfiles parsing

No details are provided when an error occurs during the parsing of an errorfile,
Thus it is a bit hard to diagnose where the problem is. Now, when it happens, an
understandable error message is reported.

This patch is not a bug fix in itself. But it will be required to change an
fatal error into a warning in last stable releases. Thus it must be backported
as far as 2.0.

BUG/MINOR: ssl: don't report 1024 bits DH param load error when it's higher

The default dh_param value is 2048 and it's preset to zero unless explicitly
set, so we must not report a warning about DH param not being loadble in 1024
bits when we're going to use 2048. Thanks to Dinko for reporting this.

This should be backported to 2.2.

CLEANUP: cfgparse: remove duplicate registration for transparent build options

Since commit 37bafdcbb ("MINOR: sock_inet: move the IPv4/v6 transparent mode code
to sock_inet"), build options for transparent proxying are registered twice.
This patch removes the older one.

MEDIUM: pattern: turn the pattern chaining to single-linked list

It does not require heavy deletion from the expr anymore, so we can now
turn this to a single-linked list since most of the time we want to delete
all instances of a given pattern from the head. By doing so we save 32 bytes
of memory per pattern. The pat_unlink_from_head() function was adjusted
accordingly.

MINOR: pattern: prepare removal of a pattern from the list head

Instead of using LIST_DEL() on the pattern itself inside an expression,
we look it up from its head. The goal is to get rid of the double-linked
list while this usage remains exclusively for freeing on startup error!

MINOR: pattern: during reload, delete elements frem the ref, not the expression

Instead of scanning all elements from the expression and using the
slow delete path there, let's use the faster way which involves
pat_delete_gen() while the elements are detached from ther reference.

MEDIUM: pattern: make pat_ref_prune() rely on pat_ref_purge_older()

When purging all of a reference, it's much more efficient to scan the
reference patterns from the reference head and delete all derivative
patterns than to scan the expressions. The only thing is that we need
to proceed both for the current and next generations, in case there is
a huge gap between the two. With this, purging 20M IP addresses in small
batches of 100 takes roughly 3 seconds.

MINOR: pattern: add pat_ref_purge_older() to purge old entries

This function will be usable to purge at most a specified number of old
entries from a reference. Entries are declared old if their generation
number is in the past compared to the one passed in argument. This will
ease removal of early entries when new ones have been appended.

We also call malloc_trim() when available, at the end of the series,
because this is one place where there is a lot of memory to save. Reloads
of 1M IP addresses used in an ACL made the process grow up to 1.7 GB RSS
after 10 reloads and roughly stabilize there without this call, versus
only 260 MB when the call is present. Sadly there is no direct equivalent
for jemalloc, which stabilizes around 800MB-1GB.

MINOR: pattern: implement pat_ref_load() to load a pattern at a given generation

pat_ref_load() basically combines pat_ref_append() and pat_ref_commit().
It's very similar to pat_ref_add() except that it also allows to set the
generation ID and the line number. pat_ref_add() was modified to directly
rely on it to avoid code duplication. Note that a previous declaration
of pat_ref_load() was removed as it was just a leftover of an earlier
incarnation of something possibly similar, so no existing functionality
was changed here.

MINOR: pattern: add pat_ref_commit() to commit a previously inserted element

This function will be used after a successful pat_ref_append() to propagate
the pattern to all use places (including parsing and indexing). On failure,
it will entirely roll back all insertions and free the pattern itself. It
also preserves the generation number so that it is convenient for use in
association with pat_ref_append(). pat_ref_add() was modified to rely on
it instead of open-coding the insertion and roll-back.

MEDIUM: pattern: only match patterns that match the current generation

Instead of matching any pattern found in the tree, only match those
matching the current generation of entries. This will make sure that
reloads are atomic, regardless of the time they take to complete, and
that newly added data are not matched until the whole reference is
committed. For consistency we proceed the same way on "show map" and
"show acl".

This will have no impact for now since generations are not used.

MINOR: pattern: store a generation number in the reference patterns

Right now it's not possible to perform a safe reload because we don't
know what patterns were recently added or were already present. This
patch adds a generation counter to the reference patterns so that it
is possible to know what generation of the reference they were loaded
with. A reference now has two generations, the current one, used for
all additions, and the next one, allocated to those wishing to update
the contents. The generation wraps at 2^32 so comparisons must be made
relative to the current position.

The idea will be that upon full reload, the caller will first get a new
generation ID, will insert all new patterns using it, will then switch
the current ID to the new one, and will delete all entries older than
the current ID. This has the benefit of supporting chunked updates that
remain consistent and that won't block the whole process for ages like
pat_ref_reload() currently does.

MINOR: pattern: introduce pat_ref_delete_by_ptr() to delete a valid reference

Till now the only way to remove a known reference was via
pat_ref_delete_by_id() which scans the whole list to find a matching pointer.
Let's add pat_ref_delete_by_ptr() which takes a valid pointer. It can be
called by the function above after the pointer is found, and can also be
used to roll back a failed insertion much more efficiently.

CLEANUP: pattern: remove pat_delete_fcts[] and pattern_head->delete()

These ones are not used anymore, so let's remove them to remove a bit
of the complexity. The ACL keyword's delete() function could be removed
as well, though most keyword declarations are positional and we have a
high risk of introducing a mistake here, so let's not touch the ACL part.

CLEANUP: acl: don't reference the generic pattern deletion function anymore

A few ACL keyword used to reference pat_delete_gen() as the deletion
function but this is not needed since it's the default one now. Let's
just remove this reference.

MINOR: pattern: perform a single call to pat_delete_gen() under the expression

When we're removing an element under the expression lock, we don't need
anymore to run over all ->delete() functions via the expressions, since
we know that the single function does it fine now. Note that at this
point, pattern->delete() is not used at all through out the code anymore.

MINOR: pattern: remerge the list and tree deletion functions

pat_del_tree_gen() was already chained onto pat_del_list_gen() to deal
with remaining cases, so let's complete the merge and have a generic
pattern deletion function acting on the reference and taking care of
reliably removing all elements.

MEDIUM: pattern: change the pat_del_* functions to delete from the references

This is the next step in speeding up entry removal. Now we don't scan
the whole lists or trees for elements pointing to the target reference,
instead we start from the reference and delete all linked patterns.

This simplifies some delete functions since we don't need anymore to
delete multiple times from an expression since all nodes appear after
the reference element. We can now have one generic list and one generic
tree deletion function.

This required the replacement of pattern_delete() with an open-coded
version since we now need to lock all expressions first before proceeding.
This means there is a high risk of lock inversion here but given that the
expressions are always scanned in the same order from the same head, this
must not happen.

Now deleting first entries is instantaneous, and it's still slow to
delete the last ones when looking up their ID since it still requires
to look them up by a full scan, but it's already way faster than
previously. Typically removing the last 10 IP from a 20M entries ACL
with a full-scan each took less than 2 seconds.

It would be technically possible to make use of indexed entries to
speed up most lookups for removal by value (e.g. IP addresses) but
that's for later.

MEDIUM: pattern: link all final elements from the reference

There is a data model issue in the current pattern design that makes
pattern deletion extremely expensive: there's no direct way from a
reference to access all indexed occurrences. As such, the only way
to remove all indexed entries corresponding to a reference update
is to scan all expressions's lists and trees to find a link to the
reference. While this was possibly OK when map removal was not
common and most maps were small, this is not conceivable anymore
with GeoIP maps containing 10M+ entries and del-map operations that
are triggered from http-request rulesets.

This patch introduces two list heads from the pattern reference, one
for the objects linked by lists and one for those linked by tree node.
Ideally a single list would be enough but the linked elements are too
much unrelated to be distinguished at the moment, so we'll need two
lists. However for the long term a single-linked list will suffice but
for now it's not possible due to the way elements are removed from
expressions. As such this patch adds 32 bytes of memory usage per
reference plus 16 per indexed entry, but both will be cut in half
later.

The links are not yet used for deletion, this patch only ensures the
list is always consistent.

MINOR: pattern: make the delete and prune functions more generic

Now we have a single prune() function to act on an expression, and one
delete function for the lists and one for the trees. The presence of a
pointer in the lists is enough to warrant a free, and we rely on the
PAT_SF_REGFREE flag to decide whether to free using free() or regfree().

MINOR: pattern: new sflag PAT_SF_REGFREE indicates regex_free() is needed

Currently we have no way to know how to delete/prune a pattern in a
generic way. A pattern doesn't contain its own type so we don't know
what function to call. Tree nodes are roughly OK but not lists where
regex are possible. Let's add one new bit for sflags at index time to
indicate that regex_free() will be needed upon deletion. It's not used
for now.

CLEANUP: pattern: delete the back refs at once during pat_ref_reload()

It's pointless to delete a backref and relink it to the next entry since
the next entry is going to do the exact same and so on until all of them
are deleted. Let's simply delete backrefs on reload.

MINOR: pattern: move the update revision to the pat_ref, not the expression

It's not possible to uniquely update a single expression without updating
the pattern reference, I don't know why we've put the revision in the
expression back then, given that it in fact provides an update for a
full pattern. Let's move the revision into the reference's head instead.

MEDIUM: pattern: call malloc_trim() on pat_ref_reload()

This is one case where we may release large amounts of data at once. Tests
show that without this, after 10 full reloads of an ACL containing 1M IP
addresses, the memory usage grew and stabilized around 1.7 GB of RSS. With
this change, it stays around 260 MB and is stable across reloads.

MEDIUM: pools: call malloc_trim() from pool_gc()

If available it definitely makes sense to call it since it's also
called when stopping to reclaim the maximum possible memory.

MINOR: compat: automatically include malloc.h on glibc

This is in order to access malloc_trim() which is convenient after
clearing huge maps to reclaim memory. When this is detected, we also
define HA_HAVE_MALLOC_TRIM.

REGTEST: converter: Add a regtest for MQTT converters

This new script tests mqtt_is_valid() and mqtt_get_field_value() converters used
to validate and extract information from a MQTT (Message Queuing Telemetry
Transport) message.

MINOR: sample: Add converts to parses MQTT messages

This patch implements a couple of converters to validate and extract data from a
MQTT (Message Queuing Telemetry Transport) message. The validation consists of a
few checks as well as "packet size" validation. The extraction can get any field
from the variable header and the payload.

This is limited to CONNECT and CONNACK packet types only. All other messages are
considered as invalid. It is not a problem for now because only the first packet
on each side can be parsed (CONNECT for the client and CONNACK for the server).

MQTT 3.1.1 and 5.0 are supported.

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>

REGTEST: converter: Add a regtest for fix converters

This new script tests fix_is_valid() and fix_tag_value() converters used to
validate and extract information from a FIX (Financial Information eXchange)
message.

MINOR: sample: Add converters to parse FIX messages

This patch implements a couple of converters to validate and extract tag value
from a FIX (Financial Information eXchange) message. The validation consists in
a few checks such as mandatory fields and checksum computation. The extraction
can get any tag value based on a tag string or tag id.

This patch requires the istend() function. Thus it depends on "MINOR: ist: Add
istend() function to return a pointer to the end of the string".

Reviewed and Fixed by Christopher Faulet <cfaulet@haproxy.com>

MINOR: ist: Add istend() function to return a pointer to the end of the string

istend() is a shortcut to istptr() + istlen().

[RELEASE] Released version 2.4-dev0

Released version 2.4-dev0 with the following main changes :
- MINOR: version: it's development again.
- DOC: mention in INSTALL that it's development again

DOC: mention in INSTALL that it's development again

This reverts commit 975908c831f8fa00ccf1afd1f7cfe2ad80a8c8e6.

MINOR: version: it's development again.

This reverts commit 0badabc381bc9c0d6be7479d528a5d549d6185f4.

[RELEASE] Released version 2.3.0

Released version 2.3.0 with the following main changes :
    - CLEANUP: pattern: remove unused entry "tree" in pattern.val
    - BUILD: ssl: use SSL_CTRL_GET_RAW_CIPHERLIST instead of OpenSSL versions
    - BUG/MEDIUM: filters: Don't try to init filters for disabled proxies
    - BUG/MINOR: proxy/server: Skip per-proxy/server post-check for disabled proxies
    - BUG/MINOR: checks: Report a socket error before any connection attempt
    - BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup
    - MINOR: server: Copy configuration file and line for server templates
    - BUG/MEDIUM: mux-pt: Release the tasklet during an HTTP upgrade
    - BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions
    - MINOR: debug: don't count free(NULL) in memstats
    - BUG/MINOR: filters: Skip disabled proxies during startup only
    - MINOR: mux_h2: capitalize frame type in stats
    - MINOR: mux_h2: add stat for total count of connections/streams
    - MINOR: stats: do not display empty stat module title on html
    - BUG/MEDIUM: stick-table: limit the time spent purging old entries
    - BUG/MEDIUM: listener: only enable a listening listener if needed
    - BUG/MEDIUM: listener: never suspend inherited sockets
    - BUG/MEDIUM: listener: make the master also keep workers' inherited FDs
    - MINOR: fd: add fd_want_recv_safe()
    - MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers
    - REGTESTS: mark abns_socket as working now
    - CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream
    - MINOR: sock: add a check against cross worker<->master socket activities
    - CI: github actions: limit OpenSSL no-deprecated builds to "default,bug,devel" reg-tests
    - BUG/MEDIUM: server: make it possible to kill last idle connections
    - MINOR: mworker/cli: the master CLI use its own applet
    - MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL
    - BUILD: ssl: use feature macros for detecting ec curves manipulation support
    - DOC: Add dns as an available domain to show stat
    - BUILD: makefile: usual reorder of objects for faster builds
    - DOC: update INSTALL to mention that TCC is supported
    - DOC: mention in INSTALL that haproxy 2.3 is a stable version
    - MINOR: version: mention that it's stable now

MINOR: version: mention that it's stable now

This version will be maintained up to around Q1 2022.

DOC: mention in INSTALL that haproxy 2.3 is a stable version

One day we'll have a script to update the status and support dates :-)

DOC: update INSTALL to mention that TCC is supported

TinyCC as found at https://repo.or.cz/tinycc.git does work to some extents
and is very convenient for developers, so let's mention it.

BUILD: makefile: usual reorder of objects for faster builds

Reordered the objets by reverse build times made the total build time
go down from 17.7s to 17.2s at -O2 using make -j8 on my PC, and from
~3.2 to ~2.7s on the build farm.

DOC: Add dns as an available domain to show stat

Within management.txt, proxy was listed as the only available option. "dns"
is now supported so let's add that. This change also updates the command to list
the available options <dns|proxy> for "domain" as previously it only specified
<domain>, which could be confusing as a user may think this field accepts
dynamic options when it actually requires a specific keyword.

BUILD: ssl: use feature macros for detecting ec curves manipulation support

Let us use SSL_CTX_set1_curves_list, defined by OpenSSL, as well as in
openssl-compat when SSL_CTRL_SET_CURVES_LIST is present (BoringSSL),
for feature detection instead of versions.

MINOR: ssl: define SSL_CTX_set1_curves_list to itself on BoringSSL

OpenSSL 1.0.2 and onwards define SSL_CTX_set1_curves_list which is both a
function and a macro. OpenSSL 1.0.2 to 1.1.0 define SSL_CTRL_SET_CURVES_LIST
as a macro, which disappeared from 1.1.1. BoringSSL only has that one and
not the former macro but it does have the function. Let's keep the test on
the macro matching the function name by defining the macro to itself when
needed.

MINOR: mworker/cli: the master CLI use its own applet

Following the patch b4daee ("MINOR: sock: add a check against cross
worker<->master socket activities"), this patch adds a dedicated applet
for the master CLI. It ensures that the CLI connection can't be
used with the master rights in the case of bugs.

BUG/MEDIUM: server: make it possible to kill last idle connections

In issue #933, @jaroslawr provided a report indicating that when using
many threads and many servers, it's very difficult to terminate the last
idle connections on each server. The issue has two causes in fact. The
first one is that during the calculation of the estimate of needed
connections, we round the computation up while in previous round it was
already rounded up, so we end up adding 1 to 1 which once divided by 2
remains 1. The second issue is that servers are not woken up anymore for
purging their connections if they don't have activity. The only reason
that was there to wake them up again was in case insufficient connections
were purged. And even then the purge task itself was not woken up. But
that is not enough for getting rid of the long tail of old connections
nor updating est_need_conns.

This patch makes sure to properly wake up as long as at least one idle
connection remains, and not to round up the needed connections anymore.

Prior to this patch, a test involving many connections which suddenly
stopped would keep many idle connections, now they're effectively halved
every pool-purge-delay.

This needs to be backported to 2.2.

CI: github actions: limit OpenSSL no-deprecated builds to "default,bug,devel" reg-tests

MINOR: sock: add a check against cross worker<->master socket activities

Given that the previous issues caused spurious worker socket wakeups in
the master for inherited FDs that couldn't be closed, let's add a strict
test in the I/O callback to make sure that an accept() event is always
caught by the appropriate type of process (master for master listeners,
worker for worker listeners).

CLEANUP: mux-h2: Remove the h1 parser state from the h2 stream

Since the h2 multiplexer no longer relies on the legacy HTTP representation, and
uses exclusively the HTX, the H1 parser state (h1m) is no longer used by the h2
streams. Thus it can be removed.

This patch may be backported as far as 2.1.

REGTESTS: mark abns_socket as working now

William noticed that the real issue in the abns test was that it was
failing to rebind some listeners, issues which were addressed in the
previous commits (in some cases, the master would even accept traffic
on the worker's socket which was not properly disabled, causing some
of the strange error messages).

The other issue documented in commit 2ea15a080 was the lack of
reliability when the test was run in parallel. This is caused by the
abns socket which uses a hard-coded address, making all tests in
parallel to step onto each others' toes. Since vtest cannot provide
abns sockets, we're instead concatenating the number of the listening
port that vtest allocated for another frontend to the abns path, which
guarantees to make them unique in the system.

The test works fine in all cases now, even with 100 in parallel.

MEDIUM: listeners: make use of fd_want_recv_safe() to enable early receivers

We used to refrain from calling fd_want_recv() if fd_updt was not allocated
but it's not the right solution as this does not allow the FD to be set.
Instead, let's use the new fd_want_recv_safe() which will update the FD and
create an update entry only if possible. In addition, the equivalent test
before calling fd_stop_recv() was removed as totally useless since there's
not fd_updt creation in this case.

MINOR: fd: add fd_want_recv_safe()

This does the same as fd_want_recv() except that it does check for
fd_updt[] to be allocated, as this may be called during early listener
initialization. Previously we used to check fd_updt[] before calling
fd_want_recv() but this is not correct since it does not update the
FD flags. This method will be safer.

BUG/MEDIUM: listener: make the master also keep workers' inherited FDs

In commit 374e9af35 ("MEDIUM: listener: let do_unbind_listener() decide
whether to close or not") it didn't appear necessary to have the master
process keep open the workers' inherited FDs. But this is actually
necessary to handle the reload on "bind fd@foo" situations, otherwise
the FD may be reassigned and the new socket cannot be set up, sometimes
causing "socket operation on non-socket" or other types of errors.

William found that this was the cause for the consistent failures of the
abns regtest, which already used to fail very often before this and was
as such marked as broken.

Interestingly I didn't have this issue with my test configs because
the FD number I used was higher and within the range of other listening
sockets. But this means that one of these wouldn't work as expected.

No backport is needed, this was introduced as part of the listeners
rework in 2.3.

BUG/MEDIUM: listener: never suspend inherited sockets

It is not acceptable to suspend an inherited socket because we'd kill
its listening state, making it possibly unrecoverable for future
processes. The situation which can trigger this is when there is an
abns socket in a config and an inherited FD on another listener. Upon
soft reload, the abns fails to bind, a SIGTTOU is sent to the old
process which suspends everything, including the inherited FD, then
the new process can bind and tell the old one to quit. Except that the
new FD was not set back to the listen state, which is detected by
listener_accept() which can pause it. It's only upon second reload
that the FD works again.

The solution is to refrain from suspending such FDs since we don't own
them. And the next process will get them right anyway from its config.
For now only TCP and UDP face this issue so it's better to address this
on a protocol basis

No backport is needed, this is related to the new listeners in 2.3.

BUG/MEDIUM: listener: only enable a listening listener if needed

The test on listener->state == LI_LISTEN is not sufficient to decide
if we need to enable a listener. Indeed, there is a very special case
which is the inherited FD shared, which has to reflect the real socket
state even after the previous test, and as such needs to remain in
LI_LISTEN state. In this case we don't want a worker to start the
master's listener nor conversely. Let's add a specific test for this.

BUG/MEDIUM: stick-table: limit the time spent purging old entries

An interesting case was reported with threads and moderately sized
stick-tables. Sometimes the watchdog would trigger during the purge.
It turns out that the stick tables were sized in the 10s of K entries
which is the order of magnitude of the possible number of connections,
and that threads were used over distinct NUMA nodes. While at first
glance nothing looks problematic there, actually there is a risk that
a thread trying to purge the table faces 100% of entries still in use
by a connection with (ts->ref_cnt > 0), and ends up scanning the whole
table, while other threads on the other NUMA node are causing the
cache lines to bounce back and forth and considerably slow down its
progress to the point of possibly spending hundreds of milliseconds
there, multiplied by the number of queued threads all failing on the
same point.

Interestingly, smaller tables would not trigger it because the scan
would be faster, and larger ones would not trigger it because plenty
of entries would be idle!

The most efficient solution is to increase the table size to be large
enough for this never to happen, but this is not reliable. We could
have a parallel list of idle entries but that would significantly
increase the storage and processing cost only to improve a few rare
corner cases.

This patch takes a more pragmatic approach, it considers that it will
not visit more than twice the number of nodes to be deleted, which
means that it accepts to fail up to 50% of the time. Given that very
small batches are programmed each time (1/256 of the table size), this
means the operation will finish quickly (128 times faster than now),
and will reduce the inter-thread contention. If this needs to be
reconsidered, it will probably mean that the batch size needs to be
fixed differently.

This needs to be backported to stable releases which extensively use
threads, typically 2.0.

Kudos to Nenad Merdanovic for figuring the root cause triggering this!

MINOR: stats: do not display empty stat module title on html

If a stat module is not available on the current proxy scope, do not
display its title on the related html box. This is clearer for the user.

MINOR: mux_h2: add stat for total count of connections/streams

Add counters for total number of http2 connections/stream since haproxy
startup. Contrary to open_conn/stream, they are never reset to zero.

MINOR: mux_h2: capitalize frame type in stats

http/2 frame type names are capitalized in the rfc, use the same
notation on the stats labels.

BUG/MINOR: filters: Skip disabled proxies during startup only

This partially reverts the patch 400829cd2 ("BUG/MEDIUM: filters: Don't try to
init filters for disabled proxies"). Disabled proxies must not be skipped in
flt_deinit() and flt_deinit_all_per_thread() when HAProxy is stopped because,
obvioulsy, at this step, all proxies appear as disabled (or stopped, it is the
same state). It is safe to do so because, during startup, filters declared on
disabled proxies are removed. Thus they don't exist anymore during shutdown.

This patch must be backported in all versions where the patch above is.

MINOR: debug: don't count free(NULL) in memstats

The mem stats are pretty convenient to spot leaks, except that they count
free(NULL) as 1, and the code does actually have quite a number of free(foo)
guards where foo is NULL if the object was already freed. Let's just not
count these ones so that the stats remain consistent. Now it's possible
to compare the strdup()/malloc() and free() and verify they are consistent.

BUILD: ssl: use HAVE_OPENSSL_KEYLOG instead of OpenSSL versions

let us use HAVE_OPENSSL_KEYLOG for feature detection instead
of versions

BUG/MEDIUM: mux-pt: Release the tasklet during an HTTP upgrade

When a TCP connection is upgraded to HTTP, the passthrough multiplexer owning
the client connection is detroyed and replaced by an HTTP multiplexer. When it
happens, the connection context is changed (it is in fact the mux itself). Thus,
when the mux-pt is destroyed, the connection is not released. But, only the
connection must be kept. Everything else concerning the mux must be
released. Especially, the tasklet used for I/O subscriptions. In this part,
there was a bug and the tasklet was never released.

This patch should fix the issue #935. It must be backported as far as 2.0.

MINOR: server: Copy configuration file and line for server templates

When servers based on server templates are initialized, the configuration file
and line are now copied. This helps to emit understandable warning and alert
messages.

This patch may be backported if needed, as far as 1.8.

BUG/MINOR: server: Set server without addr but with dns in RMAINT on startup

On startup, if a server has no address but the dns resolutions are configured,
"none" method is added to the default init-addr methods, in addition to "last"
and "libc". Thus on startup, this server is set to RMAINT mode if no address is
found. It is only performed if no other init-addr method is configured.

Setting the RMAINT mode on startup is important to inhibit the health checks.

For instance, following servers will now be set to RMAINT mode on startup :

  server srv nofound.tld:80 check resolvers mydns
  server srv _http._tcp.service.local check resolvers mydns
  server-template srv 1-3 _http._tcp.service.local check resolvers mydns

while followings ones will trigger an error :

  server srv nofound.tld:80 check
  server srv nofound.tld:80 check resolvers mydns init-addr libc
  server srv _http._tcp.service.local check
  server srv _http._tcp.service.local check resolvers mydns init-addr libc
  server-template srv 1-3 _http._tcp.service.local check resolvers mydns init-addr libc

This patch must be backported as far as 1.8.

BUG/MINOR: checks: Report a socket error before any connection attempt

When a health-check fails, if no connection attempt was performed, a socket
error must be reported. But this was only done if the connection was not
allocated. It must also be done if there is no control layer. Otherwise, a
L7TOUT will be reported instead.

It is possible to not having a control layer for a connection if the connection
address family is invalid or not defined.

This patch must be backported to 2.2.