Instead of calling the external password command for all loaded
encrypted certificates, we will keep a local password cache.
The passwords won't be stored as plain text, they will be stored
obfuscated into the password cache. The obfuscation is simply based on a
XOR'ing with a random number built during init.
After init is performed, the password cache is overwritten and freed so
that no dangling info allowing to dump the passwords remains.
MEDIUM: ssl: Add certificate password callback that calls external command
When a certificate is protected by a password, we can provide the
password via the dedicated pem_password_cb param provided to
PEM_read_bio_PrivateKey.
HAProxy will fetch the password automatically during init by calling a
user-defined external command that should dump the right password on its
standard output (see new 'ssl-passphrase-cmd' global option).
MINOR: init: Use devnullfd in stdio_quiet calls instead of recreating a fd everytime
Since commit "65760d MINOR: init: Make devnullfd global and create it
earlier in init" the devnullfd file descriptor pointing to /dev/null
is created regardless of the process's parameters so we can use it in
all 'stdio_quiet' calls instead or recreating an FD.
MINOR: init: Make devnullfd global and create it earlier in init
The devnull fd might be needed during configuration parsing, if some
options require to fork/exec for instance. So we now create it much
earlier in the init process and without depending on the '-q' or '-d'
parameters.
BUG/MINOR: init: Do not close previously created fd in stdio_quiet
During init we were calling 'stdio_quiet' and passing the previously
created 'devnullfd' file descriptor. But the 'stdio_quiet' was also
closed afterwards which raised an error (EBADF).
If we keep from closing FDs that were opened outside of the
'stdio_quiet' function we will let the caller manage its FD and avoid
double close calls.
This patch can be backported to all stable branches.
Huangbin Zhan [Tue, 28 Oct 2025 03:39:54 +0000 (11:39 +0800)]
MINOR: http: fix 405,431,501 default errorfile
A few typos were present in the default errorfiles for the status codes
above (missing dot at the end of the sentence, extra closing bracket).
This fixes them. This can be backported.
Ilia Shipitsin [Thu, 23 Oct 2025 16:39:49 +0000 (18:39 +0200)]
CI: disable fail-fast on fedora rawhide builds
Previously builds were dependent in terms that if one fails, other are
stopped. By their nature those builds are independent, let's not to fail
them altogether
Willy Tarreau [Wed, 29 Oct 2025 07:03:01 +0000 (08:03 +0100)]
MINOR: ssl-sample: add ssl_fc_early_rcvd() to detect use of early data
We currently have ssl_fc_has_early() which says that early data are still
unconfirmed by a final handshake, but nothing to see if a client has been
able to use early data at all, which is a problem because such mechanisms
generally depend on multiple factors and it's hard to know when they start
to work. This new sample fetch function will indicate that some early data
were seen over that front connection, i.e. this can be used to confirm
that at some point the client was able to push some. This is essentially
a debugging tool that has no practical use case other than debugging.
Willy Tarreau [Wed, 29 Oct 2025 06:57:28 +0000 (07:57 +0100)]
DOC: config: slightly clarify the ssl_fc_has_early() behavior
Clarify that it's about handshake *completion*, and also mention that
the action to be used to wait for the handshake is "wait-for-handshake",
which was not mentioned.
Willy Tarreau [Wed, 29 Oct 2025 07:11:51 +0000 (08:11 +0100)]
DOC: config: fix confusing typo about ACL -m ("now" vs "not")
A one-letter typo in the doc update comint with commit 6ea50ba462 ("MINOR:
acl; Warn when matching method based on a suffix is overwritten") inverts
the meaning of the sentence. It was "is not allowed" and not
"is now allowed". Needs to be backported only if the commit above ever is
(unlikely).
Amaury Denoyelle [Tue, 28 Oct 2025 10:23:18 +0000 (11:23 +0100)]
BUG/MINOR: acl: warn if "_sub" derivative used with an explicit match
Recently, a new warning is displayed when an ACL derivative match method
is override with another '-m' method. This is implemented via the
following patch :
BUG/MEDIUM: ssl: Crash because of dangling ckch_store reference in a ckch instance
When updating CAs via the CLI, we need to create new copies of all the
impacted ckch instances (as in referenced in the ckch_inst_link list of
the updated CA) in order to use them instead of the old ones once the
updated is completed. This relies on the ckch_inst_rebuild function that
would set the ckch_store field of the ckch_inst. But we forgot to also
add the newly created instances in the ckch_inst list of the
corresponding ckch_store.
When updating a certificate afterwards, we iterate over all the
instances linked in the ckch_inst list of the ckch_store (which is
missing some instances because of the previous command) and rebuild the
instances before replacing the ckch_store. The previous ckch_store,
still referenced by the dangling ckch instance then gets deleted which
means that the instance keeps a reference to a free'd object.
Then if we were to once again update the CA file, we would iterate over
the ckch instances referenced in the cafile_entry's ckch_inst_link list,
which includes the first mentioned ckch instance with the dead
ckch_store reference. This ends up crashing during the ckch_inst_rebuild
operation.
This bug was raised in GitHub #3165.
This patch should be backported to all stable branches.
Willy Tarreau [Mon, 27 Oct 2025 14:48:59 +0000 (15:48 +0100)]
BUG/MEDIUM: cli: do not return ACKs one char at a time
Since 3.0 where the CLI started to use rcv_buf, it appears that some
external tools sending chained commands are randomly experiencing
failures. Each time this happens when the whole command is sent as a
single packet, immediately followed by a close. This is not a correct
way to use the CLI but this has been working for ages for simple
netcat-based scripts, so we should at least try to preserve this.
The cause of the failure is that the first LF that acks a command is
immediately sent back to the client and rejected due to the closed
connection. This in turn forwards the error back to the applet which
aborts its processing.
Before 3.0 the responses would be queued into the buffer, then sent
back to the channel, and would all fail at once. This changed when
snd_buf/rcv_buf were implemented because the applets are much more
responsive and since they yield between each command, they can
deliver one ACK at a time that is immediately forwarded down the
chain.
An easy way to observe the problem is to send 5 map updates, a shutdown,
and immediately close via tcploop, and in parallel run a periodic
"show map" to count the number of elements:
Before 3.0, there would always be 5 elements. Since 3.0 and before 20ec1de214 ("MAJOR: cli: Refacor parsing and execution of pipelined
commands"), almost always 2. And since that commit above in 3.2, almost
always one. Doing the same using socat or netcat shows almost always 5...
It's entirely timing-dependent, and might even vary based on the RTT
between the client and haproxy!
The approach taken here consists in doing the same principle as MSG_MORE
or Nagle but on the response buffer: the applet doesn't need to send a
single ACK for each command when it has already been woken up and is
scheduled to come back to work. It's fine (and even desirable) that
ACKs are grouped in a single packet as much as possible.
For this reason, this patch implements APPCTX_CLI_ST1_YIELD, a new CLI
flag which indicates that the applet left in yielding condition, i.e.
it has not finished its work. This flag is used by .rcv_buf to hold
pending data. This way we won't return partial responses for no reason,
and we can continue to emulate the previous behavior.
One very nice benefit to this is that it saves huge amounts of CPU on
the client. In the test below that tries to update 1M map entries, the
CPU used by socat went from 100% to 0% and the total transfer time
dropped by 28%:
This patch will need to be backported to 3.0, and depends on these two
patches to be backported as well:
MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty
MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
Willy Tarreau [Mon, 27 Oct 2025 14:48:04 +0000 (15:48 +0100)]
MINOR: cli: create cli_raw_rcv_buf() from the generic applet_raw_rcv_buf()
This is in preparation for a future fix. For now it's simply a pure
copy of the original function, but dedicated to the CLI. It will
have to be backported to 3.0.
Willy Tarreau [Mon, 27 Oct 2025 14:41:05 +0000 (15:41 +0100)]
MINOR: applet: do not put SE_FL_WANT_ROOM on rcv_buf() if the channel is empty
appctx_rcv_buf() prepares all the work to schedule the transfers between
the applet and the channel, and it takes care of setting the various flags
that indicate what condition is blocking the transfer from progressing.
There is one limitation though. In case an applet refrains from sending
data (e.g. rate-limited, prefers to aggregate blocks etc), it will leave
a possibly empty channel buffer, and keep some data in its outbuf. The
data in its outbuf will be seen by the function above as an indication
of a channel full condition, so it will place SE_FL_WANT_ROOM. But later,
sc_applet_recv() will see this flag with a possibly empty channel, and
will rightfully trigger a BUG_ON().
appctx_rcv_buf() should be more accurate in fact. It should only set
SE_FL_RCV_MORE when more data are present in the applet, then it should
either set or clear SE_FL_WANT_ROOM dependingon whether the channel is
empty or not.
Right now it doesn't seem possible to trigger this condition in the
current state of applets, but this will become possible with a future
bugfix that will have to be backported, so this patch will need to be
backported to 3.0.
Olivier Houchard [Fri, 24 Oct 2025 11:49:23 +0000 (13:49 +0200)]
MEDIUM: quic: Fix build with openssl-compat
As the QUIC options have been split into backend and frontend, there is
no more GTUNE_QUIC_LISTEN_OFF to be found in global.tune.options, look
for QUIC_TUNE_FE_LISTEN_OFF in quic_tune.fe instead.
This should fix the build with USE_QUIC and USE_QUIC_OPENSSL_COMPAT.
Olivier Houchard [Fri, 24 Oct 2025 10:43:47 +0000 (12:43 +0200)]
BUG/MEDIUM: mt_list: Use atomic operations to prevent compiler optims
As a folow-up to f40f5401b9f24becc6fdd2e77d4f4578bbecae7f, explicitely
use atomic operations to set the prev and next fields, to make sure the
compiler can't assume anything about it, and just does it.
This should be backported after f40f5401b9 up to 2.8.
Willy Tarreau [Fri, 24 Oct 2025 08:42:06 +0000 (10:42 +0200)]
BUILD: openssl-compat: fix build failure with OPENSSL=0 and KTLS=1
The USE_KTLS test is currently being done outside of the USE_OPENSSL
guard so disabling USE_OPENSSL still results in build failures on
libcs built with support for kernels before 4.17, because we enable
KTLS by default on linux. Let's move the KTLS block inside the
USE_OPENSSL guard instead.
Willy Tarreau [Fri, 24 Oct 2025 05:46:53 +0000 (07:46 +0200)]
BUG/MINOR: stick-tables: properly index string-type keys
This is one of the rare pleasant surprises of fixing an almost 16-years
old bug that remained unnoticed since the feature was implemented. In
1.4-dev7, commit 3bd697e071 ("[MEDIUM] Add stick table (persistence)
management functions and types") introduced stick-tables with multiple
key types, including strings, IP addresses and integers. Entries are
coded in binary and their binary representation is indexed. A special
case was made for strings in order to index them as zero-terminated
strings. However, there's one subtlety. While strings indeed have a
zero appended, they're still indexed using ebmb_insert(), which means
that all the bytes till the configured size are indexed as well. And
while these bytes generally come from a temporary storage that often
contains zeroes, or that is longer than the configured string length
and will result in truncation, it's not always the case and certain
traffic patterns with certain configurations manage to occasionally
present unpadded strings resulting in apparent duplicate keys appearing
in the dump, as shown in GH issue #3161. It seems to be essentially
reproducible at boot, and not to be particularly affected by mixed
patterns. These keys are in fact not exact duplicates in memory, but
everywhere they're used (including during synchronization), they are
equal.
What's interesting is that when this happens, one key can be presented
to a peer with its own data and will be indexed as the only one, possibly
replacing contents from the previous key, which might replace them again
later once updated in turn. This is visible in the dump of the issue
above, where key "localhost:8001" was split into two entries, one with a
request count of one and the other with a request count of 499999, and
indeed, all peers see only that last value, which overwrote the first
one.
This fix must be backported to all stable branches. Special kudos to
Mark Wort for undelining that one.
This is a second attempt at fixing issues on 32bits systems which would
trigger the following BUG_ON() statement:
FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825 shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this
This is a drop-in replacement for d30b88a6c + 4693ee0ff, as suggested by
Willy.
Indeed, on supported platforms unsigned int can be assumed to be 4 bytes
long, and long can be assumed to be 8 bytes long. As such, the previous
attempt was overkill and added unecessary maintenance complexity which
could result in bugs if not used properly. Moreover, it would only
partially solve the issue, since on little endian vs big endian
architectures, the provisioned memory areas (originating from the same
shm stats file) could be read differently by the host.
Instead we fix the aligments issues, and this alone helps to ensure
struct memory consistency on 64 vs 32bits platforms. It was tested
on both i386 and i586.
last_change and last_sess counters are now stored as unsigned int, as
it helped to fix the alignment issues and they were found to be used
as 32bits integers anyway.
Thanks to Willy for problem analysis and the patch proposal.
This reverts commit 466a603b59ed77e9787398ecf1baf77c46ae57b1.
Due to the last 2 commits, this macro is now unused, and will probably
never be used, so let's get rid of that for now.
Revert "MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct"
This reverts commit 4693ee0ff7a5fa4a12ff69b1a33adca142e781ac.
As discussed in GH #3168, this works but it is not the proper way to fix
the issue. See following commits.
This reverts commit d30b88a6cc47d662e92b524ad5818be312401d0e.
As discussed in GH #3168, this works but it is not the proper way to fix
the issue. See following commits.
BUG/MEDIUM: applet: Improve again spinning loops detection with the new API
A first attempt to fix this issue was already pushed (54b7539d6 "BUG/MEDIUM:
apppet: Improve spinning loop detection with the new API"). But it not was
fully accurrate. Indeed, we must check if something was received or sent by
the applet before incrementing the call rate. But we must also take care the
applet is allowed to receive or send data. That is what is performed in this
patch.
This patch must be backported as far as 3.0 with the patch above.
BUG/MINOR: quic: rename and duplicate stream settings
Several settings can be set to control stream multiplexing and
associated receive window. Previously, all of these settings were
configured using prefix "tune.quic.frontend.", despite being applied
blindly on both sides.
Fix this by duplicating these settings specific to frontend and backend
side. Options are also renamed to use the standardize prefix
"tune.quic.[be|fe].stream." notation.
Also, each option is individually renamed to better reflect its purpose
and hide technical details relative to QUIC transport parameter naming :
* max-data-size -> stream.rxbuf
* max-streams-bidi -> stream.max-concurrent
* stream-data-ratio -> stream.data-ratio
BUG/MINOR: quic: split max-idle-timeout option for FE/BE usage
Streamline max-idle-timeout option. Rename it to use the newer cohesive
naming scheme 'tune.quic.fe|be.'.
Two different fields were already defined in global struct. These fields
are moved into quic_tune along with other QUIC settings. However, no
parser was defined for backend option, this commit fixes this.
On frontend side, a quic_conn can have a dedicated FD or use the
listener one. These different modes can be activated via a global QUIC
tune setting.
This patch adjusts the option. First, it is renamed to the more
meaningful name 'tune.quic.fe.sock-per-conn'. Also, arguments are now
either 'default-on' or 'force-off'. The objective is to better highlight
reliationship with 'quic-socket' bind option.
The older option is deprecated and will be removed in 3.5.
A QUIC global tune setting is defined to be able to force Retry emission
prior to handshake. By definition, this ability is only supported by
QUIC servers, hence it is a frontend option only.
Rename the option to use "fe" prefix. The old option name is deprecated
and will be removed in 3.5
QUIC global memory can be limited across the entire process via a global
tune setting. Previously, this setting used to misleading "frontend"
prefix. As this is applied as a sum between all QUIC connections, both
from frontend and backend sides, remove the prefix. The new option name
is "tune.quic.mem.tx-max".
The older option name is deprecated and will be removed in 3.5.
This patch is similar to the previous one, except that it is focused on
Tx QUIC settings. It is now possible to toggle GSO and pacing on
frontend and backend sides independently.
As with previous patch, option are renamed to use "fe/be" unified
prefixes. This is part of the current serie of commits which unify QUI
settings. Older options are deprecated and will be removed on 3.5
release.
MINOR: quic: split congestion controler options for FE/BE usage
Various settings can be configured related to QUIC congestion controler.
This patch duplicates them to be able to set independent values on
frontend and backend sides.
As with previous patch, option are renamed to use "fe/be" unified
prefixes. This is part of the current serie of commits which unify QUIC
settings. Older options are deprecated and will be removed on 3.5
release.
MINOR: quic: duplicate glitches FE option on BE side
Previously, QUIC glitches support was only implemented for frontend
side. Extend this so that the option can be specified separately both on
frontend and backend sides. Function _qcc_report_glitch() now retrieves
the relevant max value based on connection side.
In addition to this, option has been renamed to use "fe/be" prefixes.
This is part of the current serie of commits which unify QUIC settings.
Older options are deprecated and will be removed on 3.5 release.
MINOR: quic: rename "no-quic" to "tune.quic.listen"
Rename the option to quickly enable/disable every QUIC listeners. It now
takes an argument on/off. The documentation is extended to reflect the
fact that QUIC backend are not impacted by this option.
The older keyword is simply removed. Deprecation is considered
unnecessary as this setting is only useful during debugging.
MINOR: quic: prepare support for options on FE/BE side
A major reorganization of QUIC settings is going to be performed. One of
its objective is to clearly define options which can be separately
configured on frontend and backend proxy sides.
To implement this, quic_tune structure is extended to support fe and be
options. A set of macros/functions is also defined : it allows to
retrieve an option defined on both sides with unified code, based on
proxy side of a quic_conn/connection instance.
Remove parsing code for tune.quic.frontend.conn-tx-buffers.limit. This
option was deprecated for some time and in fact was noop and not
mentionned anymore in the documentation.
Olivier Houchard [Thu, 23 Oct 2025 12:41:23 +0000 (14:41 +0200)]
BUG/MEDIUM: mt_lists: Avoid el->prev = el->next = el
Avoid setting both el->prev and el->next on the same line.
The goal is to set both el->prev and el->next to el, but a naive
compiler, such as when we're using -O0, will set el->next first, then
will set el->prev to the value of el->next, but if we're unlucky,
el->next will have been set to something else by another thread.
So explicitely set both to what we want.
MINOR: acme: display the complete challenge_ready command in the logs
When using a wildcard DNS domain in the ACME configuration, for example
*.example.com, one might think that it needs to use the challenge_ready
command with this domain. But that's not the case, the challenge_ready
command takes the domain asked by the ACME server, which is stripped of
the wildcard.
In order to be clearer, the log message shows exactly the command the
user should sent, which is clearer.
MINOR: acme: add the dns-01-record field to the sink
The dns-01-record field in the dpapi sink, output the authentication
token which is needed in the TXT record in order to validate the DNS-01
challenge.
Olivier Houchard [Thu, 23 Oct 2025 08:48:38 +0000 (10:48 +0200)]
BUG/MEDIUM: stick-tables: Don't loop if there's nothing left
Before waking up the expiration task again at the end of it, make sure
the next date is set. If there's nothing left to do, then task_exp will
be TASK_ETERNITY and we then don't want to be waken up again.
Willy Tarreau [Wed, 22 Oct 2025 16:55:29 +0000 (18:55 +0200)]
BUG/MEDIUM: build: limit excessive and counter-productive gcc-15 vectorization
In https://bugs.gentoo.org/964719, Dan Goodliffe reported that using
CFLAGS="-O3 -march=westmere" creates a binary that segfaults on startup
with gcc-15. This could be reproduced here, is isolated to gcc-15 and
-O3, and is caused by gcc emitting "movdqa" instructions to read unaligned
longs taken from chars that were carefully isolated within ifdefs checking
for support for unaligned integers on the platform...
Some experiments showed that changing all casts all over the code using
either typedef-enforced align(1) or using the packed union trick does
the job, it needs a more in-depth validation since it's obvious that
it doesn't produce the same code at all (at least on more modern
machines).
However, the offending optimization option could be isolated, it's
"-fvect-cost-model=dynamic" which causes this, while -O2 uses
"-fvect-cost-model=very-cheap". Turning it back to very-cheap solves the
issue, reduces the code, and yields an extra 5% performance increase on
the http-request rate (181k vs 172k on a single core)! This could at
least partially explain why it has been observed several times over
the last few years that -O3 yields bigger and slower code than -O2.
It was also verified that the option doesn't change the emitted code
at -O0..-O2,-Os,-Oz, but only at -O3.
This patch detects the presence of this option and turns it on to
address the problem that some distros are facing after an upgrade to
gcc-15. As such it should be backported to recent LTS and stable
branches. Here, 3.1 was used, so it seems legit to at least target
the last two LTS branches (i.e. go as far as 3.0).
Thanks to Dan Goodliffe for sharing a working reproducer, Sam James
for starting the investigations and Christian Ruppert for bringing
the issue to us.
As reported by @tianon on GH #3168, running haproxy on 32bits i386
platform would trigger the following BUG_ON() statement:
FATAL: bug condition "sizeof(struct shm_stats_file_object) != 544" matched at src/stats-file.c:825
shm_stats_file_object struct size changed, is is part of the exported API: ensure all precautions were taken (ie: shm_stats_file version change) before adjusting this
In fact, some efforts were already taken to ensure shm_stats_file_object
struct size remains consistent on 64 vs 32 bits platforms, since
shm_stats_file_object is part of the public API and directly exposed in
the stats file.
However, some parts were overlooked: some structs that are embedded in
shm_stats_file_object struct itself weren't using fixed-width integers,
and would sometime be unaligned. The result of this is that it was
up to the compiler (platform-dependent) to choose how to deal with such
ambiguities, which could cause the struct mapping/size to be inconsistent
from one platform to another.
Hopefully this was caught by the BUG_ON() statement and with the precious
help of @tianon
To fix this, we now use fixed-width integers everywhere for members
(and submembers) of shm_stats_file_object struct, and we use explicit
padding where missing to avoid automatic padding when we don't expect
one. As for the previous commit, we leverage FIXED_SIZE() and
FIXED_SIZE_ARRAY() macro to set the expected width for each integer
without causing build issues on platform that don't support larger
integers.
No backport needed, this feature was introduced during 3.3-dev.
MEDIUM: freq-ctr: use explicit-size types for freq-ctr struct
freq-ctr struct is used by the shm_stats_file API, and more precisely,
it is used in the shm_stats_file_object struct for counters.
shm_stats_file_object struct requires to be plateform-independent, thus
we switch to using explicit size types (AKA fixed width integer types)
for freq-ctr, in the attempt to make freq-ctr size and memory mapping
consistent from one platform to another.
We cannot simply use fixed-width integer because some of them are
involved in atomic operations, and forcing a given width could
cause build issues on some platforms where atomic ops are not
implemented for large integers. Instead we leverage the FIXED_SIZE
macro to keep handling the integers as before, but forcing them to
be stored using expected number of bytes (unused bytes will simply
be ignored).
FIXED_SIZE() macro can be used to instruct the compiler that the struct
member named <name>, handled as <type>, must be stored using <size> bytes
and that even if the type used is actualler smaller than the expected size
FIXED_SIZE_ARRAY(), similar to FIXED_SIZE() but for arrays: it takes an
extra argument which is the number of members.
They may be used for portability concerns to ensure a structure mapping
remains consistent between platforms.
Amaury Denoyelle [Mon, 20 Oct 2025 12:29:55 +0000 (14:29 +0200)]
MINOR: quic: remove received CRYPTO temporary tree storage
The previous commit switch from ncbuf to ncbmbuf as storage for received
CRYPTO frames. The latter ensures that buffering of such frames cannot
fail anymore due to gaps size.
Previously, extra mechanism were implemented on QUIC frames parsing
function to overcome the limitation of ncbuf on gaps size. Before
insertion, CRYPTO frames were stored in a temporary tree to order their
insertion. As this is not necessary anymore, this commit removes the
temporary tree insertion.
This commit is closely associated to the previous bug fix. As it
provides a neat optimization and code simplication, it can be backported
with it, but not in the next immediate release to spot potential
regression.
Amaury Denoyelle [Thu, 16 Oct 2025 14:43:16 +0000 (16:43 +0200)]
BUG/MAJOR: quic: use ncbmbuf for CRYPTO handling
In QUIC, TLS handshake messages such as ClientHello are encapsulated in
CRYPTO frames. Each QUIC implementation can split the content in several
frames of random sizes. In fact, this feature is now used by several
clients, based on chrome so-called "Chaos protection" mechanism :
To support this, haproxy uses a ncbuf storage to store received CRYPTO
frames before passing it to the SSL library. However, this storage
suffers from a limitation as gaps between two filled blocks cannot be
smaller than 8 bytes. Thus, depending on the size of received CRYPTO
frames and their order, ncbuf may not be sufficient. Over time, several
mechanisms were implemented in haproxy QUIC frames parsing to overcome
the ncbuf limitation.
However, reports recently highlight that with some clients haproxy is
not able to deal with CRYPTO frames reception. In particular, this is
the case with the latest ngtcp2 release, which implements a similar
chaos protection mechanism via the following patch. It also seems that
this impacts haproxy interaction with firefox.
To fix haproxy CRYPTO frames buffering once and for all, an alternative
non-contiguous buffer named ncbmbuf has been recently implemented. This
type does not suffer from gaps size limitation, albeit at the cost of a
small reduction in the size available for data storage.
Thus, the purpose of this current patch is to replace ncbuf with the
newer ncbmbuf for QUIC CRYPTO frames parsing. Now, ncbmb_add() is used
to buffer received frames which is guaranteed to suceed. The only
remaining case of error is if a received frame offset and length exceed
the ncbmbuf data storage, which would result in a CRYPTO_BUFFER_EXCEEDED
error code.
A notable behavior change when switching to ncbmbuf implementation is
that NCB_ADD_COMPARE mode cannot be used anymore during add. Instead,
crypto frame content received at a similar offset will be overwritten.
A final note regarding STREAM frames parsing. For now, it is considered
unnecessary to switch from ncbuf in this case. Indeed, QUIC clients does
not perform aggressive fragmentation for them. Keeping ncbuf ensure that
the data storage size is bigger than the equivalent ncbmbuf area.
This should fix github issue #3141.
This patch must be backported up to 2.6. It is first necessary to pick
the relevant commits for ncbmbuf implementation prior to it.
Amaury Denoyelle [Thu, 16 Oct 2025 13:05:50 +0000 (15:05 +0200)]
MINOR: ncbmbuf: implement advance operation
Implement ncbmb_advance() function for the ncbmbuf type. This allows to
remove bytes in front of the buffer, regardless of the existing gaps.
This is implemented by resetting the corresponding bits of the bitmap.
As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
Amaury Denoyelle [Wed, 15 Oct 2025 09:56:16 +0000 (11:56 +0200)]
MINOR: ncbmbuf: implement ncbmb_data()
Implement ncbmb_data() function for the ncbmbuf type. Its purpose is
similar to its ncbuf counterpart : it returns the size in bytes of data
starting at a specific offset until the next gap.
As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
Extend private API for ncbmbuf type by defining an iterator type for the
buffer bitmap handling. The purpose is to provide a simple method to
iterate over the bitmap one byte at a time, with a proper bitmask set to
hide irrelevant bits.
This internal type is unused for now, but will become useful when
implementing ncb_data() and ncb_advance() functions.
As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
Amaury Denoyelle [Tue, 14 Oct 2025 14:17:14 +0000 (16:17 +0200)]
MINOR: ncbmbuf: implement add
This patch implements add operation for ncbmbuf type.
This function is simpler than its ncbuf counterpart. Indeed, for now
only NCB_ADD_OVERWRT mode is supported. This compromise has been chosen
as ncbmbuf will be first used for QUIC CRYPTO frames handling, which
does not mandate to compare existing filled blocks during insertion.
As the previous patch, this commit must be backported prior to the fix
to come on QUIC CRYPTO frames parsing.
Amaury Denoyelle [Tue, 14 Oct 2025 09:29:40 +0000 (11:29 +0200)]
MINOR: ncbmbuf: define new ncbmbuf type
Define ncbmbuf which is an alternative non-contiguous buffer
implementation. "bm" abbreviation stands for bitmap, which reflects how
gaps and filled blocks are encoded. The main purpose of this
implementation is to get rid of the ncbuf limitation regarding the
minimal size for gaps between two blocks of data.
This commit adds the new module ncbmbuf. Along with it, some utility
functions such as ncbmb_make(), ncbmb_init() and ncbmb_is_empty() are
defined. Public API of ncbmbuf will be extended in the following
patches.
This patch is not considered a bug fix. However, it will be required to
fix issue encountered on QUIC CRYPTO frames parsing. Thus, it will be
necessary to backport the current patch prior to the fix to come.
Amaury Denoyelle [Tue, 14 Oct 2025 09:35:21 +0000 (11:35 +0200)]
MINOR: ncbuf: extract common types
ncbuf is a module which provide a non-contiguous buffer type
implementation. This patch extracts some basic types related to it into
a new file ncbuf_common.h.
This patch will be useful to provide a new non-contiguous buffer
alternative implementation based on a bitmap.
This patch is not a bug fix. However, it is necessary for ncbmbuf
implementation which will be required to fix a QUIC issue on CRYPTO
frames parsing. This, it will be necessary to backport the current patch
prior to the fix to come.
Willy Tarreau [Wed, 22 Oct 2025 07:01:54 +0000 (09:01 +0200)]
BUG/MAJOR: pools: fix default pool alignment
The doc in commit 977feb5617 ("DOC: api: update the pools API with the
alignment and typed declarations") says that alignment of zero means
the type's alignment. And this is followed by the DECLARE_TYPED_POOL()
macro. Yet this is not what is done in create_pool_from_reg() which
only raises the alignment to a void* if lower, while it should start
from the type's. The effect is haproxy refusing to start on some 32-bit
platforms since that commit, displaying an error such as:
"BUG in the code: at src/mux_h2.c:454, requested creation of pool
'h2s' aligned to 4 while type requires alignment of 8! Please
report to developers. Aborting."
Let's just apply the default type's alignment.
Thanks to @tianon for reporting this in GH issue #3168. No backport is
needed since aligned pools are 3.3-only.
Amaury Denoyelle [Tue, 21 Oct 2025 13:06:39 +0000 (15:06 +0200)]
BUG/MEDIUM: h3: properly encode response after interim one in same buf
Recently, proper support for interim responses forwarding to HTTP/3
client has been implemented. However, there was still an issue if two
responses are both encoded in the same snd_buf() iteration.
The issue is caused due to H3 HEADERS frame encoding method : 5 bytes
are reserved in front of the buffer to encode both H3 frame type and
varint length field. After proper headers encoding, output buffer head
is adjusted so that length can be encoded using the minimal varint size.
However, if the buffer is not empty due to a previous response already
encoded but not yet emitted, messing with the buffer head will corrupt
the entire H3 message. This only happens when encoding of both responses
is done in the same snd_buf() iteration, or at least without emission to
quic_conn layer in between.
The result of this bug is that the HTTP/3 client will be unable to parse
the response, most of the time reporting a formatting error. This can
be reproduced using the following netcat as HTTP/1 server to haproxy :
$ while sleep 0.2; do \
printf "HTTP/1.1 100 continue\r\n\r\nHTTP/1.1 200 ok\r\nContent-length: 5\r\nConnection: close\r\n\r\nblah\n" | nc -lp8002
done
To fix this, only adjust buffer head if content is empty. If this is not
the case, frame length is simply encoded as a 4-bytes varint size so
that messages are contiguous in the buffer.
BUG/MEDIUM: h1-htx: Don't set HTX_FL_EOM flag on 1xx informational messages
1xx informational messages are part of the HTTP response. It is not expected
to have a HX_FL_EOM flag set after parsing such messages when received from
a server. It is espacially important whne an informational messages is
processed on client side while the final response was not recieved yet, to
not erroneously detect the end of the message.
The HTTP multiplexers seem to ignore the HTX_FL_EOM flag for information
messages, but it remains an error from the HTX specification point of
view. So it must be fixed.
While it should theorically be backported as far as 3.0, it is a good idea
to not do so for now because no bug was reported and regressions may happen.
Olivier Houchard [Wed, 15 Oct 2025 14:01:21 +0000 (16:01 +0200)]
MEDIUM: stick-tables: Stop as soon as stktable_trash_oldest succeeds.
stktable_trash_oldest() goes through all the shards, trying to free a
number of entries. Going through each shard is expensive, as we have to
take the shard lock, so stop as soon as we free'd at least one entry, as
it is only called when we want to make room for one entry.
Olivier Houchard [Tue, 14 Oct 2025 16:51:32 +0000 (18:51 +0200)]
MEDIUM: stick-tables: Stop if stktable_trash_oldest() fails.
In stksess_new(), if the table is full, we call stktable_trash_oldest()
to remove a few entries so that we have some room for a new one.
It is unlikely, but possible, that stktable_trash_oldest() will fail. If
so, just give up and do not add the new entry, instead of adding it
anyway.
Give up if stktable_trash_oldest() fails to free any entry
MEDIUM: stick-tables: Use a per-shard expiration task
Instead of having per-table expiration tasks, just use one per shard.
The task will now go through all the tables to expire entries. When a
table gets an expiration earlier than the one previously known, it will
be put in a mt-list, and the task will be responsible to put it into an
eb32, ordered based on the next expiration.
Each per-shard task will run on a different thread, so it should lead to
a better load distribution than the per-table tasks.
Olivier Houchard [Thu, 16 Oct 2025 13:45:52 +0000 (15:45 +0200)]
MINOR: initcalls: Add a new initcall stage, STG_INIT_2
Add a new initcall stage, STG_INIT_2, for stuff to be called after
step_init_2() is called, so after we know for sure that global.nbthread
will be set.
Modify stick-tables stkt_late_init() to run at STG_INIT_2 instead of
STG_INIT, in anticipation for it to be enhanced and have a need for
global.nbthread.
Willy Tarreau [Mon, 20 Oct 2025 12:50:27 +0000 (14:50 +0200)]
BUG/MEDIUM: cli: also free the trash chunk on the error path
Since commit 20ec1de214 ("MAJOR: cli: Refacor parsing and execution of
pipelined commands"), command not returning any response (e.g. "quit")
don't pass through the free_trash_chunk() call, possibly leaking the
cmdline buffer. A typical way to reproduce it is to loop on "quit" on
the CLI, though it very likely affects other specific commands.
Let's make sure in the release handler that we always release that
chunk in any case. This must be backported to 3.2.
BUG/MINOR: quic-be: unchecked connections during handshakes
This bug impacts only the backends.
The ->conn (pointer to struct connection) member validity of the ssl_sock_ctx
struct was not checked before being dereferenced, leading to possible crashes
in qc_ssl_do_hanshake() during handshake.
This was reported by GH #3163 issue.
No need to backport because the QUIC backend support arrived with 3.3
Olivier Houchard [Sun, 19 Oct 2025 21:17:55 +0000 (23:17 +0200)]
BUG/MEDIUM: mt_list: Make sure not to unlock the element twice
In mt_list_delete(), if the element was not in a list, then n and p will
point to it, and so setting n->prev and n->next will be enough to unlock it.
Don't do it twice, as once it's been done the first time, another thread may
be working with it, and may have added it to a list already, and doing it
a second time can lead to list inconsistencies.
Willy Tarreau [Sat, 18 Oct 2025 09:24:05 +0000 (11:24 +0200)]
[RELEASE] Released version 3.3-dev10
Released version 3.3-dev10 with the following main changes :
- BUG/MEDIUM: connections: Only avoid creating a mux if we have one
- BUG/MINOR: sink: retry attempt for sft server may never occur
- CLEANUP: mjson: remove MJSON_ENABLE_RPC code
- CLEANUP: mjson: remove MJSON_ENABLE_PRINT code
- CLEANUP: mjson: remove MJSON_ENABLE_NEXT code
- CLEANUP: mjson: remove MJSON_ENABLE_BASE64 code
- CLEANUP: mjson: remove unused defines and math.h
- BUG/MINOR: http-ana: Reset analyse_exp date after 'wait-for-body' action
- CLEANUP: mjson: remove unused defines from mjson.h
- BUG/MINOR: acme: avoid overflow when diff > notAfter
- DEV: patchbot: use git reset+checkout instead of pull
- MINOR: proxy: explicitly permit abortonclose on frontends and clarify the doc
- REGTESTS: fix h2_desync_attacks to wait for the response
- REGTESTS: http-messaging: fix the websocket and upgrade tests not to close early
- MINOR: proxy: only check abortonclose through a dedicated function
- MAJOR: proxy: enable abortonclose by default on HTTP proxies
- MINOR: proxy: introduce proxy_abrt_close_def() to pass the desired default
- MAJOR: proxy: enable abortonclose by default on TLS listeners
- MINOR: h3/qmux: Set QC_SF_UNKNOWN_PL_LENGTH flag on QCS when headers are sent
- MINOR: stconn: Add two fields in sedesc to replace the HTX extra value
- MINOR: h1-htx: Increment body len when parsing a payload with no xfer length
- MINOR: mux-h1: Set known input payload length during demux
- MINOR: mux-fcgi: Set known input payload length during demux
- MINOR: mux-h2: Use <body_len> H2S field for payload without content-length
- MINOR: mux-h2: Set known input payload length of the sedesc
- MINOR: h3: Set known input payload length of the sedesc
- MINOR: stconn: Move data from kip to kop when data are sent to the consumer
- MINOR: filters: Reset knwon input payload length if a data filter is used
- MINOR: hlua/http-fetch: Use <kip> instead of HTX extra field to get body size
- MINOR: cache: Use the <kip> value to check too big objects
- MINOR: compression: Use the <kip> value to check body size
- MEDIUM: mux-h1: Stop to use HTX extra value when formatting message
- MEDIUM: htx: Remove the HTX extra field
- MEDIUM: acme: don't insert acme account key in ckchs_tree
- BUG/MINOR: acme: memory leak from the config parser
- CI: cirrus-ci: bump FreeBSD image to 14-3
- BUG/MEDIUM: ssl: take care of second client hello
- BUG/MINOR: ssl: always clear the remains of the first hello for the second one
- BUG/MEDIUM: stconn: Properly forward kip to the opposite SE descriptor
- MEDIUM: applet: Forward <kip> to applets
- DEBUG: mux-h1: Dump <kip> and <kop> values with sedesc info
- BUG/MINOR: ssl: leak in ssl-f-use
- BUG/MINOR: ssl: leak crtlist_name in ssl-f-use
- BUILD: makefile: disable tail calls optimizations with memory profiling
- BUG/MEDIUM: apppet: Improve spinning loop detection with the new API
- BUG/MINOR: ssl: Free global_ssl structure contents during deinit
- BUG/MINOR: ssl: Free key_base from global_ssl structure during deinit
- MEDIUM: jwt: Remove certificate support in jwt_verify converter
- MINOR: jwt: Add new jwt_verify_cert converter
- MINOR: jwt: Do not look into ckch_store for jwt_verify converter
- MINOR: jwt: Add new "jwt" certificate option
- MINOR: jwt: Add specific error code for known but unavailable certificate
- DOC: jwt: Add doc about "jwt_verify_cert" converter
- MINOR: ssl: Dump options in "show ssl cert"
- MINOR: jwt: Add new "add/del/show ssl jwt" CLI commands
- REGTEST: jwt: Test new CLI commands
- BUG/MINOR: ssl: Potential NULL deref in trace macro
- MINOR: regex: use a thread-local match pointer for pcre2
- BUG/MEDIUM: pools: fix bad freeing of aligned pools in UAF mode
- MEDIUM: pools: detect() when munmap() fails in UAF mode
- TESTS: quic: useless param for b_quic_dec_int()
- BUG/MEDIUM: pools: fix crash on filtered "show pools" output
- BUG/MINOR: pools: don't report "limited to the first X entries" by default
- BUG/MAJOR: lb-chash: fix key calculation when using default hash-key id
- BUG/MEDIUM: stick-tables: Don't forget to dec count on failure.
- BUG/MINOR: quic: check applet_putchk() for 'show quic' first line
- TESTS: quic: fix uninit of quic_cc_path const member
- BUILD: ssl: can't build when using -DLISTEN_DEFAULT_CIPHERS
- BUG/MAJOR: quic: uninitialized quic_conn_closed struct members
- BUG/MAJOR: quic: do not reset QUIC backends fds in closing state
- BUG/MINOR: quic: SSL counters not handled
- DOC: clarify the experimental status for certain features
- MINOR: config: remove experimental status on tune.disable-fast-forward
- MINOR: tree-wide: add missing TAINTED flags for some experimental directives
- MEDIUM: config: warn when expose-experimental-directives is used for no reason
- BUG/MEDIUM: threads/config: drop absent threads from thread groups
- REGTESTS: remove experimental from quic/retry.vtc
Willy Tarreau [Fri, 17 Oct 2025 18:55:43 +0000 (20:55 +0200)]
REGTESTS: remove experimental from quic/retry.vtc
Recent commit 8b7a82cd30 ("MEDIUM: config: warn when
expose-experimental-directives is used for no reason") triggered on
this test exactly for the reason it was made for. The tests were just
done without quic on it. Let's drop the unneeded option.
Willy Tarreau [Fri, 17 Oct 2025 18:36:00 +0000 (20:36 +0200)]
BUG/MEDIUM: threads/config: drop absent threads from thread groups
Thread groups can be assigned arbitrary thread ranges, but if the
mentioned threads do not exist, this causes crashes in listener_accept()
or some connections to be ignored. The reason is that the calculated
mask is derived from the thread group's enabled threads count. Examples:
This commit removes missing threads, emits a warning when the thread
group just has less threads than requested, and an error when it is
left with no threads at all.
This must be backported to 3.1 since the issue is present there already.
Willy Tarreau [Fri, 17 Oct 2025 16:15:12 +0000 (18:15 +0200)]
MEDIUM: config: warn when expose-experimental-directives is used for no reason
If users start to enable expose-experimental-directives for the purpose
of testing one specific feature, there are chances that the option remains
forever and hides the experimental status of other options.
Let's emit a warning if the option appears and is not used. This will
remind users that they can now drop it, and help keep configs safe for
future upgrades.
Willy Tarreau [Fri, 17 Oct 2025 15:57:40 +0000 (17:57 +0200)]
MINOR: tree-wide: add missing TAINTED flags for some experimental directives
We normally taint the process when using experimental directives, but
a handful of places were missed so we don't always know that they are
in use. Let's fix these places (hint for future directives, just look
for places checking for "experimental_directives_allowed", and add
"mark_tainted(TAINTED_CONFIG_EXP_KW_DECLARED);").
Willy Tarreau [Fri, 17 Oct 2025 16:55:03 +0000 (18:55 +0200)]
MINOR: config: remove experimental status on tune.disable-fast-forward
The option was turned to off by default in 2.8 with commit 2f7c82bfd
("BUG/MINOR: haproxy: Fix option to disable the fast-forward"), however
at the same time it should have dropped its experimental status since
the feature is enabled by default. The only goal of the option is to
debug something, like many other tune.xxx options. The option should
still normally not be used without being invited to do so by developers
looking for something specific though.
This could be backported if desired to simplify debugging, though this
has never been needed for now.
Willy Tarreau [Fri, 17 Oct 2025 16:39:03 +0000 (18:39 +0200)]
DOC: clarify the experimental status for certain features
Certain features require "expose-experimental-directives" to be set in
the global section. Let's clarify that experimental featuers are only
maintained in best effort mode, may break during the stable cycle, and
are generally not maintained beyond the release of the next LTS branch
since it is extremely challenging, and early adopters are expected to
upgrade to benefit from improvements anyway.
The SSL counters were not handled at all for QUIC connections. This patch
implement ssl_sock_update_counters() extracting the code from ssl_sock.c
and call this function where applicable both in TLS/TCP and QUIC parts.
BUG/MAJOR: quic: do not reset QUIC backends fds in closing state
This bug impacts only the backends.
When entering the closing state, a quic_closed_conn is used to replace the quic_conn.
In this state, the ->fd value was reset to -1 value calling qc_init_fd(). This value
is used by qc_may_use_saddr() which supposes it cannot be -1 for a backend, leading
->li to be dereferencd, which is legal only for a listener.
This bug impacts only the backend but with possible crash when qc_may_use_saddr()
is called: qc_test_fd() is false leading qc->li to be dereferenced. This is legal
only for a listener.
This patch prevents such fd value resettings for backends.
No need to backport because the QUIC backends support arrived with 3.3.
BUG/MAJOR: quic: uninitialized quic_conn_closed struct members
A quic_conn_closed struct is initialized to replace the quic_conn when the
connection enters the closing to reduce the connection memory footprint.
->max_udp_payload quic_conn_close was not initialized leading to possible
BUG_ON()s in qc_rcv_buf() when comparing the RX buf size to this payload.
->cntrs counters were alon not initialized with the only consequence
to generate wrong values for these counters.
BUILD: ssl: can't build when using -DLISTEN_DEFAULT_CIPHERS
Emeric reported that he can't build haproxy anymore since 9bc6a034
("BUG/MINOR: ssl: Free global_ssl structure contents during deinit").
src/ssl_sock.c:7020:40: error: comparison with string literal results in unspecified behavior [-Werror=address]
7020 | if (global_ssl.listen_default_ciphers != LISTEN_DEFAULT_CIPHERS)
| ^~
src/ssl_sock.c:7023:41: error: comparison with string literal results in unspecified behavior [-Werror=address]
7023 | if (global_ssl.connect_default_ciphers != CONNECT_DEFAULT_CIPHERS)
| ^~
src/ssl_sock.c: At top level:
Indeed the mentionned patch is checking the pointer in order to free
something freeable, but that can't work because these constant are
strings literal which can be passed from the compiler and not pointers.
Also the test is not useful, because these strings are strdup() in
__ssl_sock_init, so they can be free directly.
Must be backported in every stable branches with 9bc6a034.
Amaury Denoyelle [Mon, 13 Oct 2025 16:16:22 +0000 (18:16 +0200)]
BUG/MINOR: quic: check applet_putchk() for 'show quic' first line
Ensure applet_putchk() return value is checked when outputing via the
CLI 'show quic' header line.
This is only to align with other usages of the same function, as trash
output buffer should always be large enough for it. As such, the command
is simply aborted if this is not the case.
This should fix coverity report from github issue #3139.
Olivier Houchard [Tue, 14 Oct 2025 16:11:31 +0000 (18:11 +0200)]
BUG/MEDIUM: stick-tables: Don't forget to dec count on failure.
In stksess_new(), if we failed to allocate memory for the new stksess,
don't forget to decrement the table entry count, as nobody else will
do it for us.
An artificially high count could lead to at least purging entries while
there is no need to.
Willy Tarreau [Thu, 16 Oct 2025 08:30:57 +0000 (10:30 +0200)]
BUG/MAJOR: lb-chash: fix key calculation when using default hash-key id
A subtle regression was introduced in 3.0 by commit faa8c3e02 ("MEDIUM:
lb-chash: Deterministic node hashes based on server address"). When keys
are calculated from the server's ID (which is the default), due to the
reorganisation of the code, the key ended up being hashed twice instead
of being multiplied by the scaling range.
While most users will never notice it, it is blocking some large cache
users from upgrading from 2.8 to 3.0 or 3.2 because the keys are
redistributed.
After a check with users on the mailing list [1] it was estimated that
keep the current situation is the worst choice because those who have
not yet upgraded will face the problem while by fixing it, those who
already have and for whom it happened smoothly will handle it just
right again.
As such this fix must be backported to 3.0 without waiting (in order
to preserve those who upgrade from two redistributions). Please note
that only configurations featuring "hash-type consistent" and not
having "hash-key" present with a value other than "id" are affected,
others are not (e.g. "hash-key addr" is unaffected).
Willy Tarreau [Thu, 16 Oct 2025 06:38:35 +0000 (08:38 +0200)]
BUG/MINOR: pools: don't report "limited to the first X entries" by default
With the fix in commit 982805e6a3 ("BUG/MINOR: pools: Fix the dump of
pools info to deal with buffers limitations"), the max count is now
compared to the number of dumped pools instead of the configured
numbered, and keeping >= is no longer valid because maxcnt is set by
default to the same value when not set, so this means that since this
patch we're always displaying "limited to the first X entries" where X
is the number of dumped entries even in the absence of any limitation.
Let's just fix the comparison to only show this when the limit is lower.
This must be backported to 3.2 where the patch above already is.
Willy Tarreau [Thu, 16 Oct 2025 06:27:44 +0000 (08:27 +0200)]
BUG/MEDIUM: pools: fix crash on filtered "show pools" output
The truncation of pools output that was adressed in commit 982805e6a3
("BUG/MINOR: pools: Fix the dump of pools info to deal with buffers
limitations") required to split the pools filling from dumping. However
there is a problem when a limit is passed that is lower than the number
of pools or if a pool name is specified or if pool caches are disabled,
because in this case the number of filled slots will be lower than the
initially allocated one, and empty entries will be visited either by the
sort functions when filling the entries if "byxxx" is specified, or by
the dump function after the last entry, but none of these functions was
expecting to be passed a NULL entry.
Let's just re-adjust nbpools to match the number of filled entries at
the end. Anyway the totals are calculated on the number of dumped
entries.
This must be backported to 3.2 since the fix above was backported there
as well.
The third parameter passed to b_quic_dec_int() is unitialized. This is not a bug.
But this disturbs coverity for an unknown reason as revealed by GH issue #3154.
This patch takes the opportunity to use NULL as passed value to avoid using such
an uneeded third parameter.
Should be backported to 3.2 where this unit test was introduced.
Willy Tarreau [Mon, 13 Oct 2025 17:22:31 +0000 (19:22 +0200)]
MEDIUM: pools: detect() when munmap() fails in UAF mode
Better check that munmap() always works, otherwise it means we might
have miscalculated an address, and if it fails silently, it will eat
all the memory extremely quickly. Let's add a BUG_ON() on munmap's
return.
Willy Tarreau [Mon, 13 Oct 2025 17:15:55 +0000 (19:15 +0200)]
BUG/MEDIUM: pools: fix bad freeing of aligned pools in UAF mode
As reported by Christopher, in UAF mode memory release of aligned
objects as introduced in commit ef915e672a ("MEDIUM: pools: respect
pool alignment in allocations") does not work. The padding calculation
in the freeing code is no longer correct since it now depends on the
alignment, so munmap() fails on EINVAL. Fortunately we don't care much
about it since we know it's the low bits of the passed address, which
is much simpler to compute, since all mmaps are page-aligned.
There's no need to backport this, as this was introduced in 3.3.
Willy Tarreau [Mon, 13 Oct 2025 14:47:50 +0000 (16:47 +0200)]
MINOR: regex: use a thread-local match pointer for pcre2
The pcre2 matching requires an array of matches for grouping, that is
allocated when executing the rule by pre-processing it, and that is
immediately freed after use. This is quite inefficient and results in
annoying patterns in "show profiling" that attribute the allocations
to libpcre2 and the releases to haproxy.
A good suggestion from Dragan is to pre-allocate these per thread,
since the entry is not specific to a regex. In addition we're already
limited to MAX_MATCH matches so we don't even have the problem of
having to grow it while parsing nor processing.
The current patch adds a per-thread pair of init/deinit functions to
allocate a thread-local entry for that, and gets rid of the dynamic
allocations. It will result in cleaner memory management patterns and
slightly higher performance (+2.5%) when using pcre2.
MINOR: jwt: Add new "add/del/show ssl jwt" CLI commands
The new "add/del ssl jwt <file>" commands allow to change the "jwt" flag
of an already loaded certificate. It allows to delete certificates used
for JWT validation, which was not yet possible.
The "show ssl jwt" command iterates over all the ckch_stores and dumps
the ones that have the option set.
DOC: jwt: Add doc about "jwt_verify_cert" converter
Add information about the new "jwt_verify_cert" converter and update the
existing "jwt_converter" doc to remove mentions of certificates from it.
Add information about the new "jwt" certificate option.
MINOR: jwt: Add specific error code for known but unavailable certificate
A certificate that does not have the 'jwt' flag enabled cannot be used
for JWT validation. We now raise a specific return value so that such a
case can be identified.
This option can be used to enable the use of a given certificate for JWT
verification. It defaults to 'off' so certificates that are declared in
a crt-store and will be used for JWT verification must have a
"jwt on" option in the configuration.
This converter will be in charge of performing the same operation as the
'jwt_verify' one except that it takes a full-on pem certificate path
instead of a public key path as parameter.
The certificate path can be either provided directly as a string or via
a variable. This allows to use certificates that are not known during
init to perform token validation.
MEDIUM: jwt: Remove certificate support in jwt_verify converter
The jwt_verify converter will not take full-on certificates anymore
in favor of a new soon to come jwt_verify_cert. We might end up with a
new jwt_verify_hmac in the future as well which would allow to deprecate
the jwt_verify converter and remove the need for a specific internal
tree for public keys.
The logic to always look into the internal jwt tree by default and
resolve to locking the ckch tree as little as possible will also be
removed. This allows to get rid of the duplicated reference to
EVP_PKEYs, the one in the jwt tree entry and the one in the ckch_store.
BUG/MINOR: ssl: Free global_ssl structure contents during deinit
Some fields of the global_ssl structure are strings that are strdup'ed
but never freed. There is only one static global_ssl structure so not
much memory is used but we might as well free it during deinit.
This patch can be backported to all stable branches.
BUG/MEDIUM: apppet: Improve spinning loop detection with the new API
Conditions to detect the spinning loop for applets based on the new API are
not accurrate. We cannot continue to check the channel's buffers state to
know if an applet has made some progress. At least, we must also check the
applet's buffers.
After digging to find the right way to do, it was clear that the best is to
use something similar to what is performed for the streams, namely, checking
read and write events. And in fact, it is quite easy to do with the new
API. So let's do so.