Willy Tarreau [Wed, 3 Jun 2026 13:01:51 +0000 (15:01 +0200)]
[RELEASE] Released version 3.4.0
Released version 3.4.0 with the following main changes :
- BUG/MINOR: tcpcheck: Check LDAP response to not read more data than available
- BUG/MINOR: ssl-gencert: validate SNI characters to prevent SAN certificate injection
- BUG/MINOR: mux-h1: H2 preface rejection doesn't update stick-table glitches
- BUG/MEDIUM: cpu-topo: Enforce thread-hard-limit on policy
- BUG/MEDIUM: qmux: do not crash on too large record
- BUG/MEDIUM: qmux: do not crash on receiving an invalid first frame
- BUG/MINOR: qmux: reject too large initial record
- Revert "BUG/MEDIUM: dns: fix long loops in additional records parse on name failure"
- BUG/MINOR: qpack: Fix index calculation in debug functions
- BUG/MINOR: qpack: fix potential null-pointer dereference in qpack_dht_insert()
- CLEANUP: qpack: fix copy-paste typo in value Huffman debug string
- BUG/MINOR: qpack: fix sign bit mask in qpack_decode_fs_pfx()
- CLEANUP: qpack: fix copy-paste typo in value Huffman debug string for WLN
- BUG/MINOR: qpack: fix huff_dec() error handling in qpack_decode_fs()
- CLEANUP: qpack: move encoded macros to qpack-t.h to avoid duplication
- BUG/MEDIUM: quic: handle ECONNREFUSED on RX side
- BUG/MINOR: quic: Fix memory leak in quic_deallocate_dghdlrs()
- BUG/MEDIUM: lua: defer Lua VM initialisation to the first Lua config keyword
- REGTESTS: lua: fix tune.lua.openlibs in Lua reg-tests
- BUG/MINOR: mux-h2: Count padding for connection flow control on error path
- BUILD: addons: convert 51d addon to EXTRA_MAKE
- BUILD: addons: convert deviceatlas addon to EXTRA_MAKE
- BUILD: addons: convert WURFL addon to EXTRA_MAKE
- MINOR: mux_quic/flags: add missing flags
- BUG/MINOR: mux_quic: open an idle QCS on reset on BE side
- BUG/MINOR: mux_quic: fix BE conn removal on app shutdown
- BUG/MINOR: mux_quic: prevent BE reuse with an errored conn
- BUG/MINOR: quic: fix ack range node pool_free call passing wrong pointer type
- MEDIUM: quic: optimize HKDF operations by reusing per-thread contexts
- BUG/MEDIUM: quic: reset cwnd in slow_start on persistent congestion (cubic)
- BUG/MEDIUM: quic: reset consecutive_losses on exit from recovery period (cubic)
- BUG/MINOR: quic: update drs->lost before calling on_ack_recv
- Revert "MEDIUM: quic: optimize HKDF operations by reusing per-thread contexts"
- BUG/MEDIUM: lua: register hlua_init() as a pre-check to fix crash without Lua config
- REGTESTS: quic: disable quic/ocsp_auto_update for now
- BUG/MINOR: threads: set at least grp_max when mtpg is too small
- BUG/MEDIUM: threads: ignore max-threads-per-group when thread-groups is set
- CLEANUP: thread: indicate when max-threads-per-group is ignored
- MINOR: cpu-topo: notify when cpu-policy is ignored due to other settings
- MINOR: thread: report when thread-groups or nbthread results in less threads
- BUILD: makefile: include EXTRA_MAKE in the .build_opts construction
- BUG/MINOR: quic: Fix another buffer overflow with sockaddr_in46
- MINOR: quic: Copy sin6_flowinfo and sin6_scope_id too
- BUILD: Makefile: put EXTRA_MAKE help at the right place
- BUG/MINOR: cache: fix cache tree iteration
- BUG/MEDIUM: resolvers: Wait a bit before calling the xprt prepare_srv
- CLEANUP: addons/51degrees: initialize variables
- MINOR: addons/51degrees: handle memory allocation failures
- CLEANUP: ncbmbuf: improve handling of memory allocation errors in unit tests
- CLEANUP: admin/halog: improve handling of memory allocation errors
- DOC: internals: clarify ambiguous wording in core-principles
- DOC: internals: add a threat model definition
- DOC: add security.txt describing how to report security issues
- DOC: security: also add a note to exclude dev/ and admin/
- BUG/MEDIUM: qmux: Close connection on invalid frame
- CLEANUP: fix comment typo
- BUG/MEDIUM: h3: fix MAX_PUSH_ID handling
- BUG/MINOR: cache: Fix copy of value when parsing maxage
- BUG/MEDIUM: mux-h1: Dup connection/upgrade value to parse it when making headers
- BUG/MEDIUM: htx: Fix headers rollback on partial copy in htx_xfer()
- MINOR: deinit: release the in-memory copy of shared libs
- MINOR: debug: add -dA to dump an archive of all dependencies
- BUG/MEDIUM: ssl: Make sure the alpn length is small enough
- BUG/MINOR: applet: Commit changes into input buffer after sending HTX data
- BUG/MINOR: mux-spop: Fix possible off-by-one OOB read in spop_get_varint()
- BUG/MEDIUM: leastconn: Unlock the write lock on allocation failure
- BUG/MINOR: tasks: Increase the right niced_task counter
- BUILD: makefile: search for Lua 5.5 as well
- DEV: dev/gdb: improve ebtree pointer handling
- DEV: dev/gdb: add simple task dump
- DEV: dev/gdb: add simple thread dump
- DEV: dev/gdb: add fdtab dump
- DOC: config: add a few more explanation in http-reusee regarding sni-auto
- REGTESTS: add basic QMux tests
- BUG/MINOR: http-act: Properly handle final evaluation in pause action
- BUILD: makefile/lua: use the system's default library before all other variants
- BUG/MINOR: startup: unbreak chroot with CAP_SYS_CHROOT
- BUG/MINOR: haterm: do not try to bind QUIC when not supported
- BUG/MINOR: haterm: also apply the tcp-bind-opts to clear TCP "bind" lines
- CLEANUP: haterm: do not try to bind to SSL when not built in
- MINOR: haterm: enable ktls on the SSL bind line when supported
- CI: github: replace cirrus by a vmactions/freebsd-vm job
- BUILD: makefile: fix build error with GNU make 4.2.1 and /bin/dash
- BUG/MEDIUM: channel: Fix condition to know if a channel may send
- BUG/MEDIUM: vars: Properly eval set-var-fmt action for emtpy log-format string
- CI: github: run illumos job weekly on Mondays at 03:00 instead of monthly
- BUG/MEDIUM: stream: Don't use small buffer on queuing with a request data filter
- BUG/MINOR: jwe: don't write randoms past MAX_DECRYPTED_CEK_LEN in RSA_PKCS1_PADDING
- BUG/MEDIUM: chunk: do not rely on small trash by default for expressions
- CLEANUP: map: always test pat->ref in sample_conv_map_key()
- DEV: patchbot: prepare for new version 3.5-dev
- MINOR: version: mention that it's 3.4 LTS now.
Willy Tarreau [Wed, 3 Jun 2026 12:36:37 +0000 (14:36 +0200)]
CLEANUP: map: always test pat->ref in sample_conv_map_key()
sample_conf_map_key() calls pattern_exec_match() which may return a
static pattern with ref=NULL when passed with fill=1 (which is the
case) and pat->match == NULL (which doesn't seem to be the case). It
doesn't seem it could happen with standard maps, as only "-m found"
drops has a NULL ->match function and there's no keyword associated
with it) but maybe this could happen with maps implemented in Lua,
though this remains unlikely.
Anyway better clarify the situation by always checking that the ref
is non-null before dereferencing it, it will at least avoid warnings
from code coverage tools.
Willy Tarreau [Wed, 3 Jun 2026 12:00:31 +0000 (14:00 +0200)]
BUG/MEDIUM: chunk: do not rely on small trash by default for expressions
There's a corner case with get_trash_chunk_sz() combined with the use
of small bufs: if some incoming data is going to be inflated by a
converter in a non-predictable way (say url_enc etc) then there are
two possibilities:
- either we try to allocate a size that corresponds to the data, but
we risk to allocate a small buf to convert a 900B chunk, that will
now fail if it contains too many non-printable chars;
- or we try to allocate 3x the size to be conservative, but without
large bufs we'd fail to transcode any chunk larger than 5.3kB, even
if it contains only printable chars.
The approach should definitely be refined and it is not 100% reliable
for now. Better temporarily ignore the small buffers for these particular
cases where the savings are not relevant, and see how to pass the knowledge
of the expected size ranges deeper down the API in 3.5. We may possibly rely
on the current trash size (instead of contents) or other mechanisms that
are yet to be specified. alloc_small_trash_chunk() gets the same change
BTW for the same reasons.
The comment for get_trash_chunk_sz() was updated to restate the importance
of being conservative when requesting a size.
Willy Tarreau [Wed, 3 Jun 2026 11:23:32 +0000 (13:23 +0200)]
BUG/MINOR: jwe: don't write randoms past MAX_DECRYPTED_CEK_LEN in RSA_PKCS1_PADDING
The recent fix in commit 1a5a33396d ("BUG/MEDIUM: jwe: substitute random
CEK on RSA1_5 decryption failure per RFC 7516 #11.5") writes 8 bytes at
once but stops at the last one, so it can overflow the sample by 7 bytes.
This is totally harmless since the max size is 64 bytes, but better stop
at the boundary. A final loop completes one byte at a time by construction
so that we can adapt to any value of MAX_DECRYPTED_CEK_LEN, but the compiler
will not emit it since we stop at 64.
BUG/MEDIUM: stream: Don't use small buffer on queuing with a request data filter
When there is a filter registered on the request data forwarding, we must
disable usage of the small buffers. For now it is safer to do so because we
don't know if the filter will properly handle the small buffers. In
addition, there is a true issue because it is possible to never re-arm the
receives in that case because the buffer reserve must be respected. This
leads to think a small buffer is always full, even empty one.
CI: github: run illumos job weekly on Mondays at 03:00 instead of monthly
The previous schedule (25th of each month) provided too little coverage
frequency. Switch to a weekly run every Monday at 03:00 UTC to catch
regressions sooner.
BUG/MEDIUM: vars: Properly eval set-var-fmt action for emtpy log-format string
When the log-format string was empty, in action_store() function, a fallback was
performed on the expression evaluation, thinking a set-var() was performed.
However, it is possible to have an empty log-format string. At least, on 3.2 and
3.0, it is allowed to parse an empty log-format string, quoted empty string are
not rejected.
So, on 3.2 and 3.0, it was possible to have a "set-var-fmt" action in the config
leading to parse an empty log-format string. Doing so, a crash could be
experienced when the action was executed because the fallback on the expression
evaluation led to dereference a NULL pointer.
To fix the issue, during parsing the action type is now set to a different value
for a "set-var" or a "set-var-fmt" action. And this action type is tested during
execution to perform the right action.
This patch should fix issue #3406. It must be backported as far as 3.0. Only 3.2
and 3.0 are affected by the issue.
BUG/MEDIUM: channel: Fix condition to know if a channel may send
Historically, we considered a channel cannot send before the connection was
established. This was useful to know if the reserve should still be
respected for the receives. This was because it was possible to rewrite the
request on connection retry (because of http-send-name-header option).
However noadays, it is a useless limitation. Once data forwarding is
started, there is no longer rewrites on the request at the stream layer
(http-send-name-header option is handled by the muxes). And, since it is
possible to use small buffers to queue requests, it could be an issue,
because the reserve and the small buffer size are the same by default. Once
a small request was finally dequeued, the receives on client side were not
re-armed because we should still respect the reserve on receives
(channel_recv_limit() was returning 0 in that case).
To fix the issue, we must consider a channel may send since the underlying
stconn has reached the SC_ST_REQ state, instead of SC_ST_EST. Doing so, we
are able to ignore the reserve earlier and the receives can be re-armed even
with small buffers.
There is no reason to backport this patch, except if an issue is reported,
because only the 3.4 is concerned. But it could theorically be backported to
all stable versions.
Willy Tarreau [Wed, 3 Jun 2026 10:01:47 +0000 (12:01 +0200)]
BUILD: makefile: fix build error with GNU make 4.2.1 and /bin/dash
The latest fix in the Makefile in commit 9993688954 ("BUILD: makefile/lua:
use the system's default library before all other variants") broke the
build on a machine with GNU make 4.2.1 and /bin/dash:
Makefile:690: *** unterminated call to function 'shell': missing ')'. Stop.
It's caused by the '#' in '#include'. Protecting it with a backslash
fixes the make issue but moves it to the shell where it's echoed in the
output. Printf '\043' works but not sure if it's everywhere yet. At this
point better just revert that tiny part which was made to refine the
presence check for lua.h by checking that it contains valid C code. If
the commit above is backported, this one will have to be as well.
Willy Tarreau [Tue, 2 Jun 2026 17:19:25 +0000 (19:19 +0200)]
MINOR: haterm: enable ktls on the SSL bind line when supported
When both USE_LINUX_SPLICE and USE_KTLS are enabled, it's worth
enabling kTLS on the bind line as it significantly increases the
local bit rate as well as through TLS accelerators (up to x2/x3).
The -dT option remains available to disable it. It was verified to
gracefully downgrade when not supported (e.g. OpenSSL 3.0.1 does
this).
Willy Tarreau [Tue, 2 Jun 2026 16:57:05 +0000 (18:57 +0200)]
CLEANUP: haterm: do not try to bind to SSL when not built in
When built without USE_OPENSSL, the binding errors are dirty, speaking
about crt-store and stuff like this. Better just indicate that SSL
support was not built in and explain how to enable it.
Willy Tarreau [Tue, 2 Jun 2026 16:52:56 +0000 (18:52 +0200)]
BUG/MINOR: haterm: also apply the tcp-bind-opts to clear TCP "bind" lines
Commit 92581043fb ("MINOR: haterm: add long options for QUIC and TCP
"bind" settings") added --tcp-bind-opts. The doc (and commit) says that
it applies to TCP bind lines but it only applied to the TCP/SSL ones,
not the clear ones. Let's fix it. No backport needed, this is only 3.4.
Willy Tarreau [Tue, 2 Jun 2026 16:46:01 +0000 (18:46 +0200)]
BUG/MINOR: haterm: do not try to bind QUIC when not supported
When building without QUIC support (e.g. an SSL library not supporting
it), we'll get errors when trying to bind to the SSL port that QUIC is
not supported because the quic binding was unconditional. Let's only
place it when QUIC is supported. No backport needed, this is only 3.4.
Maxime Henrion [Mon, 1 Jun 2026 14:49:02 +0000 (10:49 -0400)]
BUG/MINOR: startup: unbreak chroot with CAP_SYS_CHROOT
The use of the unshare() mechanism to get the ability to chroot as an
unprivileged user produced a warning on some configurations where the
haproxy process has the CAP_SYS_CHROOT capability. We now only attempt
to use it when a previous chroot() call failed because of insufficient
privileges.
This should fix GitHub issue #3395. No backport needed.
Willy Tarreau [Tue, 2 Jun 2026 14:34:37 +0000 (16:34 +0200)]
BUILD: makefile/lua: use the system's default library before all other variants
The recent update to the makefile in commit bfbca23dc2 ("BUILD: makefile:
search for Lua 5.5 as well") to enable searching for Lua 5.5 revealed a
problem by which we were using the fallback versions before the main one
(e.g. /usr/include/lua-5.4/lua.h before /usr/include/lua.h). However, the
libs often contain the version in their name so that we can end up linking
with 5.5 while 5.4 was used in the include.
This was detected only when enabling lua 5.5 because in Lua 5.4
"luaL_openlibs()" was a symbol and became an inline in 5.5, preventing
from using a mix of the two versions.
The current change is minimal in that it skips all fallbacks when lua.h
is present in /usr/include, and includes it in the test to make sure that
the directory found contains valid C. LUA_LIB checks for lua before the
variants so as to remain consistent with the system provided version.
Thanks to @gene-git for reporting this problem in GH issue #3404.
This may have to be backported after a period of observation if users
face build issues for older releases on newer distros. In this case,
backporting 1c0f781994 ("MINOR: hlua: Add support for lua 5.5") would
equally be needed. However this will result in the system's version
being used first, which may or may not be desired.
BUG/MINOR: http-act: Properly handle final evaluation in pause action
The ACT_OPT_FINAL flag was not properly handled in the pause action. When
this flag is set, because of an abort or an unexpected error, an action must
no longer yield. However, in the pause action, this flag was never tested.
In case of client abort for instance, this could trigger an internal error
instead of a client error.
This patch should fix the issue #3403. It must be backported as far as 3.2.
Willy Tarreau [Tue, 2 Jun 2026 07:10:27 +0000 (09:10 +0200)]
DOC: config: add a few more explanation in http-reusee regarding sni-auto
The default sni-auto that aims at not upsetting certain servers doing
excessive checks of SNI vs host has some drawbacks (lower reuse ratio)
that are particularly hard to diagnose, so let's explain how connections
are reused/purged when dealing with many hosts, and how to cheat as well.
Let's also mention the expression used by "sni-auto" since it was only
mentioned in the code.
Willy Tarreau [Wed, 27 May 2026 16:51:28 +0000 (18:51 +0200)]
DEV: dev/gdb: add fdtab dump
Three functions are provided here:
fd_dump: lists all FDs
fd_dump_conn: lists all FDs holding a connection
fd_dump_listener: lists all FDs holding a listener
They take no argument, and dump some of the known info. E.g. for
a connection, ctrl, xprt, flags, mux, sessions, frontend's name
and session's age are reported. Example:
Willy Tarreau [Wed, 27 May 2026 16:50:11 +0000 (18:50 +0200)]
DEV: dev/gdb: add simple thread dump
The thread_dump function dumps the list of known threads and a few info
on them (pointer, current run queue, flags etc). This should help more
easily spot a particular one and find stuck ones.
Willy Tarreau [Wed, 27 May 2026 16:49:51 +0000 (18:49 +0200)]
DEV: dev/gdb: add simple task dump
New functions task_dump_wq and task_dump_rq can be used to dump tasks
in a wait queue or in a run queue respectively. For the wait queue (the
most common usage), one needs to pass either the thread-local's timers,
or the thread group ones for shared tasks:
Willy Tarreau [Wed, 27 May 2026 16:49:27 +0000 (18:49 +0200)]
DEV: dev/gdb: improve ebtree pointer handling
The ebtree descent functions currently use $arg0 as is and it's up to
the user to manually type the required casts that are never obvious
(particularly when coming from a pointer). Let's put the eb_root* cast
in the function to be more user-friendly.
Willy Tarreau [Mon, 1 Jun 2026 16:44:04 +0000 (18:44 +0200)]
BUILD: makefile: search for Lua 5.5 as well
Support for Lua 5.5 was brought in 3.4-dev2 with commit 1c0f781994
("MINOR: hlua: Add support for lua 5.5") but the Makefile doesn't look
for it, which can be quite confusing on recent distros which start to
ship with it. Let's add it to the looked up names.
BUG/MINOR: tasks: Increase the right niced_task counter
In __task_wakeup(), for a niced task, we don't always want to increase
the niced_task counter of the running thread's thread group, if we are
waking up the task of another thread, who belongs to a different thread
group, then we want to increment that thread group's counter instead, as
that's the one that will get decremented later.
So just increase the counter for the target thread'd thread group,
instead of using tg_ctx.
The impact is probably pretty minor, niced task shared amongst thread
are not very common, and the impact would mostly mean we'd run more/less
tasks in one run of process_runnable_tasks() than expected.
This should be backported as far as 2.8.
BUG/MEDIUM: leastconn: Unlock the write lock on allocation failure
When we fail to allocate a new tree element, we're still holding the
write lock, so we should do an write unlock, not a read unlock, or the
lock will get corrupted and most likely this will end in a deadlock.
BUG/MINOR: mux-spop: Fix possible off-by-one OOB read in spop_get_varint()
In spop_get_varint(), -1 is returned if there is not enough data in the
buffer to decode the variable integer. However a strict comparison agasint
b_data() was performed, which is wrong. A failure must be reported if the
index is greater or equal to b_data().
BUG/MINOR: applet: Commit changes into input buffer after sending HTX data
After sending HTX data to an applet, htx_to_buf() must be called on the
applet buffer to commit changes (and possibly to reset the buffer if it is
empty). This was performed on the output buffer while it should in fact be
performed on the input buffer. So let's fix it.
BUG/MEDIUM: ssl: Make sure the alpn length is small enough
When the check for server hash was introduced to make sure we're using
the right alpn, the logic to store the new alpn was flawed. We should
always check that the new alpn length is small enough to fit in the
buffer, no matter if the server hash is not the same or not. So always
check the length first, and only check if the alpn or the server changed
after.
This should be backported whenever commit de3f245df032073dec134f5d0f597d73a7b2575d has been backported.
Willy Tarreau [Fri, 29 May 2026 09:39:05 +0000 (11:39 +0200)]
MINOR: debug: add -dA to dump an archive of all dependencies
This adds "-dA[file]" on the command line, which dumps an archive of all
dependencies detected at runtime into the designated file in tar format.
This is equivalent to "set-dumpable libs", but instead of keeping the libs
in memory, it dumps them into a file. This may be used after a core dump,
in order to provide all necessary libraries to developers to permit them
to exploit the core. This may not be available on all operating systems.
Willy Tarreau [Fri, 29 May 2026 09:20:12 +0000 (11:20 +0200)]
MINOR: deinit: release the in-memory copy of shared libs
When shared libs were loaded via "set-dumpable libs", better release
them upon deinit, it will make valgrind happier. For this we now have
a new function free_collected_libs() in tools.c and call it in deinit().
BUG/MEDIUM: htx: Fix headers rollback on partial copy in htx_xfer()
In htx_xfer() function, when headers are partially copied, depending on the
flags, a rollback may be performed to remove all copied headers from the
destination message. However, there was an issue in the loop performing the
rollback. Instead of decrementing the returned value using the size of the
HTX block from the destination message, the one from the source message was
used. So the wrong value was be returned and in worst case, it could
overflow.
In addition, the BUG_ON() in the loop was removed because test condition was
wrong.
BUG/MEDIUM: mux-h1: Dup connection/upgrade value to parse it when making headers
When message headers are formatted, the connection and upgrade header values
are parsed to be sanitized and to fill H1M flags. The values are modified in
place without changing the HTX message information accordingly (the block
info and the HTX info). It could be an issue if the output buffer is full
and the header cannot be formatted. Because the formatting can be stopped
with a HTX message in hazardous state.
It should be quite difficult to trigger this issue. But now, a copy of the
value is performed before parsing it. So only the copy will be altered,
leaving the HTX message in a safe state.
This patch must be backported to all stable versions.
BUG/MINOR: cache: Fix copy of value when parsing maxage
During maxage parsing, the size of the value was not properly computed when
it was copied into the trash chunk. The name (max-age or s-maxage) must be
skipped with the '=' character. But instead of doing a subtraction, and
addition was performed, adding 2 extra bytes to the value used for the
convertion to integer.
In addition, the "chunk_memcat(chk, "", 1)" operation to add a trailing
NULL-byte was replaced by "*(b_tail(chk)) = '\0'". It a bit easier to
understand.
This patch should be backported to all stable versions.
MAX_PUSH_ID frames are emitted by the client only on the control stream.
These conditions are checked via h3_check_frame_valid() since the
following patch.
However control stream test was inverted by mistake. This patch fixes
it.
Due to this bug, H3 connections were improperly closed on error by
haproxy for clients which send MAX_PUSH_ID frames. This has been
detected on the QUIC interop with aioquic and neqo clients.
Olivier Houchard [Thu, 28 May 2026 20:01:01 +0000 (22:01 +0200)]
BUG/MEDIUM: qmux: Close connection on invalid frame
In qcc_qmux_recv(), when calling qmux_parse_frm(), also treat negative
values as an error, and close the connexion. qmux_parse_frm() will
return -1 if the frame is of an invalid type, and we don't want to
process any further, or we will crash.
Willy Tarreau [Sun, 31 May 2026 20:42:53 +0000 (22:42 +0200)]
DOC: add security.txt describing how to report security issues
Move the security contact out of intro.txt into a dedicated, easily
searchable doc/security.txt that points reporters at the threat model
first, and reference it from intro.txt's contacts section and the
documentation index.
Willy Tarreau [Sun, 31 May 2026 18:28:08 +0000 (20:28 +0200)]
DOC: internals: add a threat model definition
Add doc/internals/threat-model.txt describing what does and does not
qualify as a security vulnerability in HAProxy so that reporters and
developers have a common understanding of the threat model, and make it
clear that anything non-critical should be handled in the open and
not hidden behind embargoes.
The document lists assets to protect, what constitutes an attack, what
are the mitigations in place, and the severity ordering of various
risks. This may in the long term also help developers make better
choices of default settings and option names, and may also justify
changing default settings over time when modern operating systems
bring new possibilities.
A section also lists some invariants and defaults in an attempt to
limit the risk of reporting theoretical issues that are technically
impossible to happen in the field.
This is an initial version meant to be refined as cases arise. It
was incrementally designed and cross-checked with the help of three
independent LLMs (Qwen, Gemini and Claude) until each correctly
classified a set of sample reports against it. In the current state
they do not raise any residual ambiguities anymore.
Willy Tarreau [Sat, 30 May 2026 09:43:43 +0000 (09:43 +0000)]
DOC: internals: clarify ambiguous wording in core-principles
After testing against a few LLMs, it appeared that several entries in
the core principles document were ambiguous or imprecise and could be
misread (size_t, pools, trash, dwcas, comparison, ncbuf). No more
complaint after this rewording so this will be sufficient for now.
Ilia Shipitsin [Sun, 31 May 2026 07:19:21 +0000 (09:19 +0200)]
CLEANUP: admin/halog: improve handling of memory allocation errors
Found via cppcheck --force --enable=all --output-file=haproxy.log :
admin/halog/halog.c:1805:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1806:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1809:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1810:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
admin/halog/halog.c:1814:2: warning: If memory allocation fails, then there is a possible null pointer dereference: ustat [nullPointerOutOfMemory]
Ilia Shipitsin [Sun, 31 May 2026 07:14:39 +0000 (09:14 +0200)]
CLEANUP: ncbmbuf: improve handling of memory allocation errors in unit tests
Found via cppcheck --force --enable=all --output-file=haproxy.log :
src/ncbmbuf.c:192:9: warning: If memory allocation fails, then there is a possible null pointer dereference: area [nullPointerOutOfMemory]
src/ncbmbuf.c:373:9: warning: If memory allocation fails, then there is a possible null pointer dereference: data [nullPointerOutOfMemory]
src/ncbmbuf.c:546:9: warning: If memory allocation fails, then there is a possible null pointer dereference: data [nullPointerOutOfMemory]
Found via cppcheck --force --enable=all --output-file=haproxy.log :
addons/51degrees/51d.c:130:3: warning: If memory allocation fails, then
there is a possible null pointer dereference: name [nullPointerOutOfMemory]
addons/51degrees/51d.c:922:4: warning: If memory allocation fails, then
there is a possible null pointer dereference: _51d_property_list [nullPointerOutOfMemory]
Olivier Houchard [Fri, 29 May 2026 17:38:03 +0000 (19:38 +0200)]
BUG/MEDIUM: resolvers: Wait a bit before calling the xprt prepare_srv
We can't call call the prepare_srv() method too early, because it needs
global.nbthreads to be properly set, which won't be true at post_parse
time. So instead, make it so that code runs later, as a post_check
function, when it will be safe to do so.
This should be backported up to 2.8.
This should fix github issue #3402
Maxime Henrion [Fri, 29 May 2026 15:55:39 +0000 (11:55 -0400)]
BUG/MINOR: cache: fix cache tree iteration
Ever since the introduction of multiple cache trees, the "show cache"
CLI command was not properly showing the contents of each tree, but was
only showing the first one.
Fix that by properly resetting next_key when we switch to the next tree.
Olivier Houchard [Fri, 29 May 2026 14:17:36 +0000 (16:17 +0200)]
MINOR: quic: Copy sin6_flowinfo and sin6_scope_id too
In in46un_to_addr(), when copying a struct sockaddr_in6, copy the
sin6_flowinfo and sin6_scope_id, as they are part of the structure too.
They are unlikely to be of any use for us, but this is more correct
anyway.
Olivier Houchard [Fri, 29 May 2026 14:03:26 +0000 (16:03 +0200)]
BUG/MINOR: quic: Fix another buffer overflow with sockaddr_in46
Very similarly to what was fixed with commit 63f853957af3ee062493bb3700f964ce456125b0, we cast a sockaddr_in46 in
quic_dgram_parse() to sockaddr_storage while providing source and
destination addresses to qc_handle_conn_migration(), which will then
copy the whole sockaddr_storage, thus reading memory past what was
provided.
While this most likely won't have any impact, let's do the right thing,
and use in46un_to_addr() to generate a real sockaddr_storage.
This does not need to be backported.
Willy Tarreau [Fri, 29 May 2026 09:07:38 +0000 (11:07 +0200)]
BUILD: makefile: include EXTRA_MAKE in the .build_opts construction
EXTRA_MAKE allows to source an external makefile to bring new options
that will result in including add-ons etc. It must be part of the
construction of .build_opts that decides whether or not existing .o
are reusable or need to be rebuilt, otherwise we can end up with a mix
of .o built with some options and others with different options.
Willy Tarreau [Wed, 27 May 2026 09:07:09 +0000 (11:07 +0200)]
MINOR: thread: report when thread-groups or nbthread results in less threads
Some setups where the number of threads is forced without any binding
(no cpu-map), are quite suspicious if they result in less threads than
available CPUs, and not even predictably bound, so we want to notify
the user that this might be an oversight.
Similarly, when thread-groups is forced and not nbthread (and no cpu-map),
and the final number of threads is lower than the hard-limit or the number
of CPUs we also indicate the impact and how to remedy it. This can happen
for example when starting on a machine with more than 64 CPUs and
thread-groups forced to 1, or on more than 128 CPUs and thread-groups
forced to 2 (e.g. when moving an older config to a new platform).
It is possible that some of these conditions might need to be readjusted
in the future to catch other traps or to relax certain commonly used,
valid cases, so for now it is preferable not to backport this patch.
Willy Tarreau [Thu, 28 May 2026 15:41:18 +0000 (17:41 +0200)]
MINOR: cpu-topo: notify when cpu-policy is ignored due to other settings
The cpu-policy directive is ignored when nbthreads, thread-groups, or
cpu-map are set. In addition, first-usable-node is ignored when the
process was externally restricted (e.g. taskset). This is difficult to
debug when it happens because multiple parameters come into the mix and
it's easy to forget to unset one. Let's emit a notice when this happens
and the policy was forced. This way, it remains silent with the default
policy, but if it was forced, the incompatibility is reported.
It's worth noting that ll the cpu-policy functions take a char **err
but none uses it. It could have been useful here instead of calling
ha_notice() all along, but one needs to determine who the consumers
are and who will be responsible for freeing the message, so let's go
with ha_notice() given that were were already some diag_warnings in
these functions.
Willy Tarreau [Thu, 28 May 2026 15:19:44 +0000 (17:19 +0200)]
CLEANUP: thread: indicate when max-threads-per-group is ignored
Since it's easy to get caught by some parameters being ignored, let's
detect when mtpg was explicitly set and report a notice if it is ignored
due to thread-groups being set. For this we need to avoid presetting
the value in the global section and only set it when entering function
thread_detect_count(), which is OK since the value cannot be used before.
Willy Tarreau [Thu, 28 May 2026 14:48:51 +0000 (16:48 +0200)]
BUG/MEDIUM: threads: ignore max-threads-per-group when thread-groups is set
As documented, max-threads-per-group is the default number of threads
to arrange in a group before creating another group, and is only meant
to be used when thread-groups is not set.
However it was always enforced, so configs like:
global
thready-groups 2
which were sufficient in 3.2 and above to start with 64-128 threads
are now suddenly limited to 32 threads! Let's relax the limit when
thread-groups is set!
Willy Tarreau [Thu, 28 May 2026 15:15:57 +0000 (17:15 +0200)]
BUG/MINOR: threads: set at least grp_max when mtpg is too small
When starting, say, 128 threads with max-threads-per-group set to 2
and MAX_TGROUPS set to the default 32, instead of setting the resulting
number of groups to 32 and threads to 64, they're set to 1 and 32
respectively because the condition to raise grp_min is not satisfied.
Let's cut the condition in two parts to also permit to raise it at
least to grp_max.
Willy Tarreau [Thu, 28 May 2026 16:44:36 +0000 (18:44 +0200)]
REGTESTS: quic: disable quic/ocsp_auto_update for now
It was made from the split of the original one into the SSL and the QUIC
variant. However there's a catch: both use the same certificates which
includes the OCSP URL 127.0.0.1:12345, and both need to start a server
on that port. Depending on the number of parallel process and their
speed, they might very well work, or totally fail due to a binding
conflict and the fact that the test runs for a few seconds.
Let's disable the QUIC variant for now, since the whole point of the
test is to verify all the sequencing, the SSL one is greatly sufficient.
Maybe a better approach can be found later.
BUG/MEDIUM: lua: register hlua_init() as a pre-check to fix crash without Lua config
Commit 1c59c39171529 deferred hlua_init() to be called lazily from the
config keyword handlers (lua-load, lua-load-per-thread,
lua-prepend-path, tune.lua.openlibs), with a call inside
hlua_post_init() as a safety net for the case where no Lua directive
appears in the configuration at all.
The problem is hlua_init() is a function that allocates internal
servers (socket_proxy, socket_tcp, socket_ssl) that must exist before
haproxy initialize the configuration. But hlua_post_init() is done too
far after this initialization, so the safety net does not work
correctly.
This would results in a crash in the deinit() if no lua
configuration was loaded in haproxy.
Core was generated by `./haproxy -W -f /dev/null'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005671c72b1047 in _ceb_first (root=0x30, kofs=16, key_type=CEB_KT_U64, key_len=0,
is_dup_ptr=0x7ffc13197a14) at include/import/cebtree-prv.h:1160
1160 if (!*root)
(gdb) bt
#0 0x00005671c72b1047 in _ceb_first (root=0x30, kofs=16, key_type=CEB_KT_U64, key_len=0,
is_dup_ptr=0x7ffc13197a14) at include/import/cebtree-prv.h:1160
#1 _ceb64_first (root=0x30, kofs=16) at src/_ceb_int.c:73
#2 ceb64_ofs_first (root=0x30, kofs=16) at src/_ceb_int.c:66
#3 0x00005671c6be5e6e in srv_close_idle_conns (srv=0x5671fd592a80) at src/server.c:7676
#4 0x00005671c6d3be17 in deinit_proxy (p=0x5671fd5d7780) at src/proxy.c:393
#5 0x00005671c6d3c536 in proxy_drop (p=0x5671fd5d7780) at src/proxy.c:479
#6 0x00005671c6aed998 in hlua_deinit () at src/hlua.c:14934
#7 0x00005671c6db2e41 in deinit () at src/haproxy.c:2846
#8 0x00005671c6db3d98 in deinit_and_exit (status=0) at src/haproxy.c:2966
#9 0x00005671c6db6111 in main (argc=4, argv=0x7ffc131983c8) at src/haproxy.c:3997
The fix is to do the initialization earlier, in a pre-check callback.
BUG/MINOR: quic: update drs->lost before calling on_ack_recv
The QUIC congestion control algorithm impacted by this bug is BBR.
In qc_notify_cc_of_newly_acked_pkts(), drs->lost was updated after
quic_cc_drs_on_ack_recv(), causing the current sample's lost count to
miss the bytes_lost from the current loss detection round. This meant
that rs->lost = drs->lost - rs->prior_lost would always be 0 for the
current losses, since both drs->lost and rs->prior_lost (captured at
packet send time) excluded the current bytes_lost.
Moving drs->lost += bytes_lost before on_ack_recv ensures that the
rate sample correctly includes the newly detected lost bytes, matching
the BBR algorithm's intent where C.delta_lost = C.lost - C.prior_lost
should reflect all losses since the last sample.
Must be backported as far as 3.1 where delivery rate sampling was
implemented.
BUG/MEDIUM: quic: reset consecutive_losses on exit from recovery period (cubic)
When exiting the recovery period and re-entering congestion avoidance,
the consecutive_losses counter was not reset. This meant that if a loss
event arrived immediately after the ACK that ended recovery, the counter
would still hold the value that triggered recovery, causing an immediate
re-entry into recovery (recovery -> CA -> recovery loop).
Resetting consecutive_losses to 0 on recovery exit matches the behavior
of resetting it on ACK in CA, ensuring a clean slate for the new
congestion avoidance period.
BUG/MEDIUM: quic: reset cwnd in slow_start on persistent congestion (cubic)
The cubic slow_start callback was only resetting the internal cubic state
without reducing the congestion window, unlike newreno which calls
quic_cc_path_reset(). Per RFC 9002, persistent congestion should trigger
both entry into slow start and a reduction of the congestion window.
MEDIUM: quic: optimize HKDF operations by reusing per-thread contexts
Allocating and freeing an OpenSSL EVP_PKEY_CTX context via
EVP_PKEY_CTX_new_id() and EVP_PKEY_CTX_free() on every HKDF cryptographic
operation (such as during stateless reset token generation) induces
unnecessary memory allocation overhead.
Optimize this by introducing a global per-thread context array
'quic_tls_hkdf_ctxs'. These contexts are allocated and initialized once
at startup via a POST_CHECK hook (quic_tls_alloc_hkdf_ctxs) and are
properly freed at exit via a POST_DEINIT hook (quic_tls_dealloc_hkdf_ctxs).
The functions quic_hkdf_extract(), quic_hkdf_expand(), and
quic_hkdf_extract_and_expand() now reuse the pre-allocated context
corresponding to the current thread ID ('tid'), removing dynamic
allocations from these frequent execution paths.
As a cleanup, quic_hkdf_expand() is now static and unexported from the
header file.
Should be easily backported to all versions for optimization purposes.
BUG/MINOR: quic: fix ack range node pool_free call passing wrong pointer type
In quic_insert_new_range(), the variable 'first' is a struct eb64_node*,
but pool_free expects a struct quic_arng_node*. While the addresses are identical
(since 'first' is the first member of quic_arng_node), this is technically
incorrect and should use eb64_entry() for proper type safety.
Amaury Denoyelle [Thu, 28 May 2026 13:55:56 +0000 (15:55 +0200)]
BUG/MINOR: mux_quic: prevent BE reuse with an errored conn
When a backend connection is reused, qcm_strm_attach() callback is used.
A BUG_ON() is present to ensure that the connection is not already on
error. This should be guaranteed by the fact that idle insertion is
skipped for such connections.
However, when a connection is flagged on error, it is not immediately
removed from its idle/avail pool. Thus, there is a risk that it is
reused, triggering the aformentioned BUG_ON() statement.
This issue should be avoided via avail_streams callback which should
return 0, forcing the caller to cancel the connection reuse. In QUIC,
this callback implementation relies on internal qcc_be_is_reusable().
However, it lacked checks for error status.
To fix this, extend qcc_be_is_reusable() to properly check connection
errors or an expired timeout.
Previously, these parameters were already checked by qcc_is_dead(). As
it also relies on qcc_be_is_reusable(), this patch also rearranges it to
avoid duplicate checks for backend connections.
Amaury Denoyelle [Thu, 28 May 2026 14:44:03 +0000 (16:44 +0200)]
BUG/MINOR: mux_quic: fix BE conn removal on app shutdown
When QUIC application layer is shut for a backend connection, the
connection is immediately removed from its idle pool. This is a nice
optimization as this prevents a future streams to try to reuse an
unusable connection. This is implemented since the following commit.
However, this removal is not correctly performed as it is used
conn_delete_from_tree(). For private connections, this can cause crashes
as they are stored in the session instead. Thus, connection status is
now properly check, and alternatively session_unown_conn() is used if
stored in the session.
BUG/MINOR: mux_quic: open an idle QCS on reset on BE side
On the backend side, a QCS may be opened but resetted immediately. No
STREAM frame will be emitted prior to the RESET_STREAM. When the latter
is sent, qcs_close_local() will mark the QCS Tx channel as closed.
In this case, a BUG_ON() would be triggered as there is QCS Tx channel
is not yet marked as opened. To prevent this, add a qcs_idle_open() call
when the stream is resetted, but only for the backend side.
BUG/MINOR: mux-h2: Count padding for connection flow control on error path
When DATA frame are received, we take care to update the counter used to
send WINDOW_UPDATE for the connection. It is also performed on error path
when DATA frames are processed. However, when this happened, only the frame
length was accounted while the padding must also be considered.
To fix the issue, the full frame length (h2c->dfl), which include the
padding length, must be added to the amount of newly received data
(h2c->rcvd_c).
The issue was introduced with commit eeacca75d ("BUG/MINOR: mux-h2: count
rejected DATA frames against the connection's flow control") and backported
to 2.8.
REGTESTS: lua: fix tune.lua.openlibs in Lua reg-tests
These tests were using "tune.lua.openlibs none" with lua-load, which
was a no-op in the old code since Lua states 0 and 1 were always
initialised before config parsing with all standard libraries.
Now that the Lua VM is initialised lazily, the restriction correctly
applies to state 0 as well. Replace "none" with the minimal set of
libraries actually required by each test's Lua code:
BUG/MEDIUM: lua: defer Lua VM initialisation to the first Lua config keyword
HAProxy used to call hlua_init() unconditionally from step_init_1(),
before any configuration file was parsed. As a consequence, Lua states
0 and 1 were always created with hlua_openlibs_flags set to its default
value (HLUA_OPENLIBS_ALL), regardless of any tune.lua.openlibs directive
that appeared later in the global section. With multiple threads, states
2..N were created correctly in hlua_post_init() after the config had been
parsed, while states 0 and 1 retained the full standard-library set.
This produced the observable bug reported in GitHub issue #3396: a script
loaded with lua-load-per-thread could see require() as a function on
thread 1 but nil on thread 2 when tune.lua.openlibs was used to restrict
the available libraries.
The initialisation is now lazy. hlua_init() is idempotent: it returns
immediately if the states already exist (hlua_states[0] != NULL). It is
called explicitly from the three config keyword handlers that need the
Lua states to be live before they can do their work (lua-load,
lua-load-per-thread, lua-prepend-path) and from tune.lua.openlibs, after
the hlua_openlibs_flags variable has been updated, so that the states are
always created with the correct library set.
hlua_post_init() calls hlua_init() unconditionally as a safety net,
covering the case where no Lua directive appeared in the configuration at
all (no global section, or only pure-tuning directives such as timeouts
and memory limits), and ensuring correct behaviour with multiple
consecutive global sections.
As a result of this change, tune.lua.openlibs must now appear before
lua-load, lua-load-per-thread, and lua-prepend-path in the configuration;
if any of those keywords is encountered first, the Lua states will already
be initialised and tune.lua.openlibs with a non-default value will return
a parse error.
BUG/MINOR: quic: Fix memory leak in quic_deallocate_dghdlrs()
When deallocating the QUIC datagram handlers, the per-thread buffer
allocated inside quic_dghdlrs[i].buf.buffer was missing a free().
This led to a memory leak on exit or reload.
Fix this by freeing each thread buffer before releasing the main
quic_dghdlrs array.
Unlike the detection performed during sendto() for an unreachable peer,
ECONNREFUSED was not handled when received via recvmsg() as an ICMP
"host unreachable" message.
This patch tracks ECONNREFUSED errors on the receive path.
Note that this detection is entirely dependent on the remote host effectively
sending an ICMP "host unreachable" message and on the absence of any network
filtering (e.g., firewalls) that would drop such ICMP packets. Without
receiving this ICMP signal, the connection state cannot be updated through
this mechanism.
At a higher level, similar to how this error is handled on sendto(),
the connection is now terminated as soon as possible by calling
qc_kill_conn(). This triggers a call to qc_notify_err(). When the mux
does not exist, it attempts to create one via conn_create_mux(). While
the latter systematically fails if the connection is flagged with
CO_FL_ERROR, it has the useful side effect of waking the stconn stream
attached to the connection during a session opening without a mux
(e.g., for H3).
This issue was caught by haload (upcoming tool).
Must be backported as far as 2.6 because it impacts both the QUIC
frontends and backends.
CLEANUP: qpack: move encoded macros to qpack-t.h to avoid duplication
QPACK_LFL_WLN_BIT and related encoded field line bitmasks were defined
in both qpack-enc.c and qpack-dec.c. Moved them to qpack-t.h where
they are shared between encoder and decoder, eliminating the duplicate
definitions.
Should be backported to ease any further commit to come.
BUG/MINOR: qpack: fix huff_dec() error handling in qpack_decode_fs()
The <nlen> variable is a signed integer, but the check for a Huffman
decoding error was written as 'nlen == (uint32_t)-1'.
With standard compiler type promotion rules, this comparison happens to
work as intended when huff_dec() returns -1. However, relying on implicit
unsigned promotions for signed error checking is fragile. If a compiler
applies different promotion semantics, or if huff_dec() returns any other
negative error code, the failure would go undetected, leading to buffer
corruption or a crash via b_add() and ist2().
Fix this by using 'nlen < 0', removing any ambiguity regardless of the
compiler used.
CLEANUP: qpack: fix copy-paste typo in value Huffman debug string for WLN
In qpack_decode_fs(), inside the QPACK_LFL_WLN_BIT branch (Literal field
line with literal name), the debug message printed "[name huff ...]" instead
of "[value huff ...]" after decoding the value string.
This is a harmless copy-paste typo from the preceding name decoding block.
Even if this is a cleanup, should be easily backported to ease any further
backport.
BUG/MINOR: qpack: fix sign bit mask in qpack_decode_fs_pfx()
The sign bit of the Delta Base integer encoding was extracted using
mask 0x8 (bit 3) instead of 0x80 (bit 7). This was likely a copy-paste
error from other QPACK instructions using 3-bit varints.
According to RFC 9204 Section 5.2.1, for prefix instructions, the sign
bit 'S' is the most significant bit (bit 7) of the first byte, followed
by a 7-bit varint.
This fix is harmless for current HTTP/3 traffic: per RFC 9204, the Delta
Base calculation is strictly used for dynamic table entry references.
Since HAProxy's QPACK dynamic table is currently disabled and the extracted
sign bit is not yet used in the decoding logic (only in debug prints),
this code path has no impact on production for now.
CLEANUP: qpack: fix copy-paste typo in value Huffman debug string
In qpack_decode_fs(), when decoding a literal field line with a literal
value, the debug message mistakenly printed "[name huff ...]" instead of
"[value huff ...]" after a successful Huffman decoding of the value string.
This is a harmless copy-paste typo from the field name decoding block
just above, fix it to prevent confusion when debugging QPACK streams.
Should be easily backported to all versions to ease further modifications
into the QPACK code.
BUG/MINOR: qpack: fix potential null-pointer dereference in qpack_dht_insert()
When defragmenting the QPACK dynamic header table upfront during an
insertion, qpack_dht_defrag() can fail and return NULL if memory
allocation or re-allocation fails.
However, qpack_dht_insert() was blindly using the returned pointer
without validation, immediately leading to a null-pointer dereference
on 'dht->wrap'.
Fix this by checking if 'dht' is NULL after the defrag call and return
an error (-1).
Note that this has no impact on production yet because the QPACK dynamic
table is currently not enabled/used, so qpack_dht_insert() is never called.
BUG/MINOR: qpack: Fix index calculation in debug functions
Although qpack_idx_to_name and qpack_idx_to_value are currently only
called within uncompiled debug code, they contained an index bug. They
passed absolute indexes directly to qpack_get_dte instead of relative
dynamic table indexes.
This patch fixes the logic by subtracting QPACK_SHT_SIZE and guarding
against static table index lookups.
The commit broke the resolvers. All responses are marked as invalid. The
resolv_read_name() function can return 0 on error, but it seems also
possible to return 0 when no label name was found. And depending on the
caller, it can be an error... or not.
So, let's revert it. This might trigger a watchdog but doesn't seem to and
once fixed it makes things worse.
Amaury Denoyelle [Wed, 27 May 2026 13:34:01 +0000 (15:34 +0200)]
BUG/MINOR: qmux: reject too large initial record
Initial max_record_size is set to 16382. If the first received record
size is larger, abort xprt_qmux layer immediately without having to wait
for the timeout.
Amaury Denoyelle [Wed, 27 May 2026 13:35:34 +0000 (15:35 +0200)]
BUG/MEDIUM: qmux: do not crash on receiving an invalid first frame
With QMux, each peer has to first emit a transport parameters frame. If
the received frame is different, xprt_qmux handshake cannot proceed.
This patch removes the BUG_ON() in this case, replacing it with a safer
connection closure.
In the future, a graceful close with CONNECTION_CLOSE frame should be
implemented.
Amaury Denoyelle [Wed, 27 May 2026 13:30:04 +0000 (15:30 +0200)]
BUG/MEDIUM: qmux: do not crash on too large record
Remove BUG_ON() when reading a QMux record larger than the buffer. It is
now replaced by a safer error handling. In the future, a proper
CONNECTION_CLOSE emission should be implemented for this case.
Olivier Houchard [Wed, 27 May 2026 09:13:52 +0000 (11:13 +0200)]
BUG/MEDIUM: cpu-topo: Enforce thread-hard-limit on policy
When a policy is set, and the number of threads is calculated
dynamically, make sure we enforce thread-hard-limit, and do not create
thread groups based on how many thread we would have created without
the limit.
This should be backported to 3.3 and 3.2. The patch won't apply cleanly
there, because the code has changed since then, but it should be very
similar, only we'll have to check "cpu_count" there, where in 3.4 we
check "thr_count".
commit 72fd357814e1 ("MEDIUM: mux-h1: Return an error on h2 upgrade
attempts if not allowed") added an h1_report_glitch() call on the new
405 path but exits via "goto no_parsing", which skips the
session_add_glitch_ctr() call at the end of the parse block. As a
result fc_glitches increments correctly but the per-session stick
counters never see it, breaking sc_glitch_cnt-based rate limiting of
the H2-preface-over-H1 abuse pattern.
No backport needed beyond the branches that took 72fd357814.
[cf: Patch was edited to move the goto label instead of duplicating
the call to session_add_glitch_ctr]
BUG/MINOR: ssl-gencert: validate SNI characters to prevent SAN certificate injection
ssl_sock_add_san_ext() builds the Subject Alternative Name extension by
concatenating "DNS:" + servername and passing the result to
X509V3_EXT_nconf_nid(). OpenSSL's nconf parser splits the value string on
commas into multiple type:value SAN entries. The SNI comes from unauthenticated
TLS ClientHello data -- an attacker can embed commas and colons (e.g.,
"host,dns:internal.corp,ip:10.0.0.1") to inject arbitrary GENERAL_NAME entries
into certificates signed by HAProxy's configured CA.
This is a CA issuance-policy violation: the operator expects one certificate
per SNI hostname, but an attacker can obtain certificates containing additional
hostnames/IPs/emails without access to the CA private key.
Fix by adding ssl_sock_sni_is_valid() that validates the SNI contains only
DNS-label-legal characters (alphanumeric, hyphens, dots). The check is
performed at the start of ssl_sock_do_create_cert() before any allocation.
Commas, colons, spaces, and other special characters cause certificate
generation to fail, preventing SAN injection while allowing all valid
hostname values.
BUG/MINOR: tcpcheck: Check LDAP response to not read more data than available
tcpcheck_ldap_expect_bindrsp() parses ASN.1 BER-encoded LDAP responses from
the health check target. After reading the outer message size and validating
protocol fields, it encounters a long-form BER length for the bindResponse
value (high bit set in the length byte). The code reads nbytes = (*ptr &
0x7f) then advances ptr by 1 + nbytes without checking that enough bytes
remain in the receive buffer. So, it is possible to read more data than
available.
Note that it is only possible if the LDAP response was forged because the
message length was already checked. LDAP response remains quite short and it
is not possible to read outside the buffer area. So at worst, garbage are
parsed and a wrong result is reported by the LDAP health-check. Most
probably an error will be reported.
This patch could be backported to all stable versions.
Willy Tarreau [Tue, 26 May 2026 19:56:40 +0000 (21:56 +0200)]
[RELEASE] Released version 3.4-dev14
Released version 3.4-dev14 with the following main changes :
- MINOR: config: shm-stats-file is no longer experimental
- BUILD: proxy: unstatify the proxies_del_lock to avoid a warning without threads
- BUG/MEDIUM: net_helper: fix a remaining possibly infinite loop in converters
- MINOR: ssl_sock: remove unneeded check on QMux flags
- MINOR: connection: define xprt_add_l6hs()
- MINOR: xprt_qmux: define default value for get_alpn
- MINOR: connection: define mask CO_FL_WAIT_XPRT_L6
- MINOR: session: support QMux in clear on FE side
- MINOR: backend: support QMux in clear for BE side
- BUG/MINOR: ocsp: Manage date too far away in the future
- MINOR: mux_quic: handle STOP_SENDING in QMux
- MINOR: mux_quic: handle MAX_STREAMS for uni stream in QMux
- MINOR: mux_quic: do not crash on unhandled QMux frame reception
- BUG/MEDIUM: applet: Properly handle receives of size 0
- BUG/MEDIUM: resolvers: Fix test on dn label size in resolv_dn_label_to_str()
- BUG/MEDIUM: ssl-gencert: Unlock LRU cache if failing to generate certificate
- BUG/MINOR: quic: fix ODCID lookup from derived value
- BUG/MEDIUM: dict: hold lock while decrementing refcount in dict_entry_unref
- BUG/MINOR: tcpchecks: Limit parsing of agent-check reply to the buffer
- BUG/MEDIUM: hlua: Fix integer underflow when receiving line from lua cosocket
- BUG/MEDIUM: cli: Fix parsing of pattern finishing a command payload
- BUG/MEDIUM: acme: NUL terminate response buffer before PEM parsing
- BUILD: intops: mask the fail value in array_size_or_fail()
- BUG/MEDIUM: log-forward: make sure the month is unsigned
- BUG/MEDIUM: regex: allocate a large enough pcre2 match for all matches
- BUG/MEDIUM: tcpcheck/spoe: bound the SPOP error code to valid values
- BUG/MEDIUM: cache: fix a refcount leak for missed secondary entries
- BUG/MINOR: log: free logformat expr on compile failure in cfg_parse_log_profile
- BUG/MINOR: resolvers: fix room for trailing zero in resolv_dn_label_to_str()
- BUG/MINOR: resolvers: fix risk of appending garbage past the domain name
- BUG/MINOR: mux-h2: validate HEADERS frame length before reading stream dep
- BUG/MINOR: log: look for the end of priority before the end of the buffer
- BUG/MINOR: dict: fix refcount race on insert collision
- BUG/MINOR: init: use more than ha_random64() for the cluster secret
- BUG/MINOR: sample: limit the be2hex converter's chunk size
- CLEANUP: resolvers: use read_n32() instead of open-coded big-endian read
- CLEANUP: resolvers: remove pool_free(NULL) in SRV additional record matching
- CLEANUP: resolvers: fix comment typos and wrong filenames in file headers
- BUG/MINOR: haterm: fix the random suffix multiplication
- MINOR: haterm: enable h3 for TCP bindings
- MINOR: haterm: do not emit a warning when not using SSL
- BUG/MEDIUM: h1: drop headers whose names contain invalid chars
- BUG/MEDIUM: h1: limit status codes to 3 digits by default
- BUG/MEDIUM: cache: always verify the primary hash in get_secondary_entry()
- BUG/MINOR: cache: also recognize directives in the form "token="
- BUG/MINOR: resolvers: relax size checks in authority record parsing
- BUG/MINOR: sample: request an extra output byte for the url_dec converter
- BUG/MINOR: http-fetch: check against the whole token in get_http_auth()
- BUG/MEDIUM: acme: protect against risk of null-deref on connection failure
- BUG/MINOR: http-ext: always check remaining data when reading rfc7239 nodeport
- BUG/MINOR: base64: return empty string for empty input in base64dec()
- BUG/MINOR: payload: fix the handshake length bounds check smp_client_hello_parse()
- BUG/MINOR: ssl-hello: make use of the null-terminated servername
- BUG/MINOR: resolvers: switch to a better PRNG for query IDs
- BUG/MINOR: addons/51d: NUL-terminate headers before passing them to Trie API
- BUG/MEDIUM: tools: insert an XXH64 layer on the PRNG output
- MINOR: tools: provide a function to generate a hashed random pair
- MEDIUM: init: fall back to ha_random64_pair_hashed() for the cluster secret
- MEDIUM: tools: use the hashed random pair for UUID generation
- MEDIUM: h1: use ha_random64_pair_hashed() for the WebSocket key
- MEDIUM: quic: use ha_random64_pair_hashed() to generate the QUIC retry tokens
- MEDIUM: tools: switch the main PRNG to a thread-local xoshiro256**
- BUG/MEDIUM: h3: reject client push stream
- BUG/MINOR: h3: reject server push stream
- BUG/MINOR: h3: reject client CANCEL_PUSH frame
- BUG/MINOR: h3: adjust error on PUSH_PROMISE frame reception
- BUG/MINOR: h3: reject server MAX_PUSH_ID frame
- BUG/MEDIUM: auth: fix unconfigured password NULL deref
- BUG/MINOR: h3: add missing break on rcv_buf()
- BUG/MINOR: hlua: prevent Lua from passing CR/LF/NUL in HTTP headers
- BUG/MINOR: qmux: do not crash on frame parsing issue
- BUG/MINOR: quic: reject packet too short for HP decryption
- BUG/MINOR: jwe: enforce GCM tag length to 128 bits
- BUG/MEDIUM: jwe: substitute random CEK on RSA1_5 decryption failure per RFC 7516 #11.5
- BUG/MEDIUM: mux-fcgi: reject stream ID 0 for application records
- MINOR: http: Add function to remove all occurrences of a value in a header
- MINOR: h1: Add a H1M flag to specify a non-empty 'Upgrade:' header was parsed
- BUG/MEDIUM: h1-htx: Sanitize parsing to properly handle upgrade requests
- BUG/MINOR: mux-fcgi: Use relative offset to compute contig data in demux buf
- BUG/MINOR: mux-spop: Use relative offset to compute contig data in demux buf
- CLEANUP: mux-fcgi/mux-spop: Remove copy/pasted comment about slow realign
CLEANUP: mux-fcgi/mux-spop: Remove copy/pasted comment about slow realign
A comment about the condition to perform a slow realign of the demux buffer
was abusively copy/pasted from the FCGI multiplexer at different places in
the FCGI and SPOP multiplexers. Let's remove these comments.
BUG/MINOR: mux-spop: Use relative offset to compute contig data in demux buf
b_contig_data() should be called with a head-relative offset (0 for the
beginning of readable data). However, in the SPOP multiplexer, to get
contiguous data available in the demux buffer, it is called with
b_head_ofs(dbuf) which returns an absolute buffer position (b->head). So
b->head is counted twice. Because of this bug, the demux buffer could be
realigned while it should not and conversely.
Instead, the offset 0 must be used. So let's fix it.
BUG/MINOR: mux-fcgi: Use relative offset to compute contig data in demux buf
b_contig_data() should be called with a head-relative offset (0 for the
beginning of readable data). However, in the FCGI multiplexer, to get
contiguous data available in the demux buffer, it is called with
b_head_ofs(dbuf) which returns an absolute buffer position (b->head). So
b->head is counted twice. Because of this bug, the demux buffer could be
realigned while it should not and conversely.
Instead, the offset 0 must be used. So let's fix it.
BUG/MEDIUM: h1-htx: Sanitize parsing to properly handle upgrade requests
Thanks to previous patches, the request messages are now sanitized to
properly handle Upgrade requests. Now, if a 'connection: upgrade' header
value was found while no 'Upgrade' header, the 'upgrade' values is removed
from the 'connection' header. Conversely the opposite is also performed. If
'Upgrade' header was found, but no "conneciotn: upgrade" header value, all
occurrences of 'Upgrade' header are refused.
This patch depends on following ones:
* MINOR: h1: Add a H1M flag to specify a non-empty 'Upgrade:' header was parsed
* MINOR: http: Add function to remove all occurrences of a value in a header
It should fix the issue 3397. But the H2 part should be reviewed too, and
probably the H1 response parsing, to be consistent with this change.