BUG/MEDIUM: mux-fcgi: Use a safe loop to resume each stream eligible for sending
At the end of fcgi_send(), if the connection is not full anymore, we loop on
the send list to resume FCGI stream for sending. But a streams may be
removed from the this list during the loop. So a safe loop must be used.
This patch should be backported to all stable versions.
BUG/MAJOR: resolvers: Properly lowered the names found in DNS response
Names found in DNS responses are lowered to be compared. A name is composed
of several labels, strings precedeed by their length on one byte. For
instance:
3www7haproxy3org
There is an bug when labels are lowered. The label length is not skipped and
tolower() function is called on it. So for label length in the range [65-90]
(uppercase char), 32 is added to the label length due to the conversion of a
uppercase char to lowercase. This bugs can lead to OOB read later in the
resolvers code.
The fix is quite obvious, the label length must be skipped when the label is
lowered.
Thank you to Kamil Frankowicz for having reported this.
This patch must be backported to all stable versions.
BUG/MAJOR: fcgi: Fix param decoding by properly checking its size
In functions used to decode a FCGI parameter, the test on the data length
before reading the parameter's name and value did not consider the offset
value used to skip already parsed data. So it was possible to read more data
than available (OOB read). To do so, a malicious FCGI server must send a
forged GET_VALUES_RESULT record containing a parameter with wrong name/value
length.
Thank you to Kamil Frankowicz for having reported this.
This patch must be backported to all stable versions.
BUG/MINOR: sample: Fix sample to retrieve the number of bytes received and sent
There was an issue in the if/else statement in smp_fetch_bytes() function.
When req.bytes_in or req.bytes_out was requested, res.bytes_in was always
returned. It is now fixed.
BUG/MINOR: channel: Increase the stconn bytes_in value in channel_add_input()
This function is no longer used. So it is not really an bug. But it is still
available and could be used by legacy applets. In that case, we must take
care to increment the stconn bytes_in value accordingly when input data are
inserted.
MINOR: stream: Display the currently running filter per channel in stream dump
Since the 3.1, when stream's info are dump, it is possible to print the
yielding filter on each channel, if any. It was useful to detect buggy
filter on spinning loop. But it is not possible to detect a filter consuming
too much CPU per-execution. We can see a filter was executing in the
backtrace reported by the watchdog, but we are unable to spot the specific
one.
Thanks to this patch, it is now possible. When a dump is emitted, the
running or yield filters on each channel are now displayed with their
current state (RUNNING or YIELDING).
This patch could be backported as far as 3.2 because it could be useful to
spot issues. But the filter API was slightly refactored in 3.4, so this
patch should be adapted.
MINOR: filters: Set last_entity when a filter fails on stream_start callback
On the stream, the last_entity should reference the last rule or the last
filter evaluated during the stream processing. However, this info was not
saved when a filter failed on strem_start callback function. It is now
fixed.
MINOR: filters: Use filter API as far as poissible to break loops on filters
When the filters API was refactored to improve loops on filters, some places
were not updated (or not fully updated). Some loops were not relying on
resume_filter_list_break() while it was possible. So let's do so with this
patch.
DEBUG: stream: Display the currently running rule in stream dump
Since the 2.5, when stream's info are dump, it is possible to print the
yielding rule, if any. It was useful to detect buggy rules on spinning
loop. But it is not possible to detect a rule consuming too much CPU
per-execution. We can see a rule was executing in the backtrace reported by
the watchdog, but we are unable to spot the specific rule.
Thanks to this patch, it is now possible. When a dump is emitted, the
running or yield rule is now displayed with its current state (RUNNING or
YIELDING).
This patch could be backported as far as 3.2 because it could be useful to
spot issues.
MEDIUM: http-ana: Use the version of the opposite side for internal messages
When the response is send by HAProxy, from a applet or for the analyzers,
The request version is used for the response. The main reason is that there
is not real version for the response when this happens. "HTTP/1.1" is used,
but it is in fact just an HTX response. So the version of the request is
used.
In the same manner, when the request is sent from an applet (httpclient),
the response version is used, once available.
The purpose of this change is to return the most accurate version from the
user point of view.
MEDIUM: http-fetch: Rework how HTTP message version is retrieved
Thanks to previous patches, we can now rely on the version stored in the
http_msg structure to get the request or the response version.
"req.ver" and "res.ver" sample fetch functions returns the string
representation of the version, without the prefix, so "<major>.<minor>", but
only if the version is valid. For the response, "res.ver" may be added from
a health-check context, in that case, the HTX message is used.
"capture.req.ver" and "capture.res.ver" does the same but the "HTTP/" prefix
is added to the result. And "capture.res.ver" cannot be called from a
health-check.
To ease the version formatting and avoid code duplication, an helper
function was added. So these samples are now relying on "get_msg_version()".
MINOR: http-ana: Save the message version in the http_msg structure
When the request or the response is received, the numerical value of the
message version is now saved. To do so, the field "vsn" was added in the
http_msg structure. It is an unsigned char. The 4 MSB bits are used for the
major digit and the 4 LSB bits for the minor one.
Of couse, the version must be valid. the HTX_SL_F_NOT_HTTP flag of the
start-line is used to be sure the version is valid. But because this flag is
quite new, we also take care the string representation of the version is 8
bytes length. 0 means the version is not valid.
BUG/MINOR: h1-htx: Be sure that H1 response version starts by "HTTP/"
When the response is parsed, we test the version to be sure it is
valid. However, the protocol was not tested. Now we take care that the
response version starts by "HTTP/", otherwise an error is returned.
Of course, it is still possible to by-pass this test with
"accept-unsafe-violations-in-http-response" option.
This patch could be backported to all stable versions.
MINOR: h1-htx: Reports non-HTTP version via dedicated flags
Now, when the HTTP version format is not strictly valid, flags are set on
the h1 parser and the HTX start-line. H1_MF_NOT_HTTP is set on the H1 parser
and HTX_SL_F_NOT_HTTP is set on the HTX start-line. These flags were
introduced to avoid parsing again and again the version to know if it is a
valid version or not, escpecially because it is most of time valid.
MINOR: htx: Add a function to retrieve the HTTP version from a start-line
htx_sl_vsn() function can now be used to retrieve the ist string
representing the HTTP version from a start-line passed as parameter. This
function takes care to return the right part of the start-line, depending on
its type (request or response).
BUG/MINOR: hlua: Properly enable/disable receives for TCP applets
From a lua TCP applet, in functions used to retrieve data (receive,
try_receive and getline), we must take care to disable receives when data
are returned (or on failure) and to restart receives when these functions
are called again. In addition, when an applet execution is finished, we must
restart receives to properly drain the request.
This patch should be backported to 3.3. On older version, no bug was
reported so we can wait a report first.
BUG/MEDIUM: hlua: Fix end of request detection when retrieving payload
When the lua HTTP applet was refactored to use its own buffers, a bug was
introduced in receive() and getline() function. We rely on HTX_FL_EOM flag
to detect the end of the request. But there is nothing preventing extra
calls to these function, when the whole request was consumed. If this
happens, the call will yield waiting for more data with no way to stop it.
To fix the issue, APPLET_REQ_RECV flag was added to know the whole request
was received.
This patch should fix #3293. It must be backported to 3.3.
BUG/MINOR: hlua: Properly enable/disable line receives from HTTP applet
From a lua HTTP applet, in the getline() function, we must take care to
disable receives when a line is retrieved and to restart receives when the
function is called again. In addition, when an applet execution is finished,
we must restart receives to properly drain the request.
This patch could help to fix #3293. It must be backported to 3.3. On older
version, no bug was reported so we can wait a report first. But in that
case, hlua_applet_http_recv() should also be fixed (on 3.3 it was fixed
during the applets refactoring).
BUG/MINOR: quic: fix OOB read in preferred_address transport parameter
This bug impacts only the QUIC backend. A QUIC server does receive
a server preferred address transport parameter.
In quic_transport_param_dec_pref_addr(), the boundary check for the
connection ID was inverted and incorrect. This could lead to an
out-of-bounds read during the following memcpy.
This patch fixes the comparison to ensure the buffer has enough input data
for both the CID and the mandatory Stateless Reset Token.
Thank you to Kamil Frankowicz for having reported this.
BUG/MINOR: qpack: fix 1-byte OOB read in qpack_decode_fs_pfx()
In qpack_decode_fs_pfx(), if the first qpack_get_varint() call
consumes the entire buffer, the code would perform a 1-byte
out-of-bounds read when accessing the sign bit via **raw.
This patch adds an explicit length check at the beginning of
qpack_get_varint(), which systematically secures all other callers
against empty inputs. It also adds a necessary check before the
second varint call in qpack_decode_fs_pfx() to ensure data is still
available before dereferencing the pointer to extract the sign bit,
returning QPACK_RET_TRUNCATED if the buffer is exhausted.
Thank you to Kamil Frankowicz for having reported this.
BUG/MAJOR: qpack: unchecked length passed to huffman decoder
A call to huffman decoder function (huff_dec()) is made from qpack_decode_fs()
without checking the buffer length passed to this function, leading to OOB read
which can crash the process.
Thank you to Kamil Frankowicz for having reported this.
Willy Tarreau [Wed, 4 Mar 2026 10:37:12 +0000 (11:37 +0100)]
BUG/MEDIUM: hpack: correctly deal with too large decoded numbers
The varint hpack decoder supports unbounded numbers but returns 32-bit
results. This means that possible truncation my happen on some field
lengths or indexes that would be emitted as quantities that do not fit
in a 32-bit number. The final value will also depend on how the left
shift operation behaves on the target architecture (e.g. whether bits
are lost or used modulo 31). This could lead to a desynchronization of
the HPACK stream decoding compared to what an external observer would
see (e.g. from a network traffic capture). However, there isn't any
impact between streams, HPACK is performed at the connection level,
not at the stream level, so no stream may try to leverage this
limitation to have any effect on another one.
For the fix, instead of adding checks everywhere in the loop and for
the final stage, let's rewrite the decoder to compare the read value
to a max value that is shifted by 7 bits for every 7 bits read. This
allows a sender to continue to emit zeroes for higher bits without
being blocked, while detecting that a received value would overflow.
The loop is now simpler as it deals both with values with the higher
bit set and the final ones, and stops once the final value was recorded.
A test on non-zero before performing the shift was added to please
ubsan, though in practice zero shifted by any quantity remains zero.
But the test is cheap so that's OK.
Thanks to Guillaume Meunier, Head of Vulnerability Operations Center
France at Orange Cyberdefense, for reporting this bug.
On the backend side, QUIC MUX may be started preemptively before the
ALPN negotiation. This is useful notably for 0-RTT implementation.
However, this was a source of crashes. ALPN was expected to be retrieved
from the server cache, however QUIC MUX still used the ALPN from the
transport layer. This could cause a crash, especially when several
connections runs in parallel as the server cache is shared among
threads.
Thanks to the previous patch which reworks QUIC MUX init, this solution
can now be fixed. Indeed, if conn_get_alpn() is not successful, MUX can
look at the server cache again to use the expected value.
Note that this could still prevent the MUX to work as expected if the
server cache is resetted between connect_server() and MUX init. Thus,
the ultimate solution would be to copy the cached ALPN into the
connection. This problem is not specific to QUIC though, and must be
fixed in a separate patch.
This patch reworks the installation of app-ops layer by QUIC MUX.
Previously, app_ops field was stored directly into the quic_conn
structure. Then the MUX reused it directly during its qmux_init().
This patch removes app_ops field from quic_conn and replaces it with a
copy of the negotiated ALPN. By using quic_alpn_to_app_ops(), it ensures
it remains compatible with a known application layer.
On the MUX layer, qcc_install_app_ops() now uses the standard
conn_get_alpn() to retrieve the ALPN from the transport layer. This is
done via the newly defined <get_alpn> QUIC xprt callback.
This new architecture should be cleaner as it better highlights the
responsibility of each layers in the ALPN/app negotiation.
MINOR: mux-quic: add function for ALPN to app-ops conversion
Extract the conversion from ALPN to qcc_app_ops type from quic_conn
source file into QUIC MUX. The newly created function is named
quic_alpn_to_app_ops(). This will serve as a central point to identify
which ALPNs are currently supported in our QUIC stack.
This patch is purely a small refactoring. It will be useful for the next
one which rework MUX app-ops layer init. The current cleanup allows
notably to remove H3/hq-interop headers from quic_conn source file.
MINOR: quic/h3: reorganize stream reject after MUX closure
The QUIC MUX layer is closed after its transport counterpart. This may
be necessary then to reject any new streams opened by the remote peer.
This operation is dependent however from the application protocol.
Previously, a function qc_h3_request_reject() was directly implemented
in quic_conn source file for use when HTTP/3 was previously negotiated.
However, this solution was not evolutive and broke layering.
This patch introduces a new proper separation with a <strm_reject>
callback defined in quic_conn structure. When set, it will be used to
preemptively close any new stream. QUIC MUX is responsible to set it
just before its closure.
No functional change. This patch is purely a refactoring with a better
architecture design. Especially, H3 specific code from transport layer
is now completely removed.
BUG/MEDIUM: stream: Handle TASK_WOKEN_RES as a stream event
The conversion of TASK_WOKEN_RES to a stream event was missing. Among other
things, this wakeup reason is used when a stream is dequeued. So it was
possible to skip the connection establishment if the stream was also woken
up for a timer reason. When this happened, the stream was blocked till the
queue timeout expiration.
Converting TASK_WOKEN_RES to STRM_EVT_RES fixes the issue.
This patch should fix the issue #3290. It must be backported as far as 3.2.
BUG/MINOR: hlua: fix return with push nil on proxy check
hlua_check_proxy() may now return NULL if the target proxy instance has
been flagged for deletion. Thus, proxies method have been adjusted and
may push nil to report such case.
This patch fixes these error paths. When nil is pushed, 1 must be
returned instead of 0. This represents the count of pushed values on the
stack which can be retrieved by the caller.
REGTESTS: complete "del backend" with unnamed defaults ref free
Complete delete backend regtests by checking deletion of a proxy with a
reference on an unnamed defaults instance. This operation is sensible as
the defaults refcount is decremented, and when the last backend is
removed, the defaults is also freed.
This patch finalizes "del backend" handler by implementing the proper
proxy deletion.
After ensuring backend deletion can be performed, several steps are
executed. First, any watcher elements are updated to point on the next
proxy instance. The backend is then removed from ID and name global
trees and is finally detached from proxies_list.
Once the backend instance is removed from proxies_list, the backend
cannot be found by new elements. Thread isolation is lifted and
proxy_drop() is called, which will purge the proxy if its refcount is
null. Thanks to recently introduced PROXIES_DEL_LOCK, proxy_drop() is
thread safe.
MINOR: proxy: use atomic ops for default proxy refcount
Default proxy refcount <def_ref> is used to comptabilize reference on a
default proxy instance by standard proxies. Currently, this is necessary
when a default proxy defines TCP/HTTP rules or a tcpcheck ruleset.
Transform every access on <def_ref> so that atomic operations are now
used. Currently, this is not strictly needed as default proxies
references are only manipulated at init or deinit in single thread mode.
However, when dynamic backends deletion will be implemented, <def_ref>
will be decremented at runtime also.
MEDIUM: proxy: add lock for global accesses during default free
This patch is similar to the previous one, but this time it deals with
functions related to defaults proxies instances. Lock PROXIES_DEL_LOCK
is used to protect accesses on global collections.
This patch will be necessary to implement dynamic backend deletion, even
if defaults won't be use as direct target of a "del backend" CLI.
However, a backend may have a reference on a default instance. When the
backend is freed, this references is released, which can in turn cause
the freeing of the default proxy instance. All of this will occur at
runtime, outside of thread isolation.
MEDIUM: proxy: add lock for global accesses during proxy free
Define a new lock with label PROXIES_DEL_LOCK. Its purpose is to protect
operations performed on global lists or trees while a proxy is freed.
Currently, this lock is unneeded as proxies are only freed on
single-thread init or deinit. However, with the incoming dynamic backend
deletion, this operation will be also performed at runtime, outside of
thread isolation.
Amaury Denoyelle [Wed, 18 Feb 2026 13:00:08 +0000 (14:00 +0100)]
MINOR: cli: implement wait on be-removable
Implement be-removable argument to CLI wait. This is implemented via
be_check_for_deletion() invokation, also used by "del backend" handler.
The objective is to test whether a backend instance can be removed. If
this is not the case, the command may returns immediately if the target
proxy is incompatible with dynamic removal or if a user action is
required. Else, the command will wait until the temporary restriction is
lifted.
Amaury Denoyelle [Mon, 23 Feb 2026 11:01:39 +0000 (12:01 +0100)]
MINOR: server: mark backend removal as forbidden if QUIC was used
Currenly, quic_conn on the backend side may access their parent proxy
instance during their lifetime. In particular, this is the case for
counters update, with <prx_counters> field directly referencing a proxy
memory zone.
As such, this prevents safe backend removal. One solution would be to
check if the upper connection instance is still alive, as a proxy cannot
be removed if connection are still active. However, this would
completely prevent proxy counters update via
quic_conn_prx_cntrs_update(), as this is performed on quic_conn release.
Another solution would be to use refcount, or a dedicated counter on the
which account for QUIC connections on a backend instance. However,
refcount is currently only used by short-term references, and it could
also have a negative impact on performance.
Thus, the simplest solution for now is to disable a backend removal if a
QUIC server is/was used in it. This is considered acceptable for now as
QUIC on the backend side is experimental.
Amaury Denoyelle [Mon, 12 Jan 2026 14:26:45 +0000 (15:26 +0100)]
MINOR: proxy: prevent backend deletion if server still exists in it
Ensure a backend instance cannot be removed if there is still server in
it. This is checked via be_check_for_deletion() to ensure "del backend"
cannot be executed. The only solution is to use "del server" to remove
on the servers instances.
This check only covers servers not yet targetted via "del server". For
deleted servers not yet purged (due to their refcount), the proxy
refcount is incremented but this does not block "del backend"
invokation.
MINOR: proxy: prevent deletion of backend referenced by config elements
Define a new proxy flag PR_FL_NON_PURGEABLE. This is used to mark every
proxy instance explicitely referenced in the config. Such instances
cannot be deleted at runtime.
Static use_backend/default_backend rules are handled in
proxy_finalize(). Also, sample expression proxy references are protected
via smp_resolve_args().
Note that this last case also incidentally protects any proxies
referenced via a CLI "set var" expression. This should not be the case
as in this case variable value is instantly resolved so the proxy
reference is not needed anymore. This also affects dynamic servers.
MINOR: proxy: prevent backend removal when unsupported
Prevent removal of a backend which relies on features not compatible
with dynamic backends. This is the case if either dispatch or
transparent option is used, or if a stick-table is declared.
These limitations are similar to the "add backend" ones.
Amaury Denoyelle [Wed, 18 Feb 2026 08:55:13 +0000 (09:55 +0100)]
MINOR: lua: handle proxy refcount
Implement proxy refcount for Lua proxy class. This is similar to the
server class.
In summary, proxy_take() is used to increment refcount when a Lua proxy
is instantiated. proxy_drop() is called via Lua garbage collector. To
ensure a deleted backend is released asap, hlua_check_proxy() now
returns NULL if PR_FL_DELETED is set.
This approach is directly dependable on Lua GC execution. As such, it
probably suffers from the same limitations as the ones already described
in the previous commit. With the current patch, "del backend" is not
directly impacted though. However, the final proxy deinit may happen
after a long period of time, which could cause memory pressure increase.
One final observations regarding deinit : it is necessary to delay a
BUG_ON() which checks that defaults proxies list is empty. Now this must
be executed after Lua deinit (called via post_deinit_list). This should
guarantee that all proxies and their defaults refcount are null.
MINOR: server: take proxy refcount when deleting a server
When a server is deleted via "del server", increment refcount of its
parent backend. This is necessary as the server is not referenced
anymore in the backend, but can still access it via its own <proxy>
member. Thus, backend removal must not happen until the complete purge
of the server.
The proxy refcount is released in srv_drop() if the flag SRV_F_DELETED
is set, which indicates that "del server" was used. This operation is
performed after the complete release of the server instance to ensure no
access will be performed on the proxy via itself. The refcount must not
be decremented if a server is freed without "del server" invokation.
Another solution could be for servers to always increment the refcount.
However, for now in haproxy refcount usage is limited, so the current
approach is preferred. It should also ensure that if the refcount is
still incremented, it may indicate that some servers are not completely
purged themselves.
Note that this patch may cause issues if "del backend" are used in
parallel with LUA scripts referencing servers. Currently, any servers
referenced by LUA must be released by its garbage collector to ensure it
can be finally freed. However, it appeas that in some case the gc does
not run for several minutes. At least this has been observed with Lua
version 5.4.8. In the end, this will result in indefinitely blocking of
"del backend" commands.
Amaury Denoyelle [Fri, 27 Feb 2026 08:20:15 +0000 (09:20 +0100)]
MINOR: proxy: rename default refcount to avoid confusion
Rename proxy conf <refcount> to <def_ref>. This field only serves for
defaults proxy instances. The objective is to avoid confusion with the
newly introduced <refcount> field used for dynamic backends.
As an optimization, it could be possible to remove <def_ref> and only
use <refcount> also for defaults proxies usage. However for now the
simplest solution is implemented.
Amaury Denoyelle [Fri, 13 Feb 2026 10:29:00 +0000 (11:29 +0100)]
MINOR: proxy: add refcount to proxies
Implement refcount notion into proxy structure. The objective is to be
able to increment refcount on proxy to prevent its deletion temporarily.
This is similar to the server refcount : "del backend" is not blocked
and will remove the targetted instance from the global proxies_list.
However, the final free operation is delayed until the refcount is null.
As stated above, the API is similar to servers. Proxies are initialized
with a refcount of 1. Refcount can be incremented via proxy_take(). When
no longer useful, refcount is decremented via proxy_drop() which
replaces the older free_proxy(). Deinit is only performed once refcount
is null.
This commit also defines flag PR_FL_DELETED. It is set when a proxy
instance has been removed via a "del backend" CLI command. This should
serve as indication to modules which may still have a refcount on the
target proxy so that they can release it as soon as possible.
Note that this new refcount is completely ignored for a default proxy
instance. For them, proxy_take() is pure noop. Free is immediately
performed on first proxy_drop() invokation.
Amaury Denoyelle [Tue, 10 Feb 2026 10:20:12 +0000 (11:20 +0100)]
MINOR: stats: protect proxy iteration via watcher
Define a new <px_watch> watcher member in stats applet context. It is
used to register the applet on a proxy when iterating over the proxies
list. <obj1> is automatically updated via the watcher interaction.
Watcher is first initialized prior to stats_dump_proxies() invocation.
This guarantees that stats dump is safe even if applet yields and a
backend is removed in parallel.
Amaury Denoyelle [Tue, 10 Feb 2026 10:18:11 +0000 (11:18 +0100)]
MINOR: proxy: define proxy watcher member
Define a new member watcher_list in proxy. It will be used to register
modules which iterate over the proxies list. This will ensure that the
operation is safe even if a backend is removed in parallel.
Add "del backend" handler which is restricted to admin level. Along with
it, a new function be_check_for_deletion() is used to test if the
backend is removable.
Correct documentation for srv_detach() which previously stated that this
function could be called for a server even if not stored in its proxy
list. In fact there is a BUG_ON() which detects this case.
Amaury Denoyelle [Mon, 12 Jan 2026 09:53:57 +0000 (10:53 +0100)]
MINOR: proxy: convert proxy flags to uint
Proxy flags member were of type char. This will soon enough not be
sufficient as new flags will be defined. As such, convert flags member
to unsigned int type.
Amaury Denoyelle [Wed, 18 Feb 2026 13:00:22 +0000 (14:00 +0100)]
BUG/MINOR: proxy: add dynamic backend into ID tree
Add missing proxy_index_id() call in "add backend" handler. This step is
responsible to store the newly created proxy instance in the
used_proxy_id global tree.
Amaury Denoyelle [Thu, 26 Feb 2026 10:18:43 +0000 (11:18 +0100)]
BUG/MINOR: promex: fix server iteration when last server is deleted
Servers iteration via promex is now resilient to server runtime deletion
thanks to the watcher mechanism. However, the watcher was not correctly
initialized which could cause duplicate metrics reporting.
This issue happens when promex dump yielded when manipulating the last
server of a proxy. If this server is removed in parallel, <sv> pointer
will be set to NULL when promex resumes. Instead of switching to another
proxy, the code would reuse the same one and iterate again on the same
server list.
To fix this issue, <sv> pointer must not be reinitialized just after a
resumption point. Instead, this is now performed before
promex_dump_srv_metrics(), or just after switching to another proxy
instance. Thus, on resumption, if promex_dump_srv_metrics() is started
with <sv> as NULL, it means that the server was deleted and the end of
the current proxy list is reached, hence iteration is restarted on the
next proxy instance.
Note that ctx.p[1] does not need to be manually updated at the end of
promex_dump_srv_metrics() as srv_watch already does that.
Amaury Denoyelle [Tue, 24 Feb 2026 16:05:37 +0000 (17:05 +0100)]
MINOR: promex: test applet resume in stress mode
Implement a stress mode with force yield for promex applet each time a
metric is displayed. This is implemented by returning 0 in
promex_dump_ts() each time the output buffer is not empty.
To test this, haproxy must be compiled with DEBUG_STRESS and the
following configuration must be used :
Willy Tarreau [Thu, 26 Feb 2026 16:18:41 +0000 (17:18 +0100)]
BUG/MINOR: call EXTRA_COUNTERS_FREE() before srv_free_params() in srv_drop()
As seen with the last changes to counters allocation, the move of the
counters storage to the thread group as operated in commit 04a9f86a85
("MEDIUM: counters: add a dedicated storage for extra_counters in various
structs") causes some random errors when using ASAN, because the extra
counters are freed in srv_drop() after calling srv_free_params(), which
is responsible for freeing the per-thread group storage.
For the proxies however it's OK because free calls are made before the
call to deinit_proxy() which frees the per_tgrp area.
Willy Tarreau [Wed, 25 Feb 2026 08:58:08 +0000 (09:58 +0100)]
MEDIUM: counters: make EXTRA_COUNTERS_GET() consider tgid
Now we store and retrieve only counters for the current tgid when more
than one is supported. This allows to significantly reduce contention
on shared stats. The haterm utility saw its performance increase from
4.9 to 5.8M req/s in H1, and 6.0 to 7.6M for H2, both with 5 groups of
16 threads, showing that we don't necessarily need insane amounts of
groups.
Willy Tarreau [Wed, 25 Feb 2026 17:38:55 +0000 (18:38 +0100)]
MEDIUM: counters: return aggregate extra counters in ->fill_stats()
Now thanks to new macro EXTRA_COUNTERS_AGGR() we can iterate over all
thread groups storages when returning the data for a given metric. This
remains convenient and mostly transparent. The caller continues to pass
the pointer to the metric in the first group, and offsets are calculated
for all other groups and data summed. For now all groups except the
first one contain only zeroes but reported values are nevertheless
correct.
Willy Tarreau [Wed, 25 Feb 2026 09:13:21 +0000 (10:13 +0100)]
MINOR: counters: add EXTRA_COUNTERS_BASE() to retrieve extra_counters base storage
The goal is to always retrieve the storage address of the first thread
group for the given module. This will be used to iterate over all thread
groups. For now it returns the same value as EXTRA_COUNTERS_GET().
Willy Tarreau [Wed, 25 Feb 2026 15:06:21 +0000 (16:06 +0100)]
MEDIUM: counters: store the number of thread groups accessing extra_counters
In order to be able to properly allocate all storage and retrieve data
from there, we'll need to know how many thread groups are supposed to
access it. Let's store the number of thread groups at init time. If the
tgrp_step is zero, there's always only one tg though.
Now EXTRA_COUNTERS_ALLOC() takes this number of thread groups in argument
and stores it in the structure. It also allocates as many areas as needed,
incrementing the datap pointer by the step for each of them.
EXTRA_COUNTERS_FREE() uses this info to free all allocated areas.
EXTRA_COUNTERS_INIT() initializes all allocated areas, this is used
elsewhere to clear/preset counters, e.g. in proxy_stats_clear_counters().
It involves a memcpy() call for each array, which is normally preset to
something empty but might also be used to preset certain non-scalar
fields such as an instance name.
Willy Tarreau [Wed, 25 Feb 2026 08:53:13 +0000 (09:53 +0100)]
MINOR: counters: store a tgroup step for extra_counters to access multiple tgroups
We'll need to permit any user to update its own tgroup's extra counters
instead of the global ones. For this we now store the per-tgroup step
between two consecutive data storages, for when they're stored in a
tgroup array. When shared (e.g. resolvers or listeners), we just store
zero to indicate that it doesn't scale with tgroups. For now only the
registration was handled, it's not used yet.
Willy Tarreau [Tue, 24 Feb 2026 19:08:49 +0000 (20:08 +0100)]
MEDIUM: counters: add a dedicated storage for extra_counters in various structs
Servers, proxies, listeners and resolvers all use extra_counters. We'll
need to move the storage to per-tgroup for those where it matters. Now
we're relying on an external storage, and the data member of the struct
was replaced with a pointer to that pointer to data called datap. When
the counters are registered, these datap are set to point to relevant
locations. In the case of proxies and servers, it points to the first
tgrp's storage. For listeners and resolvers, it points to a local
storage. The rationale here is that listeners are limited to a single
group anyway, and that resolvers have a low enough load so that we do
not care about contention there.
Willy Tarreau [Wed, 25 Feb 2026 13:24:33 +0000 (14:24 +0100)]
CLEANUP: counters: only retrieve zeroes for unallocated extra_counters
Since version 2.4 with commit 7f8f6cb926 ("BUG/MEDIUM: stats: prevent
crash if counters not alloc with dummy one") we can afford to always
update extra_counters because we know they're always either allocated
or linked to a dedicated trash. However, the ->fill_stats() callbacks
continue to access such values, making it technically possible to
retrieve random counters from this trash, which is not really clean.
Let's implement an explicit test in the ->fill_stats() functions to
only return 0 for the metric when not allocated like this. It's much
cleaner because it guarantees that we're returning an empty counter
in this case rather than random values.
The situation currently happens for dummy servers like the ones used
in Lua proxies as well as those used by rings (e.g. used for logging
or traces). Normally, none of the objects retrieved via stats or
Prometheus is concerned by this unallocated extra_counters situation,
so this is more about a cleanup than a real fix.
Willy Tarreau [Wed, 25 Feb 2026 10:44:48 +0000 (11:44 +0100)]
MEDIUM: counters: change the fill_stats() API to pass the module and extra_counters
We'll soon need to iterate over thread groups in the fill_stats() functions,
so let's first pass the extra_counters and stats_module pointers to the
fill_stats functions. They now call EXTRA_COUNTERS_GET() themselves with
these elements in order to retrieve the required pointer. Nothing else
changed, and it's getting even a bit more transparent for callers.
Willy Tarreau [Tue, 24 Feb 2026 18:31:51 +0000 (19:31 +0100)]
CLEANUP: stats: drop stats.h / stats-t.h where not needed
A number of C files include stats.h or stats-t.h, many of which were
just to access the counters. Now those which really need counters rely
on counters.h or counters-t.h, which already reduces the amount of
preprocessed code to be built (~3000 lines or about 0.05%).
Willy Tarreau [Tue, 24 Feb 2026 17:57:05 +0000 (18:57 +0100)]
REORG: stats/counters: move extra_counters to counters not stats
It was always difficult to find extra_counters when the rest of the
counters are now in counters-t.h. Let's move the types to counters-t.h
and the macros to counters.h. Stats include them since they're used
there. But some users could be cleaned from the stats definitions now.
Willy Tarreau [Tue, 24 Feb 2026 17:50:30 +0000 (18:50 +0100)]
CLEANUP: quic-stats: include counters from quic_stats
There's something a bit awkward in the way stats counters are inherited
through the QUIC modules: quic_conn-t includes quic_stats-t.h, which
declares quic_stats_module as extern from a type that's not known from
this file. And anyway externs should not be exported from type defintions
since they're not part of the ABI itself.
This commit moves the declaration to quic_stats.h which now takes care
to include stats-t.h to get the definition of struct stats_module. The
few users who used to learn it through quic_conn-t.h now include it
explicitly. As a bonus this reduces the number of preprocessed lines
by 5000 (~0.1%).
By the way, it looks like struct stats_module could benefit from being
moved off stats-t.h since it's only used at places where the rest of
the stats is not needed. Maybe something to consider for a future
cleanup.
Willy Tarreau [Wed, 25 Feb 2026 09:40:48 +0000 (10:40 +0100)]
CLEANUP: tree-wide: drop a few useless null-checks before free()
We only support platforms where free(NULL) is a NOP so that
null checks are useless before free(). Let's drop them to keep
the code clean. There were a few in cfgparse-global, flt_trace,
ssl_sock and stats.
Willy Tarreau [Tue, 24 Feb 2026 19:28:50 +0000 (20:28 +0100)]
BUG/MINOR: server: adjust initialization order for dynamic servers
It appears that in cli_parse_add_server(), we're calling srv_alloc_lb()
and stats_allocate_proxy_counters_internal() before srv_preinit() which
allocates the thread groups. LB algos can make use of the per_tgrp part
which is initialized by srv_preinit(). Fortunately for now no algo uses
both tgrp and ->server_init() so this explains why this remained
unnoticed to date. Also, extra counters will soon require per_tgrp to
already be initialized. So let's move these between srv_preinit() and
srv_postinit(). It's possible that other parts will have to be moved
in between.
This could be backported to recent versions for the sake of safety but
it looks like the current code cannot tell the difference.
Willy Tarreau [Wed, 25 Feb 2026 23:18:44 +0000 (00:18 +0100)]
BUG/MEDIUM: mux-h2: make sure to always report pending errors to the stream
Some stream parsing errors that do not affect the connection result in
the parsed block not being transferred from the rx buffer to the channel
and not being reported upstream in rcv_buf(), causing the stconn to time
out. Let's detect this condition, and propagate term flags anyway since
no more progress will be made otherwise.
This should be backported at least till 3.2, probably even 2.8.
Willy Tarreau [Wed, 25 Feb 2026 21:39:35 +0000 (22:39 +0100)]
MINOR: mux-h2: add a new setting, "tune.h2.log-errors" to tweak error logging
The H2 mux currently logs whenever some decoding fails. Most of the errors
happen at the connection level, but some are even at the stream level,
meaning that multiple logs can be emitted for a given connection, which
can quickly use some resource for little value. This new setting allows
to tweak this and decide to only log errors that affect the connection,
or even none at all.
BUG/MINOR: quic: missing app ops init during backend 0-RTT sessions
The QUIC mux requires "application operations" (app ops), which are a list
of callbacks associated with the application level (i.e., h3, h0.9) and
derived from the ALPN. For 0-RTT, when the session cache cannot be reused
before activation, the current code fails to reach the initialization of
these app ops, causing the mux to crash during its initialization.
To fix this, this patch restores the behavior of
ssl_sock_srv_try_reuse_sess(), whose purpose was to reuse sessions stored
in the session cache regardless of whether 0-RTT was enabled, prior to
this commit:
MEDIUM: quic-be: modify ssl_sock_srv_try_reuse_sess() to reuse backend
sessions (0-RTT)
With this patch, this function now does only one thing: attempt to reuse a
session, and that's it!
This patch allows ignoring whether a session was successfully reused from
the cache or not. This directly fixes the issue where app ops
initialization was skipped upon a session cache reuse failure. From a
functional standpoint, starting a mux without reusing the session cache
has no negative impact; the mux will start, but with no early data to
send.
Finally, there is the case where the ALPN is reset when the backend is
stopped. It is critical to continue locking read access to the ALPN to
secure shared access, which this patch does. It is indeed possible for the
server to be stopped between the call to connect_server() and
quic_reuse_srv_params(). But this cannot prevent the mux to start
without app ops. This is why a 'TODO' section was added, as a reminder that a
race condition regarding the ALPN reset still needs to be fixed.
Olivier Houchard [Wed, 18 Feb 2026 10:22:04 +0000 (10:22 +0000)]
BUG/MEDIUM: cpu-topo: Distribute CPUs fairly across groups
Make sure CPUs are distributed fairly across groups, in case the number
of groups to generate is not a divider of the number of CPUs, otherwise
we may end up with a few groups that will have no CPU bound to them.
This was introduced in 3.4-dev2 with commit 56fd0c1a5c ("MEDIUM: cpu-topo:
Add an optional directive for per-group affinity"). No backport is
needed unless this commit is backported.
When "mode haterm" was set in a "defaults" section, it could not be
overridden in subsequent sections using the "mode" keyword. This is because
the proxy stream instantiation callback was not being reset to the
default stream_new() value.
This could break the stats URI with a configuration such as:
defaults
mode haterm
# ...
frontend stats
bind :8181
mode http
stats uri /
This patch ensures the ->stream_new_from_sc() proxy callback is reset
to stream_new() when the "mode" keyword is parsed for any mode other
than "haterm".
Amaury Denoyelle [Mon, 23 Feb 2026 09:10:05 +0000 (10:10 +0100)]
MINOR: proxy: improve code when checking server name conflicts
During proxy_finalize(), a lookup is performed over the servers by name
tree to detect any collision. Only the first conflict for each server
instance is reported to avoid a combinatory explosion with too many
alerts shown.
Previously, this was written using a for loop without any iteration.
Replace this by a simple if statement as this is cleaner.
Amaury Denoyelle [Mon, 23 Feb 2026 09:04:45 +0000 (10:04 +0100)]
MINOR: ncbmbuf: improve itbmap_next() code
itbmap_next() advances an iterator over a ncbmbuf buffer storage. When
reaching the end of the buffer, <b> field is set to NULL, and the caller
is expected to stop working with the iterator.
Complete this part to ensure that itbmap type is fully initialized in
case null iterator value is returned. This is not strictly required
given the above description, but this is better to avoid any possible
future mistake.
Willy Tarreau [Mon, 23 Feb 2026 15:19:48 +0000 (16:19 +0100)]
MINOR: traces: always mark trace_source as thread-aligned
Some perf profiles occasionally show that reading the trace source's
state can take some time, which is not expected at all. It just happens
that the trace_source is not cache-aligned so depending on linkage, it
may share a cache line with a more active variable, thereby inducing a
slow down to all threads trying to read the variable.
Let's always mark it aligned to avoid this. For now the problem was not
observed again.
BUG/MEDIUM: spoe: Acquire context buffer in applet before consuming a frame
Changes brought to support large buffers revealed a bug in the SPOE applet
when a frame is copied in the SPOE context buffer. A b_xfer() was performed
without allocating the SPOE context buffer. It is not expected. As stated in
the function documentation, the caller is responsible for ensuring there is
enough space in the destination buffer. So first of all, it must ensure this
buffer was allocated.
With recent changes, we are able to hit a BUG_ON() because the swap is no
longer possible if source and destination buffers size are not the same.
This patch should fix the issue #3286. It could be backported as far as 3.1.
Willy Tarreau [Mon, 23 Feb 2026 14:42:04 +0000 (15:42 +0100)]
CLENAUP: cfgparse: accept-invalid-http-* does not support "no"/"defaults"
Some options do not support "no" nor "defaults" and they're placed after
the check for their absence. However, "accept-invalid-http-request" and
"accept-invalid-http-response" still used to check for the flags that
come with these prefixes, but Coverity noticed this was dead code in
github issue #3272. Let's just drop the test.
CLEANUP: ssl: Remove a useless variable from ssl_gen_x509()
This function was recently created by moving code from acme_gen_tmp_x509()
(in acme.c) to ssl_gencrt.c (ssl_gen_x509()). The <ctmp> variable was
initialized and then freed without ever being used. This was already the
case in the original acme_gen_tmp_x509() function.
CLEANUP: haterm: avoid static analyzer warnings about rand() use
Avoid such a warnings from coverity:
CID 1645121: (#1 of 1): Calling risky function (DC.WEAK_CRYPTO)
dont_call: random should not be used for security-related applications,
because linear congruential algorithms are too easy to break.
Hyeonggeun Oh [Mon, 23 Feb 2026 03:49:37 +0000 (12:49 +0900)]
MINOR: ssl: clarify error reporting for unsupported keywords
This patch changes the registration of the following keywords to be
unconditional:
- ssl-dh-param-file
- ssl-engine
- ssl-propquery, ssl-provider, ssl-provider-path
- ssl-default-bind-curves, ssl-default-server-curves
- ssl-default-bind-sigalgs, ssl-default-server-sigalgs
- ssl-default-bind-client-sigalgs, ssl-default-server-client-sigalgs
Instead of excluding them at compile time via #ifdef guards in the keyword
registration table, their parsing functions now check feature availability
at runtime and return a descriptive error when the feature is missing.
For features controlled by the SSL library (providers, curves, sigalgs,
DH), the error message includes the actual OpenSSL version string via
OpenSSL_version(OPENSSL_VERSION), so users can immediately identify which
library they are running rather than seeing cryptic internal macro names.
For ssl-dh-param-file, the message also includes "(no DH support)" as a
hint, since OPENSSL_NO_DH can be set either by an OpenSSL build or by
HAProxy itself in certain configurations.
For ssl-engine, which depends on a HAProxy build-time flag (USE_ENGINE),
the message retains the flag name as it is more actionable for the user.
This addresses issue https://github.com/haproxy/haproxy/issues/3246.
In acme_req_finalize(), acme_req_challenge(), acme_req_neworder(),
acme_req_account(), and acme_post_as_get(), the success path always
calls unconditionally memprintf(errmsg, ...).
This may result in a leak of errmsg.
Additionally, acme_res_chkorder(), acme_res_finalize(), acme_res_auth(),
and acme_res_neworder() had unused 'out:' labels that were removed.
365a696 ("MINOR: acme: emit a log for DNS-01 challenge response")
introduces the auth->dns member which is istdup(). But this member is
never free, instead auth->token was freed twice by mistake.
Amaury Denoyelle [Fri, 20 Feb 2026 10:05:40 +0000 (11:05 +0100)]
MINOR: quic: add BUG_ON() on half_open_conn counter access from BE
half_open_conn is a proxy counter used to account for quic_conn in
half-open state : this represents a connection whose address is not yet
validated (handshake successful, or via token validation).
This counter only has sense for the frontend side. Currently, code is
safe as access is only performed if quic_conn is not yet flagged with
QUIC_FL_CONN_PEER_VALIDATED_ADDR, which is always set for backend
connections.
To better reflect this, add a BUG_ON() when half_open_conn is
incremented/decremented to ensure this never occurs for backend
connections.
Amaury Denoyelle [Fri, 20 Feb 2026 10:05:20 +0000 (11:05 +0100)]
BUG/MINOR: quic: fix counters used on BE side
quic_conn is initialized with a pointer to its proxy counters. These
counters are then updated during the connection lifetime.
Counters pointer was incorrect for backend quic_conn, as it always
referenced frontend counters. For pure backend, no stats would be
updated. For listen instances, this resulted in incorrect stats
reporting.
Fix this by correctly set proxy counters based on the connection side.
BUG/MINOR: haterm: missing allocation check in copy_argv()
This is a very minor bug with a very low probability of occurring.
However, it could be flagged by a static analyzer or result in a small
contribution, which is always time-consuming for very little gain.
MINOR: haterm: add long options for QUIC and TCP "bind" settings
Add the --quic-bind-opts and --tcp-bind-opts long options to append
settings to all QUIC and TCP bind lines. This requires modifying the argv
parser to first process these new options, ensuring they are available
during the second argv pass to be added to each relevant "bind" line.
MINOR: haterm: provide -b and -c options (RSA key size, ECDSA curves)
Add -b and -c options to the haterm argv parser. Use -b to specify the RSA
private key size (in bits) and -c to define the ECDSA certificate curves.
These self-signed certificates are required for haterm SSL bindings.