git.ipfire.org Git - thirdparty/haproxy.git/log

]> git.ipfire.org Git - thirdparty/haproxy.git/log

projects / thirdparty / haproxy.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 5 Dec 2024 09:28:50 +0000 (10:28 +0100)]

MINOR: stktable: implement "recv-only" table option

When "recv-only" keyword is added on a stick table declaration (in peers
or proxy section), haproxy considers that the table is only used for
data retrieval from a remote location and not used to perform local
updates. As such, it enables the retrieval of local-only values such
as conn_cur that are ignored by default. This can be useful in some
contexts where we want to know about local-values such are conn_cur
from a remote peer.

To do this, add stktable struct flags which default to NONE and enable
the RECV_ONLY flag on the table then "recv-only" keyword is found in the
table declaration. Then, when in peer_treat_updatemsg(), when handling
table updates, don't ignore data updates for local-only values if the flag
is set.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 4 Dec 2024 15:25:53 +0000 (16:25 +0100)]

BUG/MINOR: quic: remove startup alert if GSO unsupported

This patch is similar to the previous one, but for GSO support. Remove
alert level message to a diag report only visible with argument -dD.

This must be backported up to 3.1.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 4 Dec 2024 15:25:03 +0000 (16:25 +0100)]

BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported

QUIC relies on several advanced network API features from the kernel to
perform optimally. Checks are performed during startup to ensure that
these features are supported. A fallback is automatically performed for
every incompatible feature.

Besides the automatic fallback mechanism, a message is also reported to
the user at the same time. Previously, alert level was used, but it is
incorrect as it is reserved for unrecoverable errors which should
prevent haproxy to start. Warning level could be used, but this can
annoy users running with zero-warning mode.

This patch removes the alert message when 'socket-owner connection' mode
cannot be activated. Convert the message to a diag level. This allows
users to start without forcing configuration modification to hide a
warning. Besides, several feature fallback such as the polling mechanism
does not emit any warning either, so it's better to adopt a similar
behavior for QUIC features.

This must be backported up to 2.8.

commit | commitdiff | tree

Amaury Denoyelle [Thu, 28 Nov 2024 10:27:15 +0000 (11:27 +0100)]

BUG/MEDIUM: mux-quic: remove pacing status when everything is sent

TASK_F_USR1 is used by MUX tasklet when emission has been interrupted
due to pacing. When the tasklet runs again, only qcc_purge_sending()
will be called as an optimization.

Pacing status is only removed via qcc_wakeup(). Until then, TASK_F_USR1
is not cleared. This causes an issue after emission with pacing
completion if the MUX tasklet is woken up for a recv subscribe, as
qcc_wakeup() is not used by quic-conn layer. The tasklet will
incorrectly run only for pacing emission, without handling reception
process. Worst, a crash will occur if QCC tx frames list is empty, due
to a BUG_ON() in qcc_purge_sending().

Recv subscribe is only used for 0-RTT, when QUIC MUX is instantiated
before quic-conn handshake completion. Thus, this bug can only be
reproduced with 0-rtt. Furthermore, MUX must already have emitted at
least a few response bytes with pacing, before QUIC handshake
completion. It cannot easily be reproduced, at least with CLI clients
where the handshake is always already completed before MUX exchanges.

To fix this, remove TASK_F_USR1 when pacing emission has been completed.
At least, this prevents BUG_ON() on qcc_purge_sending() as it won't be
called with an empty QCC Tx frame list anymore. However, this bug has
revealed that MUX tasklet architecture is not suitable when both
handling reception and emission part. This will be improved in a future
serie of patches.

This should fix github issue #2796.

This must be backported up to 3.1.

commit | commitdiff | tree

Willy Tarreau [Wed, 4 Dec 2024 14:58:49 +0000 (15:58 +0100)]

BUG/MINOR: init: do not call fork_poller() for non-forked processes

In 3.1-dev10, commit 8dd4efe42f ("MAJOR: mworker: move master-worker
fork in init()") made the fork_poller() code unconditional, while it
is only desirable for processes that have been forked from a parent
(standalone daemon mode) or from a master (master-worker mode). The
call can be expensive in some cases as it will create a new poller,
scan and try to migrate to it all existing FDs till the highest known
one. With very high numbers of FDs, this can take several seconds to
start.

This should be backported to 3.1.

commit | commitdiff | tree

Willy Tarreau [Wed, 4 Dec 2024 07:26:24 +0000 (08:26 +0100)]

BUG/MEDIUM: init: make sure only daemonized processes change their session

Commit 8dd4efe42f ("MAJOR: mworker: move master-worker fork in init()")
introduced some sensitive changes to the startup code (which was
expected), and one sensitive change is that the second call to setsid()
was accidentally made unconditional. As such it even applies to foreground
processes, resulting in foreground processes being detached from the
terminal and no longer responding to Ctrl-C nor Ctrl-Z. An example of
this simply consists in start haproxy -db under sudo. Then a new shell
is required to stop it.

This patch removes this second setsid(), as it is already done in
apply_daemon_mode().

This must be backported to 3.1.

commit | commitdiff | tree

Frederic Lecaille [Wed, 4 Dec 2024 17:47:15 +0000 (18:47 +0100)]

BUG/MINOR: quic: fix bbr_inflight() calls with wrong gain value

This patch fixes two wrong calls to bbr_inflight().

bbr_target_inflight() aim is to compute the number of bytes BBR has to put on
the network as bytes in flight (sent but not acked bytes). It must call
bbr_inflight() with the current window gain value (in place of a wrong fixed 100
gain value here, in percents).

bbr_is_time_to_cruise() also called bbr_inflight() with a wrong gain value
as parameter due to a confusion between the value mentioned by the RFC (1
meaning 100% of the current window) and our implementation which needs value in
percents (so 100 in place of 1 here). Note that bbr_is_time_to_cruise() aim is to
make BBR the decision to leave the probing_bw down state. The bug had as side
effect to make BBR stay in this state during too long periods of time during
which the bottleneck bandwidth is decreasing, leading to big oscillations
between the mininum and maximum bottleneck bandwidth estimations.

This patch must be backported to 3.1 where BBR was first implemented.

commit | commitdiff | tree

Willy Tarreau [Thu, 28 Nov 2024 18:42:22 +0000 (19:42 +0100)]

MINOR: tasklet: set TASK_WOKEN_OTHER on tasklets by default

Now when tasklets are woken up via tasklet_wakeup(), tasklet_wakeup_on()
or tasklet_wakeup_after(), either the optional wakeup flags will be used,
or TASK_WOKEN_OTHER will be used.

This allows tasklet handlers waking up for any given cause to notice
whether or not they were also woken for another reason. For example, a
mux handler could skip heavy parts when seeing that TASK_WOKEN_OTHER is
absent, proving that no standard tasklet_wakeup() was done, for example
in response to a subscribe().

The benefit of the TASK_WOKEN_* flags is that they're purged during the
wakeup, and that they're easy to check for using TASK_WOKEN_ANY.
TASK_F_UEVT1 and TASK_F_UEVT2 are also usable for private use (e.g. wakeup
from a stream to a connection inside a mux).

Probably that in the future, code dealing with subscribe events should
start to place TASK_WOKEN_IO like is done for upper layers.

commit | commitdiff | tree

Willy Tarreau [Thu, 28 Nov 2024 14:11:46 +0000 (15:11 +0100)]

MINOR: tools: add a new macro DEFVAL() to provide a default argument

This is like DEFZERO and DEFNULL, but this one allows to specify the
default value to be used as the first argument.

commit | commitdiff | tree

Valentine Krasnobaeva [Mon, 2 Dec 2024 15:05:16 +0000 (16:05 +0100)]

BUG/MINOR: startup: fix pidfile creation

Pidfile should be created at the latest initialization stage, when we are
sure, that process is able to start successfully, otherwise PID value, written
in this file is no longer valid.

So, for the standalone mode, let's move the block, which opens the pidfile and
let's put it just before applying "chroot". In master-worker mode, master
doesn't perform chroot. So it creates the pidfile, only when the "READY"
message from the newly forked worker is received.

This should be backported only in 3.1

commit | commitdiff | tree

Valentine Krasnobaeva [Mon, 2 Dec 2024 15:04:56 +0000 (16:04 +0100)]

BUG/MINOR: startup: close pidfd and free global.pidfile in handle_pidfile()

After master-worker mode refactoring, global.pidfile is only used in
handle_pidfile(), which opens the provided file and writes the PID into it. So,
it's more appropriate to perform the close(pidfd) and ha_free(&global.pidfile)
also in this function.

This commit prepares the fix of the pidfile creation, as it's created now very
early, when we are not sure, that process has successfully started. In
master-worker mode handle_pidfile() can be called in the master process context.
So, let's make it accessible from other compilation units via global.h.

This should be backported only in 3.1.

commit | commitdiff | tree

Valentine Krasnobaeva [Mon, 2 Dec 2024 13:47:17 +0000 (14:47 +0100)]

BUG/MINOR: signal: register default handler for SIGINT in signal_init()

When haproxy is launched in a background and in a subshell (see example below),
according to POSIX standard (2.11. Signals and Error Handling), it inherits
from the subshell SIG_IGN signal handler for SIGINT and SIGQUIT.

$ (./haproxy -f env4.cfg &)

So, when haproxy is lanched like this, it doesn't stop upon receiving
the SIGINT. This can be a root cause of some unexpected timeouts, when haproxy
is started under VTest, as VTest sends to the process SIGINT in order to
terminate it. To fix this, let's explicitly register the default signal
handler for the SIGINT in signal_init() initcall.

This should be backported in all stable versions.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 2 Dec 2024 15:44:00 +0000 (16:44 +0100)]

MINOR: hlua: fix ambiguous hlua usage in hlua_filter_delete()

In GH #2804, @Bbulatov reported that the result of hlua_stream_ctx_get()
was used and de-referenced without checking if it's NULL in
hlua_filter_delete() while other functions used to check for NULL before
de-referencing it.

In fact hlua_stream_ctx_get() can only return NULL if
hlua_stream_ctx_prepare() failed or was not called on the current stream.

Now because of the filter's API, since hlua_filter_delete() is mapped as
detach method and hlua_filter_new() as attach method, and since
hlua_filter_new() is responsible for calling hlua_stream_ctx_prepare(),
there's no reason hlua_filter_delete() should be called if
hlua_filter_new() failed or wasn't called. Thus we can assume that hlua
can never be NULL in hlua_filter_delete(), so we add a BUG_ON() to ensure
it is always the case and remove the ambiguity.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 2 Dec 2024 15:22:28 +0000 (16:22 +0100)]

BUG/MINOR: listener: fix potential null pointer dereference in listener_release()

As reported by @Bbulatov on GH #2804, fe is found at multiple places in
listener_release(): in some places it is first checked against NULL before
being de-referenced while in some other places it is not, which is
ambiguous and could hide a bug.

In practise, fe cannot be NULL for now, but it might not be the case in
the future as we want to keep the possibility to run isolated listeners
(that is, without proxy attached).

We've already ensured this was the case with a57786e ("BUG/MINOR:
listener: null pointer dereference suspected by coverity"), but
this promise was recently broken by 65ae134 ("BUG/MINOR: listener: Wake
proxy's mngmt task up if necessary on session release").

Let's fix that by conditionning the block with an "else if" statement
instead of a regular "else".

No need for backport except if multi-connection protocols (ie: FTP) were
to be backported as well.

commit | commitdiff | tree

William Lallemand [Mon, 2 Dec 2024 14:19:41 +0000 (15:19 +0100)]

CI: github: allow coredumps on aws-lc and wolfssl jobs

The weekly aws-lc and wolfssl jobs lacks an `ulimit -c` call in order to
allow to get the coredumps.

commit | commitdiff | tree

Frederic Lecaille [Fri, 29 Nov 2024 13:39:48 +0000 (14:39 +0100)]

BUILD: quic: fix a build error about an non initialized timestamp

This is to please a non identified compilers which complains about an hypothetic
<time_ns> variable which would be not initialized even if this is the case only
when it is not used.

This build issue arrived with this commit:
BUG/MINOR: improve BBR throughput on very fast links

Should be backported to 3.1 with this previous commit.

commit | commitdiff | tree

Christopher Faulet [Fri, 29 Nov 2024 13:31:21 +0000 (14:31 +0100)]

BUG/MINOR: h1-htx: Use default reason if not set when formatting the response

When the response status line is formatted before sending it to the client,
if there is no reason set, HAProxy should add one that matches the status
code, as stated in the configuration manual. However it is not performed.

It is possible to hit this bug when the response comes from a H2 server,
because there is no reason field in HTTP/2 and above.

This patch should fix the issue #2798. It should be backported to all stable
versions.

commit | commitdiff | tree

Christopher Faulet [Thu, 28 Nov 2024 09:01:41 +0000 (10:01 +0100)]

BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry

It is possible to loose the request after several L7 retries, leading to
crashes, because the request channel flag stating some data were sent is not
properly reset.

When a L7 retry is performed, some flags on different entities must be reset
to be sure a new connection will be properly retried, just like it was the
first one, mainly because there was no connection establishment failure. One
of them, on the request channel, is not reset. The flag stating some data
were already sent. It is annoying because this flag is used during the
connection establishment to know if an error is triggered at the connection
level or at the data level. In the last case, the error must be handled by
the HTTP response analyzer, to eventually perform another L7 retry.

Because CF_WROTE_DATA flag is not removed when a L7 retry is performed, a
subsequent connection establishment error may be handled as a L7 error while
in fact the request was never sent. It also means the request was never
saved in the buffer used to performed L7 retries. Thus, on the next L7
retires, the request is just lost. This forecefully leads to a bunch of
undefined behavior. One of them is a crash, when the request is used to
perform the load-balancing.

This patch should fix issue #2793. It must be backported to all stable
versions.

commit | commitdiff | tree

Amaury Denoyelle [Fri, 29 Nov 2024 13:28:09 +0000 (14:28 +0100)]

BUG/MEDIUM: quic: prevent stream freeze on pacing

On snd_buf completion, QUIC MUX tasklet is scheduled if newly data has
been transferred from the stream layer. Thanks to qcc_wakeup(), pacing
status is removed from tasklet, which ensure next emission will reset Tx
frames and use the new data.

Tasklet is not scheduled if MUX is already subscribed on send due to a
previous blocking condition. This is an optimization to prevent an
unneeded IO handler execution. However, this causes a bug if an emission
is currently delayed due to pacing. As pacing status is not removed on
snd_buf, next emission process will continue emission with older data
without refreshing the newly transferred one.

This causes a transfer freeze. Unless there is some activity on the
connection, the transfer will be eventually aborted due to idle timeout.

To fix this, remove TASK_F_USR1 if tasklet wakeup is not called due to
send subscription. Note that this code is also duplicated in done_ff for
zero-copy transfer.

This must be backported up to 3.1.

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 29 Nov 2024 07:42:01 +0000 (08:42 +0100)]

BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided

In _event_hdl_publish(), when we prepare the asynchronous event and no
<data> was provided (set to NULL), we forgot to initialize the _data
event_hdl_async_event struct member to NULL, which leads to uninitialized
reads in event_hdl_async_free_event() when the event is freed:

==1002331== Conditional jump or move depends on uninitialised value(s)
==1002331==    at 0x35D9D1: event_hdl_async_free_event (event_hdl.c:224)
==1002331==    by 0x1CC8EC: hlua_event_runner (hlua.c:9917)
==1002331==    by 0x39AD3F: run_tasks_from_lists (task.c:641)
==1002331==    by 0x39B7B4: process_runnable_tasks (task.c:883)
==1002331==    by 0x314B48: run_poll_loop (haproxy.c:2976)
==1002331==    by 0x315218: run_thread_poll_loop (haproxy.c:3190)
==1002331==    by 0x18061D: main (haproxy.c:3747)

The bug severity was set to MEDIUM because of its nature, and it's best
if this patch can be backported up to 2.8. But in practise it can only be
triggered with events that don't provide optional data: since PAT_REF
events are the first native events making use of this feature, this bug
shouldn't be an issue before f72a66e ("MINOR: pattern: publish event_hdl
events on pat_ref updates")

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 29 Nov 2024 06:33:51 +0000 (07:33 +0100)]

BUG/MINOR: hlua_fcn: fix Patref:set() force parameter

Patref:set(key, val[, force]) takes optional "force" parameter (defaults
to false) to force the entry to be created if it doesn't already exist

To retrieve the value, lua_tointeger() was used in place of
lua_toboolean(), and because of that force is not enabled if "true"
is passed as parameter (only numbers were recognized) despite the
documentation mentioning that "force" is a boolean.

To fix the issue, we replace lua_tointeger by lua_toboolean.

Also, the doc was updated to rename "bool" to "boolean" for the "force"
parameter to stay consistent with historical naming in the file.

No backport needed unless 9ee37de5c ("MINOR: hlua_fcn: add Patref:set()")
is.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 28 Nov 2024 16:39:00 +0000 (17:39 +0100)]

DOC: lua: prefer Patref:{set,add}() over legacy methods for acl and maps

Patref:set() can achieve the same thing as core.set_map()
Patref:add() can achieve the same thing as core.add_acl()
Patref:del() can achieve the same thing as core.del_map() and
core.del_acl()

As a bonus, Patref:{set,add} are more efficient than their core
legacy equivalent, because they don't require systematic pattern
reference lookup for each individual operation.

Let's mention that in the doc to encourage Patref methods adoption.

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 27 Nov 2024 16:06:58 +0000 (17:06 +0100)]

MINOR: hlua_fcn: add Patref:event_sub()

Just like we did for server events, in this patch we expose the PAT_REF
event family (see "MINOR: event_hdl: add PAT_REF events") in Lua.

Unlike server events, Patref events don't provide additional event data,
and the registration can only take place from a Patref object (ie: not
globally).

Thanks to this commit it now becomes possible to trigger actions when
updates are performed on a map (or acl list) being monitor, without
the need to loop or use inefficient workarounds.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 26 Nov 2024 12:03:23 +0000 (13:03 +0100)]

MINOR: hlua_fcn: add Patref:add_bulk()

There is no cli equivalent for this one. It is similar to Patref:add()
excepts thay it takes a table as parameter (for acl: table of keys, for
maps: table of keys:values). The goal is to add multiple entries at once
to limit locking time to the strict minimum. It is recommended to use this
one over Patref:add() when adding multiple entries at once.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 26 Nov 2024 10:26:27 +0000 (11:26 +0100)]

MINOR: hlua_fcn: add Patref:set()

Just like "set map" on the cli, the Patref:set() method (only relevant
for maps) can be used to modify an existing entry's value in the pattern
reference pointed to by the Lua Patref object. Lookup is performed on the
key. The update will target the live pattern reference version, unless
Patref:prepare() is ongoing.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 26 Nov 2024 08:20:10 +0000 (09:20 +0100)]

MINOR: hlua_fcn: add Patref:del()

Just like "del map" and "del acl" on the cli, the Patref:del() method can
be used to delete an existing entry in the pattern reference pointed to
by the Lua Patref object. The update will target the live pattern
reference version, unless Patref:prepare() is ongoing.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 26 Nov 2024 07:38:23 +0000 (08:38 +0100)]

MINOR: hlua_fcn: add Patref:add()

Just like "add map" and "add acl" on the cli, the Patref:add() method can
be used to add a new entry to the pattern reference pointed to by the
Lua Patref object. The update will target the live pattern reference
version, unless Patref:prepare() is ongoing.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 25 Nov 2024 14:29:40 +0000 (15:29 +0100)]

MINOR: hlua_fcn: add Patref:giveup()

If Patref:commit() was used and the new version (generation) isn't going
to be committed, calling Patref:giveup() will allow allocated resources
to be freed and reused. It is a good habit to call this if commit()
isn't called after a prepare().

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 21 Nov 2024 15:46:26 +0000 (16:46 +0100)]

MINOR: hlua_fcn: add Patref:purge() method

It is a special Lua Patref method: it bypasses the commit/prepare logic
and purges the whole pattern reference items pointed to by Patref Lua
object (all versions, not just the current one). It doesn't have a cli
equivalent: it leverages pat_ref_purge_range().

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 21 Nov 2024 15:32:05 +0000 (16:32 +0100)]

MINOR: hlua_fcn: add Patref:prepare() method

Just like the "prepare map" or "prepare acl" on the cli, but for Lua:
it leverages the pattern API to create a subset (ie: a new generation id)
that will automatically be used as target for following Patref operations
(add/set/del...) until the "commit" method is invoked to atomically push
the pending updates.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 19 Nov 2024 15:40:20 +0000 (16:40 +0100)]

MINOR: hlua_fcn: add Patref:commit() method

commit() method may be used to commit pending updates on the local patref
object:

hlua_patref flags were added:
HLUA_PATREF_FL_GEN means the patref object has been updated
and it is associated to a new revision (curr_gen) in order to prepare
and commit the pending updates.

upon commit, the pattern API is leveraged with curr_gen as revision to
commit new object items. Once commit is performed, previous (pending)
revisions that are older than the committed one are cleaned up (similar
to what's done with commit on the cli). Also, Patref function APIs now
take into account curr_gen to perform lookups.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 21 Nov 2024 10:28:01 +0000 (11:28 +0100)]

MINOR: pattern: add pat_ref_may_commit() helper function

pat_ref_may_commit() may be used to know if a given generation ID id still
valid, which means it may still be committed at some point. Else it means
that another pending generation ID older than the tested one was already
committed and thus other generations ID below this one are stale and must
be regenerated.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 19 Nov 2024 14:34:11 +0000 (15:34 +0100)]

MINOR: hlua_fcn: wrap pat_ref struct for patref class

In order to extend the patref class features, let's wrap the pat_ref struct
into hlua_patref struct. This way we may add additional data alongside the
pat_ref pointer to store additional context required for pat_ref data
manipulation from lua.

Since the wrapper (hlua_patref) is an allocated object, we declare the _gc
metamethod for patref class in order to properly cleanup resources when
they are out of scope.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 19 Nov 2024 11:01:51 +0000 (12:01 +0100)]

MINOR: hlua_fcn: implement index and pair metamethods for patref class

patref object may now leverage index and pair methamethods to list and
access patref elements at a specific index (=key)

Also, patref:is_map() method may be used to know if the patref stores acl
(key only) or map-style (key:value) patterns.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 14 Nov 2024 16:37:54 +0000 (17:37 +0100)]

MINOR: hlua: add core.get_patref method

core.get_patref() method may be used to get a reference to a pattern
object (pat_ref struct which is used for maps and acl storage) from
Lua by providing the reference name (filename for files, or prefix+name
for opt or virtual pattern references).

Lua documentation was updated.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 7 Nov 2024 16:26:32 +0000 (17:26 +0100)]

MINOR: hlua: add patref class

Implement patref class to expose pat_ref struct internal pattern struct
in lua. This is some prerequisite work needed to be able to manipulate
exisiting generic pattern object lists (acl/map) from Lua, because the Map
class can only be used to perform matching ops on Map files.

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 18 Oct 2024 16:40:41 +0000 (18:40 +0200)]

MINOR: pattern: publish event_hdl events on pat_ref updates

Now that PAT_REF events were defined in previous commit, let's actually
publish them from pattern API where relevant. Unlike server events,
pattern reference events are only published in the pat_ref subscriber's
list on purpose, because in some setups patref updates (updates performed
on a map for instance from action or cli) are very frequent, and we don't
want to impact pattern API performance just for that.

Moreover, as the main use case is to be able to subscribe to maps updates
from Lua, allowing a per-pattern reference registration is already enough.

No additional data is provided for such events (also for performance reason)

Care was taken not to publish events when the update doesn't affect the
live subset (the one targeted by curr_gen).

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 6 Nov 2024 16:10:52 +0000 (17:10 +0100)]

MINOR: event_hdl: add PAT_REF events

This is some prerequisite work for implementing PAT_REF events.

In this commit we define the PAT_REF event_hdl family (which gets family
slot id #2), with the following supported events:

  - EVENT_HDL_SUB_PAT_REF_ADD: element was added to the current version of
    the pattern ref
  - EVENT_HDL_SUB_PAT_REF_DEL: element was deleted from the current
    version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_SET: element was modified in the current version
    of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_COMMIT: pending element(s) was/were commited in
    the current version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_CLEAR: all elements were cleared from the
    current version of the pattern ref

The goal is to be able to track a pat_ref struct in order to be notified
when it is updated. For performance reasons, events from this family won't
provide any additional info, and will only be published in the pat_ref
subscription list. Indeed, pat_ref may be updated at a relatively high
frequency (or worse, batch work), so we cannot afford doing expensive
treatment for each update.

commit | commitdiff | tree

Frederic Lecaille [Wed, 27 Nov 2024 18:39:34 +0000 (19:39 +0100)]

BUG/MINOR: improve BBR throughput on very fast links

This patch fixes the loss of information when computing the delivery rate
(quic_cc_drs.c) on links with very low latency due to usage of 32bits
variables with the millisecond as precision.

Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask
the scheduler to update the call date of this task. This allows this task to get
a nanosecond resolution on the call date calling task_mono_time(). This is enabled
only for congestion control algorithms with delivery rate estimation support
(BBR only at this time).

Store the send date with nanosecond precision of each TX packet into
->time_sent_ns new quic_tx_packet struct member to store the date a packet was
sent in nanoseconds thanks to task_mono_time().

Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c).

Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to
distinguish the unit used by this variable (millisecond) and update the code which
uses this variable. The logic found in quic_loss.c is not modified at all.

Must be backported to 3.1.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 28 Nov 2024 11:58:37 +0000 (12:58 +0100)]

MINOR: log: always consider "+M" option in lf_text_len()

Historically, when lf_text_len() or lf_text() were called with a NULL
string and "+M" option was set, "-" would be printed.

However, if the input string was simply an empty one with len > 0, then
nothing would be printed. This can happen if lf_text() is called with
an empty string because in this case len is set to size (indeed, for
performance reasons we don't pre-compute the length, we stop as soon
as we encounter a NULL-byte)

In practise, a lot of call places making use of lf_text() or lf_text_len()
try their best to avoid calling lf_text() with an empty string, and
instead explicitly call lf_text_len() with NULL as parameter to consider
the "+M" option.

But this is not enough, as shown in GH #2797, there could still be places
where lf_text() is called with an empty string. In such case, instead of
ignoring the "+M" option, let's check after _lf_text_len() if the returned
pointer differs from the original one. If both are equal, then it means
that nothing was printed (ie: result of empty string): in that case we
check the "+M" option to print "-" when possible.

While this commit seems harmless, it's probably better to avoid
backporting it since it could break existing applications relying on the
historical behavior.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 28 Nov 2024 11:03:17 +0000 (12:03 +0100)]

BUG/MINOR: log: fix lf_text() behavior with empty string

As reported by Baptiste in GH #2797, if a logformat alias leveraging
lf_text() ends up printing nothing (empty string), the whole logformat
evaluation stops, leading garbage log message.

This bug was introduced during 3.0 cycle in fcb7e4b ("MINOR: log: add
lf_rawtext{_len}() functions"). At that time I genuinely thought that
if strlcpy2() returned 0, it was due to a lack of space, actually
forgetting that the function may simply be called with an empty string.

Because of that, lf_text() would return NULL if called with an empty
string, and since all lf_*() helpers are expected to return NULL on
error, this explains why the logformat evaluation immediately stops in
this case.

To fix the issue, let's simply consider that strlcpy2() returning 0 is
not an error, like it was already the case before.

It should be backported in 3.1 and 3.0 with fcb7e4b.

commit | commitdiff | tree

Christopher Faulet [Thu, 28 Nov 2024 10:45:51 +0000 (11:45 +0100)]

MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status

The "421" status can now be specified on retry-on directives. PR_RE_* flags
were updated to remains sorted.

This patch should fix the issue #2794. It is quite simple so it may safely
be backported to 3.1 if necessary.

commit | commitdiff | tree

Christopher Faulet [Wed, 27 Nov 2024 09:04:45 +0000 (10:04 +0100)]

BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set

epoll_wait() may return EPOLLUP and/or EPOLLRDHUP after an asynchronous
connect(), to indicate that the peer accepted the connection then
immediately closed before epoll_wait() returned. When this happens,
sock_conn_check() is called to check whether or not the connection correctly
established, and after that the receive channel of the socket is assumed to
already be closed. This lets haproxy send the request at best (if RDHUP and
not HUP) then immediately close.

Over the last two years, there were a few reports about this spuriously
happening on connections where network captures proved that the server had
not closed at all (and sometimes even received the request and responded to
it after haproxy had closed). The logs show that a successful connection is
immediately reported on error after the request was sent. After
investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can
be reported by epool_wait() during the connect() but in sock_conn_check(),
the connect() reports a success. So the connection is validated but the HUP
is handled on the first receive and an error is reported.

The same behavior could be observed on health-checks, leading HAProxy to
consider the server as DOWN while it is not.

The only explanation at this point is that it is a kernel bug, notably
because it does not even match the documentation for connect() nor epoll. In
addition for now it was only observed with Ubuntu kernels 5.4 and 5.15 and
was never reproduced on any other one.

We have no reproducer but here is the typical strace observed:

socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 114
fcntl(114, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(114, SOL_TCP, TCP_NODELAY, [1], 4) = 0
connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(19, EPOLL_CTL_ADD, 114, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=114, u64=114}}) = 0
epoll_wait(19, [{events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=151, u64=151}}, {events=EPOLLIN, data={u32=59, u64=59}}, {events=EPOLLIN|EPOLLRDHUP, data={u32=114, u64=114}}], 200, 0) = 4
epoll_ctl(19, EPOLL_CTL_MOD, 114, {events=EPOLLOUT, data={u32=114, u64=114}}) = 0
epoll_wait(19, [{events=EPOLLOUT, data={u32=114, u64=114}}, {events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=10, u64=10}}, {events=EPOLLIN, data={u32=165, u64=165}}], 200, 0) = 4
connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = 0
sendto(114, "POST "..., 1009, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 1009
close(114)                              = 0

Some ressources about this issue:
  - https://www.spinics.net/lists/netdev/msg876470.html
  - https://github.com/haproxy/haproxy/issues/1863
  - https://github.com/haproxy/haproxy/issues/2368

So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on
the FD during the connection establishement if FD_POLL_ERR is not reported
too in sock_conn_check(). This way, the call to connect() is able to
validate or reject the connection. At the end, if the HUP or RDHUP flags
were valid, either connect() would report the error itself, or the next
recv() would return 0 confirming the closure that the poller tried to
report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and
this pattern is so rare that nobody will ever notice the extra call to
recv().

Please note that at least one reporter confirmed that using poll() instead
of epoll() also addressed the problem, so that can also be a temporary
workaround for those discovering the problem without the ability to
immediately upgrade.

The event is accounted via a COUNT_IF(), to be able to spot it in future
issue. Just in case.

This patch should fix the issue #1863 and #2368. It may be related
to #2751. It should be backported as far as 2.4. In 3.0 and below, the
COUNT_IF() must be removed.

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 16:24:21 +0000 (17:24 +0100)]

DEV: patchbot: prepare for new version 3.2-dev

The bot will now load the prompt for the upcoming 3.2 version so we have
to rename the files and update their contents to match the current version.

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 16:20:40 +0000 (17:20 +0100)]

MINOR: version: this is development again (3.2)

This basically reverts commit b629f366a7 ("MINOR: version: mention that
3.1 is stable now").

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 18:03:00 +0000 (19:03 +0100)]

MEDIUM: pattern: always consider gen_id for pat_ref lookup operations

Historically, pat_ref lookup operations were performed on the whole
pat_ref elements list. As such, set, find and delete operations on a given
key would cause any matching element in pat_ref to be considered.

When prepare/commit operations were added, gen_id was impelemnted in
order to be able to work on a subset from pat_ref without impacting
the current (live) version from pat_ref, until a new subset is committed
to replace the current one.

While the logic was good, there remained a design flaw from the historical
implementation: indeed, legacy functions such as pat_ref_set(),
pat_ref_delete() and pat_ref_find_elt() kept performing the lookups on the
whole set of elements instead of considering only elements from the current
subset. Because of this, mixing new prepare/commit operations with legacy
operations could yield unexpected results.

For instance, before this commit:

  echo "add map #0 key oldvalue" | socat /tmp/ha.sock -
  echo "prepare map #0" | socat /tmp/ha.sock -
  New version created: 1
  echo "add map @1 #0 key newvalue" | socat /tmp/ha.sock -
  echo "del map #0 key" | socat /tmp/ha.sock -
  echo "commit map @1 #0" | socat /tmp/ha.sock -

  -> the result would be that "key" entry doesn't exist anymore after the
  commit, while we would expect the new value to be there instead.

Thanks to the previous commits, we may finally fix this issue: for set,
find_elt and delete operations, the current generation id is considered.

With the above example, it means that the "del map #0 key" would only
target elements from the current subset, thus elements in "version 1" of
the map would be immune to the delete (as we would expect it to work).

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 16:26:23 +0000 (17:26 +0100)]

MEDIUM: pattern: consider gen_id in pat_ref_set_from_node()

Don't set all duplicates from a given node if they don't have the same
gen_id. Indeed, now we consider the gen_id to only work on the same
pattern ref revision.

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 17:18:54 +0000 (18:18 +0100)]

MINOR: pattern: add pat_ref_gen_delete() function

pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging
to <gen_id> and matching <key> under <ref>

The goal is to be able to target a single subset from <ref>

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 17:07:52 +0000 (18:07 +0100)]

MINOR: pattern: add pat_ref_gen_find_elt() function

pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element
belonging to <gen_id> and matching <key> in <ref> reference.

The goal is to be able to target a single subset from <ref>

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 16:30:39 +0000 (17:30 +0100)]

MINOR: pattern: add pat_ref_gen_set() function

pat_ref_gen_set(ref, gen_id, value, err) modifies to <value> the sample
of all patterns matching <key> and belonging to <gen_id> (generation id)
under <ref>

The goal is to be able to target a single subset from <ref>

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 20 Nov 2024 15:22:22 +0000 (16:22 +0100)]

MINOR: pattern: split pat_ref_set()

split pat_ref_set() function in 2 distinct functions. Indeed, since
0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for
"set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated
to include an extra <elt> argument. But the logic behind is not explicit
because the function will not only try to set <elt>, but also its
duplicate (unlike pat_ref_set_elt() which only tries to update <elt>).

Thus, to make it clearer and better distinguish between the key-based
lookup version and the elt-based one, restotre pat_ref_set() previous
prototype and add a dedicated pat_ref_set_elt_duplicate() that takes
<elt> as argument and tries to update <elt> and all duplicates.

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 14:33:57 +0000 (15:33 +0100)]

[RELEASE] Released version 3.2-dev0

Released version 3.2-dev0 with the following main changes :
- exact copy of 3.1.0

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 14:24:10 +0000 (15:24 +0100)]

[RELEASE] Released version 3.1.0

Released version 3.1.0 with the following main changes :
    - BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line
    - BUILD: activity/memprofile: fix a build warning in the posix_memalign handler
    - BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call
    - CI: update to the latest AWS-LC version
    - CI: update to the latest WolfSSL version
    - DOC: ot: mention planned deprecation of the OT filter
    - Revert "CI: update to the latest WolfSSL version"
    - CI: github: add a WolfSSL job which tries the latest version
    - BUILD: systemd: fix usage of reserved name "sun" in the address field
    - BUILD: init: use the more portable FD_CLOEXEC for /dev/null
    - CI: github: improve the Wolfssl job
    - CI: github: improve the AWS-LC job
    - BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes
    - BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return
    - MINOR: mux-quic: use sched call time for pacing
    - CI: github: allow to run the Illumos job manually
    - BUILD: tcp_sample: var_fc_counter defined but not used
    - CI: github: add 'workflow_dispatch' on remaining build jobs
    - DOC: config: refine a little bit the text on QUIC pacing
    - MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros
    - MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure
    - REORG: startup: move on_new_child_failure in mworker.c
    - MINOR: startup: prefix prepare_master and run_master with mworker_*
    - REORG: startup: move mworker_prepare_master in mworker.c
    - MINOR: startup: keep updating verbosity modes only in haproxy.c
    - REORG: startup: move mworker_run_master and mworker_loop in mworker.c
    - REORG: startup: move mworker_reexec and mworker_reload in mworker.c
    - MINOR: startup: prefix apply_master_worker_mode with mworker_*
    - REORG: startup: move mworker_apply_master_worker_mode in mworker.c
    - MINOR: cfgparse-quic: strengthen quic-cc-algo parsing
    - BUG/MAJOR: quic: fix wrong packet building due to already acked frames
    - DEV: lags/show-sess-to-flags: Properly handle fd state on server side
    - BUG/MEDIUM: http-ana: Don't release too early the L7 buffer
    - MINOR: quic: make bbr consider the max window size setting
    - DOC: quic: Amend the pacing information about BBR.
    - BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize
    - MINOR: cli: Add a "help" keyword to show sess
    - MINOR: cli/quic: Add a "help" keyword to show quic
    - DOC: management: mention "show sess help" and "show quic help"
    - DOC: install: update the list of supported versions
    - MINOR: version: mention that 3.1 is stable now

commit | commitdiff | tree

Christopher Faulet [Tue, 26 Nov 2024 14:11:36 +0000 (15:11 +0100)]

MINOR: version: mention that 3.1 is stable now

This version will be maintained up to around Q1 2026. The INSTALL file
also mentions it.

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 14:18:48 +0000 (15:18 +0100)]

DOC: install: update the list of supported versions

OpenSSL up to 3.4 was tested, and gcc up to 14 was tested, so let's
reflect this in the install doc.

commit | commitdiff | tree

Willy Tarreau [Tue, 26 Nov 2024 14:00:51 +0000 (15:00 +0100)]

DOC: management: mention "show sess help" and "show quic help"

These ones were recently added but we forgot to update the doc.

commit | commitdiff | tree

Olivier Houchard [Mon, 25 Nov 2024 17:35:51 +0000 (18:35 +0100)]

MINOR: cli/quic: Add a "help" keyword to show quic

Add a help keyword to show quic, that will provide a longer explanation
of all the available options than what is provided by the command "help".

commit | commitdiff | tree

Olivier Houchard [Mon, 25 Nov 2024 17:34:01 +0000 (18:34 +0100)]

MINOR: cli: Add a "help" keyword to show sess

Add a help keyword to show sess, that will provide a longer explanation of
all the available options than what is provided by the command "help".

commit | commitdiff | tree

Amaury Denoyelle [Tue, 26 Nov 2024 10:03:30 +0000 (11:03 +0100)]

BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize

A UDP datagram cannot be greater than 65535 bytes, as UDP length header
field is encoded on 2 bytes. As such, sendmsg() will reject a bigger
input with error EMSGSIZE. By default, this does not cause any issue as
QUIC datagrams are limited to 1.252 bytes and sent individually.

However, with GSO support, value bigger than 1.252 bytes are specified
on sendmsg(). If using a bufsize equal to or greater than 65535, syscall
could reject the input buffer with EMSGSIZE. As this value is not
expected, the connection is immediately closed by haproxy and the
transfer is interrupted.

This bug can easily reproduced by requesting a large object on loopback
interface and using a bufsize of 65535 bytes. In fact, the limit is
slightly less than 65535, as extra room is also needed for IP + UDP
headers.

Fix this by reducing the count of datagrams encoded in a single GSO
invokation via qc_prep_pkts(). Previously, it was set to 64 as specified
by man 7 udp. However, with 1252 datagrams, this is still too many.
Reduce it to a value of 52. Input to sendmsg will thus be restricted to
at most 65.104 bytes if last datagram is full.

If there is still data available for encoding in qc_prep_pkts(), they
will be written in a separate batch of datagrams. qc_send_ppkts() will
then loop over the whole QUIC Tx buffer and call sendmsg() for each
series of at most 52 datagrams.

This does not need to be backported.

commit | commitdiff | tree

Frederic Lecaille [Tue, 26 Nov 2024 06:46:17 +0000 (07:46 +0100)]

DOC: quic: Amend the pacing information about BBR.

BBR handles itself its own burst size (mentioned as send_quantum in BBR RFC).

commit | commitdiff | tree

Frederic Lecaille [Tue, 26 Nov 2024 06:37:58 +0000 (07:37 +0100)]

MINOR: quic: make bbr consider the max window size setting

Limit the BBR congestion control window size as this is done for all the others
congestion control algorithms with tune.quic.frontend.default-max-window-size
or as first argument passed to "bbr" option for "quic-cc-algo".

commit | commitdiff | tree

Christopher Faulet [Mon, 25 Nov 2024 21:05:27 +0000 (22:05 +0100)]

BUG/MEDIUM: http-ana: Don't release too early the L7 buffer

In some cases, the buffer used to store the request to be able to perform a
L7 retry is released released too early, leading to a crash because a retry
is performed with an empty request.

First, there is a test on invalid 101 responses that may be caught by the
"junk-response" retry policy. Then, it is possible to get an error
(empty-response, bad status code...) after an interim response. In both
cases, the L7 buffer is already released while it should not.

To fix the issue, the L7 buffer is now released at the end of the
AN_RES_WAIT_HTTP analyser, but only when a response was successfully
received and processed. In all error cases, the stream is quickly released,
with the L7 buffer. So there is no leak and it is safer this way.

This patch may fix the issue #2793. It must be as far as 2.4.

commit | commitdiff | tree

Christopher Faulet [Mon, 25 Nov 2024 20:57:27 +0000 (21:57 +0100)]

DEV: lags/show-sess-to-flags: Properly handle fd state on server side

It must be handled as an hexadecimal value.

commit | commitdiff | tree

Frederic Lecaille [Mon, 25 Nov 2024 10:14:20 +0000 (11:14 +0100)]

BUG/MAJOR: quic: fix wrong packet building due to already acked frames

If a packet build was asked to probe the peer with frames which have just
been acked, the frames build run by qc_build_frms() could be cancelled  by
qc_stream_frm_is_acked() whose aim is to check that current frames to
be built have not been already acknowledged. In this case the packet build run
by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet
which should be ack-eliciting.

This is a bug detected by the BUG_ON() statement in qc_do_build_pk():

    BUG_ON(qel->pktns->tx.pto_probe &&
           !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING));

Thank you to @Tristan971 for having reported this issue in GH #2709

This is an old bug which must be backported as far as 2.6.

commit | commitdiff | tree

Amaury Denoyelle [Mon, 25 Nov 2024 14:37:46 +0000 (15:37 +0100)]

MINOR: cfgparse-quic: strengthen quic-cc-algo parsing

quic-cc-algo is a bind keyword which is used to specify the congestion
control algorithm. It is parsed via function bind_parse_quic_cc_algo().

The parsing function was too laxed as it used strncmp for algo token
matching. This could cause surprise if specifying an invalid algorithm
but starting identically to another entry. Especially if extra
parameters are specified in parenthesis, as in this case parameters
value will be completely ignored and default value used instead.

To fix this, convert algo argument to ist. Then, use istsplit() to
extract algo token from the optional extra arguments and compare the
whole value with isteq().

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 22:42:17 +0000 (23:42 +0100)]

REORG: startup: move mworker_apply_master_worker_mode in mworker.c

mworker_apply_master_worker_mode() is called only in master-worker mode, so
let's move it mworker.c

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 22:33:31 +0000 (23:33 +0100)]

MINOR: startup: prefix apply_master_worker_mode with mworker_*

This patch prepares the move of apply_master_worker_mode in mworker.c. So,
let's at first rename it to mworker_apply_master_worker_mode.

commit | commitdiff | tree

Valentine Krasnobaeva [Mon, 25 Nov 2024 11:04:35 +0000 (12:04 +0100)]

REORG: startup: move mworker_reexec and mworker_reload in mworker.c

Let's move mworker_reexec() and mworker_reload() in mworker.c. mworker_reload()
is called only within the functions, which are already in mworker.c. So, this
reorganization allows to declare mworker_reload() as a static.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 22:15:39 +0000 (23:15 +0100)]

REORG: startup: move mworker_run_master and mworker_loop in mworker.c

mworker_run_master() is called only in master mode. mworker_loop() is static
and called only in mworker_run_master(). So let's move these both functions in
mworker.c.

We also need here to make run_thread_poll_loop() accessible from other units,
as it's used in mworker_loop().

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 22:11:05 +0000 (23:11 +0100)]

MINOR: startup: keep updating verbosity modes only in haproxy.c

This commit prepares the move of mworker_run_master() in mworker.c.

Let's remove from it's definition the code, which adjusts verbosity in
dependency of other global run time modes (daemon or foreground). This part
should stay in main(), where all verbosity modes are handeled for
different mode combinations.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 22:07:00 +0000 (23:07 +0100)]

REORG: startup: move mworker_prepare_master in mworker.c

mworker_prepare_master() performs some preparation routines for the new worker
process, which will be forked during the startup. It's called only in
master-worker mode, so let's move it in mworker.c.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 21:58:53 +0000 (22:58 +0100)]

MINOR: startup: prefix prepare_master and run_master with mworker_*

This patch prepares the move of prepare_master() and run_master() definitions
into mworker.c. So, let's at first prefix its names with mworker_*.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 21:39:20 +0000 (22:39 +0100)]

REORG: startup: move on_new_child_failure in mworker.c

mworker_on_new_child_failure() performs some routines for the worker process,
if it has failed the reload. As it's called only in mworker_catch_sigchld()
from mworker.c, let's move mworker_on_new_child_failure() in mworker.c as well.
Like this it could also be declared as a static.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 21:41:46 +0000 (22:41 +0100)]

MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure

This patch prepares the moving of on_new_child_failure definition into
mworker.c. So, let's rename it accordingly and let's also update its
description.

commit | commitdiff | tree

Valentine Krasnobaeva [Fri, 22 Nov 2024 15:43:45 +0000 (16:43 +0100)]

MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros

In master-worker mode, worker process uses now send_fd_uxst() to send
'_send_status' command to master. Since refactoring, this started to trigger
the following Valgrind reports:

==810584== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==810584==    at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28)
==810584==    by 0x4AAC99D: sendmsg (sendmsg.c:25)
==810584==    by 0x56350F: send_fd_uxst (proto_sockpair.c:271)
==810584==    by 0x3AA25C: main (haproxy.c:4151)
==810584==  Address 0x1ffefffbfe is on thread 1's stack
==810584==  in frame #1, created by send_fd_uxst (proto_sockpair.c:241)
==810584==
==810584== Syscall param sendmsg(msg.msg_control) points to uninitialised byte(s)
==810584==    at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28)
==810584==    by 0x4AAC99D: sendmsg (sendmsg.c:25)
==810584==    by 0x56350F: send_fd_uxst (proto_sockpair.c:271)
==810584==    by 0x3AA25C: main (haproxy.c:4151)
==810584==  Address 0x1ffefffc14 is on thread 1's stack
==810584==  in frame #1, created by send_fd_uxst (proto_sockpair.c:241)
==810584==

So, let's initialize with zeros all buffers, which are passed to sendmsg
syscall(), used in send_fd_uxst() to avoid these Valgrind messages. They
increase Valgrind output and could make unnoticeable some other, more important
reports.

commit | commitdiff | tree

Willy Tarreau [Mon, 25 Nov 2024 13:30:15 +0000 (14:30 +0100)]

DOC: config: refine a little bit the text on QUIC pacing

The QUIC pacing options changed a few times during their development.
For example the unit is now in datagrams not bytes. Also a few
sentences were slightly ambiguous so let's reword this.

No backport is needed.

commit | commitdiff | tree

William Lallemand [Mon, 25 Nov 2024 13:03:13 +0000 (14:03 +0100)]

CI: github: add 'workflow_dispatch' on remaining build jobs

Add 'workflow_dispatch' on the remaining scheduled build jobs that does
not have it.

This keyword allows to start manually a job from the "Actions" interface
in github.

commit | commitdiff | tree

William Lallemand [Mon, 25 Nov 2024 10:41:26 +0000 (11:41 +0100)]

BUILD: tcp_sample: var_fc_counter defined but not used

var_fc_counter is not used on Illumos and emit a warning

  src/tcp_sample.c:291:12: warning: ‘var_fc_counter’ defined but not used [-Wunused-function]
    291 | static int var_fc_counter(struct arg *args, char **err)
        |            ^~~~~~~~~~~~~~

Let's add an ifdef to build it.

commit | commitdiff | tree

William Lallemand [Mon, 25 Nov 2024 10:30:04 +0000 (11:30 +0100)]

CI: github: allow to run the Illumos job manually

Add the "workflow_dispatch" option to the Illumos CI so it can be run
manually from the github actions page.

commit | commitdiff | tree

Amaury Denoyelle [Thu, 21 Nov 2024 15:20:15 +0000 (16:20 +0100)]

MINOR: mux-quic: use sched call time for pacing

QUIC pacing was recently implemented to limit burst and improve overall
bandwidth. This is used only for MUX STREAM emission. Pacing requires
nanosecond resolution. As such, it used now_cpu_time() which relies on
clock_gettime() syscall.

The usage of clock_gettime() has several drawbacks :
* it is a syscall and thus requires a context-switch which may hurt
  performance
* it is not be available on all systems
* timestamp is retrieved multiple times during a single task execution,
  thus yielding different values which may tamper pacing calculation

Improve this by using task_mono_time() instead. This returns task call
time from the scheduler thread context. It requires the flag
TASK_F_WANTS_TIME on QUIC MUX tasklet to force the scheduler to update
call time with now_mono_time(). This solves every limitations listed
above :
* syscall invokation is only performed once before tasklet execution,
  thus reducing context-switch impact
* on non compatible system, a millisecond timer is used as a fallback
  which should ensure that pacing works decently for them
* timer value is now guaranteed to be fixed duing task execution

commit | commitdiff | tree

Amaury Denoyelle [Fri, 22 Nov 2024 14:43:16 +0000 (15:43 +0100)]

BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return

qc_prep_pkts() is a QUIC transport level function which encodes one or
several datagrams in a buffer before sending them. It returns the number
of encoded datagram. This is especially important when pacing is used to
limit packet bursts.

This datagram accounting was not trivial as qc_prep_pkts() used several
code paths depending on the condition of the current encoded packet.
Thus, there were several places were the local variable dgram_cnt could
have been incremented. This was implemented by the following commit :

commit 5cb8f8a6224db96f4386277c41ddae4a29a4130d
MINOR: quic: support a max number of built packet per send iteration

However, there is a bug due to a missing increment when all frames from
the current QEL have been encoded. In this case, the encoding continue
in the same datagram to coalesce a futur packet. However, if this is the
last QEL, encoding loop will then break. As first_pkt is not NULL,
qc_txb_store() is called outside but dgram_cnt is yet not incremented.

In particular, this causes qc_prep_pkts() to return 0 when there is only
small STREAM frames to emit for application QEL. In qc_send(), this is
interpreted as a value which prevents further emission for the current
invokation. Thus, it may hurts performance, both without and with
pacing.

To fix this, removing multiple dgram_cnt increment. Now, it is modified
only in a single place which should cover every case, and render the
code easier to validate.

The most notable case where the bug is visible is when using cubic with
pacing without any burst, with quic-cc-algo cubic(,1). First, transfer
bandwidth in average was suboptimal, with significant variation. Worst,
it could sometimes fall dramatically for a particular stream without
recovering before returning to an expected level on the next one.

No need to backport.

commit | commitdiff | tree

Amaury Denoyelle [Thu, 21 Nov 2024 14:18:41 +0000 (15:18 +0100)]

BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes

On show quic, each MUX streams are listed with their various indicator
for buffering on Rx and Tx. In particular, txoff displays in parenthesis
the current level of data prepared by the upper stream instance not yet
emitted by QUIC transport layer.

This value is only accessible after a substract operation. However,
there was a typo which caused the result to be always 0. Fix this by
reusing the correct offsets in the calculation.

This should be backported up to 3.0.

commit | commitdiff | tree

William Lallemand [Mon, 25 Nov 2024 10:14:33 +0000 (11:14 +0100)]

CI: github: improve the AWS-LC job

Like the WolfSSL job, improve the AWS-LC job by adding the socat command
so all SSL reg-tests can be run.
Also add gdb and output of corefiles.

commit | commitdiff | tree

William Lallemand [Mon, 25 Nov 2024 09:54:39 +0000 (10:54 +0100)]

CI: github: improve the Wolfssl job

Improve the WolfSSL job by adding the missing socat command.
Also add gdb and output corefiles like it's done on the VTest job.

commit | commitdiff | tree

Willy Tarreau [Mon, 25 Nov 2024 07:43:25 +0000 (08:43 +0100)]

BUILD: init: use the more portable FD_CLOEXEC for /dev/null

In 3.1-dev10, commit 8dd4efe42f ("MAJOR: mworker: move master-worker
fork in init()"), the FD associated to /dev/null was made CLOEXEC
using O_CLOEXEC. Unfortunately this is not portable on older OSes,
doesn't build on Solaris for example, and was even reported as breaking
moderately old Linux OSes for other projects. Better not use it unless
absolutely certain it will work (currently we only use it for Linux
namespaces, which are optional), and use the conventional FD_CLOEXEC
instead.

No backport is needed.

commit | commitdiff | tree

Willy Tarreau [Mon, 25 Nov 2024 07:04:09 +0000 (08:04 +0100)]

BUILD: systemd: fix usage of reserved name "sun" in the address field

systemd.c doesn't build on Solaris / Illumos because it uses "sun" as
the field name in a structure, while "sun" is the name of the macro
used to detect Solaris:

  src/systemd.c: In function 'sd_notify':
  src/systemd.c:43:22: error: expected identifier or '(' before numeric constant
     struct sockaddr_un sun;
                        ^
  src/systemd.c:44:2: warning: no semicolon at end of struct or union
    } socket_addr = {
    ^

Admittedly, the OS could have instead defined "sun" to itself to avoid
this. Any other name will work, let's just use "ux" for the short form
of "unix".

The problem appeared in 3.0-dev with commit aa3632962f ("MEDIUM:
mworker: get rid of libsystemd"), though by then this file was only
built when USE_SYSTEMD was set, which was not the case for non-linux
platforms. However since 3.1-dev14 with commit 15845247db ("MEDIUM:
mworker: remove USE_SYSTEMD requirement for -Ws"), all platforms
now build this file.

No backport is needed even though it will not hurt to have it in 3.0
for completeness.

commit | commitdiff | tree

William Lallemand [Fri, 22 Nov 2024 16:03:09 +0000 (17:03 +0100)]

CI: github: add a WolfSSL job which tries the latest version

Like the AWS-LC job, add a CI job which looks for the latest WolfSSL
version and tries to build it.

The patch adds a function which determines the latest version of WolfSSL
from the github tag, and the yml which describes the job.

commit | commitdiff | tree

William Lallemand [Fri, 22 Nov 2024 15:23:44 +0000 (16:23 +0100)]

Revert "CI: update to the latest WolfSSL version"

This reverts commit 03f57fcf94dae61906b56d10d1fb21f7afaae4fc.

Looks like the 5.7.4 version is broke with HAProxy, let's revert the CI
for now.

commit | commitdiff | tree

Willy Tarreau [Fri, 22 Nov 2024 15:06:09 +0000 (16:06 +0100)]

DOC: ot: mention planned deprecation of the OT filter

Miroslav mentioned below that he's currently working on an OpenTelemetry
replacement for the OpenTracing filter since OpenTracing itself is no
longer maintained nor supported:

https://github.com/haproxy/haproxy/issues/2782#issuecomment-2493576327

Given that he aims for 3.2, let's already settle on an upcoming deprecation
of the filter for 3.3 with a removal for 3.5. This will leave time to finish
the development and permit users to switch smoothly. At this point no warning
is emitted (since the users have no alternative) but better mention this plan
in the doc to make them aware of future changes.

commit | commitdiff | tree

William Lallemand [Fri, 22 Nov 2024 15:05:32 +0000 (16:05 +0100)]

CI: update to the latest WolfSSL version

Update the CI to the 5.7.4 WolfSSL version.

commit | commitdiff | tree

William Lallemand [Fri, 22 Nov 2024 15:03:28 +0000 (16:03 +0100)]

CI: update to the latest AWS-LC version

Update the CI to the 1.39.0 AWS-LC version.

commit | commitdiff | tree

Frederic Lecaille [Fri, 22 Nov 2024 14:40:05 +0000 (15:40 +0100)]

BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call

The per-packet delivery rate sample is applied to ack-eliciting packet only
calling ->drs_on_transmit() BBR callback. So, ->on_pkt_lost() which inspects the
delivery rate sampling information during packet loss detection must not be
called for non ack-eliciting packet. If not, it would be facing with non
initialized variables with big chance to trigger a BUG_ON().

As BBR is implemented in the current developement version, there is
no need to backport this patch.

commit | commitdiff | tree

Willy Tarreau [Fri, 22 Nov 2024 08:41:02 +0000 (09:41 +0100)]

BUILD: activity/memprofile: fix a build warning in the posix_memalign handler

A "return NULL" statement was placed for error handling in the
posix_memalign() handler instead of an int errno value, by recent
commit 5ddc8b3ad4 ("MINOR: activity/memprofile: monitor non-portable
calls as well"). Surprisingly the warning only triggered on gcc-4.8.
Let's use ENOMEM instead. No backport needed.

commit | commitdiff | tree

Christopher Faulet [Thu, 21 Nov 2024 21:01:12 +0000 (22:01 +0100)]

BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line

The formatting of the first-line, for a request or a response, does not
properly handle the wrapping of the output buffer. This may lead to a data
corruption for the current response or eventually for the previous one.

Utility functions used to format the first-line of the request or the
response rely on the chunk API. So it is not expected to pass a buffer that
wraps. Unfortunatly, because of a change performed during the 2.9 dev cycle,
the output buffer was direclty used instead of a non-wrapping buffer created
from it with b_make() function. It is not an issue for the request because
its start-line is always the first block formatted in the output buffer. But
for the response, the output may be not empty and may wrap. In that case,
the response start-line is dumped at a random position in the buffer,
corrupting data. AFAIK, it is only an issue if the HTTP request pipelining
is used.

To fix the issue, we now take care to create a non-wapping buffer from the
output buffer.

This patch should fix issues #2779 and #2996. It must be backported as far as
2.9.

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 22:26:41 +0000 (23:26 +0100)]

[RELEASE] Released version 3.1-dev14

Released version 3.1-dev14 with the following main changes :
    - MINOR: acl: export find_acl_default()
    - MINOR: sample: extend the "when" converter to support an ACL
    - MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{client,server} as sizes
    - MINOR: cfgparse: parse tune.{rcvbuf,sndbuf}.{frontend,backend} as sizes
    - MINOR: cfgparse: parse tune.pipesize as a size
    - MINOR: cfgparse: parse tune.recv_enough as a size
    - MINOR: cfgparse: parse tune.bufsize as a size
    - MINOR: cfgparse: parse tune.bufsize.small as a size
    - REGTESTS: silence the "log format ignored" warnings
    - REGTESTS: silence warning "previous 'http-response' action is final"
    - REGTESTS: make the unit explicit for very short timeouts
    - REGTESTS: silence warnings about content-type being ignored
    - REGTESTS: remove a duplicate "option httpslog" in the defaults section
    - REGTESTS: silence warning "L6 sample fetches ignored" in cond_set_var
    - REGTESTS: add missing timeouts to 30 tests
    - REGTESTS: only use tune.ssl.default-dh-param when not using AWS-LC
    - REGTESTS: enable -dW on almost all tests to fail on warnings
    - MEDIUM: config: warn on unitless timeouts < 100 ms
    - MINOR: tools: make parse_size_err() support 32/64 bits
    - MINOR: ring: support unit suffixes in the size
    - MINOR: cfgparse-global: parse options to allow non std keywords in discovery mode
    - BUG/MINOR: mworker-prog: don't warn about deprecated section with expose-deprecated-directives
    - MINOR: cli: make "show env" accessible via master CLI without enabling debug
    - MINOR: config: show HAPROXY_BRANCH in "show env" output
    - MINOR: http-ana: Add option to keep query-string on a localtion-based redirect
    - MINOR: http-ana: Add support for "set-cookie-fmt" option to redirect rules
    - MINOR: agent-check: Be able to set absolute weight via an agent
    - MINOR: stream: Add an option to "show sess" command to dump the captured URI
    - DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code
    - DOC: config: Fix a typo in "1.3.1. The Request line"
    - MINOR: http: Add support for HTTP 414/431 status codes
    - DEV: phash: Update 414 and 431 status codes to phash
    - MINIR: mux-h1: Return 414 or 431 when appropriate
    - BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only
    - DOC: config: Slightly improve the %Tr documentation
    - DOC: config: Move wait_end in section about internal samples
    - DOC: config: Move fs.* and bs.* in section about L5 samples
    - MINOR: stats-file: add the filename in the warning
    - MEDIUM: stats-file: explicitely ignore comments starting by //
    - DOC: quic: rename max-window-size as with default prefix
    - MINOR: mux-quic: add missing values for show flags
    - MINOR: quic: simplify qc_prep_pkts() exit path
    - MINOR: quic: support a max number of built packet per send iteration
    - MINOR: quic: extend qc_send_mux() return type with a dedicated enum
    - MINOR: quic: define quic_pacing module
    - MINOR: quic/pacing: implement quic_pacer engine
    - MINOR: quic/pacing: support pacing emission on quic_conn layer
    - MINOR: quic/pacing: add burst support
    - MINOR: mux-quic: define a tx STREAM frame list member
    - MINOR: mux-quic: encapsulate QCC tasklet wakeup
    - MAJOR: mux-quic: support pacing emission
    - MINOR: quic: use dynamic cc_algo on bind_conf
    - MINOR: quic: extend quic-cc-algo optional parameters
    - MEDIUM: quic: define cubic-pacing congestion algorithm
    - MINOR: mux_quic/pacing: display pacing info on show quic
    - MEDIUM: stats-file: silently ignore be/fe mistmatch
    - REGTESTS: use -dW by default on every reg-tests
    - DOC: lua: fix yield-dependent methods expected contexts
    - DOC: sched: add missing scheduler API documentation for tasklet_wakeup_after()
    - DOC: sched: document the missing TASK_F_UEVT* flags
    - CLEANUP: tinfo: move sched_*_date/*_mono_time to the thread-local area
    - MINOR: stream: don't update s->lat_time when the wakeup date is not set
    - MINOR: tinfo/clock: turn sched_call_date to 64-bits
    - MINOR: sched: add TASK_F_WANTS_TIME to make the scheduler update the call date
    - MINOR: tools: add new macro DEFZERO to provide a default zero argument
    - MINOR: tasklet: make the low-level tasklet API take a flag
    - MINOR: tasklet: support an optional set of wakeup flags to tasklet_wakeup_on()
    - DOC: configuration: explain the rules regarding spaces in arguments
    - DOC: configuration: explain quotes and spaces in conditional blocks
    - DOC: configuration: wrap long line for "strstr()" conditional expression
    - BUG/MINOR: http-ana: Adjust the server status before the L7 retries
    - MINOR: http-fetch: Add an option to 'query" to get the QS with the '?'
    - BUG/MINOR: cfgparse-quic: fix renaming of max-window-size
    - MEDIUM: mworker: remove USE_SYSTEMD requirement for -Ws
    - CI: vtest: temporarily build from the sd-notify PR
    - MINOR: systemd: replace SOCK_CLOEXEC by fcntl call to FD_CLOEXEC
    - BUILD: makefile: make ERR apply to build options as well
    - MINOR: startup: set HAPROXY_LOCALPEER only once
    - DOC: configuration: update "Environment variables" chapter
    - DOC: config: indent the list of environment variables
    - OPTION: map/hlua: make core.set_map() lookup more efficient
    - REGTESTS: switch to -Ws for master-worker reg-tests
    - REGTESTS: disable temporarly mworker test on OSX
    - MINOR: quic: Add the congestion window initial value to QUIC path
    - MINOR: window_filter: Implement windowed filter (only max)
    - MINOR: quic: implement delivery rate sampling algorithm
    - MINOR: quic: implement BBR congestion control algorithm for QUIC
    - MINOR: quic: quic_cc modifications to support BBR
    - MINOR: quic: quic_loss modifications to support BBR
    - MINOR: quic: RX part modifications to support BBR
    - MINOR: quic: TX part modifications to support BBR.
    - MINOR: quic: add "bbr" new "quic-cc-algo" option
    - BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames
    - BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding
    - BUG/MEDIUM: h3: Properly limit the number of headers received
    - BUG/MEDIUM: h3: Increase max number of headers when sending headers
    - DOC: config: Improve documentation of tune.http.maxhdr directive
    - DOC: management: Clearly state "show errors" only reports malformed H1 messages
    - BUILD: makefile: build flags.c before haproxy to speed up the build
    - BUILD: makefile: reorder object files by build time
    - MINOR: config: Improve warnings on misplaced rules by adding an optional arg
    - CLEANUP: cfgparse: Add direction in functions name that warn on misplaced rules
    - MINOR: cfgparse: Emit a warning for misplaced "tcp-response content" rules
    - BUG/MINOR: cfgparse-quic: fix bbr initialization
    - MINOR: cfgparse-quic: activate pacing only via burst argument
    - MINOR: quic: Useless rate sample member initialization
    - BUG/MINOR: cfgparse-quic: fix warning for cc-aglo with 0 burst
    - MINOR: quic: support pacing for newreno and nocc
    - BUG/MINOR: quic: Missing application limitations tracking for BBR
    - MINOR: cfgparse-global: add cfg_parse_global_chroot
    - MINOR: cfgparse-global: add more checks for "chroot" argument
    - BUG/MINOR: startup: fix UAF when set the default for log_tag
    - MINOR: capabilities: rename program_name argument to progname
    - MINOR: startup: use global progname variable
    - MINOR: cfgparse-global: add cfg_parse_global_localpeer
    - BUG/MINOR: config: allow to check HAPROXY_LOCALPEER in config
    - BUG/MINOR: startup: init_early: remove obsolete comment
    - BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler()
    - BUG/MEDIUM: wdt: fix the stuck detection for warnings
    - BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary
    - MINOR: activity/memprofile: offer a function to unregister stale info
    - BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy()
    - MINOR: activity: better report nil than ffff in unknown callers
    - CLEANUP: activity: better use a mask to tests freeing methods
    - MINOR: activity/memprofile: also monitor strdup() activity
    - MINOR: activity/memprofile: monitor non-portable calls as well
    - MINOR: activity: interrupt the show profile dump more often
    - MINOR: tools: resolve main() only once in resolve_sym_name()
    - MINOR: tools: add a new function "resolve_dso_name" to find a symbol's DSO
    - MINOR: activity/memprofile: use resolve_dso_name() for the DSO summary
    - REGTESTS: relax strerror matching to avoid a failure on libmusl
    - REGTESTS: don't rely on the base64 utility when openssl base64 is already used

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 19:59:36 +0000 (20:59 +0100)]

REGTESTS: don't rely on the base64 utility when openssl base64 is already used

Regtest ocsp_auto_update.vtc used to fail here on FreeBSD because the
base64 utility was not installed by default. Once installed it would
still fail because the utility doesn't support -w to wrap lines. Since
the regtest already relies on openssl base64 for a few commands, let's
just rely on it for the other ones. The only limitation is that openssl
freezes on lines longer than 1024 bytes, and doesn't seem to process more
than 255 chars at once, which might be the reason for using base64 -w 1000
in the first place (the script was probably tested like this). Instead
sed is efficient at wrapping long lines and does the job pretty well.
The output was fixed at 72 chars so that the output is also readable on
a terminal for debugging.

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 19:26:46 +0000 (20:26 +0100)]

REGTESTS: relax strerror matching to avoid a failure on libmusl

The regtest4be_1srv_smtpchk_httpchk_layer47errors.vtc fails on musl
because it reports "Network unreachable" for -EUNREACH while the
check matches "Network is unreachable" as on other OSes. Let's just
replace " is" with ".*". It now works on both glibc and musl.

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 14:16:37 +0000 (15:16 +0100)]

MINOR: activity/memprofile: use resolve_dso_name() for the DSO summary

Let's simplify the code by making use of this simpler and sometimes
more efficient variant.

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 14:15:53 +0000 (15:15 +0100)]

MINOR: tools: add a new function "resolve_dso_name" to find a symbol's DSO

In the memprofile summary per DSO, we currently have to pay a high price
by calling dladdr() on each symbol when doing the summary per DSO at the
end, while we're not interested in these details, we just want the DSO
name which can be made cheaper to obtain, and easier to manipulate. So
let's create resolve_dso_name() to only extract minimal information from
an address. At the moment it still uses dladdr() though it avoids all the
extra expensive work, and will further be able to leverage the same
mechanism as "show libs" to instantly spot DSO from address ranges.

commit | commitdiff | tree

Willy Tarreau [Thu, 21 Nov 2024 13:14:49 +0000 (14:14 +0100)]

MINOR: tools: resolve main() only once in resolve_sym_name()

resolv_sym_name() calls dladdr(main) for each symbol in order to compare
the first address with other symbols. But this is pointless and quite
expensive in outputs to "show profiling" for example. Let's just keep a
local copy and have a variable indicating if the resolution is needed/
in progress/done to save the value for subsequent calls.

Mirror of https://github.com/haproxy/haproxy.git