Willy Tarreau [Fri, 19 Jan 2024 16:23:07 +0000 (17:23 +0100)]
MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
This adds a new pair of stored types in the stick-tables:
- glitch_cnt
- glitch_rate
These keep count of the number of glitches reported on a front connection,
in order to decide how to act with a badly defective client or a potential
attacker. For now nothing updates these counters, but all the infrastructure
needed to configure, update and retrieve them was added, including the doc.
No regtest was added yet since they're not filled yet.
Willy Tarreau [Fri, 19 Jan 2024 15:55:17 +0000 (16:55 +0100)]
DOC: internal: update missing data types in peers-v2.0.txt
This is apparently the only location where the stored data types are
documented, but it was quite outdated as it stopped at gpc1 rate. This
patch adds the missing types (up to and including gpc_rate).
Willy Tarreau [Thu, 8 Feb 2024 13:37:56 +0000 (14:37 +0100)]
MINOR: mux-h2: count late reduction of INITIAL_WINDOW_SIZE as a glitch
It's quite uncommon for a client to decide to change the connection's
initial window size after the settings exchange phase, unless it tries
to increase it. One of the impacts depending is that it updates all
streams, so it can be expensive, depending on the stacks, and may even
be used to construct an attack. For this reason, we now count a glitch
when this happens.
A test with h2spec shows that it triggers 9 across a full test.
Willy Tarreau [Fri, 19 Jan 2024 17:20:21 +0000 (18:20 +0100)]
MINOR: mux-h2: count excess of CONTINUATION frames as a glitch
Here we consider that if a HEADERS frame is made of more than 4 fragments
whose average size is lower than 1kB, that's very likely an abuse so we
count a glitch per 16 fragments, which means 1 glitch per 1kB frame in a
16kB buffer. This means that an abuser sending 1600 1-byte frames would
increase the counter by 100, and that sending 100 headers per request in
individual frames each results in a count of ~7 to be added per request.
A test consisting in sending 100M requests made of 101 frames each over
a connection resulted in ~695M glitches to be counted for this connection.
Note that no special care is taken to avoid wrapping since it already takes
a very long time to reach 100M and there's no particular impact of wrapping
here (roughly 1M/s).
Willy Tarreau [Thu, 8 Feb 2024 14:01:36 +0000 (15:01 +0100)]
BUG/MINOR: mux-h2: count rejected DATA frames against the connection's flow control
RFC9113 clarified a point regarding the payload from DATA frames sent to
closed streams. It must always be counted against the connection's flow
control. In practice it should really have no practical effect, but if
repeated upload attempts are aborted, this might cause the client's
window to progressively shrink since not being ACKed.
It's probably not necessary to backport this, unless another patch
depends on it.
MINOR: stream: rename "txn.redispatch" to "txn.redispatched"
The fetch will return true if the stream was redispatched: this is a
past action, thus we rename the fetch to better reflect its true
meaning and prevent confusions.
Documentation was updated.
While at it, the fetch was moved from internal states section to Layer 4
section, which is where it belongs.
An extra space was placed at the start of "bytes_out" description,
and dconv was having a hard time to properly render the text in html
format because of that.
Finally, remove an extra line feed.
This should be backported in 2.9 with c7424a1ba ("MINOR: samples:
implement bytes_in and bytes_out samples")
REGTESTS: ssl: Fix empty line in cli command input
The 'set ssl cert' command was failing because of empty lines in the
contents of the PEM file used to perform the update.
We were also missing the issuer in the newly created ckch_store, which
then raised an error when committing the transaction.
BUG/MINOR: ssl: Reenable ocsp auto-update after an "add ssl crt-list"
If a certificate that has an OCSP uri is unused and gets added to a
crt-list with the ocsp auto update option "on", it would not have been
inserted into the auto update tree because this insertion was only
working on the first call of the ssl_sock_load_ocsp function.
If the configuration used a crt-list like the following:
cert1.pem *
cert2.pem [ocsp-update on] *
Then calling "del ssl crt-list" on the second line and then reverting
the delete by calling "add ssl crt-list" with the same line, then the
cert2.pem would not appear in the ocsp update list (can be checked
thanks to "show ssl ocsp-updates" command).
This patch ensures that in such a case we still perform the insertion in
the update tree.
BUG/MINOR: ssl: Destroy ckch instances before the store during deinit
The ckch_store's free'ing function might end up calling
'ssl_sock_free_ocsp' if the corresponding certificate had ocsp data.
This ocsp cleanup function expects for the 'refcount_instance' member of
the certificate_ocsp structure to be 0, meaning that no live
ckch instance kept a reference on this certificate_ocsp structure.
But since in ckch_store_free we were destroying the ckch_data before
destroying the linked instances, the BUG_ON would fail during a standard
deinit. Reversing the cleanup order fixes the problem.
BUG/MEDIUM: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.
The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.
BUG/MINOR: ssl: Clear the ckch instance when deleting a crt-list line
When deleting a crt-list line through a "del ssl crt-list" call on the
CLI, we ended up free'ing the corresponding ckch instances without fully
clearing their contents. It left some dangling references on other
objects because the attache SSL_CTX was not deleted, as well as all the
ex_data referenced by it (OCSP responses for instance).
MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.
BUG/MINOR: ssl: Duplicate ocsp update mode when dup'ing ckch
When calling ckchs_dup (during a "set ssl cert" CLI command), if the
modified store had OCSP auto update enabled then the new certificate
would not keep the previous update mode and would not appear in the auto
update list.
MINOR: applet: Use an option to disable zero-copy forwarding for all applets
At the beginning of the 3.0-dev cycle, the zero-copy forwarding support was
added only for the cache applet with an option to disable it. This was a
hack, waiting for a better integration with applets. It is now possible to
implement the zero-copy forwarding for any applets. So the specific option
for the cache applet was renamed to be used for all applets. And this option
is now also checked for the stats applet.
Concretely, 'tune.cache.zero-copy-forwarding' was renamed to
'tune.applet.zero-copy-forwarding'.
MINOR: cache: Remove unsed .data_sent field from the cache applet context
This field was introduced when the first implementation of the zero-copy
forwarding was added. It is now useless. However, we must still save the
body-size of the object in the cache.
MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
Just like for the cache applet, it is now possible to send response to the
opposite side using the zero-copy forwarding. Internal functions were
slightly updated but there is nothing special to say. Except the requested
size during the nego stage is not exact.
MEDIUM: mux-h1: Support zero-copy forwarding for chunks with an unknown size
Till now, for chunked messages, the H1 mux used the size requested during
the zero-copy forwarding negotiation as the chunk size. And till now, this
was accurate because the requested size was indeed the chunk size on the
producer side.
But this will be a problem to implement the zero-copy forwarding on some
applets because the content size is not known during the nego but only when
it is produced. Thanks to previous patches, it is now possible to know the
requested size is not exact and we are able to reserve a larger space to
write the chunk size later, in h1_done_ff(), with some padding.
MINOR: mux-h1: Stop zero-copy forwarding during nego for too big requested size
Now, during the zero-copy forwarding negotiation, when the requested size is
exact, we are now able to check if it is bigger than the expected one or
not. If it is indeed bigger than expeceted, the zero-copy forwarding is
disabled, the error will be triggered later on the normal sending path.
MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
It is now possible to use a flag during zero-copy forwarding negotiation to
specify the requested size is exact, it means the producer really expect to
receive at least this amount of data.
It can be used by consumer to prepare some processing at this stage, based
on the requested size. For instance, in the H1 mux, it is used to write the
next chunk size.
MINOR: mux-h1: Be able to define the length of a chunk size when it is prepended
It is now possible to impose the length to represent the chunk size in the
function used to prepended the chunk size in a buffer (so before the chunk
itself). It is thus possible to reserve a specific space for an unknown
chunk size and padding it with leading '0' to use all the space and avoid
holes.
MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
During zero-copy forwarding negotiation, a pseudo flag was already used to
notify the consummer if the producer is able to use kernel splicing or not. But
this was not extensible. So, now we use a true bitfield to be able to pass flags
during the negotiation. NEGO_FF_FL_* flags may be used now.
Of course, for now, there is only one flags, the kernel splicing support on
producer side (NEGO_FF_FL_MAY_SPLICE).
MEDIUM: cache: Temporarily remove zero-copy forwarding support
The cache applet will be refactored to use its own buffer. Thus, for now,
the zero-copy forwarding support is removed and it will be reintrocuded
later.
MAJOR: stats: Update HTTP stats applet to handle its own buffers
The HTTP stat applets and all internal functions was adapted to use its own
buffers instead of the channels ones. The CLI part was not refactored yet,
thus there are still some access to channels in the file. But for the HTTP
part, we no longer use the channels at all.
To do so, the HTTP stats applet now uses default .rcv_buf and .snd_buf
callback function. In addition, it sets appctx flags instead of SE ones.
MEDIUM: stats: Don't interrupt processing on partial post
We no longer test the opposite stream-connector to detect aborted partial
post. Applets must not try to access to info ouside their scope. This make
the code more sensitive to changes and it is a common source of bug.
Tests on the sedesc flags at the begining of the I/O handler should be
enough.
MEDIUM: applet: Add support for zero-copy forwarding from an applet
Thanks to this patch, it is possible to an applet to directly send data to
the opposite endpoint. To do so, it must implement <fastfwd> appctx callback
function and set SE_FL_MAY_FASTFWD flag.
Everything will be handled by appctx_fastfwd() function. The applet is only
responsible to transfer data. If it sets <to_forward> value, it is used to
limit the amount of data to forward.
MINOR: applet: Add callback function to deal with zero-copy forwarding
This patch introduces the support for the callback function responsible to
produce data via the zero-copy forwarding mechanism. There is no
implementation for now. But <to_forward> field was added in the appctx
structure to let an applet inform how much data it want to forward. It is
not mandatory but it will be used during the zero-copy forwarding
negociation.
MEDIUM: applet: Use appctx flags to report EOS/EOI/ERROR to SE
We have indroduced flags to deal with end of input, end of stream and errors
at the applet level. With this patch we make the link with the endpoint
descriptor.
In appctx_rcv_buf(), applet flags are converted to SE flags.
MINOR: applet: Add an appctx flag to report shutdown to applets
There is no shutdown for reads and send with applets. Both are performed
when the appctx is released. So instead of 2 flags, like for
muxes/connections, only one flag is used. But the idea is the same:
acknowledge the event at the applet level.
MINOR: applet: Remove appctx state field to only used the flags
The appctx state was never really used as a state. It is only used to know
when an applet should be freed on the next wakeup. This can be converted to
a flag and the state can be removed. This is what this patch does.
MINIOR: applet: Add flags to deal with ends of input, ends of stream and errors
Dedicated appctx flags to report EOI, EOS and errors (pending or terminal) were
added with the functions to set these flags. It is pretty similar to what it
done on most of muxes.
MINOR: applet: Add flags on the appctx and stop abusing its state
Till now, we've extended the appctx state to add some flags. However, the
field name is misleading. So a bitfield was added to handle real flags. And
helper functions to manipulate this bitfield were added.
MEDIM: applet: Add the applet handler based on IN/OUT buffers
A dedicated function to run applets was introduced, in addition to the old
one, to deal with applets that use their own buffers. The main differnce
here is that this handler does not use channels at all. It performs a
synchronous send before calling the applet and performs a synchronous
receive just after.
MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
There is no tasklet to handle I/O subscriptions for applets, but functions
to deal with receives and sends from the SC layer were added. it meanse a
function to retrieve data from an applet with this synchronous version and a
function to push data to an applet wit this synchronous version.
It is pretty similar to the functions used for muxes but there are some
differences. So for now, we keep them separated.
Zero-copy forwarding is not supported for now. In addition, there is no
subscription mechanism.
MINOR: applet: Implement default functions to exchange data with channels
In this patch, we add default functions to copy data from a channel to the
<inbuf> buffer of an applet (appctx_rcv_buf) and another on to copy data
from <outbuf> buffer of an applet to a channel (appctx_snd_buf).
These functions are not used for now, but they will be used by applets to
define their <rcv_buf> and <snd_buf> callback functions. Of course, it will
be possible for a specific applet to implement its own functions but these
ones should be good enough for most of applets. HTX and RAW buffers are
supported.
MINOR: applet: Add support for callback functions to exchange data with channels
For now, it is not usable, but this patch introduce the support of callback
functions, in the applet structure, to exchange data between channels and
applets. It is pretty similar to callback functions defined by muxes.
MINOR: applet: Add dedicated IN/OUT buffers for appctx
It is the first patch of a series aimed to align applets on connections.
Here, dedicated buffers are added for applets. For now, buffers are
initialized and helpers function to deal with allocation are added. In
addition, flags to report allocation failures or full buffers are also
introduced. <inbuf> will be used to push data to the applet from the stream
and <outbuf> will be used to push data from the applet to the stream.
MINOR: stconn: Be prepared to handle error when a SC is attached to an applet
sc_attach_applet() was changed to be able to fail and callers were updated
accordingly. For now it cannot fail but if this changes, callers will be
prepared to handle errors.
MINOR: stconn: Explicitly use an appctx to attach a stconn on it
In sc_attach_applet, an untyped pointer (void *) was used to attach a SC on
an applet. There is no reason to not use the right type here. So now a
pointer on an appctx is explicitly used.
MINOR: stconn: Be able to detect applets using HTX
IS_HXT_SC() macro is only usable if the stream-connector is attached to a
connection. It is a bit restrictive because this cannot work if the SC is
attached to an applet. So let's fix that be adding the support of applets
too.
MINOR: task: Move wait_event in the task header file
wait_event structure was in connection header file because it is only used
by connections and muxes. But, this may change. For instance applets may be
good candidates to use it too. So, the structure is moved to the task header
file instead.
BUG/MINOR: quic: fix possible integer wrap around in cubic window calculation
Avoid loss of precision when computing K cubic value.
Same issue when computing the congestion window value from cubic increase function
formula with possible integer varaiable wrap around.
Depends on this commit:
MINOR: quic: Code clarifications for QUIC CUBIC (RFC 9438)
CLEANUP: quic: Code clarifications for QUIC CUBIC (RFC 9438)
The first version of our QUIC CUBIC implementation is confusing because relying on
TCP CUBIC linux kernel implementation and with references to RFC 8312 which is
obsoleted by RFC 9438 (August 2023) after our implementation. RFC 8312 is a little
bit hard to understand. RFC 9438 arrived with much more clarifications.
So, RFC 9438 is about "CUBIC for Fast Long-Distance Networks". Our implementation
for QUIC is not very well documented. As it was difficult to reread this
code, this patch adds only some comments at complicated locations and rename
some macros, variables without logic modifications at all.
So, the aim of this patch is to add first some comments and variables/macros renaming
to avoid embedding too much code modifications in the same big patch.
Some code modifications will come to adapt this CUBIC implementation to this new
RFC 9438.
Rename some macros:
CUBIC_BETA -> CUBIC_BETA_SCALED
CUBIC_C -> CUBIC_C_SCALED
CUBIC_BETA_SCALE_SHIFT -> CUBIC_SCALE_FACTOR_SHIFT (this is the scaling factor
which is used only for CUBIC_BETA_SCALED)
CUBIC_DIFF_TIME_LIMIT -> CUBIC_TIME_LIMIT
CUBIC_ONE_SCALED was added (scaled value of 1).
These cubic struct members were renamed:
->tcp_wnd -> ->W_est
->origin_point -> ->W_target
->epoch_start -> ->t_epoch
->remaining_tcp_inc -> remaining_W_est_inc
Local variables to quic_cubic_update() were renamed:
t -> elapsed_time
diff ->t
delta -> W_cubic_t
Add a grahpic curve about the CUBIC Increase function.
Add big copied & pasted RFC 9438 extracts in relation with the 3 different increase
function regions.
Same thing for the fast convergence.
Fix a typo about the reference to QUIC RFC 9002.
Must be backported as far as 2.6 to ease any further modifications to come.
Willy Tarreau [Mon, 5 Feb 2024 15:20:13 +0000 (16:20 +0100)]
MINOR: debug: add an optional message argument to the BUG_ON() family
This commit adds support for an optional second argument to BUG_ON(),
WARN_ON(), CHECK_IF(), that can be a constant string. When such an
argument is given, it will be printed on a second line after the
existing first message that contains the condition.
This can be used to provide more human-readable explanations about
what happened, such as "too low on memory" or "memory corruption
detected" that may help a user resolve the incident by themselves.
Willy Tarreau [Mon, 5 Feb 2024 15:16:08 +0000 (16:16 +0100)]
MINOR: debug: support passing an optional message in ABORT_NOW()
The ABORT_NOW() macro is not much used since we have BUG_ON(), but
there are situations where it makes sense, typically if the program
must always die regardless od DEBUG_STRICT, or if the condition must
always be evaluated (e.g. decompress something and check it).
It's not convenient not to have any hint about what happened there. But
providing too much info also results in wiping some registers, making
the trace less exploitable, so a compromise must be found.
What this patch does is to provide the support for an optional argument
to ABORT_NOW(). When an argument is passed (a string), then a message
will be emitted with the file name, line number, the message and a
trailing LF, before the stack dump and the crash. It should be used
reasonably, for example in functions that have multiple calls that need
to be more easily distinguished.
BUG/MINOR: ssl: Fix error message after ssl_sock_load_ocsp call
If we were to enable 'ocsp-update' on a certificate that does not have
an OCSP URI, we would exit ssl_sock_load_ocsp with a negative error code
which would raise a misleading error message ("<cert> has an OCSP URI
and OCSP auto-update is set to 'on' ..."). This patch simply fixes the
error message but an error is still raised.
This issue was raised in GitHub #2432.
It can be backported up to branch 2.8.
Willy Tarreau [Mon, 5 Feb 2024 14:06:05 +0000 (15:06 +0100)]
MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
As seen in previous commit 59acb27001 ("BUILD: quic: Variable name typo
inside a BUG_ON()."), it can sometimes happen that with DEBUG forced
without DEBUG_STRICT, BUG_ON() statements are ignored. Sadly, it means
that typos there are not even build-tested.
This patch makes these statements reference sizeof(cond) to make sure
the condition is parsed. This doesn't result in any code being emitted,
but makes sure the expression is correct so that an issue such as the one
above will fail to build (which was verified).
This may be backported as it can help spot failed backports as well.
BUILD: debug: remove leftover parentheses in ABORT_NOW()
Since d480b7b ("MINOR: debug: make ABORT_NOW() store the caller's line
number when using abort"), building with 'DEBUG_USE_ABORT' fails with:
|In file included from include/haproxy/api.h:35,
| from include/haproxy/activity.h:26,
| from src/ev_poll.c:20:
|include/haproxy/thread.h: In function ‘ha_set_thread’:
|include/haproxy/bug.h:107:47: error: expected ‘;’ before ‘_with_line’
| 107 | #define ABORT_NOW() do { DUMP_TRACE(); abort()_with_line(__LINE__); } while (0)
| | ^~~~~~~~~~
|include/haproxy/bug.h:129:25: note: in expansion of macro ‘ABORT_NOW’
| 129 | ABORT_NOW(); \
| | ^~~~~~~~~
|include/haproxy/bug.h:123:9: note: in expansion of macro ‘__BUG_ON’
| 123 | __BUG_ON(cond, file, line, crash, pfx, sfx)
| | ^~~~~~~~
|include/haproxy/bug.h:174:30: note: in expansion of macro ‘_BUG_ON’
| 174 | # define BUG_ON(cond) _BUG_ON (cond, __FILE__, __LINE__, 3, "FATAL: bug ", "")
| | ^~~~~~~
|include/haproxy/thread.h:201:17: note: in expansion of macro ‘BUG_ON’
| 201 | BUG_ON(!thr->ltid_bit);
| | ^~~~~~
|compilation terminated due to -Wfatal-errors.
|make: *** [Makefile:1006: src/ev_poll.o] Error 1
This is because of a leftover: abort()_with_line(__LINE__);
^^
Fixing it by removing the extra parentheses after 'abort' since the
abort() call is now performed under abort_with_line() helper function.
This was raised by Ilya in GH #2440.
No backport is needed, unless the above commit gets backported.
BUG/MINOR: quic: Wrong ack ranges handling when reaching the limit.
Acknowledgements ranges are used to build ACK frames. To avoid allocating too
much such objects, a limit was set to 32(QUIC_MAX_ACK_RANGES) by this commit:
MINOR: quic: Do not allocate too much ack ranges
But there is an inversion when removing the oldest range from its tree.
eb64_first() must be used in place of eb64_last(). Note that this patch
only does this modification in addition to rename <last> variable to <first>.
This bug leads such a h2load command to block when a request ends up not
being acknowledged by haproxy even if correctly served:
There is a remaining question to be answered. In such a case, haproxy refuses to
reopen the stream, this is a good thing but should not haproxy ackownledge the
request (because correctly parsed again).
Note that to be easily reproduced, this setting had to be applied to the client
network interface:
tc qdisc add dev eth1 root netem delay 100ms 1s loss random
Willy Tarreau [Sat, 3 Feb 2024 10:55:26 +0000 (11:55 +0100)]
MINOR: acl: add extra diagnostics about suspicious string patterns
As noticed in this thread, some bogus configurations are not always easy
to spot: https://www.mail-archive.com/haproxy@formilux.org/msg44558.html
Here it was about config keywords being used in ACL patterns where strings
were expected, hence they're always valid.
Since we have the diag mode (-dD) we can perform some extra checks when
it's used, and emit them to suggest the user there might be an issue.
Here we detect a few common words (logic such as "and"/"or"/"||" etc),
C++/JS comments mistakenly used to try to isolate final args, and words
that have the exact name of a sample fetch or an ACL keyword. These checks
are only done in diag mode of course.
Willy Tarreau [Sat, 3 Feb 2024 11:05:08 +0000 (12:05 +0100)]
BUG/MINOR: diag: run the final diags before quitting when using -c
Final diags were added in 2.4 by commit 5a6926dcf ("MINOR: diag: create
cfgdiag module"), but it's called too late in the startup process,
because when "-c" is passed, the call is not made, while it's its primary
use case. Let's just move the call earlier.
Note that currently the check in this function is limited to verifying
unicity of server cookies in a backend, so it can be backported as far
as 2.4, but there is little value in insisting if it doesn't backport
easily.
Willy Tarreau [Sat, 3 Feb 2024 11:01:58 +0000 (12:01 +0100)]
BUG/MINOR: diag: always show the version before dumping a diag warning
Diag warnings were added in 2.4 by commit 7b01a8dbd ("MINOR: global:
define diagnostic mode of execution") but probably due to the split
function that checks for the mode, they did not reuse the emission of
the version string before the first warning, as was brought in 2.2 by
commit bebd21206 ("MINOR: init: report in "haproxy -c" whether there
were warnings or not"). The effet is that diag warnings are emitted
before the version string if there is no other warning nor error. Let's
just proceed like for the two other ones.
This can be backported to 2.4, though this is of very low importance.
Willy Tarreau [Fri, 2 Feb 2024 16:09:09 +0000 (17:09 +0100)]
MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
Placing DO_NOT_FOLD() before abort() only works in -O2 but not in -Os which
continues to place only 5 calls to abort() in h3.o for call places. The
approach taken here is to replace abort() with a new function that wraps
it and stores the line number in the stack. This slightly increases the
code size (+0.1%) but when unwinding a crash, the line number remains
present now. This is a very low cost, especially if we consider that
DEBUG_USE_ABORT is almost only used by code coverage tools and occasional
debugging sessions.
Willy Tarreau [Fri, 2 Feb 2024 16:05:36 +0000 (17:05 +0100)]
MINOR: debug: make sure calls to ha_crash_now() are never merged
As indicated in previous commit, we don't want calls to ha_crash_now()
to be merged, since it will make gdb return a wrong line number. This
was found to happen with gcc 4.7 and 4.8 in h3.c where 26 calls end up
as only 5 to 18 "ud2" instructions depending on optimizations. By
calling DO_NOT_FOLD() just before provoking the trap, we can reliably
avoid this folding problem. Note that this does not address the case
where abort() is used instead (DEBUG_USE_ABORT).
Willy Tarreau [Fri, 2 Feb 2024 16:00:01 +0000 (17:00 +0100)]
MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
Modern compilers sometimes perform function tail merging and identical
code folding, which consist in merging identical occurrences of same
code paths, generally final ones (e.g. before a return, a jump or an
unreachable statement). In the case of ABORT_NOW(), it can happen that
the compiler merges all of them into a single one in a function,
defeating the purpose of the check which initially was to figure where
the bug occurred.
Here we're creating a DO_NO_FOLD() macro which makes use of the line
number and passes it as an integer argument to an empty asm() statement.
The effect is a code position dependency which prevents the compiler
from merging the code till that point (though it may still merge the
following code). In practice it's efficient at stopping the compilers
from merging calls to ha_crash_now(), which was the initial purpose.
It may also be used to force certain optimization constructs since it
gives more control to the developer.
First, checks on the resolver scope were added. Then, because of the recent
changes, the logs emitted by vtest are now too big and this makes the script
fails. So tests on NaN values are now performed on a smaller request. This
reduces enough the logs to pass.
MEDIUM: promex: Add support for filters on metric names
It is now possible to filter the metrics on their name, by listing
explicitly metrics to dump or on the opposite to exclude only some metrics
from the dump. To do so, a comma-separated list of metrics must be
specified. If a name is preceded by a minus (-), the metric is excluded from
the dump. If at least one metric is specified to be explicitly dumped, all
metrics are no longer dumped, but only those explicitly listed.
The list is specified via one or more "metrics" parameters in the uri
query-string. For insance:
# Dumped all metrics, except "haproxy_server_check_status"
/metrics?metrics=-haproxy_server_check_status
# Only dump frontends, backends and servers status
/metrics?metrics=haproxy_frontend_status,haproxy_backend_status,haproxy_server_status
Included and Excluded metrics can be mixed. Only the intersection will be
dumped.
MINOR: promex: Always pass the final name and description to promex_dmp_ts()
It is easier this way, especially for promex modules. And because name and
description are now explicitly passed to this function, there is no reason
to still pass the metric, its type is enough. The function is easier to read
this way.
MINOR: promex: Rename dump functions to use the right wording
In Prometheus, a time series a stream of timestamped values belonging to the
same metric and the same set of labeled dimensions. Thus the exporter dump
time-series and not metrics.
Thus, promex_dump_metric(), promex_dump_metric_header() and
promex_metric_to_str() functions were renamed to replace "metric"
MEDIUM: promex/resolvers: Dump resolvers metrics via a promex module
Just like for stick-tables, this patch adds a promex module to dump
resolvers metrics. It adds the "resolver" scope and for now, it dumps
folloowing metrics:
MEDIUM: promex: Dump metrics of registered modules with a way to filter them
This patch adds a dump loop on the registered modules. It is very similar to
other dump loops. When a module registered, a implicit scope is created with
the module's name. It means a module name must be unique. It also means,
metrics dump of modules can be filtered via the "scope" parameter.
MEDIUM: promex: Add a registration mechanism to support modules
In this patch we add a registration mechanism for modules. To do so, a
module must defined the "promex_module" structure. The dump itself will be
based on 2 contexts. One for all the dump and another one for each metric
time-series. These contexts are used as restart points when the dump is
interrupted.
Modules must also implement 6 callback functions:
* start_metric_dump(): It is an optional callback function. If defined, it
is responsible to initialize the dump context use
as the first restart point.
* stop_metric_dump(): It is an optional callback function. If defined, it
is responsible to deinit the dump context.
* metric_info(): This one is mandatory. It returns the info about the
metric: name, type and flags and descrition.
* start_ts(): This one is mandatory, it initializes the context for a time
series for a given metric. This context is the second
restart point.
* next_ts(): This one is mandatory. It interates on time series for a
given metrics. It is also responsible to handle end of a
time series and deinit the context.
* fill_ts(): It fills info on the time series for a given metric : the
labels and the value.
In addition, a module must set its name and declare the number of metrics is
exposed.
MEDIUM: promex: Simplify the context using generic pointers for restart points
Instead of using typed pointers to save the restart points we know use
generic pointers. 4 pointers can be saved now. This replaces the 5 typed
pointers used before. So, we save 8-bytes but it is also more generic and
this will be used by the promex modules.
MINOR: promex: Always limit the number of labels dumped for each metric
It was not an issue since now, be a way to register modules on promex will
be added. Thus it is important to add some extra checks. Here, we take care
to never dump more than the max labels allowed.
MINOR: promex: Add info in the promex context to dump extra counters
The context of the promex applet was extended to support the dump of extra
counters. These counters are not dumped yet, but info to interrupt and
restart the dump are required. The stats module and the relative field
number for this module can now be saved.
In addition support for "extra-counters" parameter was added on the
query-string to dump these counters. Otherwise, no extra-counters are
dumped.
MINOR: promex: Add a param to override the description when a metric is dumped
When a metric is dumped, it is now possible to specify a custom
description. We will add the support for extra counters. The list of these
counters is retrived dynamically. Thus the description must be dynamic
too. Note it was already possible to customize the metric name.
MEDIUM: stats: Be able to access a specific field into a stats module
It is now possible to selectively retrieve extra counters from stats
modules. H1, H2, QUIC and H3 fill_stats() callback functions are updated to
return a specific counter.
MINOR: stats: Be able to access to registered stats modules from anywhere
The list of modules registered on the stats to expose extra counters is now
public. It is required to export these counters into the Prometheus
exporter.
MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
set-bc-{mark,tos} actions are pretty similar to set-fc-{mark,tos} to set
mark/tos on packets sent from haproxy to server: set-bc-{mark,tos} actions
act on the whole backend/srv connection: from connect() to connection
teardown, thus they may only be used before the connection to the server
is instantiated, meaning that they are only relevant for request-oriented
rules such as tcp-request or http-request rules. For now their use is
limited to content request rules, because tos and mark informations are
stored directly within the stream, thus it is required that the stream
already exists.
stream flags are used in combination with dedicated stream struct members
variables to pass 'tos' and 'mark' informations so that they are correctly
considered during stream connection assignment logic (prior to connecting
to actually connecting to the server)
'tos' and 'mark' fd sockopts are taken into account in conn hash
parameters for connection reuse mechanism.
MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
In this patch we add the possibility to use sample expression as argument
for set-fc-{mark,tos} actions. To make it backward compatible with
previous behavior, during parsing we first try to parse the value as
as integer (decimal or hex notation), and then fallback to expr parsing
in case of failure.