MINOR: stktable: fix potential build issue in smp_to_stkey
smp_to_stkey() uses an ambiguous cast from 64bit integer to 32 bit
unsigned integer. While it is intended, let's make the cast less
ambiguous by explicitly casting the right part of the assignment to the
proper type.
Command line argument -dt can be used to activate traces during startup.
Via its optional argument, it is possible to change settings for a
particular trace source. It is also possible to update every registered
sources by specifying an empty name.
Support the trace source alias "all". This is an alternative to the
empty name to update every sources.
MINOR: trace: ensure -dt priority over traces config section
Traces can be activated on startup either via -dt command line argument
or via the traces configuration section. This can caused confusion as it
may not be clear as trace source can be completed or overriden by one or
the other.
Fix the precedence to give the priority to the command line argument.
Now, each trace source configured via -dt is first resetted to a default
state before applying new settings. Then, it is impossible to change a
trace source via the configuration file if it was already targetted via
-dt argument.
Traces can be activated on startup via -dt command line argument. To
facilitate its usage, display a usage description and examples when
"help" is specified.
BUG/MEDIUM: queues: Adjust the proxy counters when appropriate
In process_srv_queue(), if we manage to successfully run an extra task,
don't forget to adjust the proxy's totpend and served counters accordingly.
Having an inaccurate served could lead to various subtle bugs, as it is
used when making load balancing decisions.
As discussed in GH #1750, we were lacking a sample fetch to be able to
retrieve the key from the currently tracked counter entry. To do so,
sc_key fetch can now be used. It returns a sample with the correct type
(table key type) corresponding to the tracked counter entry (from previous
track-sc rules).
If no entry is currently tracked, it returns nothing.
It can be used using the standard form "sc_key(<sc_number>)" or the legacy
form: "sc0_key", "sc1_key", "sc2_key"
stksess_getkey(t, ts) returns a stktable_key struct pointer filled with
data from input <ts> entry in <t> table. Returned pointer uses the
static_table_key variable. Indeed, stktable_key struct is more convenient
to manipulate than having to deal with the key extraction from stktsess
struct directly.
BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey()
When smp_to_stkey() deals with SINT samples, since stick-tables deals with
32 bits integers while SINT sample is 64 bit integer, inplace conversion
was done in smp_to_stkey. For that the 64 bit integer was truncated before
the key would point to it. Unfortunately this only works on little endian
architectures because with big endian ones, the key would point to the
wrong 32bit range.
To fix the issue and make the conversion endian-proof, let's re-assign
the sample as 32bit integer before the key points to it.
Thanks to Willy for having spotted the bug and suggesting the above fix.
Willy Tarreau [Thu, 9 Jan 2025 08:21:04 +0000 (09:21 +0100)]
[RELEASE] Released version 3.2-dev3
Released version 3.2-dev3 with the following main changes :
- DOC: config: add missing "track-sc0" in action keywords matrix
- BUG/MINOR: stktable: invalid use of stkctr_set_entry() with mixed table types
- BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission
- BUG/MEDIUM: mux-h2: Count copied data when looping on RX bufs in h2_rcv_buf()
- Revert "BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission"
- BUG/MAJOR: mux-quic: properly fix BUG_ON on empty STREAM emission
- MINOR: mux-quic: add traces on sd attach
- BUG/MEDIUM: mux-quic: do not attach on already closed stream
- BUG/MINOR: compression: handle a possible strdup() failure
- BUG/MINOR: pool: handle a possible strdup() failure
- BUG/MINOR: cfgparse-tcp: handle a possible strdup() failure
- BUG/MINOR: log: Allow to use if/unless conditionnals for do-log action
- MINOR: config: Alert about extra arguments for errorfile and errorloc
- BUG/MINOR: mux-quic: fix wakeup on qcc_set_error()
- MINOR: mux-quic: change return value of qcs_attach_sc()
- BUG/MINOR: mux-quic: handle closure of uni-stream
- BUG/MEDIUM: promex/resolvers: Don't dump metrics if no nameserver is defined
- BUG/MAJOR: ssl/ocsp: fix NULL conn object dereferencing to access QUIC TLS counters
- MEDIUM: errors: get rid of shm_open()
- BUILD: makefile: do not clean standalone binaries on a simple "make clean"
- BUILD: makefile: add a qinfo macro to pass info in quiet mode
- DEV: ncpu: add a simple utility to help with NUMA development
- DEV: ncpu: implement a wrapper mode
- DEV: ncpu: make the wrapper work both as a lib and executable
- BUG/MEDIUM: h1-htx: Properly handle bodyless messages
- MINOR: tools: add a few functions to simply check for a file's existence
Willy Tarreau [Mon, 6 Jan 2025 18:34:41 +0000 (19:34 +0100)]
MINOR: tools: add a few functions to simply check for a file's existence
At many places we'd like to be able to simply construct a path from a
format string and check if that path corresponds to an existing file,
directory etc. Here we add 3 functions, a generic one to test that a
path corresponds to a given file mode (e.g. S_IFDIR, S_IFREG etc), and
two other ones specifically checking for a file or a dir for easier
use.
During h1 parsing, there are some postparsing checks to detect bodyless
messages and switch the parsing in DONE state. However, a case was not
properly handled. Responses to HEAD requests with a "transfer-encoding"
header. The response parser remained blocked waiting for the response body.
To fix the issue, the postparsing was sliglty modified. Instead of trying to
handle bodyless messages in a common way between the request and the
response, it is now performed in the dedicated postparsing functions. It is
easier to enumerate all cases, especially because there is already a test
for responses to HEAD requests.
This patch should fix the issue #2836. It must be backported as far as 2.9.
Trying to preload such an executable indeed returns:
ERROR: ld.so: object '/path/to/ncpu.so' from LD_PRELOAD cannot be preloaded (cannot dynamically load position-independent executable): ignored.
Note that the code still supports it since libc.so is both an executable
and a lib. The approach taken here is the same as in the nousr.so wrapper.
It consists in dropping the DF_1_PIE flag from the resulting executable
since it's what the dynamic linker is looking for. This flag is found in
FLAGS_1 in the .dynamic section. As readelf -a suggests, it's after the
tag 0x6ffffffb. The value is 0x08000000. We're using objdump to figure the
length and offset of the struct, dd to extract the 3 parts, and sed to
patch the binary.
It's likely that it will only work on 64-bit little endian, though tests
should be performed to see what to do on other platforms. At least on
x86_64, ld.so is happy and it continues to be possible to use the binary
as a .so, and that the platform where most of the development happens so
that's fine.
In any case the wrapper and the standard shared lib are still made two
distinct files so that it's possible to use the non-patched version on
unsupported OSes or architectures.
Willy Tarreau [Wed, 8 Jan 2025 08:04:09 +0000 (09:04 +0100)]
DEV: ncpu: implement a wrapper mode
The wrapper mode allows to present itself as LD_PRELOAD before loading
haproxy, which is often more convenient since it allows to pass the
number of CPUs in argument. However, this mode is no longer supported by
modern glibcs, so a future patch will come to implement a trick that was
tested to work at least on x86.
Willy Tarreau [Wed, 8 Jan 2025 07:48:16 +0000 (08:48 +0100)]
DEV: ncpu: add a simple utility to help with NUMA development
Collecting captures of /sys isn't sufficient for NUMA development because
haproxy detects the number of CPUs at boot time and will not be able to
inspect more than this number. Let's just have a small utility to report
a fake number of CPUs, that will be loaded using LD_PRELOAD. It checks
the NCPU variable if it exists and will present this number of CPUs, or
if it does not exist, will expose the maximum supported number.
Willy Tarreau [Wed, 8 Jan 2025 09:29:39 +0000 (10:29 +0100)]
BUILD: makefile: add a qinfo macro to pass info in quiet mode
Some commands such as $(cmd_CC) etc already handle the quiet vs verbose
mode in the makefile, but sometimes we may want to pass other info. The
new "qinfo" macro can be called with a 9-char string argument (spaces
included) as a prefix for some commands, to emit that string when in
quiet mode. The caller must fill the spaces needed for alignment. E.g:
Willy Tarreau [Wed, 8 Jan 2025 10:22:18 +0000 (11:22 +0100)]
BUILD: makefile: do not clean standalone binaries on a simple "make clean"
Running "make clean" currently gets rid of a number of auxiliary tools,
including the standalone ones that do not depend on haproxy's build
options. This is a bit annoying as they have to be rebuilt each time.
Let's move them to the distclean target instead.
Since 5ee266b7 ("MINOR: error: simplify startup_logs_init_shm"), the FD
of the startup logs is always closed and the HAPROXY_STARTUPLOGS_FD
variable is not used anymore. Which means we only need a mmap.
Indeed the shm_open() function was only needed to keep the shm between
the exec() of the master so we can get the logs stored there after doing
the final exec() in wait mode. Since the wait mode doesn't exist
anymore and the parsing is done in a worker, we only need to share a
memory zone between the master and the worker.
This patch removes shm_open() and replace it with a simple mmap(), this
way the shared startup-logs become more portable and USE_SHM_OPEN is not
required anymore.
This bug arrived with this commit in the current dev branch:
056ec51c26 MEDIUM: ssl/ocsp: counters for OCSP stapling
and could occur for QUIC connections during handshake when the underlying
<conn> connection object is not already initialized. So in this case the TLS
counters attached to TLS listeners cannot be accessed through this object but
from the QUIC connection object.
Modify the code to initialize the listener (<li> variable) for both QUIC
and TCP connections, then initialize the variables for the TLS counters
if the listener is also initialized.
Thank you to @Tristan971 for having reported this issue in GH #2833.
Must be backported with the commit mentioned above if it is planned to be
backported.
BUG/MEDIUM: promex/resolvers: Don't dump metrics if no nameserver is defined
A 'resolvers' section may be defined without any nameserver. In that case,
we must take care to not dump corresponding Prometheus metrics. However
there is an issue that could lead to a crash or a strange infinite loop
because we are looping on an empty list and, at some point, we are
dereferencing an invalid pointer.
There is an issue because the loop on the nameservers of a resolvers section
is performed via callback functions and not the standard list_for_each_entry
macro. So we must take care to properly detect end of the list and empty
lists for nameservers. But the fix is not so simple because resolvers
sections with and without nameservers may be mixed.
To fix the issue, in rslv_promex_start_ts() and rslv_promex_next_ts(), when the
next resolvers section must be evaluated, a loop is now used to properly skip
empty sections.
This patch is related to #2831. Not sure it fixes it. It must be backported
as far as 3.0.
This commit is a direct follow-up to the previous one. As already
described, a previous fix was merged to prevent streamdesc attach
operation on already completed QCS instances scheduled for purging. This
was implemented by skipping app proto decoding.
However, this has a bad side-effect for remote uni-directional stream.
If receiving a FIN stream frame on such a stream, it will considered as
complete because streamdesc are never attached to a uni stream. Due to
the mentionned new fix, this prevent analysis of this last frame for
every uni stream.
To fix this, do not skip anymore app proto decoding for completed QCS.
Update instead qcs_attach_sc() to transform it as a noop function if QCS
is already fully closed before streamdesc instantiation. However,
success return value is still used to prevent an invalid decoding error
report.
The impact of this bug should be minor. Indeed, HTTP3 and QPACK uni
streams are never closed by the client as this is invalid due to the
spec. The only issue was that this prevented QUIC MUX to close the
connection with error H3_ERR_CLOSED_CRITICAL_STREAM.
This must be backported along the previous patch, at least to 3.1, and
eventually to 2.8 if mentionned patches are merged there.
MINOR: mux-quic: change return value of qcs_attach_sc()
A recent fix was introduced to ensure that a streamdesc instance won't
be attached to an already completed QCS which is eligible to purging.
This was performed by skipping application protocol decoding if a QCS is
in such a state. Here is the patch responsible for this change. caf60ac696a29799631a76beb16d0072f65eef12
BUG/MEDIUM: mux-quic: do not attach on already closed stream
However, this is too restrictive, in particular for unidirection stream
where no streamdesc is never attached. To fix this behavior, first
qcs_attach_sc() API has been modified. Instead of returning a streamdesc
instance, it returns either 0 on success or a negative error code.
There should be no functional changes with this patch. It is only to be
able to extend qcs_attach_sc() with the possibility of skipping
streamdesc instantiation while still keeping a success return value.
This should be backported wherever the above patch has been merged. For
the record, it was scheduled for immediate backport on 3.1, plus merging
on older releases up to 2.8 after a period of observation.
BUG/MINOR: mux-quic: fix wakeup on qcc_set_error()
The following patch was a major refactoring of QUIC MUX. It removes
pacing specific code path. In particular, qcc_wakeup() utility function
was removed and replaced by its tasklet_wakup() usage. 41f0472d967b2deb095d5adc8a167da973fbee3d
MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb
However, an incorrect substitution was performed in qcc_set_error(). As
such, there was no explicit wakeup in case an error is detected by QUIC
MUX or the app protocol layer. This may lead to missing error reporting
to clients.
Fix this by re-add tasklet_wakup() usage into qcc_set_error().
This must be backported up to 3.1 where above patch is scheduled.
MINOR: config: Alert about extra arguments for errorfile and errorloc
errorfile and errorloc directives expect excatly two arguments. But extra
arguments were just ignored while an error should be emitted. It is now
fixed.
This patch could be backported as far as 2.2 if necessary.
BUG/MINOR: log: Allow to use if/unless conditionnals for do-log action
The do-log action does not accept argument for now. But an error was
triggered if any extra arguments was found, preventing the use of if/unless
conditionnals.
When an action is parsed, expected arguments must be tested to detect
missing ones but not unexpected extra arguments because this should be
performed by the conditionnal parser. So just removing the test in the
do-log parser function is enough to fix the issue.
Amaury Denoyelle [Tue, 31 Dec 2024 15:06:32 +0000 (16:06 +0100)]
BUG/MEDIUM: mux-quic: do not attach on already closed stream
Due to QUIC packet reordering, a stream may be opened via a new
RESET_STREAM or STOP_SENDING frame. This would cause either Tx or Rx
channel to be immediately closed.
This can cause an issue with current QUIC MUX implementation with QCS
purging. QCS are inserted into QCC purge list when transfer could be
considered as completed. In most cases, this happens after full
request/response exchange. However, it can also happens after request
reception if RESET_STREAM/STOP_SENDING are received first.
A BUG_ON() crash will occur if a STREAM frame is received after. In this
case, streamdesc instance will be attached via qcs_attach_sc() to handle
the new request. However, QCS is already considered eligible to purging.
It could cause it to be released while its streamdesc instance remains.
A BUG_ON() crash detects this problem in qcc_purge_streams().
To fix this, extend qcc_decode_qcs() to skip app proto rcv_buf
invokation if QCS is considered completed. A similar condition was
already implemented when read was previously aborted after a
STOP_SENDING emission by QUIC MUX.
This crash was reproduced on haproxy.org. Here is the output of the
backtrace :
Core was generated by `./haproxy-dev -db -f /etc/haproxy/haproxy-current.cfg -sf 16495'.
Program terminated with signal SIGILL, Illegal instruction.
#0 0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661
2661 BUG_ON_HOT(!qcs_is_completed(qcs));
[Current thread is 1 (LWP 1457)]
[ ## gdb ## ] bt
#0 0x00000000004e442b in qcc_purge_streams (qcc=0x774cca0) at src/mux_quic.c:2661
#1 0x00000000004e4db7 in qcc_io_process (qcc=0x774cca0) at src/mux_quic.c:2744
#2 0x00000000004e5a54 in qcc_io_cb (t=0x7f71193940c0, ctx=0x774cca0, status=573504) at src/mux_quic.c:2886
#3 0x0000000000b4f792 in run_tasks_from_lists (budgets=0x7ffdcea1e670) at src/task.c:603
#4 0x0000000000b5012f in process_runnable_tasks () at src/task.c:883
#5 0x00000000007de4a3 in run_poll_loop () at src/haproxy.c:2771
#6 0x00000000007deb9f in run_thread_poll_loop (data=0x1335a00 <ha_thread_info>) at src/haproxy.c:2985
#7 0x00000000007dfd8d in main (argc=6, argv=0x7ffdcea1e958) at src/haproxy.c:3570
This BUG_ON() crash can only happen since 3.1 refactoring. Indeed, purge
list was only implemented on this version. As such, please backport it
on 3.1 immediately. However, a logic issue remains for older version as
a stream could be attached on a fully closed QCS. Thus, it should be
backported up to 2.8, this time after a period of observation.
Amaury Denoyelle [Tue, 31 Dec 2024 14:18:52 +0000 (15:18 +0100)]
MINOR: mux-quic: add traces on sd attach
Add traces into qcs_attach_sc(). This function is called when a request
is received on a QCS stream and a streamdesc instance is attached. This
will be useful to facilitate debugging.
BUG/MAJOR: mux-quic: properly fix BUG_ON on empty STREAM emission
Properly fix BUG_ON() occurence when QUIC MUX emits only empty STREAM
frames. This was addressed by a previous patch but it causes another
regression so a revert was needed.
BUG_ON() on qcc_build_frms() return value is invalid. Indeed,
qcc_build_frms() may return 0, but this does not imply that frame list
is empty, as encoded frames can have a zero length payload. As such,
simply remove this invalid BUG_ON().
Above patch tried to fix a BUG_ON() occurence when MUX only emitted
empty STREAM frames via qcc_build_frms(). Return value of qcs_send() was
changed from the payload STREAM frame to the whole frame length.
However, this is invalid as this return value is used to ensure
connection flow-control is not exceeded on sending retry. This causes
occurence of BUG_ON() crash in qcc_io_send() as send-list is not
properly purged after QCS emission.
Reverts this incorrect fix. The original issue will be properly dealt in
the next commit.
This commit must be backported to 3.1 if reverted commit was already
applied on it.
BUG/MEDIUM: mux-h2: Count copied data when looping on RX bufs in h2_rcv_buf()
When data was copied from RX buffers to the channel buffer, more data than
expected could be moved because amount of data copied was never decremented
from the limit. This could lead to a stream dead lock when the compression
filter was inuse.
The issue was introduced by commit 4eb3ff1 ("MAJOR: mux-h2: make streams use
the connection's buffers") but revealed by 3816c38 ("MAJOR: mux-h2: permit a
stream to allocate as many buffers as desired").
Because a h2 stream can now have several RX buffers, in h2_rcv_buf(), we
loop on these buffers to fill the channel buffer. However, we must still
take care to respect the limit to not copy to much data. However, the
"count" variable was never decremented to reflect amount of data already
copied. So, it was possible to exceed the limit.
It was an issue when the compression filter was inuse because the channel
buffer could be fully filled, preventing the compression to be
performed. When this happened, the stream was infinitly blocked because the
compression filter was asking for some space but nothing was scheduled to be
forwarded.
This patch should fix the issue #2826. It must be backported to 3.1.
Amaury Denoyelle [Tue, 31 Dec 2024 14:21:19 +0000 (15:21 +0100)]
BUG/MAJOR: mux-quic: fix BUG_ON on empty STREAM emission
A BUG_ON() is present in qcc_io_send() to ensure that encoded frame list
is empty if qcc_build_frms() previously returned 0.
This BUG_ON() may be triggered if empty STREAM frame is encoded for
standalone FIN. Indeed, qcc_build_frms() returns the sum of all STREAM
payload length. In case only empty STREAM frames are generated, return
value will be 0, despite new frames encoded and inserted into frame
list.
To fix this, change return value of qcs_send(). This now returns the
whole STREAM frame length, both header and payload included. This
ensures that qcc_build_frms() won't return a nul value if new frames are
encoded, even empty ones.
BUG/MINOR: stktable: invalid use of stkctr_set_entry() with mixed table types
Some actions such as "sc0_get_gpc0" (using smp_fetch_sc_stkctr()
internally) can take an optional table name as parameter to perform the
lookup on a different table from the tracked one but using the key from
the tracked entry. It is done by leveraging the stktable_lookup() function
which was originally meant to perform intra-table lookups.
Calling sc0_get_gpc0() with a different table name will result in
stktable_lookup() being called to perform lookup using a stktsess from
a different table. While it is theorically fine, it comes with a pitfall:
both tables (the one from where the stktsess originates and the actual
target table) should rely on the exact same key type and length.
Failure to do so actually results in undefined behavior, because the key
type and/or length from one table is used to perform the lookup in
another table, while the underlying lookup API expects explicit type and
key length.
For instance, consider the below example:
peers testpeers
bind 127.0.0.1:10001
server localhost
table test type binary len 1 size 100k expire 1h store gpc0
table test2 type string size 100k expire 1h store gpc0
Performing a curl request to localhost:8080 will cause unitialized reads
because string "ok" from test2 table will be compared as a string against
"AA" binary sample which is not NULL terminated:
==2450742== Conditional jump or move depends on uninitialised value(s)
==2450742== at 0x484F238: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2450742== by 0x27BCE6: stktable_lookup (stick_table.c:539)
==2450742== by 0x281470: smp_fetch_sc_stkctr (stick_table.c:3580)
==2450742== by 0x283083: smp_fetch_sc_get_gpc0 (stick_table.c:3788)
==2450742== by 0x2A805C: sample_process (sample.c:1376)
So let's prevent that by adding some comments in stktable_set_entry()
func description, and by adding a check in smp_fetch_sc_stkctr() to ensure
both source stksess and target table share the same key properties.
While it could be relevant to backport this in all stable versions, it is
probably safer to wait for some time before doing so, to ensure that no
existing configs rely on this ambiguity because the fact that the target
table and source stksess entry need to share the same key type and length
is not explicitly documented.
DOC: config: add missing "track-sc0" in action keywords matrix
In d54e8f8107 ("DOC: config: reorganize actions into their own section"),
"track-sc0" keyword was properly documented but the keyword was not placed
in the action keywords matrix alongside other track-sc* statements. It
was probably overlooked, so let's fix that.
Willy Tarreau [Wed, 25 Dec 2024 14:17:01 +0000 (15:17 +0100)]
[RELEASE] Released version 3.2-dev2
Released version 3.2-dev2 with the following main changes :
- MINOR: build: define DEBUG_STRESS
- MINOR: applet: define applet_putchk_stress() alternative
- MINOR: stats: use stress mode to force reentrant dumps
- CI: scripts: add support for AWS-LC-FIPS in build-ssl.sh
- MINOR: ssl: add "FIPS" details in haproxy -vv
- MEDIUM: ssl: rename 'OpenSSL' by 'SSL library' in haproxy -vv
- CI: github: let's add an AWS-LC-FIPS job
- MINOR: window_filter: rely on the time to update the filter samples (QUIC/BBR)
- BUG/MINOR: quic: wrong logical statement in in_recovery_period() (BBR)
- BUG/MINOR: quic: fix BBB max bandwidth oscillation issue.
- BUG/MINOR: quic: wrong bbr_target_inflight() implementation
- BUG/MINOR: quic: remove max_bw filter from delivery rate sampling
- BUG/MINOR: quic: underflow issue for bbr_inflight_hi_from_lost_packet()
- BUG/MINOR: quic: reduce packet losses at least during ProbeBW_CRUISE (BBR)
- MINOR: quic: reduce the private data size of QUIC cc algos
- CLEANUP: quic: remove a wrong comment about ->app_limited (drs)
- BUG/MINOR: quic: fix the wrong tracked recovery start time value
- BUG/MINOR: quic: too permissive exit condition for high loss detection in Startup (BBR)
- BUG/MINOR: cli: cli_snd_buf: preserve \r\n for payload lines
- REGTESTS: ssl: add a PEM with mix of LF and CRLF line endings
- BUG/MINOR: quic: missing Startup accelerating probing bw states
- CLEANUP: quic: Rename some BBR functions in relation with bw probing
- REORG: startup: move global.maxconn calculations in limits.c
- REORG: startup: move code that applies limits to limits.c
- REORG: startup: move nofile limit checks in limits.c
- MINOR: ssl: add utils functions to extract X509 notAfter date
- MINOR: ssl/cli: allow to filter expired certificates with 'show ssl sni'
- MINOR: ssl/cli: add -A to the 'show ssl sni' command description
- BUG/MINOR: ssl/cli: 'show ssl cert' escape the first '*' of a filename
- BUG/MINOR: ssl/cli: 'show ssl crl-file' escape the first '*' of a filename
- BUG/MINOR: ssl/cli: 'show ssl ca-file' escape the first '*' of a filename
- BUG/MEDIUM: stconn: Only consider I/O timers to update stream's expiration date
- BUG/MEDIUM: queues: Make sure we call process_srv_queue() when leaving
- BUG/MEDIUM: queues: Do not use pendconn_grab_from_px().
- CLEANUP: queues: Remove pendconn_grab_from_px().
- BUILD: debug: only dump/reset glitch counters when really defined
- MINOR: compiler: add a __has_builtin() macro to detect features more easily
- MINOR: compiler: rely on builtin detection for __builtin_unreachable()
- MINOR: compiler: add a new "ASSUME" macro to help the compiler
- MINOR: compiler: also enable __builtin_assume() for ASSUME()
- MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid
- MINOR: bug: make BUG_ON() fall back to ASSUME
- CLEANUP: cache: use ASSUME_NONNULL() instead of DISGUISE()
- CLEANUP: hlua: use ASSUME_NONNULL() instead of ALREADY_CHECKED()
- CLEANUP: htx: use ASSUME_NONNULL() to mark the start line as non-null
- CLEANUP: mux-fcgi: use ASSUME_NONNULL() to indicate that the first block exists
- CLEANUP: stats: use ASSUME_NONNULL() to indicate that the first block exists
- CLEANUP: quic: replace ALREADY_CHECKED() with ASSUME_NONNULL() at a few places
- CLEANUP: ssl-sock: drop two now unneeded ALREADY_CHECKED()
- BUG/MEDIUM: mux-quic: do not mix qcc_io_send() return codes with pacing
- CLEANUP: mux-quic: remove unused qcc member send_retry_list
- MINOR: quic: add traces
- MINOR: mux-quic: refactor wait-for-handshake support
- MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption
- MEDIUM/OPTIM: mux-quic: implement purg_list
- MINOR: mux-quic: extract code to build STREAM frames list
- MINOR: mux-quic: split STREAM and RS/SS emission
- MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send
- MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb
- MINOR: trace: implement tracing disabling API
- MINOR: mux-quic: hide traces when woken up on pacing only
- MINOR: ssl/cli: add a 'Uncommitted' status for 'show ssl' commands
- MINOR: ssl/ocsp: Add extra details in error logs when possible
- BUILD: ssl/ocsp: error: â\80\98%.*sâ\80\99 directive argument is null
- MEDIUM: ssl/ocsp: OCSP response is expired with OCSP_MAX_RESPONSE_TIME_SKEW
- MINOR: ssl: improve HAVE_SSL_OCSP ifdef
- DOC: config: add example for server "track" keyword
- DOC: config: reorder "tune.lua.*" keywords by alphabetical order
- DOC: config: add "tune.lua.burst-timeout" to the list of global parameters
- MINOR: hlua: add option to preserve bool type from smp to lua
- REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool
- BUG/MEDIUM: mux-quic: prevent BUG_ON() by refreshing frms on MAX_DATA
- CLEANUP: mux-quic: remove dead err label in qcc_build_frms()
- BUG/MINOR: h2/rhttp: fix HTTP2 conn counters on reverse
- MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion"
- MINOR: ssl: change visibility of ssl_stats_module
- MINOR: ssl: rework the error management in the OCSP callback
- MEDIUM: ssl/ocsp: counters for OCSP stapling
- CI: limit aws-lc and libressl Quic Interop to "haproxy" only
- BUG/MEDIUM: queue: Make process_srv_queue return the number of streams
- CI: github: try to build the latest WolfSSL master weekly
- CI: github: activate ASAN on the WolfSSL weekly job
- BUG/MINOR: stats: fix segfault caused by uninitialized value in "show schema json"
- MINOR: stktable: add stktable_get_data_type_idx() helper function
- MINOR: stktable: support optional index for array types in {set, clear, show} table commands
- CI: scripts: allow to build wolfssl with --enable-debug
- CI: github: activate debug in wolfssl weekly build
- BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections
- MEDIUM: queue: Handle the race condition between queue and dequeue differently
- CLEANUP: Remove pendconn_must_try_again().
- BUILD: compat: add missing fcntl.h before defining F_SETPIPE_SZ
- BUILD: mworker: always initialize the saveptr of strtok_r()
- BUILD: limits: make normalize_rlim() take an rlim_t to fix build on m68k
- BUG/MINOR: checks: handle a possible strdup() failure
- BUG/MINOR: listener: handle a possible strdup() failure
- BUG/MINOR: mux_h1: handle a possible strdup() failure
- BUG/MINOR: debug: handle a possible strdup() failure
The reason is the comparison between a ulong limit and RLIM_INFINITY.
Indeed, on m68k, rlim_t is an unsigned long long. Let's just change
the function's input type to take an rlim_t instead. This also allows
to get rid of the casts in the call place.
This can be backported to 3.1 though it's not important given the low
prevalence of this platform for such use cases.
Willy Tarreau [Wed, 25 Dec 2024 11:10:13 +0000 (12:10 +0100)]
BUILD: mworker: always initialize the saveptr of strtok_r()
Building with some libcs which define strtok_r() as an inline function
can yield a possibly uninitialized warning due to a loop dereferencing
this save pointer early, even though the doc clearly mentions that it
is ignored. This is actually more of a mismatch between the compiler
and the libc (gcc-4.7 and glibc-2.23 in that case). It's trivial to
set s2 to NULL here so let's do it to please this old couple. Note
that while the warning is triggered in all supported versions, there's
no point backporting it since it's unlikely this combination will be
relevant outside of backwards compatibility checks now.
Willy Tarreau [Wed, 25 Dec 2024 10:53:11 +0000 (11:53 +0100)]
BUILD: compat: add missing fcntl.h before defining F_SETPIPE_SZ
n 1.5-dev8, 13 years ago, support for setting pipe size was added by
commit bd9a0a778 ("OPTIM/MINOR: make it possible to change pipe size
(tune.pipesize)"). For compatibility purposes, it was defining
F_SETPIPE_SZ in compat.h if it was not set. It apparently always had
F_SETPIPE_SZ defined before being included.
Now in 3.2-dev1, commit fbc534a6f ("REORG: startup: move nofile limit
checks in limits.c") reordered a few includes and ended up with
mworker-prog.c including compat.h before fcntl.h, causing a redefinition
error on certain libcs:
CC src/mworker-prog.o
In file included from /usr/include/bits/fcntl.h:61:0,
from /usr/include/fcntl.h:35,
from include/haproxy/limits.h:11,
from include/haproxy/mworker.h:18,
from src/mworker-prog.c:27:
/usr/include/bits/fcntl-linux.h:203:0: warning: "F_SETPIPE_SZ" redefined [enabled by default]
In file included from include/haproxy/api-t.h:35:0,
from include/haproxy/api.h:33,
from src/mworker-prog.c:23:
include/haproxy/compat.h:161:0: note: this is the location of the previous definition
Let's simply include fcntl.h in compat.h before the macro is redefined.
There's normally no need to backport this, though it's harmless to do
it if needed.
MEDIUM: queue: Handle the race condition between queue and dequeue differently
There is a small race condition, where a server would check if there is
something left in the proxy queue, and adding something to the proxy
queue. If the server checks just before the stream is added to the queue,
and it no longer has any stream to deal with, then nothing will take
care of the stream, that may stay in the queue forever.
This was worked around with commit 5541d4995d, by checking for that exact
condition after adding the stream to the queue, and trying again to get
a server assigned if it is detected.
That fix lead to multiple infinite loops, that got fixed, but it is not
unlikely that it could happen again. So let's fix the initial problem
differently : a single server may mark itself as ready, and it removes
itself once used. The principle is that when we discover that the just
queued stream is alone with no active request anywhere ot dequeue it,
instead of rebalancing it, it will be assigned to that current "ready"
server that is available to handle it. The extra cost of the atomic ops
is negligible since the situation is super rare.
Olivier Houchard [Tue, 17 Dec 2024 13:30:46 +0000 (13:30 +0000)]
BUG/MEDIUM: queues: Stricly respect maxconn for outgoing connections
The "served" field of struct server is used to know how many connections
are currently in use for a server. But served used to be incremented way
after the server was picked, so there were race conditions that could
lead more than maxconn connections to be allocated for one server. To
fix this, increment served way earlier, and make sure at the time that
it never goes past maxconn.
We now should never have more outgoing connections than set by maxconn.
MINOR: stktable: support optional index for array types in {set, clear, show} table commands
As discussed in GH #2286, {set, clear, show} table commands were unable
to deal with array types such as gpt, because they handled such types as
a non-array types, thus only the first entry (ie: gpt[0]) was considered.
In this patch we add an extra logic around array-types handling so that
it is possible to specify an array index right after the type, like this:
set table peer/table key mykey data.gpt[2] value
# where 2 is the entry index that we want to access
If no index is specified, then it implicitly defaults to 0 to mimic
previous behavior.
BUG/MINOR: stats: fix segfault caused by uninitialized value in "show schema json"
Since b3d5708 ("MINOR: stats: remove implicit static trash_chunk usage")
a segfault can occur when issuing "show schema json" on the stats socket.
Indeed, now the dumping functions don't rely on trash_chunk anymore, but
instead they rely on the appctx->chunk buffer. However, unlike other
stats dumping commands, the "show schema json" only have an io handler,
and no parse function. With other command, the parse function is
responsible for pre-setting some data, including applet ctx reservation.
Thus due to "show schema json" lacking parsing function, the applet ctx is
used uninitialized, which is a bug obviously.
To fix the issue we simply add a parse function for "show schema json",
although all it does for now is calling applet_reserve_svcctx() for the
current applet ctx.
This issue was reported by @dsuch in GH #2825. It must be backported up
to 3.0.
Olivier Houchard [Mon, 23 Dec 2024 14:17:25 +0000 (14:17 +0000)]
BUG/MEDIUM: queue: Make process_srv_queue return the number of streams
Make process_srv_queue() return the number of streams unqueued, as
pendconn_grab_from_px() did, as that number is used by
srv_update_status() to generate logs.
Add 2 counters in the SSL stats module for OCSP stapling.
- ssl_ocsp_staple is the number of OCSP response successfully stapled
with the handshake
- ssl_failed_ocsp_stapled is the number of OCSP response that we
couldn't staple, it could be because of an error or because the
response is expired.
These counters are incremented in the OCSP stapling callback, so if no
OCSP was configured they won't never increase. Also they are only
working in frontends.
MINOR: hlua: rename "tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion"
A better name was found for the option implemented in ec74438
("MINOR: hlua: add option to preserve bool type from smp to lua")
Indeed, "tune.lua.preserve-smp-bool {on | off}" wasn't explicit enough
nor did it encourage the adoption of the new "fixed" behavior (vs
historical behavior which is now considered as a bug).
Thus it becomes "tune.lua.bool-sample-conversion { normal | pre-3.1-bug }"
which actively encourage users to switch the new behavior after having
patched in-use Lua script if needed. From a technical point of view,
the logic remains the same, as the option currently defaults to
"pre-3.1-bug" to prevent script breakage, and a warning is emitted if
the option isn't set explicily and Lua is used.
Documentation and regtests were updated.
Must be backported in 3.1 with ec74438 and f2838f5 ("REGTESTS: fix
lua-based regtests using tune.lua.smp-preserve-bool")
Amaury Denoyelle [Mon, 16 Dec 2024 10:00:23 +0000 (11:00 +0100)]
BUG/MINOR: h2/rhttp: fix HTTP2 conn counters on reverse
Dedicated HTTP/2 stats proxy counters are available for current and
total number of HTTP/2 connection on both frontend and backend sides.
Both counters are simply incremented into h2_init().
This causes issues when using reverse HTTP. First, increment is not
performed on the expected side, as it is triggered before
h2_conn_reverse() which switches a connection from frontend to backend
or vice versa. For example on active revers side, h2_total_connections
is incremented on the backend only even after connection is reversed and
attached to a listener for the remainder of its lifetime.
h2_open_connections suffers from a similar but arguably worst behavior
as it is also decremented. If increment and decrement operations are not
performed on the same proxy side, which happens for every connection
which has been successfully reversed, it causes an invalid counter
value, possibly with an integer overflow.
To fix this, delay increment operations on reverse HTTP from h2_init()
to h2_conn_reverse(). Both counters are updated only after reverse has
completed, thus using the expected frontend or backend side.
To prevent overflow on h2_open_connections, ensure h2_release()
decrement is not performed if a connection is freed before achieving its
reversal, as in this case it would not have been accounted by H2
counters.
Amaury Denoyelle [Thu, 19 Dec 2024 15:34:18 +0000 (16:34 +0100)]
CLEANUP: mux-quic: remove dead err label in qcc_build_frms()
STREAM frames emission in qcc_build_frms() has been splitted from
RESET_STREAM/STOP_SENDING into qcc_emit_rs_ss(). Now, the former cannot
fail, as such err label can be removed as it is unreachable.
Amaury Denoyelle [Thu, 19 Dec 2024 13:17:37 +0000 (14:17 +0100)]
BUG/MEDIUM: mux-quic: prevent BUG_ON() by refreshing frms on MAX_DATA
QUIC MUX emission has been optimized recently by recycling STREAM frames
list between emission cycles. This is done via qcc frms list member. If
new data is available, frames list must be cleared before the next
emission to force the encoding of new STREAM frames.
If a refresh frames list is missed, it would lead to incomplete data
emission on the next transfer. In most cases, this is detected via a
BUG_ON() inside qcc_io_send(), as qcs instances remains in send_list
after a qcc_send_frames() full emission.
A bug was recently found which causes this BUG_ON() crash. This is
directly related to flow control. Indeed, when sending credit is
increased on the connection or a stream, frames list should be cleared
as new larger STREAM frames could be encoded. This was already performed
on MAX_DATA/MAX_STREAM_DATA reception but only if flow-control limit was
unblocked. However this is not the proper condition and it may lead to
insufficient frames refresh and thus this BUG_ON() crash.
Fix this by adjusting the condition for frames refresh on flow control
credit increase. Now, frames list is cleared if real offset is not
blocked and soft offset was equal or greater to the previous limit.
Indeed, this is the only case in which frames refreshing is necessary as
it would result in bigger encoded STREAM frames.
This bug was detected on QUIC interop with go-x-net client. It can also
be reproduced, albeit not systematically, using the following command :
$ ngtcp2-client -q --no-quic-dump --no-http-dump \
--exit-on-all-streams-close --max-data 10 \
127.0.0.1 20443 -n10 "http://127.0.0.1:20443/?s=10k"
This bug appeared with the following patch. As it is scheduled for 3.1
backporting, the current fix should be backported with it. 14710b5e6bf76834343d58db22e00b72590b16fe
MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send
REGTESTS: fix lua-based regtests using tune.lua.smp-preserve-bool
Because of the previous commit, configs making use of lua script without
setting "tune.lua.smp-preserve-bool" explicitly now raise a warning.
However, since 6f746af91 ("REGTESTS: use -dW by default on every
reg-tests"), regtests are not allowed to raise warnings anymore.
Because of this the CI now fails for every tests that relies on Lua.
To fix this, let's explicitly set the "tune.lua.smp-preserve-bool" for
all tests involving Lua. Here we set the value to "on" because we know
it is safe to do so, and this way it will be future-proof.
If ec7443827 ("MINOR: hlua: add option to preserve bool type from smp to
lua") is backported, then this patch must be backported with it (if it
is not trivial to backport, then simply follow this rule: grep for
"lua-load" in reg-tests directory, then for each match, make sure to set
the tune.smp-preserve-bool tunable in the global section.
MINOR: hlua: add option to preserve bool type from smp to lua
As discussed in GH #2814, there is an ambiguity in hlua implementation
that causes haproxy smp boolean type to be pushed as an integer on the
Lua stack. On the other hand, when doing Lua to haproxy smp conversion,
the boolean type is properly perserved. Of course this situation is not
desirable and can lead to unexpected results. However we cannot simply
fix the behavior because in Lua boolean and integer types are not
are completely distinct types and cannot be used interchangeably. So in
order to prevent breaking existing scripts logic, in this patch we add a
dedicated lua tunable named "tune.lua.smp-preserve-bool" which can take
the following values:
- "on" : when converting haproxy smp to lua, boolean type is preserved
- "off": when converting haproxy smp to lua, boolean is converted to
integer (legacy behavior)
For now, the tunable defaults to "off" to preserve historical behavior.
However, when the option isn't set explicitly and lua is used, a warning
will be emitted in order to raise user's awareness about this ambiguity.
It is expected that the tunable could default to "on" in future versions,
thus it is recommended to avoid setting it to "off" except when using
existing Lua scripts that still rely on the old behavior regarding boolean
smp to Lua conversion, and that they cannot be fixed easily.
This should solve issue GH #2814. It may be relevant to backport this in
haproxy 3.1.
DOC: config: add "tune.lua.burst-timeout" to the list of global parameters
"tune.lua.burst-timeout" was properly defined but not listed in the list
of global parameters as it was overlooked in 58e36e5b1 ("MEDIUM: hlua:
introduce tune.lua.burst-timeout")
Allow to build correctly without OCSP. It could be disabled easily with
OpenSSL build with OPENSSL_NO_OCSP. Or even with
DEFINE="-DOPENSSL_NO_OCSP" on haproxy make line.
MEDIUM: ssl/ocsp: OCSP response is expired with OCSP_MAX_RESPONSE_TIME_SKEW
When a OCSP response has a nextUpdate date which is
OCSP_MAX_RESPONSE_TIME_SKEW (300) seconds in the future, the OCSP
stapling callback ssl_sock_ocsp_stapling_cbk() returns SSL_TLSEXT_ERR_NOACK.
However we don't emit an error when trying to load the file.
There is a OCSP_check_validity() check using
OCSP_MAX_RESPONSE_TIME_SKEW, but it checks that the OCSP response is not
thisUpdate is not too much in the past.
This patch emits an error during loading so we don't try to load an OCSP
response which would never be emitted because of OCSP_MAX_RESPONSE_TIME_SKEW.
MINOR: ssl/ocsp: Add extra details in error logs when possible
When the ocsp response auto update process fails during insertion or
while validating the received ocsp response, we call
ssl_sock_update_ocsp_response or ssl_ocsp_check_response respectively
and both these functions take an 'err' parameter in which detailed error
messages can be written. Until now, those error messages were discarded
and the only information given to the user was a generic error
(ERR_CHECK or ERR_INSERT) which does not help much.
We now keep a pointer to the last error message in the certificate_ocsp
structure and dump its content in the update logs as well as in the
"show ssl ocsp-updates" cli command.
Amaury Denoyelle [Thu, 12 Dec 2024 14:28:35 +0000 (15:28 +0100)]
MINOR: mux-quic: hide traces when woken up on pacing only
Previous commit aligned default and pacing emission. This is a cleaner
and more robust code. However, it may disrupt traces analysis when
pacing is rescheduled until timer expiration.
Hide traces when qcc_io_cb() is woken up only due to pacing and timer is
not yet expired. This is implemented by using special TASK_WOKEN_IO for
pacing.
Amaury Denoyelle [Tue, 17 Dec 2024 15:28:16 +0000 (16:28 +0100)]
MINOR: trace: implement tracing disabling API
Define a set of functions to temporarily disable/reactivate tracing for
the current thread. This could be useful when wanting to quickly remove
tracing output for some code parts.
The API relies on a disable/resume set of functions, with a thread-local
counter. This counter is tested under __trace_enabled(). It is a
cumulative value so that the same count of resume must be issued after
several disable usage. There is also the possibility to force reset the
counter to 0 before restoring the old value.
Amaury Denoyelle [Thu, 12 Dec 2024 13:53:49 +0000 (14:53 +0100)]
MEDIUM: mux-quic: remove pacing specific code on qcc_io_cb
Pacing was recently implemented by QUIC MUX. Its tasklet is rescheduled
until next emission timer is reached. To improve performance, an
alternate execution of qcc_io_cb was performed when rescheduled due to
pacing. This was implemented using TASK_F_USR1 flag.
However, this model is fragile, in particular when several events
happened alongside pacing scheduling. This has caused some issue
recently, most notably when MUX is subscribed on transport layer on
receive for handshake completion while pacing emission is performed in
parallel. MUX qcc_io_cb() would not execute the default code path, which
means the reception event is silently ignored.
Recent patches have reworked several parts of qcc_io_cb. The objective
was to improve performance with better algorithm on send and receive
part. Most notable, qcc frames list is only cleared when new data is
available for emission. With this, pacing alternative code is now mostly
unneeded. As such, this patch removes it. The following changes are
performed :
* TASK_F_USR1 is now not used by QUIC MUX. As such, tasklet_wakeup()
default invokation can now replace obsolete wrappers
qcc_wakeup/qcc_wakeup_pacing
* qcc_purge_sending is removed. On pacing rescheduling, all qcc_io_cb()
is executed. This is less error-prone, in particular when pacing is
mixed with other events like receive handling. This renders the code
less fragile, as it completely solves the described issue above.
Amaury Denoyelle [Thu, 12 Dec 2024 11:03:37 +0000 (12:03 +0100)]
MEDIUM/OPTIM: mux-quic: do not rebuild frms list on every send
A newly introduced frames list member has been defined into QCC instance
with pacing implementation. This allowed to preserve STREAM frames built
between different emission scheduled by pacing, without having to
regenerate it if no new QCS data is available.
Generalize this principle outside of pacing scheduling. Now, the frames
list will be reused accross several qcc_io_send() usage. Frames list is
only cleared when necessary. This will force its refreshing in the next
qcc_io_send() via qcc_build_frms_list().
Frames list refreshing is performed in the following cases :
* on successful transfer from stream snd_buf / done_ff / shut
* on stream reset or read abort
* on max_data/max_stream_data reception with window increase
Note that the two first cases are in fact covered directly due to
qcc_send_stream() usage when QCS is (re)inserted into the send_list.
The main objective of this patch will be to remove QUIC MUX pacing
specific code path. It could also provide better performance as emission
of large frames may often be rescheduled due to transport layer, either
on congestion or full socket buffer. When QUIC MUX is rescheduled, no
new data is available and frames list can be reuse as-is, avoiding an
unecessary loop over send_list.
Amaury Denoyelle [Fri, 13 Dec 2024 16:44:38 +0000 (17:44 +0100)]
MINOR: mux-quic: split STREAM and RS/SS emission
This commit is a follow-up of the previous one which defines function
qcc_build_frms(). This function implements looping over qcc send_list,
to both encode and send individually any STOP_SENDING and RESET_STREAM,
but also encode STREAM frames as a preparator step. STREAM frames were
then sent as a list outside of qcc_build_frms() via qcc_send_frames().
Extract STOP_SENDING/RESET_STREAM encoding and emission step into a new
function qcc_emit_rs_ss(). The code is thus cleaner. In particular it
highlights that an error during STOP_SENDING/RESET_STREAM emission stage
is fatal and prevent any STREAM frames processing.
Amaury Denoyelle [Wed, 11 Dec 2024 16:53:29 +0000 (17:53 +0100)]
MINOR: mux-quic: extract code to build STREAM frames list
Extracts code responsible to generate STREAM, RESET_STREAM and
STOP_SENDING frames for each qcs instances registered in qcc send_list.
It is moved from qcc_io_send() to its owned new function
qcc_build_frms().
This commit does not bring functional change. It is a preparatory step
to adapt QUIC MUX send mechanism to allow reusing of qcc frms list
accross qcc_io_send() invokation.
As a side change, qcc_tx_frms_free() is renamed to qcc_clear_frms().
This better highlights its relationship with qcc_build_frms().
Amaury Denoyelle [Tue, 17 Dec 2024 10:16:34 +0000 (11:16 +0100)]
MEDIUM/OPTIM: mux-quic: implement purg_list
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.
qcc_io_process() is responsible to perform some internal operations on
QUIC MUX after I/O completion. It is notably called on every qcc_io_cb()
tasklet handler.
The most intensive work on it is the purging of QCS instances after
transfer completion. This was implemented by looping on QCC streams tree
and inspecting the state of every QCS. The purpose of this commit is to
optimize this processing.
A new purg_list QCC member is defined. It is responsible to list every
QCS instances whose transfer has been completed. It is thus safe to
reuse <el_send> QCS list attach point. Stream purging will thus only
loop on purg_list instead of every known QCS.
MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.
Define a recv_list element into qcc structure. This is used to
registered every instance of qcs which are currently blocked on
demuxing, which happen on no more space in <rx.appbuf>.
The purpose of this patch is to reduce qcc_io_recv() CPU usage. Now,
only recv_list iteration is performed, instead of the previous looping
over every qcs instances. This is useful as qcc_io_recv() is called each
time qcc_io_cb() is scheduled, even if only sending condition was the
wakeup origin.
A qcs is not inserted into recv_list immediately after blocking on demux
full buffer. Instead, this is only done after unblocking via stream
rcv_buf callback, which ensure that new buffer space is available.
Amaury Denoyelle [Fri, 29 Nov 2024 15:18:12 +0000 (16:18 +0100)]
MINOR: mux-quic: refactor wait-for-handshake support
This commit refactors wait-for-handshake support from QUIC MUX. The flag
logic QC_CF_WAIT_HS is inverted : it is now positionned only if MUX is
instantiated before handshake completion. When the handshake is
completed, the flag is removed.
The flag is now set directly on initialization via qmux_init(). Removal
via qcc_wait_for_hs() is moved from qcc_io_process() to qcc_io_recv().
This is deemed more logical as QUIC MUX is scheduled on RECV to be
notify by the transport layer about handshake termination. Moreover,
qcc_wait_for_hs() is now called if recv subscription is still active.
This commit is the first of a serie which aims to refactor QUIC MUX I/O
handler and improves its overall performance. The ultimate objective is
to be able to stream qcc_io_cb() by removing pacing specific code path
via qcc_purge_sending().
Amaury Denoyelle [Wed, 11 Dec 2024 16:55:29 +0000 (17:55 +0100)]
BUG/MEDIUM: mux-quic: do not mix qcc_io_send() return codes with pacing
With pacing implementation, qcc_send_frames() return code has been
extended to report emission interruption due to pacing limitation. This
is used only in qcc_io_send().
However, its invokation may be skipped using 'sent_done' label. This
happens on emission failure of a STOP_SENDING or RESET_STREAM (either
memory allocation failure, or transport layer rejection). In this case,
return values are mixed as qcs_send() is wrongly compared against pacing
interruption condition. This value corresponds to the length of the last
built STREAM frames.
If by mischance the last frame was 1 byte long, qcs_send() return value
is equal to pacing interruption condition. This has several effects. If
pacing is activated, it may lead to unneeded wakeup on QUIC MUX. Worst,
if pacing is not used, a BUG_ON() crash will be triggered.
Fix this by using a different variable dedicated to qcc_send_frames()
return value. By default it is initialized to 0. This ensures that
pacing code won't be activated in case qcc_send_frames() is not used.
Willy Tarreau [Tue, 17 Dec 2024 14:02:39 +0000 (15:02 +0100)]
CLEANUP: ssl-sock: drop two now unneeded ALREADY_CHECKED()
In ssl_sock_bind_verifycbk() a BUG_ON() checks the validity of "ctx" and
"bind_conf". There was a pair of ALREADY_CHECKED() macros after BUG_ON()
for the case where DEBUG_STRICT=0. But this is now addressed so we can
remove these two macros and rely on the BUG_ON() instead.
Willy Tarreau [Tue, 17 Dec 2024 13:54:20 +0000 (14:54 +0100)]
CLEANUP: quic: replace ALREADY_CHECKED() with ASSUME_NONNULL() at a few places
There were 4 instances of ALREADY_CHECKED() used to tell the compiler that
the argument couldn't be NULL by design. Let's change them to the cleaner
ASSUME_NONNULL(). Functions like qc_snd_buf() were slightly reduced in
size (-24 bytes).
Apparently gcc-13 sees a potential case that others don't see, and it's
likely a bug since depending what is masked, it will completely change
the output warnings to the point of contradicting itself. After many
attempts, it appears that just checking that CMSG_FIRSTHDR(msg) is not
null suffices to calm it down, so the strange warnings might have been
the result of an overoptimization based on a supposed UB in the first
place. At least now all versions up to 13.2 as well as clang are happy.
Willy Tarreau [Tue, 17 Dec 2024 13:31:08 +0000 (14:31 +0100)]
CLEANUP: stats: use ASSUME_NONNULL() to indicate that the first block exists
In stats_scope_ptr(), the validity of blk() was assumed using
ALREADY_CHECKED(blk), but we can now use the cleaner ASSUME_NONNULL().
In addition this simplifies the BUG_ON() check that follows.
Willy Tarreau [Thu, 7 Nov 2024 10:12:40 +0000 (11:12 +0100)]
MINOR: bug: make BUG_ON() fall back to ASSUME
When the strict level is zero and BUG_ON() is not implemented, some
possible null-deref warnings are emitted again because some were
covering for these cases. Let's make it fall back to ASSUME() so that
the compiler continues to know that the tested expression never happens.
It also allows to further optimize certain functions by helping the
compiler eliminate certain tests for impossible values. However it
requires that the expression is really evaluated before passing the
result through ASSUME() otherwise it was shown that gcc-11 and above
will fail to evaluate its implications and will continue to emit the
null-deref warnings in case the expression is non-trivial (e.g. it
has multiple terms).
We don't do it for BUG_ON_HOT() however because the extra cost of
evaluating the condition is generally not welcome in fast paths,
particularly when that BUG_ON_HOT() was kept disabled for
performance reasons.
Willy Tarreau [Tue, 17 Dec 2024 09:42:07 +0000 (10:42 +0100)]
MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid
At plenty of places we have ALREADY_CHECKED() or DISGUISE() on a pointer
just to avoid "possibly null-deref" warnings. These ones have the side
effect of weakening optimizations by passing through an assembly step.
Using ASSUME_NONNULL() we can avoid that extra step. And when the
__builtin_unreachable() builtin is not present, we fall back to the old
method using assembly. The macro returns the input value so that it may
be used both as a declarative way to claim non-nullity or directly inside
an expression like DISGUISE().
Willy Tarreau [Tue, 17 Dec 2024 08:19:20 +0000 (09:19 +0100)]
MINOR: compiler: also enable __builtin_assume() for ASSUME()
Clang apparently has __builtin_assume() which does exactly the same
as our macro, since at least v3.8. Let's enable it, in case it may
even better detect assumptions vs unreachable code.
Willy Tarreau [Thu, 7 Nov 2024 10:09:33 +0000 (11:09 +0100)]
MINOR: compiler: add a new "ASSUME" macro to help the compiler
This macro takes an expression, tests it and calls an unreachable
statement if false. This allows the compiler to know that such a
combination does not happen, and totally eliminate tests that would
be related to this condition. When the statement is not available
in the compiler, we just perform a break from a do {} while loop
so that the expression remains evaluated if needed (e.g. function
call).
Willy Tarreau [Tue, 17 Dec 2024 08:10:53 +0000 (09:10 +0100)]
MINOR: compiler: rely on builtin detection for __builtin_unreachable()
Due to __builtin_unreachable() only being associated to gcc 4.5 and
above, it turns out it was not enabled for clang. It's not used *that*
much but still a little bit, so let's enable it now. This reduces the
code size by 0.2% and makes it a bit more efficient.
Willy Tarreau [Tue, 17 Dec 2024 07:54:23 +0000 (08:54 +0100)]
MINOR: compiler: add a __has_builtin() macro to detect features more easily
We already have a __has_attribute() macro to detect when the compiler
supports a specific attribute, but we didn't have the equivalent for
builtins. clang-3 and gcc-10 have __has_builtin() for this. Let's just
bring it using the same mechanism as __has_attribute(), which will allow
us to simply define the macro's value for older compilers. It will save
us from keeping that many compiler-specific tests that are incomplete
(e.g. the __builtin_unreachable() test currently doesn't cover clang).
Willy Tarreau [Mon, 16 Dec 2024 08:31:27 +0000 (09:31 +0100)]
BUILD: debug: only dump/reset glitch counters when really defined
If neither DEBUG_GLITCHES nor DEBUG_STRICT is set, we end up with
no dbg_cnt section, resulting in debug_parse_cli_counters not
building due to __stop_dbg_cnt and __start_dbg_cnt not being defined.
Let's just condition the end of the function to these conditions.
An alternate approach (less elegant) is to always declare a dummy
entry of type DBG_COUNTER_TYPES in debug.c.
This must be backported to 3.1 since it was brought with glitches.