MEDIUM: queue: use tasklet_instant_wakeup() to wake tasks
It's long been known that queues didn't scale with threads for various
reasons ranging from the cost of the queue lock to the cost of the
massive amount of inter-thread wakeups.
But some recent reports showing deplorable perfs with threads used at
100% CPU helped us notice that the two elements above add on top of
each other:
- with plenty of inter-thread wakeups, the scheduler takes a lot of
time to dequeue pending tasks from the shared queue ;
- the lock held by the scheduler to do this slows down subsequent
task_wakeup() calls from the the queue that are made under the
queue's lock
- the queue's lock slows down addition of new requests to the queue
and adds up to the number of needed queue entries for a steady
traffic.
But the cost of the share queue has no reason for being paid because
it had already been paid when process_stream() added the request to
the queue. As such an instant wakeup is perfectly fit for this.
This is exactly what this patch does, it uses tasklet_instant_wakeup()
to dequeue pending requests, which has the effect of not bloating the
shared queue, hence not requiring the global queue lock, which in turn
results in the wakeup to be much faster, and the queue lock to be much
shorter. In the end, a test with 4k concurrent connections that was
being limited to 40-80k requests/s before with 16 threads, some of
which were stuck at 100% CPU now reaches 570k req/s with 4% idle.
Given that it's been found that it was possible to trigger the watchdog
on the queue lock under extreme conditions, and that such conditions
could happen when users want to protect their servers during a DoS, it
would definitely make sense to backport it to the most recent releases
(2.5 and 2.4 seem like good candidates especially because their scheduler
is modern enough to receive the change above). If a backport is performed,
the following patch is needed:
MINOR: task: add a new task_instant_wakeup() function
MINOR: task: add a new task_instant_wakeup() function
This function's purpose is to wake up either a local or remote task,
bypassing the tree-based run queue. It is meant for fast wakeups that
are supposed to be equivalent to those used with tasklets, i.e. a task
had to pause some processing and can complete (typically a resource
becomes available again). In all cases, it's important to keep in mind
that the task must have gone through the regular scheduling path before
being blocked, otherwise the task priorities would be ignored.
The reason for this is that some wakeups are massively inter-thread
(e.g. server queues), that these inter-thread wakeups cause a huge
contention on the shared runqueue lock. A user reported 47% CPU spent
in process_runnable_tasks with only 32 threads and 80k requests in
queues. With this mechanism, purely one-to-one wakeups can avoid
taking the lock thanks to the mt_list used for the shared tasklet
queue.
Right now the shared tasklet queue moves everything to the TL_URGENT
queue. It's not dramatic but it would seem better to have a new shared
list dedicated to tasks, and that would deliver into TL_NORMAL, for an
even better fairness. This could be improved in the future.
BUG/MAJOR: connection: Never remove connection from idle lists outside the lock
Since the idle connections management changed to use eb-trees instead of MT
lists, a lock must be acquired to manipulate servers idle/safe/available
connection lists. However, it remains an unprotected use in
connect_server(), when a connection is removed from an idle list if the mux
has no more streams available. Thus it is possible to remove a connection
from an idle list on a thread, while another one is looking for a idle
connection. Of couse, this may lead to a crash.
To fix the bug, we must take care to acquire the idle connections lock
first. The bug was introduced by the commit f232cb3e9 ("MEDIUM: connection:
replace idle conn lists by eb trees").
MEDIUM: httpclient/ssl: verify is configurable and disabled by default
Disable temporary the SSL verify by default in the httpclient. The
initialization of the @system-ca during the init of the httpclient is a
problem in some cases.
The verify can be reactivated with "httpclient-ssl-verify required" in
the global section.
The httpclient HTTPS requests now enable the "verify required" option.
To achieve this, the "@system-ca" ca-file is configured in the
httpclient ssl server. Which means all the system CAs will be loaded at
haproxy startup.
Change the init order of the httpclient, a different init sequence is
required to allow a more complicated init.
The init is splitted in two parts:
- the first part is executed before config_check_validity(), which
allows to create proxy and more advanced stuff than STG_INIT, because
we might want to use stuff already initialized in haproxy (trash
buffers for example)
- the second part is executed after the config_check_validity(),
currently it is used for the log configuration.
This adds a call to function <fct> to the list of functions to be called at
the step just before the configuration validity checks. This is useful when you
need to create things like it would have been done during the configuration
parsing and where the initialization should continue in the configuration
check.
It could be used for example to generate a proxy with multiple servers using
the configuration parser itself. At this step the trash buffers are allocated.
Threads are not yet started so no protection is required. The function is
expected to return non-zero on success, or zero on failure. A failure will make
the process emit a succinct error message and immediately exit.
MINOR: conn-stream: Make cs_detach_* private and use cs_destroy() from outside
A conn-stream is never detached from an endpoint or an application alone,
except on a reset. Thus, to avoid any error, these functions are now
private. And cs_destroy() function is added to destroy a conn-stream. This
function is called when a stream is released, on the front and back
conn-streams, and when a health-check is finished.
MINOR: stream: Don't needlessly detach server endpoint on early client abort
When a client abort is detected with the server conn-stream in CS_ST_INI
state, there is no reason to detach the endpoing because we know there is no
endpoint attached to this conn-stream. This patch depends on the commit
"BUG/MEDIUM: conn-stream: Set back CS to RDY state when the appctx is
created".
BUG/MEDIUM: conn-stream: Set back CS to RDY state when the appctx is created
When an appctx is created on the server side, we now set the corresponding
conn-stream to ready state (CS_ST_RDY). When it happens, the backend
conn-stream is in CS_ST_INI state. It is not consistant to let the
conn-stream in this state because it means it is possible to have a target
installed in CS_ST_INI state, while with a connection, the conn-stream is
switch to CS_ST_RDY or CS_ST_EST state.
It is especially anbiguous because we may be tempted to think there is no
endpoint attached to the conn-stream before the CS_ST_CON state. And it is
indeed the reason for a bug leading to a crash because a cs_detach_endp() is
performed if an abort is detected on the backend conn-stream in CS_ST_INI
state. With a mux or a appctx attached to the conn-stream, "->endp" field is
set to NULL. It is unexpected. The API will be changed to be sure it is not
possible. But it exposes a consistency issue with applets.
So, the conn-stream must not stay in CS_ST_INI state when an appctx is
attached. But there is no reason to set it in CS_ST_REQ. The conn-stream
must be set to CS_ST_RDY to handle applets and connections in the same
way. Note that if only the target is set but no appctx is created, the
backend conn-stream is switched from CS_ST_INI to CS_ST_REQ state to be able
to create the corresponding appctx. This part is unchanged.
This patch depends on the commit "MINOR: backend: Don't allow to change
backend applet".
The ambiguity exists on previous versions. But the issue is
2.6-specific. Thus, no backport is needed.
BUG/MINOR: backend: Don't allow to change backend applet
This part was inherited from haproxy-1.5. But since a while (at least 1.8),
the backend applet, once created, is no longer changed. Thus there is no
reason to still check if the target has changed. And in fact, if it was
still possible, there would be a memory leak because the old applet would be
lost and never released.
There is no reason to backport this fix because the leak only exists on a
dead code path.
BUG/MINOR: cache: Disable cache if applet creation fails
When we want to serve a resource from the cache, if the applet creation
fails, the "cache-use" action must not yield. Otherwise, the stream will
hang. Instead, we now disable the cache. Thus the request may be served by
the server.
MINOR: conn-stream: Rely on endpoint shutdown flags to shutdown an applet
cs_applet_shut() now relies on CS_EP_SH* flags to performed the applet
shutdown. It means the applet release callback is called if there is no
CS_EP_SHR or CS_EP_SHW flags set. And it set these flags, CS_EP_SHRR and
CS_EP_SHWN more specifically, before exiting.
This way, cs_applet_shut() is the really equivalent to cs_conn_shut().
This function does not release the applet but only call the applet release
callback. It is equivalent to cs_conn_shut() but for applets. Thus the
function is renamed cs_applet_shut().
CLEANUP: conn-stream: Rename cs_conn_close() and cs_conn_drain_and_close()
These functions don't close the connection but only perform shutdown for
reads and writes at the mux level. It is a bit ambiguous. Thus,
cs_conn_close() is renamed cs_conn_shut() and cs_conn_drain_and_close() is
renamed cs_conn_drain_and_shut(). These both functions rely on
cs_conn_shutw() and cs_conn_shutr().
DEV: stream: Fix conn-streams dump in full stream message
Since the recent changes about the conn-streams, the stream dump in "show
sess all" command is a bit mangled. front and back conn-stream are now
properly displayed (csf and csb). In addition, when there is no backend
endpoint, "APPCTX" was always reported. Now, "NONE" is reported in this
case.
As anticipated in commit 211ea252d ("BUG/MINOR: logs: fix logsrv leaks
on clean exit"), there were indeed other corner cases that were not
properly covered. Setting the http client's ring_name to NULL make the
sink lookup crash on startup in sink_find () with a config as simple as:
global
log ring@buf0 local0
The fields must be properly initialized (both config file name and
the ring_name). This only needs to be backported if/when the commit
above is backported.
Do not initialize mux task timeout if timeout client is set to 0 in the
configuration. Check for the task before queuing it in qc_io_cb() or
qc_detach().
This fix a crash when timeout client is 0 or undefined.
Unsubscribe from lower layer on qc_release. This ensures that the lower
layer won't wake up a null tasklet after the MUX has been released and
may prevent a crash.
BUG/MEDIUM: quic: Possible crash with released mux
It is possible the xprt layer have to process retransmitted STREAM frames after
the mux was released. In this case, there is no need to try to wake it up.
MINOR: ssl: Add 'show ssl providers' cli command and providers list in -vv option
Starting from OpenSSLv3, providers are at the core of cryptography
functions. Depending on the provider used, the way the SSL
functionalities work could change. This new 'show ssl providers' CLI
command allows to show what providers were loaded by the SSL library.
This is required because the provider configuration is exclusively done
in the OpenSSL configuration file (/usr/local/ssl/openssl.cnf for
instance).
A new line is also added to the 'haproxy -vv' output containing the same
information.
Complete qc_send function. After having processed each qcs emission, it
will now retry send on qcs where transfer can continue. This is useful
when qc_stream_desc buffer is full and there is still data present in
qcs buf.
To implement this, each eligible qcs is inserted in a new list
<qcc.send_retry_list>. This is done on send notification from the
transport layer through qcc_streams_sent_done(). Retry emission until
send_retry_list is empty or the transport layer cannot proceed more
data.
Several send operations are now called on two different places. Thus a
new _qc_send_qcs() function is defined to factorize the code.
This change should maximize the throughput during QUIC transfers.
MINOR: quic: limit total stream buffers per connection
MUX streams can now allocate multiple buffers for sending. quic-conn is
responsible to limit the total count of allowed allocated buffers. A
counter is stored in the new field <stream_buf_count>.
For the moment, the value is hardcoded to 30.
On stream buffer allocation failure, the qcc MUX is flagged with
QC_CF_CONN_FULL. The MUX is then woken up as soon as a buffer is freed,
most notably on ACK reception.
Acknowledge of STREAM has been complexified with the introduction of
stream multi buffers. Two functions are executing roughly the same set
of instructions in xprt_quic.c.
To simplify this, move the code complexity in a new function
qc_stream_desc_ack(). It will handle offset calculation, removal of
data, freeing oldest buffer and freeing stream instance if required.
The qc_stream_desc API is cleaner as qc_stream_desc_free_buf() ambiguous
function has been removed.
Complete the qc_stream_desc type to support multiple buffers on
emission. The main objective is to increase the transfer throughput.
The MUX is now able to transfer more data without having to wait ACKs.
To implement this feature, a new type qc_stream_buf is declared. it
encapsulates a buffer with a list element. New functions are defined to
retrieve the current buffer, release it or allocate a new one. Each
buffer is kept in the qc_stream_desc list until all of its data is
acknowledged.
On the MUX side, a qcs uses the current stream buffer to transfer data.
Once the buffer is full, it is released and a new one will be allocated
on a future qc_send() invocation.
Add a new member <qc> in qc_stream_desc structure. This change is
possible since previous patch which add quic-conn argument to
qc_stream_desc_new().
The purpose of this change is to simplify the future evolution of
qc-stream-desc API. This will avoid to repeat qc as argument in various
functions which already used a qc_stream_desc.
MINOR: quic-stream: use distinct tree nodes for quic stream and qcs
Simplify the model qcs/qc_stream_desc. Each types has now its own tree
node, stored respectively in qcc and quic-conn trees. It is still
necessary to mark the stream as detached by the MUX once all data is
transfered to the lower layer.
This might improve slightly the performance on ACK management as now
only the lookup in quic-conn is necessary. On the other hand, memory
size of qcs structure is increased.
REORG: quic: use a dedicated module for qc_stream_desc
Regroup all type definitions and functions related to qc_stream_desc in
the source file src/quic_stream.c.
qc_stream_desc complexity will be increased with the development of Tx
multi-buffers. Having a dedicated module is useful to mix it with
pure transport/quic-conn code.
MINOR: mux-quic: split xfer and STREAM frames build
Split qcs_push_frame() in two functions.
The first one is qcs_xfer_data(). Its purpose is to transfer data from
qcs.tx.buf to qc_stream_desc buffer. The second function is named
qcs_build_stream_frm(). It generates a STREAM frame using qc_stream_desc
buffer as payload.
The trace events previously associated with qcs_push_frame() has also
been split in two to reflect the new code structure.
The purpose of this refactoring is first to better reflect how sending
is implemented. It will also simplify the implementation of Tx
multi-buffer per streams.
The DH parameters used for OpenSSL versions 1.1.1 and earlier where
changed. For OpenSSL 1.0.2 and LibreSSL the newly introduced
ssl_get_dh_by_nid function is not used since we keep the original
parameters.
DHE ciphers do not present a security risk if the key is big enough but
they are slow and mostly obsoleted by ECDHE. This patch removes any
default DH parameters. This will effectively disable all DHE ciphers
unless a global ssl-dh-param-file is defined, or
tune.ssl.default-dh-param is set, or a frontend has DH parameters
included in its PEM certificate. In this latter case, only the frontends
that have DH parameters will have DHE ciphers enabled.
Adding explicitely a DHE ciphers in a "bind" line will not be enough to
actually enable DHE. We would still need to know which DH parameters to
use so one of the three conditions described above must be met.
MINOR: ssl: Use DH parameters defined in RFC7919 instead of hard coded ones
RFC7919 defined sets of DH parameters supposedly strong enough to be
used safely. We will then use them when we can instead of our hard coded
ones (namely the ffdhe2048 and ffdhe4096 named groups).
The ffdhe2048 and ffdhe4096 named groups were integrated in OpenSSL
starting with version 1.1.1. Instead of duplicating those parameters in
haproxy for older versions of OpenSSL, we will keep using our own
parameters when they are not provided by the SSL library.
We will also need to keep our 1024 bits DH parameters since they are
considered not safe enough to have a dedicated named group in RFC7919
but we must still keep it for retrocompatibility with old Java clients.
Released version 2.6-dev6 with the following main changes :
- CLEANUP: connection: reduce the with of the mux dump output
- CI: Update to actions/checkout@v3
- CI: Update to actions/cache@v3
- DOC: adjust QUIC instruction in INSTALL
- BUG/MINOR: stats: define the description' background color in dark color scheme
- BUILD: ssl: add USE_ENGINE and disable the openssl engine by default
- BUILD: makefile: pass USE_ENGINE to cflags
- BUILD: xprt-quic: replace ERR_func_error_string() with ERR_peek_error_func()
- DOC: install: document the fact that SSL engines are not enabled by default
- CI: github actions: disable -Wno-deprecated
- BUILD: makefile: silence unbearable OpenSSL deprecation warnings
- MINOR: sock: check configured limits at the sock layer, not the listener's
- MINOR: connection: add a new flag CO_FL_FDLESS on fd-less connections
- MINOR: connection: add conn_fd() to retrieve the FD only when it exists
- MINOR: stream: only dump connections' FDs when they are valid
- MINOR: connection: use conn_fd() when displaying connection errors
- MINOR: connection: skip FD-based syscalls for FD-less connections
- MEDIUM: connection: panic when calling FD-specific functions on FD-less conns
- MINOR: mux-quic: properly set the flags and name fields
- MINOR: connection: rearrange conn_get_src/dst to be a bit more extensible
- MINOR: protocol: add get_src() and get_dst() at the protocol level
- MINOR: quic-sock: provide a pair of get_src/get_dst functions
- MEDIUM: ssl: improve retrieval of ssl_sock_ctx and SSL detection
- MEDIUM: ssl: stop using conn->xprt_ctx to access the ssl_sock_ctx
- MEDIUM: xprt-quic: implement get_ssl_sock_ctx()
- MEDIUM: quic: move conn->qc into conn->handle
- BUILD: ssl: fix build warning with previous changes to ssl_sock_ctx
- BUILD: ssl: add an unchecked version of __conn_get_ssl_sock_ctx()
- MINOR: ssl: refine the error testing for fc_err and fc_err_str
- BUG/MINOR: sock: do not double-close the accepted socket on the error path
- CI: cirrus: switch to FreeBSD-13.0
- MINOR: log: add '~' to frontend when the transport layer provides SSL
- BUILD/DEBUG: lru: fix printf format in debug code
- BUILD: peers: adjust some printf format to silence cppcheck
- BUILD/DEBUG: hpack-tbl: fix format string in standalone debug code
- BUILD/DEBUG: hpack: use unsigned int in printf format in debug code
- BUILD: halog: fix some incorrect signs in printf formats for integers
- BUG/MINOR: h3: fix build with DEBUG_H3
- BUG/MINOR: mux-h2: do not send GOAWAY if SETTINGS were not sent
- BUG/MINOR: cache: do not display expired entries in "show cache"
- BUG/MINOR: mux-h1: Don't release unallocated CS on error path
- MINOR: applet: Make .init callback more generic
- MINOR: conn-stream: Add flags to set the type of the endpoint
- MEDIUM: applet: Set the appctx owner during allocation
- MAJOR: conn-stream: Invert conn-stream endpoint and its context
- REORG: Initialize the conn-stream by hand in cs_init()
- MEDIUM: conn-stream: Add an endpoint structure in the conn-stream
- MINOR: conn-stream: Move some CS flags to the endpoint
- MEDIUM: conn-stream: Be able to pass endpoint to create a conn-stream
- MEDIUM: conn-stream: Pre-allocate endpoint to create CS from muxes and applets
- REORG: applet: Uninline appctx_new function
- MAJOR: conn-stream: Share endpoint struct between the CS and the mux/applet
- MEDIUM: conn-stream: Move remaning flags from CS to endpoint
- MINOR: mux-pt: Rely on the endpoint instead of the conn-stream when possible
- MINOR: conn-stream: Add ISBACK conn-stream flag
- MINOR: conn-stream: Add header file with util functions related to conn-streams
- MEDIUM: tree-wide: Use CS util functions instead of SI ones
- MINOR: stream-int/txn: Move buffer for L7 retries in the HTTP transaction
- CLEANUP: http-ana: Remove http_alloc_txn() function
- MINOR: stream-int/stream: Move conn_retries counter in the stream
- MINOR: stream: Simplify retries counter calculation
- MEDIUM: stream-int/conn-stream: Move src/dst addresses in the conn-stream
- MINOR: stream-int/conn-stream: Move half-close timeout in the conn-stream
- MEDIUM: stream-int/stream: Use connect expiration instead of SI expiration
- MINOR: stream-int/conn-stream: Report error to the CS instead of the SI
- MEDIUM: conn-stream: Use endpoint error instead of conn-stream error
- MINOR: channel: Use conn-streams as channel producer and consumer
- MINOR: stream-int: Remove SI_FL_KILL_CON to rely on conn-stream endpoint only
- MINOR: mux-h2/mux-fcgi: Fully rely on CS_EP_KILL_CONN
- MINOR: stream-int: Remove SI_FL_NOLINGER/NOHALF to rely on CS flags instead
- MINOR: stream-int: Remove SI_FL_DONT_WAKE to rely on CS flags instead
- MINOR: stream-int: Remove SI_FL_INDEP_STR to rely on CS flags instead
- MINOR: stream-int: Remove SI_FL_SRC_ADDR to rely on stream flags instead
- CLEANUP: stream-int: Remove unused SI_FL_CLEAN_ABRT flag
- MINOR: stream: Only save previous connection state for the server side
- MEDIUM: stream-int: Move SI err_type in the stream
- MEDIUM: stream-int/conn-stream: Move stream-interface state in the conn-stream
- MINOR: stream-int/stream: Move si_retnclose() in the stream scope
- MINOR: stream-int/backend: Move si_connect() in the backend scope
- MINOR: stream-int/conn-stream: Move si_conn_ready() in the conn-stream scope
- MINOR: conn-stream/connection: Move SHR/SHW modes in the connection scope
- MEDIUM: conn-stream: Be prepared to fail to attach a cs to a mux
- MEDIUM: stream-int/conn-stream: Handle I/O subscriptions in the conn-stream
- MINOR: conn-stream: Rename CS functions dedicated to connections
- MINOR: stream-int/conn-stream: Move si_shut* and si_chk* in conn-stream scope
- MEDIUM: stream-int/conn-stream: Move si_ops in the conn-stream scope
- MINOR: applet: Use the CS to register and release applets instead of SI
- MINOR: connection: unconst mux's get_fist_cs() callback function
- MINOR: stream-int/connection: Move conn_si_send_proxy() in the connection scope
- REORG: stream-int: Export si_cs_recv(), si_cs_send() and si_cs_process()
- REORG: stream-int: Move si_is_conn_error() in the header file
- REORG: conn-stream: Move cs_shut* and cs_chk* in cs_utils
- REORG: conn-stream: Move cs_app_ops in conn_stream.c
- MINOR: stream-int-conn-stream: Move si_update_* in conn-stream scope
- MINOR: stream-int/stream: Move si_update_both in stream scope
- MEDIUM: conn-stream/applet: Add a data callback for applets
- MINOR: stream-int/conn-stream: Move stream_int_read0() in the conn-stream scope
- MINOR: stream-int/conn-stream: Move stream_int_notify() in the conn-stream scope
- MINOR: stream-int/conn-stream: Move si_cs_io_cb() in the conn-stream scope
- MINOR: stream-int/conn-stream: Move si_sync_recv/send() in conn-stream scope
- MINOR: conn-stream: Move si_conn_cb in the conn-stream scope
- MINOR: stream-int/conn-stream Move si_is_conn_error() in the conn-stream scope
- MINOR: stream-int/conn-stream: Move si_alloc_ibuf() in the conn-stream scope
- CLEANUP: stream-int: Remove unused SI functions
- MEDIUM: stream-int/conn-stream: Move blocking flags from SI to CS
- MEDIUM: stream-int/conn-stream: Move I/O functions to conn-stream
- REORG: stream-int/conn-stream: Move remaining functions to conn-stream
- MINOR: stream: Use conn-stream to report server error
- MINOR: http-ana: Use CS to perform L7 retries
- MEDIUM: stream: Don't use the stream-int anymore in process_stream()
- MINOR: conn-stream: Remove the stream-interface from the conn-stream
- DEV: flags: No longer dump SI flags
- CLEANUP: tree-wide: Remove any ref to stream-interfaces
- CLEANUP: conn-stream: Don't export internal functions
- DOC: conn-stream: Add comments on functions of the new CS api
- MEDIUM: check: Use a new conn-stream for each health-check run
- CLEANUP: muxes: Remove MX_FL_CLEAN_ABRT flag
- MINOR: conn-stream: Use a dedicated function to conditionally remove a CS
- CLEANUP: conn-stream: rename cs_register_applet() to cs_applet_create()
- MINOR: muxes: Improve show_fd callbacks to dump endpoint flags
- MINOR: mux-h1: Rely on the endpoint instead of the conn-stream when possible
- BUG/MINOR: quic: Avoid starting the mux if no ALPN sent by the client
- BUILD: debug: mark the __start_mem_stats/__stop_mem_stats symbols as weak
- BUILD: initcall: mark the __start_i_* symbols as weak, not global
- BUG/MINOR: mux-h2: do not use timeout http-keep-alive on backend side
- BUG/MINOR: mux-h2: use timeout http-request as a fallback for http-keep-alive
- MINOR: muxes: Don't expect to have a mux without connection in destroy callback
- MINOR: muxes: Don't handle proto upgrade for muxes not supporting it
- MINOR: muxes: Don't expect to call release function with no mux defined
- MINOR: conn-stream: Use unsafe functions to get conn/appctx in cs_detach_endp
- BUG/MEDIUM: mux-h1: Don't request more room on partial trailers
- BUILD: http-client: Avoid dead code when compiled without SSL support
- BUG/MINOR: mux-quic: prevent a crash in session_free on mux.destroy
- BUG/MINOR: quic-sock: do not double free session on conn init failure
- BUG/MINOR: quic: fix return value for error in start
- MINOR: quic: emit CONNECTION_CLOSE on app init error
- BUILD: sched: workaround crazy and dangerous warning in Clang 14
- BUILD: compiler: use a more portable set of asm(".weak") statements
- BUG/MEDIUM: stream: do not abort connection setup too early
- CLEANUP: extcheck: do not needlessly preset the server's address/port
- MINOR: extcheck: fill in the server's UNIX socket address when known
- BUG/MEDIUM: connection: Don't crush context pointer location if it is a CS
- BUG/MEDIUM: quic: properly clean frames on stream free
- BUG/MEDIUM: fcgi-app: Use http_msg flags to know if C-L header can be added
- BUG/MEDIUM: compression: Don't forget to update htx_sl and http_msg flags
- MINOR: tcp_sample: clarifying samples support per os, for further expansion.
- MINOR: tcp_sample: extend support for get_tcp_info to macOs.
- SCRIPTS: announce-release: update the doc's URL
- DOC: lua: update a few doc URLs
- SCRIPTS: announce-release: add shortened links to pending issues
SCRIPTS: announce-release: add shortened links to pending issues
The list of URLs now also adds pending bugs, reviewed bugs, and code
reports. The redirect is performed on haproxy.org since github URLs
are far too large here.
The HAProxy doc was updated to point to docs.haproxy.org.
The HAProxy API doc was returning a 404, let's point to version 2.6.
This should be backported with 1.9dev modified to match the respective
versions.
David CARLIER [Mon, 11 Apr 2022 11:53:11 +0000 (12:53 +0100)]
MINOR: tcp_sample: extend support for get_tcp_info to macOs.
MacOS can feed fc_rtt, fc_rttvar, fc_sacked, fc_lost and fc_retrans
so let's expose them on this platform.
Note that at the tcp(7) level, the API is slightly different, as
struct tcp_info is called tcp_connection_info and TCP_INFO is
called TCP_CONNECTION_INFO, so for convenience these ones were
defined to point to their equivalent. However there is a small
difference now in that tcpi_rtt is called tcpi_rttcur on this
platform, which forces us to make a special case for it before
other platforms.
David CARLIER [Mon, 11 Apr 2022 11:41:24 +0000 (12:41 +0100)]
MINOR: tcp_sample: clarifying samples support per os, for further expansion.
While there is some overlap between what each OS provides in terms of
retrievable info, each set is not a real subset of another one and this
results in increasing complexity when trying to add support for new OSes.
Let's just condition each item to the OS that support it. It's not pretty
but at least it will avoid a real mess later.
Note that fc_rtt and fc_rttvar are supported on any OS that has TCP_INFO,
not just linux/freebsd/netbsd, so we continue to expose them unconditionally.
BUG/MEDIUM: compression: Don't forget to update htx_sl and http_msg flags
If the response is compressed, we must update the HTX start-line flags and
the HTTP message flags. It is especially important if there is another
filter enabled. Otherwise, there is no way to know the C-L header was
removed and T-E one was added. Except by looping on headers.
This patch is related to the issue #1660. It must backported as far as 2.0
(for HTX part only).
BUG/MEDIUM: fcgi-app: Use http_msg flags to know if C-L header can be added
Instead of relying on the HTX start-line flags, it is better to rely on
http_msg flags to know if a content-length header can be added or not. In
addition, if the header is added, HTTP_MSGF_CNT_LEN flag must be added.
Because of this bug, an invalid message can be emitted when the response is
compressed because it may contain C-L and a T-E headers.
This patch should fix the issue #1660. It must be backported as far as 2.2.
BUG/MEDIUM: quic: properly clean frames on stream free
A released qc_stream_desc is freed as soon as all its buffer content has
been acknowledged. However, it may still contains other frames waiting
for ACK pointing to deleted buffer content. This can happen on
retransmission.
When freeing a qc_stream_desc, free all its frames in acked_frms tree to
fix memory leak. This may also possibly fix a crash on retransmission.
Now, the frames are properly removed from a packet. This ensure we do
not retransmit a frame whose buffer is deallocated.
BUG/MEDIUM: connection: Don't crush context pointer location if it is a CS
The issue only concerns the backend connection. The conn-stream is now owned
by the stream and persists during all the stream life. Thus we must not
crush it when the backend connection is released.
MINOR: extcheck: fill in the server's UNIX socket address when known
While it's often a pain to try to figure a UNIX socket address, the
server ones are reliable and may be emitted in the check provided
they are retrieved in time. We cannot rely on addr_to_str() because
it only reports "unix" since it may be used to log client addresses
or listener addresses (which are renamed).
The address length was extended to 256 chars to deal with long paths
as previously it was limited to INET6_ADDRSTRLEN+1.
This addresses github issue #101. There's no point backporting this,
external checks are almost never used.
CLEANUP: extcheck: do not needlessly preset the server's address/port
During the config parsing we preset the server's address and port, but
that's pointless since it's replaced during each check in order to deal
with the possibility that the address was changed since.
BUG/MEDIUM: stream: do not abort connection setup too early
Github issue #472 reports a problem with short client connections making
stick-table entries disappear. The problem is in fact totally different
and stems at the connection establishment step.
What happens is that the stick-table there has a single entry. The
"stick-on" directive is forced to purge an existing entry before being
able to create a new one. The new entry will be committed during the
call to process_store_rules() on the response path.
But if the client sends the FIN immediately after the connection is set
up (e.g. using nc -z) then the SHUTR is received and will cancel the
connection setup just after it starts. This cancellation will induce a
call to cs_shutw() which will in turn leave the server-side state in
ST_DIS. This transition from ST_CON to ST_DIS doesn't belong to the
list of handled transition during the connection setup so it will be
handled right after on the regular path, causing the connection to be
closed. Because of this, we never pass through back_establish() and
the backend's analysers are never set on the response channel, which
is why process_store_rules() is not called and the stick-tables entry
never committed.
The comment above the code that causes this transition clearly says
that the function is to be used after the connection is established
with the server, but there's no such protection, and we always have
the AUTO_CLOSE flag there (but there's hardly any available condition
to eliminate it).
This patch adds a test for the connection not being in ST_CON or for
option abortonclose being set. It's sufficient to do the job and it
should not cause issues.
One concern was that the transition could happen during cs_recv()
after the connection switches from CON to RDY then the read0 would
be taken into account and would cause DIS to appear, which is not
handled either. But that cannot happen because cs_recv() doesn't do
anything until it's in ST_EST state, hence the read0() cannot be
called from CON/RDY. Thus the transition from CON to DIS is only
possible in back_handle_st_con() and back_handle_st_rdy() both of
which are called when dealing with the transition already, or when
abortonclose is set and the client aborts before connect() succeeds.
It's possible that some further improvements could be made to detect
this specific transition but it doesn't seem like anything would have
to be added.
This issue was first reported on 2.1. The abortonclose area is very
sensitive so it would be wise to backport slowly, and probably no
further than 2.4.
BUILD: compiler: use a more portable set of asm(".weak") statements
The two recent patches b12966af1 ("BUILD: debug: mark the
__start_mem_stats/__stop_mem_stats symbols as weak") and 2a06e248f
("BUILD: initcall: mark the __start_i_* symbols as weak, not global")
aimed at fixing a build warning and resulted in a build breakage on
MacOS which doesn't have a ".weak" asm statement.
We've already had MacOS-specific asm() statements for section names, so
this patch continues on this trend by moving HA_GLOBL() to compiler.h
and using ".globl" on MacOS since apparently nobody complains there.
It is debatable whether to expose this only when !USE_OBSOLETE_LINKER
or all the time, but since these are just macroes it's no big deal to
let them be available when needed and let the caller decide on the
build conditions.
If any of the patches above is backported, this one will need to as
well.
BUILD: sched: workaround crazy and dangerous warning in Clang 14
Ilya reported in issue #1638 that Clang 14 has invented a new warning
that encourages to modify the code in a way that is not always
equivalent, by turning "|" to "||" between some logical operators,
except that the first one guarantees that all members of the expression
will always be evaluated while the latter will stop at the first one
which is true!
This warning triggers in thread_has_tasks(), which is not sensitive to
such change of behavior but which is built this way because it results
in branchless code for something that most often evaluates to false for
all terms. As such it was out of question to turn this to less efficient
compare-and-jump that needlessly pollute the branch predictor, so the
workaround consists in casting each expression to (int). It was verified
that the code is the same.
Yet another example of how-to-introduce-bugs-by-fixing-valid-code
through warnings invented around a beer without thinking longer!
This may need to be backported to a few older branches in case this
compiler lands in recent distros or if gcc finds it wise to imitate it.
MINOR: quic: emit CONNECTION_CLOSE on app init error
Emit a CONNECTION_CLOSE if the app layer cannot be properly initialized
on qc_xprt_start. This force the quic-conn to enter the closing state
before being closed.
Without this, quic-conn normal operations continue, despite the
app-layer reported as not initialized. This behavior is undefined, in
particular when handling STREAM frames.
BUG/MINOR: quic: fix return value for error in start
Fix the return value used in quic-conn start callback for error. The
caller expects a negative value in this case.
Without this patch, the quic-conn and the connection stack are not
closed despite an initialization failure error, which is an undefined
behavior and may cause a crash in the end.
BUG/MINOR: quic-sock: do not double free session on conn init failure
In the quic_session_accept, connection is in charge to call the
quic-conn start callback. If this callback fails for whatever reason,
there is a crash because of an explicit session_free.
This happens because the connection is now the owner of the session due
to previous conn_complete_session call. It will automatically calls
session_free. Fix this by skipping the session_free explicit invocation
on error.
In practice, currently this has never happened as there is only limited
cases of failures for conn_xprt_start for QUIC.
BUG/MINOR: mux-quic: prevent a crash in session_free on mux.destroy
Implement qc_destroy. This callback is used to quickly release all MUX
resources.
session_free uses this callback. Currently, it can only be called if
there was an error during connection initialization. If not defined, the
process crashes.
BUILD: http-client: Avoid dead code when compiled without SSL support
When an HTTP client is started on an HAProxy compiled without the SSL
support, an error is triggered when HTTPS is used. In this case, the freshly
created conn-stream is released. But this code is specific to the non-SSL
part. Thus it is moved the in right #if/#else section.
BUG/MEDIUM: mux-h1: Don't request more room on partial trailers
The commit 744451c7c ("BUG/MEDIUM: mux-h1: Properly detect full buffer cases
during message parsing") introduced a regression if trailers are not
received in one time. Indeed, in this case, nothing is appended in the
channel buffer, while there are some data in the input buffer. In this case,
we must not request more room to the upper layer, especially because the
channel buffer can be empty.
To fix the issue, on trailers parsing, we consider the H1 stream as
congested when the max size allowed is reached. Of course, the H1 stream is
also considered as congested if the trailers are too big and the channel
buffer is not empty.
This patch should fix the issue #1657. It must be backported as far as 2.0.
MINOR: muxes: Don't handle proto upgrade for muxes not supporting it
Several muxes (h2, fcgi, quic) don't support the protocol upgrade. For these
muxes, there is no reason to have code to support it. Thus in the destroy
callback, there is now a BUG_ON() and the release function is simplified
because the connection is always owned by the mux..
MINOR: muxes: Don't expect to have a mux without connection in destroy callback
Once a mux initialized, the underlying connection alwaus exists from its
point of view and it is never removed until the mux is released. It may be
owned by another mux during an upgrade. But the pointer remains set. Thus
there is no reason to test it in the destroy callback function.
BUG/MINOR: mux-h2: use timeout http-request as a fallback for http-keep-alive
The doc states that timeout http-keep-alive is not set, timeout http-request
is used instead. As implemented in commit 15a4733d5 ("BUG/MEDIUM: mux-h2:
make use of http-request and keep-alive timeouts"), we use http-keep-alive
unconditionally between requests, with a fallback on client/server. Let's
make sure http-request is always used as a fallback for http-keep-alive
first.
This needs to be backported wherever the commit above is backported.
BUG/MINOR: mux-h2: do not use timeout http-keep-alive on backend side
Commit 15a4733d5 ("BUG/MEDIUM: mux-h2: make use of http-request and
keep-alive timeouts") omitted to check the side of the connection, and
as a side effect, automatically enabled timeouts on idle backend
connections, which is totally contrary to the principle that they
must be autonomous.
This needs to be backported wherever the patch above is backported.
BUILD: initcall: mark the __start_i_* symbols as weak, not global
Just like for previous fix, these symbols are marked ".globl" during
their declaration, but their later mention uses __attribute__((weak)),
so it's better to only use ".weak" during the declaration so that the
symbol's class does not change.
No need to backport this unless someone reports build issues.
BUILD: debug: mark the __start_mem_stats/__stop_mem_stats symbols as weak
Building with clang and DEBUG_MEM_STATS shows the following warnings:
warning: __start_mem_stats changed binding to STB_WEAK [-Wsource-mgr]
warning: __stop_mem_stats changed binding to STB_WEAK [-Wsource-mgr]
The reason is that the symbols are declared using ".globl" while they
are also referenced as __attribute__((weak)) elsewhere. It turns out
that a weak symbol is implicitly a global one and that the two classes
are exclusive, thus it may confuse the linker. Better fix this.
BUG/MINOR: quic: Avoid starting the mux if no ALPN sent by the client
If the client does not sent an ALPN, the SSL ALPN negotiation callback
is not called. However, the handshake is reported as successful. Check
just after SSL_do_handshake if an ALPN was negotiated. If not, emit a
CONNECTION_CLOSE with a TLS alert to close the connection.
This prevent a crash in qcc_install_app_ops() called with null as second
parameter value.
MINOR: mux-h1: Rely on the endpoint instead of the conn-stream when possible
Instead of testing if a conn-stream exists or not, we rely on CS_EP_ORPHAN
endpoint flag. In addition, if possible, we access the endpoint from the
h1s. Finally, the endpoint flags are now reported in trace messages.
MINOR: conn-stream: Use a dedicated function to conditionally remove a CS
cs_free_cond() must now be used to remove a CS. cs_free() may be used on
error path to release a freshly allocated but unused CS. But in all other
cases cs_free_cond() must be used. This function takes care to release the
CS if it is possible (no app and detached from any endpoint).
In fact, this function is only used internally. From the outside,
cs_detach_* functions are used.
MEDIUM: check: Use a new conn-stream for each health-check run
It is a partial revert of 54e85cbfc ("MAJOR: check: Use a persistent
conn-stream for health-checks"). But with the CS refactoring, the result is
cleaner now. A CS is allocated when a new health-check run is started. The
same CS is then used throughout the run. If there are several connections,
the endpoint is just reset. At the end of the run, the CS is released. It
means, in the tcp-check part, the CS is always defined.
MEDIUM: stream: Don't use the stream-int anymore in process_stream()
process_stream() and all associated functions now manipulate conn-streams.
stream-interfaces are no longer used. In addition, function to dump info
about a stream no longer print info about stream-interfaces.
MEDIUM: stream-int/conn-stream: Move I/O functions to conn-stream
cs_conn_io_cb(), cs_conn_sync_recv() and cs_conn_sync_send() are moved in
conn_stream.c. Associated functions are moved too (cs_notify, cs_conn_read0,
cs_conn_recv, cs_conn_send and cs_conn_process).
MEDIUM: stream-int/conn-stream: Move blocking flags from SI to CS
Remaining flags and associated functions are move in the conn-stream
scope. These flags are added on the endpoint and not the conn-stream
itself. This way it will be possible to get them from the mux or the
applet. The functions to get or set these flags are renamed accordingly with
the "cs_" prefix and updated to manipualte a conn-stream instead of a
stream-interface.
MINOR: conn-stream: Move si_conn_cb in the conn-stream scope
si_conn_cb variable is renamed cs_data_conn_cb. In addtion, its associated
functions are also renamed. si_cs_recv(), si_cs_send() and si_cs_process() are
renamed cs_conn_recv(), cs_conn_send and cs_conn_process(). These functions are
updated to manipulate conn-streams instead of stream-interfaces.
MEDIUM: conn-stream/applet: Add a data callback for applets
data callbacks were only used for streams attached to a connection and
for health-checks. However there is a callback used by task_run_applet. So,
si_applet_wake_cb() is first renamed to cs_applet_process() and it is
defined as the data callback for streams attached to an applet. This way,
this part now manipulates a conn-stream instead of a stream-interface. In
addition, applets are no longer handled as an exception for this part.
MINOR: stream-int/stream: Move si_update_both in stream scope
si_update_both() is renamed stream_update_both_cs() and moved in stream.c.
The function is slightly changed to manipulate the stream instead the front
and back conn-streams.
MINOR: stream-int-conn-stream: Move si_update_* in conn-stream scope
si_update_rx(), si_update_tx() and si_update() are renamed cs_update_rx(),
cs_upate_tx() and cs_update() and updated to manipulate a conn-stream
instead of a stream-interface.