Willy Tarreau [Mon, 28 Feb 2022 15:11:33 +0000 (16:11 +0100)]
DEBUG: buf: add BUG_ON_HOT() to most buffer management functions
A number of tests are now performed in low-level buffer management
functions to verify that we're not appending data to a full buffer
for example, or that the buffer passed in argument is consistent in
that its data don't outweigh its size. The few functions that already
involve memcpy() or memmove() instead got a BUG_ON() that will always
be enabled, since the overhead remains minimalist.
Willy Tarreau [Mon, 28 Feb 2022 15:10:00 +0000 (16:10 +0100)]
DEBUG: buf: replace some sensitive BUG_ON() with BUG_ON_HOT()
The buffer ring management functions br_* were all stuffed with BUG_ON()
statements that never triggered and that are on some fast paths (e.g. in
mux_h2). Let's turn them to BUG_ON_HOT() instead.
Willy Tarreau [Mon, 28 Feb 2022 14:25:58 +0000 (15:25 +0100)]
DEBUG: add two new macros to enable debugging in hot paths
Two new BUG_ON variants, BUG_ON_HOT() and CHECK_IF_HOT() are introduced
to debug hot paths (such as low-level API functions). These ones must
not be enabled by default as they would significantly affect performance
but they may be enabled by setting DEBUG_STRICT to a value above 1. In
this case, DEBUG_STRICT_ACTION is mostly respected with a small change,
which is that the no_crash variant of BUG_ON() isn't turned to a regular
warning but to a one-time warning so as not to spam with warnings in a
hot path. It is for this reason that there is no WARN_ON_HOT().
Willy Tarreau [Mon, 28 Feb 2022 13:59:25 +0000 (14:59 +0100)]
DEBUG: implement 4 levels of choices between warn and crash.
We used to have DEBUG_STRICT_NOCRASH to disable crashes on BUG_ON().
Now we have other levels (WARN_ON(), CHECK_IF()) so we need something
finer-grained.
This patch introduces DEBUG_STRICT_ACTION which takes an integer value.
0 disables crashes and is the equivalent of DEBUG_STRICT_NOCRASH. 1 is
the default and only enables crashes on BUG_ON(). 2 also enables crashes
on WARN_ON(), and 3 also enables warnings on CHECK_IF(), and is suited
to developers and CI.
Willy Tarreau [Mon, 28 Feb 2022 13:15:41 +0000 (14:15 +0100)]
DEBUG: improve BUG_ON output message accuracy
Now we'll explicitly mention if the test was a bug/warn/check, and
"FATAL" is only displayed when the process crashes. The non-crashing
BUG_ON() also suggests to report to developers.
Willy Tarreau [Mon, 28 Feb 2022 10:51:23 +0000 (11:51 +0100)]
DEBUG: rename WARN_ON_ONCE() to CHECK_IF()
The only reason for warning once is to check if a condition really
happens. Let's use a term that better translates the intent, that's
important when reading the code.
Willy Tarreau [Fri, 25 Feb 2022 16:12:11 +0000 (17:12 +0100)]
[RELEASE] Released version 2.6-dev2
Released version 2.6-dev2 with the following main changes :
- DOC: management: rework the Master CLI section
- DOC: management: add expert and experimental mode in 9.4.1
- CLEANUP: cleanup a commentary in pcli_parse_request()
- BUG/MINOR: mworker/cli: don't display help on master applet
- MINOR: mworker/cli: mcli-debug-mode enables every command
- MINOR: mworker/cli: add flags in the prompt
- BUG/MINOR: httpclient: Revisit HC request and response buffers allocation
- BUG/MEDIUM: httpclient: Xfer the request when the stream is created
- MINOR: httpclient: Don't limit data transfer to 1024 bytes
- BUILD: ssl: adjust guard for X509_get_X509_PUBKEY(x)
- REGTESTS: ssl: skip show_ssl_ocspresponse.vtc when BoringSSL is used
- MINOR: quic: Do not modify a marked as consumed datagram
- MINOR: quic: Wrong datagram buffer passed to quic_lstnr_dgram_dispatch()
- MINOR: quic: Remove a useless test in quic_get_dgram_dcid()
- BUG/MINOR: ssl: Remove empty lines from "show ssl ocsp-response <id>" output
- CLEANUP: ssl: Remove unused ssl_sock_create_cert function
- MINOR: ssl: Use high level OpenSSL APIs in sha2 converter
- MINOR: ssl: Remove EC_KEY related calls when preparing SSL context
- REGTESTS: ssl: Add test for "curves" and "ecdhe" SSL options
- MINOR: ssl: Remove EC_KEY related calls when creating a certificate
- REGTESTS: ssl: Add test for "generate-certificates" SSL option
- MINOR: ssl: Remove call to SSL_CTX_set_tlsext_ticket_key_cb with OpenSSLv3
- MINOR: ssl: Remove call to HMAC_Init_ex with OpenSSLv3
- MINOR: h3: hardcode the stream id of control stream
- MINOR: mux-quic: remove quic_transport_params_update
- MINOR: quic: rename local tid variable
- MINOR: quic: remove unused xprt rcv_buf operation
- MINOR: quic: take out xprt snd_buf operation
- CI: enable QUIC for Coverity scan
- BUG/MINOR: mworker: does not erase the pidfile upon reload
- MINOR: ssl: Remove call to ERR_func_error_string with OpenSSLv3
- MINOR: ssl: Remove call to ERR_load_SSL_strings with OpenSSLv3
- REGTESTS: ssl: Add tests for DH related options
- MINOR: ssl: Create HASSL_DH wrapper structure
- MINOR: ssl: Add ssl_sock_get_dh_from_bio helper function
- MINOR: ssl: Factorize ssl_get_tmp_dh and append a cbk to its name
- MINOR: ssl: Add ssl_sock_set_tmp_dh helper function
- MINOR: ssl: Add ssl_sock_set_tmp_dh_from_pkey helper function
- MINOR: ssl: Add ssl_new_dh_fromdata helper function
- MINOR: ssl: Build local DH of right size when needed
- MINOR: ssl: Set default dh size to 2048
- MEDIUM: ssl: Replace all DH objects by EVP_PKEY on OpenSSLv3 (via HASSL_DH type)
- MINOR: ssl: Remove calls to SSL_CTX_set_tmp_dh_callback on OpenSSLv3
- MINOR: quic: Remove an RX buffer useless lock
- MINOR: quic: Variable used before being checked in ha_quic_add_handshake_data()
- MINOR: quic: EINTR error ignored
- MINOR: quic: Potential overflow expression in qc_parse_frm()
- MINOR: quic: Possible overflow in qpack_get_varint()
- CLEANUP: h3: Unreachable target in h3_uqs_init()
- MINOR: quic: Possible memleak in qc_new_conn()
- MINOR: quic: Useless statement in quic_crypto_data_cpy()
- BUG/MEDIUM: pools: ensure items are always large enough for the pool_cache_item
- BUG/MINOR: pools: always flush pools about to be destroyed
- CLEANUP: pools: don't needlessly set a call mark during refilling of caches
- DEBUG: pools: add extra sanity checks when picking objects from a local cache
- DEBUG: pools: let's add reverse mapping from cache heads to thread and pool
- DEBUG: pools: replace the link pointer with the caller's address on pool_free()
- BUG/MAJOR: sched: prevent rare concurrent wakeup of multi-threaded tasks
- MINOR: quic: use a global dghlrs for each thread
- BUG/MEDIUM: quic: fix crash on CC if mux not present
- MINOR: qpack: fix typo in trace
- BUG/MINOR: quic: fix FIN stream signaling
- BUG/MINOR: h3: fix the header length for QPACK decoding
- MINOR: h3: remove transfer-encoding header
- MINOR: h3: add documentation on h3_decode_qcs
- MINOR: h3: set properly HTX EOM/BODYLESS on HEADERS parsing
- MINOR: mux-quic: implement rcv_buf
- MINOR: mux-quic: set EOS on rcv_buf
- MINOR: h3: set CS_FL_NOT_FIRST
- MINOR: h3: report frames bigger than rx buffer
- MINOR: h3: extract HEADERS parsing in a dedicated function
- MINOR: h3: implement DATA parsing
- MINOR: quic: Wrong smoothed rtt initialization
- MINOR: quic: Wrong loss delay computation
- MINOR: quic: Code never reached in qc_ssl_sess_init()
- MINOR: quic: ha_quic_set_encryption_secrets without server specific code
- MINOR: quic: Avoid warning about NULL pointer dereferences
- MINOR: quic: Useless test in quic_lstnr_dghdlr()
- MINOR: quic: Non checked returned value for cs_new() in hq_interop_decode_qcs()
- MINOR: h3: Dead code in h3_uqs_init()
- MINOR: quic: Non checked returned value for cs_new() in h3_decode_qcs()
- MINOR: quic: Possible frame parsers array overrun
- MINOR: quic: Do not retransmit too much packets.
- MINOR: quic: Move quic_rxbuf_pool pool out of xprt part
- MINOR: h3: report error on HEADERS/DATA parsing
- BUG/MINOR: jwt: Double free in deinit function
- BUG/MINOR: jwt: Missing pkey free during cleanup
- BUG/MINOR: jwt: Memory leak if same key is used in multiple jwt_verify calls
- BUG/MINOR: httpclient/cli: display junk characters in vsn
- MINOR: h3: remove unused return value on decode_qcs
- BUG/MAJOR: http/htx: prevent unbounded loop in http_manage_server_side_cookies
- BUG/MAJOR: spoe: properly detach all agents when releasing the applet
- REGTESTS: server: close an occasional race on dynamic_server_ssl.vtc
- REGTESTS: peers: leave a bit more time to peers to synchronize
- BUG/MEDIUM: h2/hpack: fix emission of HPACK DTSU after settings change
- BUG/MINOR: mux-h2: update the session's idle delay before creating the stream
- BUG/MINOR: httpclient: reinit flags in httpclient_start()
- BUG/MINOR: mailers: negotiate SMTP, not ESMTP
- MINOR: httpclient: sets an alternative destination
- MINOR: httpclient/lua: add 'dst' optionnal field
- BUG/MINOR: ssl: Add missing return value check in ssl_ocsp_response_print
- BUG/MINOR: ssl: Fix leak in "show ssl ocsp-response" CLI command
- BUG/MINOR: ssl: Missing return value check in ssl_ocsp_response_print
- CLEANUP: httpclient/cli: fix indentation alignment of the help message
- BUG/MINOR: tools: url2sa reads ipv4 too far
- BUG/MEDIUM: httpclient: limit transfers to the maximum available room
- DEBUG: buffer: check in __b_put_blk() whether the buffer room is respected
- MINOR: mux-quic: fix a possible null dereference in qc_timeout_task
- BUG/MEDIUM: htx: Be sure to have a buffer to perform a raw copy of a message
- BUG/MEDIUM: mux-h1: Don't wake h1s if mux is blocked on lack of output buffer
- BUG/MAJOR: mux-h2: Be sure to always report HTX parsing error to the app layer
- DEBUG: stream-int: Check CS_FL_WANT_ROOM is not set with an empty input buffer
- MINOR: quic: do not modify offset node if quic_rx_strm_frm in tree
- MINOR: h3: fix compiler warning variable set but not used
- MINOR: mux-quic: fix uninitialized return on qc_send
- MINOR: quic: fix handling of out-of-order received STREAM frames
- MINOR: pools: mark most static pool configuration variables as read-mostly
- CLEANUP: pools: remove the now unused pool_is_crowded()
- REGTESTS: fix the race conditions in 40be_2srv_odd_health_checks
- BUG/MEDIUM: stream: Abort processing if response buffer allocation fails
- MINOR: httpclient/lua: ability to set a server timeout
- BUG/MINOR: httpclient/lua: missing pop for new timeout parameter
- DOC: httpclient/lua: fix the type of the dst parameter
- CLEANUP: httpclient: initialize the client in stage INIT not REGISTER
- CLEANUP: muxes: do not use a dynamic trash in list_mux_protos()
- CLEANUP: vars: move the per-process variables initialization to vars.c
- CLEANUP: init: remove the ifdef on HAPROXY_MEMMAX
- MINOR: pools: disable redundant poisonning on pool_free()
- MINOR: pools: introduce a new pool_debugging global variable
- MINOR: pools: switch the fail-alloc test to runtime only
- MINOR: pools: switch DEBUG_DONT_SHARE_POOLS to runtime
- MINOR: pools: add a new debugging flag POOL_DBG_COLD_FIRST
- MINOR: pools: add a new debugging flag POOL_DBG_INTEGRITY
- MINOR: pools: make the global pools a runtime option.
- MEDIUM: pools: replace CONFIG_HAP_POOLS with a runtime "NO_CACHE" flag.
- MINOR: pools: store the allocated size for each pool
- MINOR: pools: get rid of POOL_EXTRA
- MINOR: pools: replace DEBUG_POOL_TRACING with runtime POOL_DBG_CALLER
- MINOR: pools: replace DEBUG_MEMORY_POOLS with runtime POOL_DBG_TAG
- MINOR: pools: add a debugging flag for memory poisonning option
- MEDIUM: initcall: move STG_REGISTER earlier
- MEDIUM: init: split the early initialization in its own function
- MINOR: init: extract args parsing to their own function
- MEDIUM: init: handle arguments earlier
- MINOR: pools: delegate parsing of command line option -dM to a new function
- MINOR: pools: support setting debugging options using -dM
- BUILD: makefile: enable both DEBUG_STRICT and DEBUG_MEMORY_POOLS by default
- CI: github: enable pool debugging by default
- DOC: Fix usage/examples of deprecated ACLs
- DOC: internal: update the pools API to mention boot-time settings
- DOC: design: add design thoughts for later simplification of the pools
- DOC: design: commit the temporary design notes on thread groups
- MINOR: stream-int: Handle appctx case first when releasing the endpoint
- MINOR: connection: Be prepared to handle conn-stream with no connection
- MINOR: stream: Handle appctx case first when creating a new stream
- MINOR: connection: Add a function to detach a conn-stream from the connection
- MINOR: stream-int: Add function to reset a SI endpoint
- MINOR: stream-int: Add function to attach a connection to a SI
- MINOR: stream-int: Be able to allocate a CS without connection
- MEDIUM: stream: No longer release backend conn-stream on connection retry
- MEDIUM: stream: Allocate backend CS when the stream is created
- REORG: conn_stream: move conn-stream stuff in dedicated files
- MEDIUM: conn-stream: No longer access connection field directly
- MEDIUM: conn-stream: Be prepared to use an appctx as conn-stream endpoint
- MAJOR: conn_stream/stream-int: move the appctx to the conn-stream
- MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int
- MEDIUM: conn_stream: Add a pointer to the app object into the conn-stream
- MINOR: stream: Add pointer to front/back conn-streams into stream struct
- MINOR: stream: Slightly rework stream_new to separate CS/SI initialization
- MINOR: stream-int: Always access the stream-int via the conn-stream
- MINOR: backend: Always access the stream-int via the conn-stream
- MINOR: stream: Always access the stream-int via the conn-stream
- MINOR: http-ana: Always access the stream-int via the conn-stream
- MINOR: cli: Always access the stream-int via the conn-stream
- MINOR: log: Always access the stream-int via the conn-stream
- MINOR: frontend: Always access the stream-int via the conn-stream
- MINOR: proxy: Always access the stream-int via the conn-stream
- MINOR: peers: Always access the stream-int via the conn-stream
- MINOR: debug: Always access the stream-int via the conn-stream
- MINOR: hlua: Always access the stream-int via the conn-stream
- MINOR: cache: Always access the stream-int via the conn-stream
- MINOR: dns: Always access the stream-int via the conn-stream
- MINOR: http-act: Always access the stream-int via the conn-stream
- MINOR: httpclient: Always access the stream-int via the conn-stream
- MINOR: tcp-act: Always access the stream-int via the conn-stream
- MINOR: sink: Always access the stream-int via the conn-stream
- MINOR: conn-stream: Rename cs_detach() to cs_detach_endp()
- CLEANUP: conn-stream: Don't export conn-stream pool
- MAJOR: stream/conn_stream: Move the stream-interface into the conn-stream
- CLEANUP: stream-int: rename si_reset() to si_init()
- MINOR: conn-stream: Release a CS when both app and endp are detached
- MINOR: stream: Don't destroy conn-streams but detach app and endp
- MAJOR: check: Use a persistent conn-stream for health-checks
- CLEANUP: conn-stream: Remove cs_destroy()
- CLEANUP: backend: Don't export connect_server anymore
- BUG/MINOR: h3/hq_interop: Fix CS and stream creation
- BUILD: tree-wide: Avoid warnings about undefined entities retrieved from a CS
- BUG/MINOR: proxy: preset the error message pointer to NULL in parse_new_proxy()
- BUG/MEDIUM: quic: fix received ACK stream calculation
- BUILD: stream: fix build warning with older compilers
- BUG/MINOR: debug: fix get_tainted() to properly read an atomic value
- DEBUG: move the tainted stuff to bug.h for easier inclusion
- DEBUG: cleanup back trace generation
- DEBUG: cleanup BUG_ON() configuration
- DEBUG: mark ABORT_NOW() as unreachable
- DBEUG: add a new WARN_ON() macro
- DEBUG: make the _BUG_ON() macro return the condition
- DEBUG: add a new WARN_ON_ONCE() macro
- DEBUG: report BUG_ON() and WARN_ON() in the tainted flags
- MINOR: quic: adjust buffer handling for STREAM transmission
- MINOR: quic: liberate the TX stream buffer after ACK processing
- MINOR: quic: add a TODO for a memleak frame on ACK consume
Amaury Denoyelle [Thu, 24 Feb 2022 09:50:58 +0000 (10:50 +0100)]
MINOR: quic: add a TODO for a memleak frame on ACK consume
The quic_frame instance containing the quic_stream must be freed when
the corresponding ACK has been received. However when implementing this
on qcs_try_to_consume, some data transfers are interrupted and cannot
complete (DC test from interop test suite).
Amaury Denoyelle [Thu, 24 Feb 2022 09:56:33 +0000 (10:56 +0100)]
MINOR: quic: liberate the TX stream buffer after ACK processing
The sending buffer of each stream is cleared when processing ACKs
corresponding to STREAM emitted frames. If the buffer is empty, free it
and offer it as with other dynamic buffers usage.
This should reduce memory consumption as before an opened stream
confiscate a buffer during its whole lifetime even if there is no more
data to transmit.
Amaury Denoyelle [Wed, 23 Feb 2022 09:54:42 +0000 (10:54 +0100)]
MINOR: quic: adjust buffer handling for STREAM transmission
Simplify the data manipulation of STREAM frames on TX. Only stream data
and len field are used to generate a valid STREAM frames from the
buffer. Do not use the offset field, which required that a single buffer
instance should be shared for every frames on a single stream.
Willy Tarreau [Fri, 25 Feb 2022 07:55:11 +0000 (08:55 +0100)]
DEBUG: add a new WARN_ON_ONCE() macro
This one will maintain a static counter per call place and will only
emit the warning on the first call. It may be used to invite users to
report an unexpected event without spamming them with messages.
Willy Tarreau [Fri, 25 Feb 2022 07:52:39 +0000 (08:52 +0100)]
DBEUG: add a new WARN_ON() macro
This is the same as BUG_ON() except that it never crashes and only emits
a warning and a backtrace, inviting users to report the problem. This will
be usable for non-fatal issues that should not happen and need to be fixed.
This way the BUG_ON() when using DEBUG_STRICT_NOCRASH is effectively an
equivalent of WARN_ON().
Willy Tarreau [Fri, 25 Feb 2022 08:01:36 +0000 (09:01 +0100)]
DEBUG: mark ABORT_NOW() as unreachable
The purpose is to make the program die at this point, so let's help the
compiler optimise the code (especially in sensitive areas) by telling it
that ABORT_NOW() does not return. This reduces the overall code size by
~0.5%.
Willy Tarreau [Fri, 25 Feb 2022 07:45:52 +0000 (08:45 +0100)]
DEBUG: cleanup BUG_ON() configuration
The BUG_ON() macro handling is complicated because it relies on a
conditional CRASH_NOW() macro whose definition depends on DEBUG_STRICT
and DEBUG_STRICT_NOCRASH. Let's rethink the whole thing differently,
and instead make the underlying _BUG_ON() macro take a crash argument
to decide whether to crash or not, as well as a prefix and a suffix for
the message, that will allow to distinguish between variants. Now the
suffix is set to a message explaining we don't crash when needed.
This also allows to get rid of the CRASH_NOW() macro and to define
much simpler new macros.
Willy Tarreau [Fri, 25 Feb 2022 09:10:00 +0000 (10:10 +0100)]
DEBUG: move the tainted stuff to bug.h for easier inclusion
The functions needed to manipulate the "tainted" flags were located in
too high a level to be callable from the lower code layers. Let's move
them to bug.h.
Willy Tarreau [Fri, 25 Feb 2022 08:56:29 +0000 (09:56 +0100)]
BUG/MINOR: debug: fix get_tainted() to properly read an atomic value
get_tainted() was using an atomic store from the atomic value to a
local one instead of using an atomic load. In practice it has no effect
given the relatively rare updates of this field and the fact that it's
read only when dumping "show info" output, but better fix it.
Willy Tarreau [Thu, 24 Feb 2022 18:35:05 +0000 (19:35 +0100)]
BUILD: stream: fix build warning with older compilers
GCC 6 was not very good at value propagation and is often mislead about
risks of null derefs. Since 2.6-dev commit 13a35e575 ("MAJOR: conn_stream/
stream-int: move the appctx to the conn-stream"), it sees a risk of null-
deref in stream_upgrade_from_cs() after checking cs_conn_mux(cs). Let's
disguise the result so that it doesn't complain anymore. The output code
is exactly the same. The same method could be used to shut warnings at
-O1 that affect the same compiler by the way.
Amaury Denoyelle [Thu, 24 Feb 2022 16:39:57 +0000 (17:39 +0100)]
BUG/MEDIUM: quic: fix received ACK stream calculation
Adjust the handling of ACK for STREAM frames. When receiving a ACK, the
corresponding frames from the acknowledged packet are retrieved. If a
frame is of type STREAM, we compare the frame STREAM offset with the
last offset known of the qcs instance.
The comparison was incomplete as it did not treat a acked offset smaller
than the known offset. Previously, the acked frame was incorrectly
buffered in the qcs.tx.acked_frms. On reception of future ACKs, when
trying to process the buffered acks via qcs_try_to_consume, the loop is
interrupted on the smallest offset different from the qcs known offset :
in this case it will be the previous smaller range. This is a real bug
as it prevents all buffered ACKs to be processed, eventually filling the
qcs sending buffer and cause the transfer to stall.
Fix this by properly properly handle smaller acked offset. First check
if the offset length is greater than the qcs offset and mark as
acknowledged the difference on the qcs. If not, the frame is not
buffered and simply ignored.
Willy Tarreau [Thu, 24 Feb 2022 15:37:19 +0000 (16:37 +0100)]
BUG/MINOR: proxy: preset the error message pointer to NULL in parse_new_proxy()
As reported by Coverity in issue #1568, a missing initialization of the
error message pointer in parse_new_proxy() may result in displaying garbage
or crashing in case of memory allocation error when trying to create a new
proxy on startup.
BUILD: tree-wide: Avoid warnings about undefined entities retrieved from a CS
Since recent changes related to the conn-stream/stream-interface
refactoring, GCC reports potential null pointer dereferences when we get the
appctx, the stream or the stream-interface from the conn-strem. Of course,
depending on the time, these entities may be null. But at many places, we
know they are defined and it is safe to get them without any check. Thus, we
use ALREADY_CHECKED() macro to silent these warnings.
Note that the refactoring is unfinished, so it is not a real issue for now.
MAJOR: check: Use a persistent conn-stream for health-checks
In the same way a stream has always valid conn-streams, when a health-checks
is created, a conn-stream is now created and the health-check is attached on
it, as an app. This simplify a bit the connect part when a health-check is
running.
MINOR: conn-stream: Release a CS when both app and endp are detached
cs_detach_app() function is added to detach an app from a conn-stream. And
now, both cs_detach_app() and cs_detach_endp() release the conn-stream when
both the app and the endpoint are detached.
MAJOR: stream/conn_stream: Move the stream-interface into the conn-stream
Thanks to all previous changes, it is now possible to move the
stream-interface into the conn-stream. To do so, some SI functions are
removed and their conn-stream counterparts are added. In addition, the
conn-stream is now responsible to create and release the
stream-interface. While the stream-interfaces were inlined in the stream
structure, there is now a pointer in the conn-stream. stream-interfaces are
now dynamically allocated. Thus a dedicated pool is added. It is a temporary
change because, at the end, the stream-interface structure will most
probably disappear.
MINOR: conn-stream: Rename cs_detach() to cs_detach_endp()
Because cs_detach() is releated to the endpoint only, the function is
renamed. The main purpose of this patch is to be able to add a function to
detach the conn-stream from the application.
MINOR: sink: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the sink part.
MINOR: tcp-act: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the tcp-act part.
MINOR: httpclient: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the httpclient part.
MINOR: http-act: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the http-act part.
MINOR: dns: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the dns part.
MINOR: cache: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the cache part.
MINOR: hlua: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the hlua part.
MINOR: debug: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the debug part.
MINOR: peers: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the peers part.
MINOR: proxy: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the proxy part.
MINOR: frontend: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the frontend part.
MINOR: log: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the log part.
MINOR: cli: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the cli part.
MINOR: http-ana: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream, all
access to the SI is done via the conn-stream. This patch is limited to the
http-ana part.
MINOR: stream: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the stream part.
MINOR: backend: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the backend part.
MINOR: stream-int: Always access the stream-int via the conn-stream
To be able to move the stream-interface from the stream to the conn-stream,
all access to the SI is done via the conn-stream. This patch is limited to
the stream-interface part.
MINOR: stream: Add pointer to front/back conn-streams into stream struct
frontend and backend conn-streams are now directly accesible from the
stream. This way, and with some other changes, it will be possible to remove
the stream-interfaces from the stream structure.
MEDIUM: conn_stream: Add a pointer to the app object into the conn-stream
In the same way the conn-stream has a pointer to the stream endpoint , this
patch adds a pointer to the application entity in the conn-stream
structure. For now, it is a stream or a health-check. It is mandatory to
merge the stream-interface with the conn-stream.
MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int
Because appctx is now an endpoint of the conn-stream, there is no reason to
still have the stream-interface as appctx owner. Thus, the conn-stream is
now the appctx owner.
MAJOR: conn_stream/stream-int: move the appctx to the conn-stream
Thanks to previous changes, it is now possible to set an appctx as endpoint
for a conn-stream. This means the appctx is no longer linked to the
stream-interface but to the conn-stream. Thus, a pointer to the conn-stream
is explicitly stored in the stream-interface. The endpoint (connection or
appctx) can be retrieved via the conn-stream.
MEDIUM: conn-stream: No longer access connection field directly
To be able to handle applets as a conn-stream endpoint, we must be prepared
to handle different types of endpoints. First of all, the conn-strream's
connection must no longer be used directly.
MEDIUM: stream: Allocate backend CS when the stream is created
Because the backend conn-stream is no longer released during connection
retry and because it is valid to have conn-stream with no connection, it is
possible to allocated it when the stream is created. This means, from now, a
stream has always valid frontend and backend conn-streams. It is the first
step to merge the SI and the CS.
MEDIUM: stream: No longer release backend conn-stream on connection retry
The backend conn-stream is no longer released on connection retry. This
means the conn-stream is detached from the underlying connection but not
released. Thus, during connection retries, the stream has always an
allocated conn-stream with no connection. All previous changes were made to
make this possible.
Note that .attach() mux callback function was changed to get the conn-stream
as argument. The muxes are no longer responsible to create the conn-stream
when a server connection is attached to a stream.
MINOR: stream-int: Add function to attach a connection to a SI
si_attach_conn() function should be used to attach a connection to a
stream-interface. It created a conn-stream if necessary. This function is
mandatory to be able to keep the backend conn-stream during connection
retries.
MINOR: stream-int: Add function to reset a SI endpoint
si_reset_endpoint() function may be used to reset the SI's endpoint without
releasing the conn-stream if the endpoint is a connection. If the endpoint
is an appctx, it is released. This change is mandatory to merge the SI and
the CS and keep the backend conn-stream attached to the stream during
connection retries.
MINOR: connection: Add a function to detach a conn-stream from the connection
cs_detach() function is added to detach a conn-stream from the underlying
connection. This part will evovle to handle applets too. Concretely,
cs_destroy() is split to detach the conn-stream from its endpoint, via
cs_detach(), and then, the conn-stream is released, via cs_free().
MINOR: stream: Handle appctx case first when creating a new stream
In the same way the previous commit, when a stream is created, the appctx
case is now handled before the conn-stream one. The purpose of this change
is to limit bugs during the SI/CS refactoring.
MINOR: connection: Be prepared to handle conn-stream with no connection
The conn-stream will progressively replace the stream-interface. Thus, a
stream will have to allocate the backend conn-stream during its
creation. This means it will be possible to have a conn-stream with no
connection. To prepare this change, we test the conn-stream's connection
when we retrieve it.
MINOR: stream-int: Handle appctx case first when releasing the endpoint
Stream-interfaces will be moved in the conn-stream and the appctx will be
moved at the same level than the muxes. Idea is to merge the
stream-interface and the conn-stream and have a better symmetry between the
muxes and the applets. To limit bugs during this refactoring, when the SI
endpoint is released, the appctx case is handled first.
Willy Tarreau [Thu, 24 Feb 2022 07:59:08 +0000 (08:59 +0100)]
DOC: design: add design thoughts for later simplification of the pools
The pools currently have plenty of options (and some usefull ones were
even lost with the modern design), but most of them could be categorized
along a few use cases, namely, performance, reliability, debuggability.
This document explores various ways to try to combine them and their
effect in a less complex way for the long term.
Willy Tarreau [Wed, 23 Feb 2022 16:58:46 +0000 (17:58 +0100)]
CI: github: enable pool debugging by default
This enables DEBUG_MEMORY_POOLS and DEBUG_POOL_INTEGRITY so that by
default the tests run under stricter checks, which are likely to
catch more bugs. Note that these ones are permanently used in prod
on haproxy.org.
Willy Tarreau [Wed, 23 Feb 2022 16:50:37 +0000 (17:50 +0100)]
BUILD: makefile: enable both DEBUG_STRICT and DEBUG_MEMORY_POOLS by default
The first one will enable all currently deployed BUG_ON() checks. These
ones are safe from a performance perspective and from a reliability
perspective. New ones may be added later with different categories
(hot path, detection of uncertain events, etc).
DEBUG_MEMORY_POOLS enables the "tag" pool debugging option by default,
so that pools may be better traced in dumps. This one alone results in
almost imperceptible performance difference, and 8 extra bytes per
allocated object.
Both options are safe for production use (they're among those enabled
all the time on haproxy.org) and allow to produce much more trustable
bug reports which should save a few round trips with the reporters.
Willy Tarreau [Wed, 23 Feb 2022 14:20:53 +0000 (15:20 +0100)]
MINOR: pools: support setting debugging options using -dM
The 9 currently available debugging options may now be checked, set, or
cleared using -dM. The directive now takes a comma-delimited list of
options after the optional poisonning byte. With "help", the list of
available options is displayed with a short help and their current
status.
Willy Tarreau [Fri, 18 Feb 2022 17:54:40 +0000 (18:54 +0100)]
MINOR: pools: delegate parsing of command line option -dM to a new function
New function pool_parse_debugging() is now dedicated to parsing options
of -dM. For now it only handles the optional memory poisonning byte, but
the function may already return an informative message to be printed for
help, a warning or an error. This way we'll reuse it for the settings
that will be needed for configurable debugging options.
Willy Tarreau [Wed, 23 Feb 2022 16:25:00 +0000 (17:25 +0100)]
MEDIUM: init: handle arguments earlier
The argument parser runs too late, we'll soon need it before creating
pools, hence just after init_early(). No visible change is expected but
this part is sensitive enough to be placed into its own commit for easier
bisection later if needed.
Willy Tarreau [Thu, 17 Feb 2022 17:10:36 +0000 (18:10 +0100)]
MINOR: init: extract args parsing to their own function
The cmdline argument parsing was performed quite late, which prevents
from retrieving elements that can be used to initialize the pools and
certain sensitive areas. The goal is to improve this by parsing command
line arguments right after the early init stage. This is possible
because the cmdline parser already does very little beyond retrieving
config elements that are used later.
Doing so requires to move the parser code to a separate function and
to externalize a few variables out of the function as they're used
later in the boot process, in the original function.
This patch creates init_args() but doesn't move it upfront yet, it's
still executed just before init(), which essentially corresponds to
what was done before (only the trash buffers, ACLs and Lua were
initialized earlier and are not needed for this).
The rest is not modified and as expected no change is observed.
Note that the diff doesn't to justice to the change as it makes it
look like the early init() code was moved to a new function after
the function was renamed, while in fact it's clearly the parser
itself which moved.
Willy Tarreau [Thu, 17 Feb 2022 16:45:58 +0000 (17:45 +0100)]
MEDIUM: init: split the early initialization in its own function
There are some delicate chicken-and-egg situations in the initialization
code, because the init() function currently does way too much (it goes
as far as parsing the config) and due to this it must be started very
late. But it's also in charge of initializing a number of variables that
are needed in early boot (e.g. hostname/pid for error reporting, or
entropy for random generators).
This patch carefully extracts all the early code that depends on
absolutely nothing, and places it immediately after the STG_LOCK init
stage. The only possible failures at this stage are only allocation
errors and they continue to provoke an immediate exit().
Some environment variables, hostname, date, pid etc are retrieved at
this stage. The program's arguments are also copied there since they're
needed to be kept intact for the master process.
Willy Tarreau [Fri, 18 Feb 2022 13:51:49 +0000 (14:51 +0100)]
MEDIUM: initcall: move STG_REGISTER earlier
The STG_REGISTER init level is used to register known keywords and
protocol stacks. It must be called earlier because some of the init
code already relies on it to be known. For example, "haproxy -vv"
for now is constrained to start very late only because of this.
This patch moves it between STG_LOCK and STG_ALLOC, which is fine as
it's used for static registration.
Willy Tarreau [Wed, 23 Feb 2022 13:15:18 +0000 (14:15 +0100)]
MINOR: pools: add a debugging flag for memory poisonning option
Now -dM will set POOL_DBG_POISON for consistency with the rest of the
pool debugging options. As such now we only check for the new flag,
which allows the default value to be preset.
Willy Tarreau [Wed, 23 Feb 2022 09:20:37 +0000 (10:20 +0100)]
MINOR: pools: replace DEBUG_MEMORY_POOLS with runtime POOL_DBG_TAG
This option used to allow to store a marker at the end of the area, which
was used as a canary and detection against wrong freeing while the object
is used, and as a pointer to the last pool_free() caller when back in cache.
Now that we can compute the offsets at runtime, let's check it at run time
and continue the code simplification.
Willy Tarreau [Wed, 23 Feb 2022 09:10:33 +0000 (10:10 +0100)]
MINOR: pools: replace DEBUG_POOL_TRACING with runtime POOL_DBG_CALLER
This option used to allow to store a pointer to the caller of the last
pool_alloc() or pool_free() at the end of the area. Now that we can
compute the offsets at runtime, let's check it at run time and continue
the code simplification. In __pool_alloc() we now always calculate the
return address (which is quite cheap), and the POOL_DEBUG_TRACE_CALLER()
calls are conditionned on the status of debugging option.
Willy Tarreau [Wed, 23 Feb 2022 09:03:11 +0000 (10:03 +0100)]
MINOR: pools: get rid of POOL_EXTRA
This macro is build-time dependent and is almost unused, yet where it
cannot easily be avoided. Now that we store the distinction between
pool->size and pool->alloc_sz, we don't need to maintain it and we
can instead compute it on the fly when creating a pool. This is what
this patch does. The variables are for now pretty static, but this is
sufficient to kill the macro and will allow to set them more dynamically.
Willy Tarreau [Wed, 23 Feb 2022 07:57:59 +0000 (08:57 +0100)]
MINOR: pools: store the allocated size for each pool
The allocated size is the visible size plus the extra storage. Since
for now we can store up to two extra elements (mark and tracer), it's
convenient because now we know that the mark is always stored at
->size, and the tracer is always before ->alloc_sz.
Willy Tarreau [Tue, 22 Feb 2022 15:23:09 +0000 (16:23 +0100)]
MEDIUM: pools: replace CONFIG_HAP_POOLS with a runtime "NO_CACHE" flag.
Like previous patches, this replaces the build-time code paths that were
conditionned by CONFIG_HAP_POOLS with runtime paths conditionned by
!POOL_DBG_NO_CACHE. One trivial test had to be added in the hot path in
__pool_alloc() to refrain from calling pool_get_from_cache(), and another
one in __pool_free() to avoid calling pool_put_to_cache().
All cache-specific functions were instrumented with a BUG_ON() to make
sure we never call them with cache disabled. Additionally the cache[]
array was not initialized (remains NULL) so that we can later drop it
if not needed. It's particularly huge and should be turned to dynamic
with a pointer to a per-thread area where all the objects are located.
This will solve the memory usage issue and will improve locality, or
even help better deal with NUMA machines once each thread uses its own
arena.
Willy Tarreau [Tue, 22 Feb 2022 08:21:13 +0000 (09:21 +0100)]
MINOR: pools: make the global pools a runtime option.
There were very few functions left that were specific to global pools,
and even the checks they used to participate to are not directly on the
most critical path so they can suffer an extra "if".
What's done now is that pool_releasable() always returns 0 when global
pools are disabled (like the one before) so that pool_evict_last_items()
never tries to place evicted objects there. As such there will never be
any object in the free list. However pool_refill_local_from_shared() is
bypassed when global pools are disabled so that we even avoid the atomic
loads from this function.
The default global setting is still adjusted based on the original
CONFIG_NO_GLOBAL_POOLS that is set depending on threads and the allocator.
The global executable only grew by 1.1kB by keeping this code enabled,
and the code is simplified and will later support runtime options.
Willy Tarreau [Mon, 21 Feb 2022 17:42:53 +0000 (18:42 +0100)]
MINOR: pools: add a new debugging flag POOL_DBG_INTEGRITY
The test to decide whether or not to enforce integrity checks on cached
objects is now enabled at runtime and conditionned by this new debugging
flag. While previously it was not a concern to inflate the code size by
keeping the two functions static, they were moved to pool.c to limit the
impact. In pool_get_from_cache(), the fast code path remains fast by
having both flags tested at once to open a slower branch when either
POOL_DBG_COLD_FIRST or POOL_DBG_INTEGRITY are set.
Willy Tarreau [Mon, 21 Feb 2022 17:30:25 +0000 (18:30 +0100)]
MINOR: pools: add a new debugging flag POOL_DBG_COLD_FIRST
When enabling pools integrity checks, we usually prefer to allocate cold
objects first in order to maximize the time the objects spend in the
cache. In order to make this configurable at runtime, let's introduce
a new debugging flag to control this allocation order. It is currently
preset by the DEBUG_POOL_INTEGRITY build-time setting.
Willy Tarreau [Mon, 21 Feb 2022 16:31:50 +0000 (17:31 +0100)]
MINOR: pools: switch DEBUG_DONT_SHARE_POOLS to runtime
This test used to appear at a single location in create_pool() to
enable a check on the pool name or unconditionally merge similarly
sized pools.
This patch introduces POOL_DBG_DONT_MERGE and conditions the test on
this new runtime flag, that is preset according to the aforementioned
debugging option.
Willy Tarreau [Mon, 21 Feb 2022 16:16:22 +0000 (17:16 +0100)]
MINOR: pools: switch the fail-alloc test to runtime only
The fail-alloc test used to be enabled/disabled at build time using
the DEBUG_FAIL_ALLOC macro, but it happens that the cost of the test
is quite cheap and that it can be enabled as one of the pool_debugging
options.
This patch thus introduces the first POOL_DBG_FAIL_ALLOC option, whose
default value depends on DEBUG_FAIL_ALLOC. The mem_should_fail() function
is now always built, but it was made static since it's never used outside.
Willy Tarreau [Fri, 18 Feb 2022 17:35:59 +0000 (18:35 +0100)]
MINOR: pools: introduce a new pool_debugging global variable
This read-mostly variable will be used at runtime to enable/disable
certain pool-debugging features and will be set by the command-line
parser. A future option -dP will take a number of debugging features
as arguments to configure this variable's contents.
Willy Tarreau [Wed, 23 Feb 2022 10:45:09 +0000 (11:45 +0100)]
MINOR: pools: disable redundant poisonning on pool_free()
The poisonning performed on pool_free() used to help a little bit with
use-after-free detection, but usually did more harm than good in that
it was never possible to perform post-mortem analysis on released
objects once poisonning was enabled on allocation. Now that there is
a dedicated DEBUG_POOL_INTEGRITY, let's get rid of this annoyance
which is not even documented in the management manual.
Willy Tarreau [Thu, 17 Feb 2022 15:47:03 +0000 (16:47 +0100)]
CLEANUP: vars: move the per-process variables initialization to vars.c
There's no point keeping the vars_init_head() call in init() when we
already have a vars_init() registered at the right time to do that,
and it complexifies the boot sequence, so let's move it there.
Willy Tarreau [Fri, 18 Feb 2022 10:07:40 +0000 (11:07 +0100)]
CLEANUP: muxes: do not use a dynamic trash in list_mux_protos()
Let's not use a trash there anymore. The function is called at very
early boot (for "haproxy -vv"), and the need for a trash prevents the
arguments from being parsed earlier. Moreover, the function only uses
a FILE* on output with fprintf(), so there's not even any benefit in
using chunk_printf() on an intermediary variable, emitting the output
directly is both clearer and safer.
Willy Tarreau [Fri, 18 Feb 2022 15:23:14 +0000 (16:23 +0100)]
CLEANUP: httpclient: initialize the client in stage INIT not REGISTER
REGISTER is meant to only assemble static lists, not to initialize
code that may depend on some elements possibly initialized at this
level. For example the init code currently looks up transport protocols
such as XPRT_RAW and XPRT_SSL which ought to be themselves registered
from at REGISTER stage, and which currently work only because they're
still registered directly from a constructor. INIT is perfectly suited
for this level.
BUG/MEDIUM: stream: Abort processing if response buffer allocation fails
In process_stream(), we force the response buffer allocation before any
processing to be able to return an error message. It is important because,
when an error is triggered, the stream is immediately closed. Thus we cannot
wait for the response buffer allocation.
When the allocation fails, the stream analysis is stopped and the expiration
date of the stream's task is updated before exiting process_stream(). But if
the stream was woken up because of a connection or an analysis timeout, the
expiration date remains blocked in the past. This means the stream is woken
up in loop as long as the response buffer is not properly allocated.
Alone, this behavior is already a bug. But because the mechanism to handle
buffer allocation failures is totally broken since a while, this bug becomes
more problematic. Because, most of time, the watchdog will kill HAProxy in
this case because it will detect a spinning loop.
To fix it, at least temporarily, an allocation failure at this stage is now
reported as an error and the processing is aborted. It's not satisfying but
it is better than nothing. If the buffers allocation mechanism is
refactored, this part will be reviewed.
This patch must be backported, probably as far as 2.0. It may be perceived
as a regression, but the actual behavior is probably even worse. And
because it was not reported, it is probably not a common situation.