git.ipfire.org Git - thirdparty/haproxy.git/log

]> git.ipfire.org Git - thirdparty/haproxy.git/log

projects / thirdparty / haproxy.git / log

Frédéric Lécaille [Mon, 24 Apr 2023 13:49:36 +0000 (15:49 +0200)]

CLEANUP: quic: Rename several <buf> variables into quic_sock.c

Rename some variables which are not struct buffer variables.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 13:44:18 +0000 (15:44 +0200)]

CLEANUP: quic: Rename <buf> variable into qc_parse_hd_form()

There is no struct buffer variable manipulated by this function.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 13:41:07 +0000 (15:41 +0200)]

CLEANUP: quic: Rename <buf> variable into quic_packet_read_long_header()

Make this function be more readable: there is no struct buffer variable passed
as parameter to this function.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 13:29:56 +0000 (15:29 +0200)]

CLEANUP: quic: Rename several <buf> variables at low level

Make quic_stateless_reset_token_cpy(), quic_derive_cid() and quic_get_cid_tid()
be more readable: there is no struct buffer variable manipulated by these
functions.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 13:24:58 +0000 (15:24 +0200)]

CLEANUP: quic: Rename quic_get_dgram_dcid() <buf> variable

quic_get_dgram_dcid() does not manipulate any struct buffer variable.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 13:02:34 +0000 (15:02 +0200)]

CLEANUP: quic: Make qc_build_pkt() be more readable

There is no <buf> variable passed to this function.
Also rename <buf_end> to <end> to mimic others functions.
Rename <beg> to <first_byte> and <end> to <last_byte>.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 12:54:48 +0000 (14:54 +0200)]

CLEANUP: quic: Rename <buf> variable for several low level functions

Make quic_build_packet_long_header(), quic_build_packet_short_header() and
quic_apply_header_protection() be more readable: there is no struct buffer
variables used by these functions.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 12:43:57 +0000 (14:43 +0200)]

CLEANUP: quic: Rename <buf> variable into quic_rx_pkt_parse()

Make this function be more readable: there is no struct buffer variable used
by this function.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 12:38:33 +0000 (14:38 +0200)]

CLEANUP: quic: Rename <buf> variable into quic_padding_check()

Make quic_padding_check() be more readable: there is not struct buffer variable
used by this function.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 12:35:18 +0000 (14:35 +0200)]

CLEANUP: quic: Rename <buf> variable to <token> in quic_generate_retry_token()

Make quic_generate_retry_token() be more readable: there is no struct buffer
variable used in this function.

Should be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 10:05:46 +0000 (12:05 +0200)]

CLEANUP: quic: Remove useless parameters passes to qc_purge_tx_buf()

Remove the pointer to the connection passed as parameters to qc_purge_tx_buf()
and other similar function which came with qc_purge_tx_buf() implementation.
They were there do track the connection during tests.

Must be backported to 2.7.

commit | commitdiff | tree

Amaury Denoyelle [Mon, 24 Apr 2023 13:32:23 +0000 (15:32 +0200)]

CLEANUP: quic: rename frame variables

Rename all frame variables with the suffix _frm. This helps to
differentiate frame instances from other internal objects.

This should be backported up to 2.7.

commit | commitdiff | tree

Amaury Denoyelle [Mon, 24 Apr 2023 12:26:30 +0000 (14:26 +0200)]

CLEANUP: quic: rename frame types with an explicit prefix

Each frame type used in quic_frame union has been renamed with the
following prefix "qf_". This helps to differentiate frame instances from
other internal objects.

This should be backported up to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 09:32:22 +0000 (11:32 +0200)]

BUG/MINOR: quic: Useless I/O handler task wakeups (draining, killing state)

From the idle_timer_task(), the I/O handler must be woken up to send ack. But
there is no reason to do that in draining state or killing state. In draining
state this is even forbidden.

Must be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 09:26:06 +0000 (11:26 +0200)]

BUG/MINOR: quic: Useless probing retransmission in draining or killing state

The timer task responsible of triggering probing retransmission did not inspect
the state of the connection before doing its job. But there is no need to
probe the peer when the connection is in draining or killing state. About the
draining state, this is even forbidden.

Must be backported to 2.7 and 2.6.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 09:20:32 +0000 (11:20 +0200)]

BUG/MINOR: quic: Possible leak during probing retransmissions

qc_dgrams_retransmit() prepares two list of frames to be retransmitted into
two datagrams. If the first datagram could not be sent, the TX buffer will
be purged with the prepared packet and its frames, but this was not the case for
the second list of frames.

Must be backported in 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 09:11:55 +0000 (11:11 +0200)]

BUG/MINOR: quic: Possible memory leak from TX packets

This bug arrived with this commit which was not sufficient:

BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()

Indeed, there were also remaining allocated TX packets to be released and
their TX frames.
Implement qc_purge_tx_buf() to do so which depends on qc_free_tx_coalesced_pkts()
and qc_free_frm_list().

Must be backported to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 24 Apr 2023 08:59:33 +0000 (10:59 +0200)]

MINOR: quic: Move traces at proto level

These traces has already been useful to debug issues.

Must be backported to 2.7 and 2.6.

commit | commitdiff | tree

Willy Tarreau [Sun, 23 Apr 2023 08:21:37 +0000 (10:21 +0200)]

[RELEASE] Released version 2.8-dev8

Released version 2.8-dev8 with the following main changes :
    - BUG/MEDIUM: cli: Set SE_FL_EOI flag for '_getsocks' and 'quit' commands
    - BUG/MEDIUM: cli: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: http-client: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: stats: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: log: Eat output data when waiting for appctx shutdown
    - BUG/MEDIUM: dns: Kill idle DNS sessions during stopping stage
    - BUG/MINOR: resolvers: Wakeup DNS idle task on stopping
    - BUG/MEDIUM: resolvers: Force the connect timeout for DNS resolutions
    - MINOR: hlua: Stop to check the SC state when executing a hlua cli command
    - BUG/MEDIUM: mux-h1: Report EOI when a TCP connection is upgraded to H2
    - BUG/MEDIUM: mux-h2: Never set SE_FL_EOS without SE_FL_EOI or SE_FL_ERROR
    - MINOR: quic: Trace fix in quic_pto_pktns() (handshaske status)
    - BUG/MINOR: quic: Wrong packet number space probing before confirmed handshake
    - MINOR: quic: Modify qc_try_rm_hp() traces
    - MINOR: quic: Dump more information at proto level when building packets
    - MINOR: quic: Add a trace for packet with an ACK frame
    - MINOR: activity: add a line reporting the average CPU usage to "show activity"
    - BUG/MINOR: stick_table: alert when type len has incorrect characters
    - MINOR: thread: keep a bitmask of enabled groups in thread_set
    - MINOR: fd: optimize fd_claim_tgid() for use in fd_insert()
    - MINOR: fd: add a lock bit with the tgid
    - MINOR: fd: implement fd_migrate_on() to migrate on a non-local thread
    - MINOR: receiver: reserve special values for "shards"
    - MINOR: bind-conf: support a new shards value: "by-group"
    - BUG/MEDIUM: fd: don't wait for tmask to stabilize if we're not in it.
    - MINOR: quic: Add packet loss and maximum cc window to "show quic"
    - BUG/MINOR: quic: Ignored less than 1ms RTTs
    - MINOR: quic: Add connection flags to traces
    - BUG/MEDIUM: quic: Code sanitization about acknowledgements requirements
    - BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.
    - BUG/MINOR: quic: SIGFPE in quic_cubic_update()
    - MINOR: quic: Display the packet number space flags in traces
    - MINOR: quic: Remove a useless test about probing in qc_prep_pkts()
    - BUG/MINOR: quic: Wrong Application encryption level selection when probing
    - CI: bump "actions/checkout" to v3 for cross zoo matrix
    - CI: enable monthly test on Fedora Rawhide
    - BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity
    - BUG/MEDIUM: stream: Report write timeouts before testing the flags
    - BUG/MEDIUM: stconn: Do nothing in sc_conn_recv() when the SC needs more room
    - MINOR: stream: Uninline and export sess_set_term_flags() function
    - MINOR: filters: Review and simplify errors handling
    - REGTESTS: fix the race conditions in log_uri.vtc
    - MINOR: channel: Forwad close to other side on abort
    - MINOR: stream: Introduce stream_abort() to abort on both sides in same time
    - MINOR: stconn: Rename SC_FL_SHUTR_NOW in SC_FL_ABRT_WANTED
    - MINOR: channel/stconn: Replace channel_shutr_now() by sc_schedule_abort()
    - MINOR: stconn: Rename SC_FL_SHUTW_NOW in SC_FL_SHUT_WANTED
    - MINOR: channel/stconn: Replace channel_shutw_now() by sc_schedule_shutdown()
    - MINOR: stconn: Rename SC_FL_SHUTR in SC_FL_ABRT_DONE
    - MINOR: channel/stconn: Replace sc_shutr() by sc_abort()
    - MINOR: stconn: Rename SC_FL_SHUTW in SC_FL_SHUT_DONE
    - MINOR: channel/stconn: Replace sc_shutw() by sc_shutdown()
    - MINOR: tree-wide: Replace several chn_cons() by the corresponding SC
    - MINOR: tree-wide: Replace several chn_prod() by the corresponding SC
    - BUG/MINOR: cli: Don't close when SE_FL_ERR_PENDING is set in cli analyzer
    - MINOR: stconn: Stop to set SE_FL_ERROR on sending path
    - MEDIUM: stconn: Forbid applets with more to deliver if EOI was reached
    - MINOR: stconn: Don't clear SE_FL_ERROR when endpoint is reset
    - MINOR: stconn: Add a flag to ack endpoint errors at SC level
    - MINOR: backend: Set SC_FL_ERROR on connection error
    - MINOR: stream: Set SC_FL_ERROR on channels' buffer allocation error
    - MINOR: tree-wide: Test SC_FL_ERROR with SE_FL_ERROR from upper layer
    - MEDIUM: tree-wide: Stop to set SE_FL_ERROR from upper layer
    - MEDIUM: backend: Stop to use SE flags to detect connection errors
    - MEDIUM: stream: Stop to use SE flags to detect read errors from analyzers
    - MEDIUM: stream: Stop to use SE flags to detect endpoint errors
    - MEDIUM: stconn: Rely on SC flags to handle errors instead of SE flags
    - BUG/MINOR: stconn: Don't set SE_FL_ERROR at the end of sc_conn_send()
    - BUG/MINOR: quic: Do not use ack delay during the handshakes
    - CLEANUP: use "offsetof" where appropriate
    - MINOR: ssl: remove OpenSSL 1.0.2 mention into certificate loading error
    - BUG/MEDIUM: http-ana: Properly switch the request in tunnel mode on upgrade
    - BUG/MEDIUM: log: Properly handle client aborts in syslog applet
    - MINOR: stconn: Add a flag to report EOS at the stream-connector level
    - MINOR: stconn: Propagate EOS from a mux to the attached stream-connector
    - MINOR: stconn: Propagate EOS from an applet to the attached stream-connector
    - MINOR: mux-h2: make the initial window size configurable per side
    - MINOR: mux-h2: make the max number of concurrent streams configurable per side
    - BUG/MINOR: task: allow to use tasklet_wakeup_after with tid -1
    - CLEANUP: quic: remove unused QUIC_LOCK label
    - CLEANUP: quic: remove unused scid_node
    - CLEANUP: quic: remove unused qc param on stateless reset token
    - CLEANUP: quic: rename quic_connection_id vars
    - MINOR: quic: remove uneeded tasklet_wakeup after accept
    - MINOR: quic: adjust Rx packet type parsing
    - MINOR: quic: adjust quic CID derive API
    - MINOR: quic: remove TID ref from quic_conn
    - MEDIUM: quic: use a global CID trees list
    - MINOR: quic: remove TID encoding in CID
    - MEDIUM: quic: handle conn bootstrap/handshake on a random thread
    - MINOR: quic: do not proceed to accept for closing conn
    - MINOR: protocol: define new callback set_affinity
    - MINOR: quic: delay post handshake frames after accept
    - MEDIUM: quic: implement thread affinity rebinding
    - BUG/MINOR: quic: transform qc_set_timer() as a reentrant function
    - MINOR: quic: properly finalize thread rebinding
    - MAJOR: quic: support thread balancing on accept
    - MINOR: listener: remove unneeded local accept flag
    - BUG/MINOR: http-ana: Update analyzers on both sides when switching in TUNNEL mode
    - CLEANUP: backend: Remove useless debug message in assign_server()
    - CLEANUP: cli: Remove useless debug message in cli_io_handler()
    - BUG/MEDIUM: stconn: Propagate error on the SC on sending path
    - MINOR: config: add "no-alpn" support for bind lines
    - REGTESTS: add a new "ssl_alpn" test to test ALPN negotiation
    - DOC: add missing documentation for "no-alpn" on bind lines
    - MINOR: ssl: do not set ALPN callback with the empty string
    - MINOR: ssl_crtlist: dump "no-alpn" on "show crtlist" when "no-alpn" was set
    - MEDIUM: config: set useful ALPN defaults for HTTPS and QUIC
    - BUG/MEDIUM: quic: prevent crash on Retry sending
    - BUG/MINOR: cfgparse: make sure to include openssl-compat
    - MINOR: clock: add now_mono_time_fast() function
    - MINOR: clock: add now_cpu_time_fast() function
    - MEDIUM: hlua: reliable timeout detection
    - MEDIUM: hlua: introduce tune.lua.burst-timeout
    - CLEANUP: hlua: avoid confusion between internal timers and tick based timers
    - MINOR: hlua: hook yield on known lua state
    - MINOR: hlua: safe coroutine.create()
    - BUG/MINOR: quic: Stop removing ACK ranges when building packets
    - MINOR: quic: Do not allocate too much ack ranges
    - BUG/MINOR: quic: Unchecked buffer length when building the token
    - BUG/MINOR: quic: Wrong Retry token generation timestamp computing
    - BUG/MINOR: mux-quic: fix crash with app ops install failure
    - BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure
    - BUG/MINOR: h3: fix crash on h3s alloc failure
    - BUG/MINOR: quic: prevent crash on qc_new_conn() failure
    - BUG/MINOR: quic: consume Rx datagram even on error
    - CLEANUP: errors: fix obsolete function comments
    - CLEANUP: server: fix update_status() function comment
    - MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server
    - MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups
    - MINOR: hlua/event_hdl: expose proxy_uuid variable in server events
    - MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args
    - MINOR: server/event_hdl: prepare for upcoming refactors
    - BUG/MINOR: event_hdl: don't waste 1 event subtype slot
    - CLEANUP: event_hdl: updating obsolete comment for EVENT_HDL_CB_DATA
    - CLEANUP: event_hdl: fix comment typo about _sync assertion
    - MINOR: event_hdl: dynamically allocated event data members
    - MINOR: event_hdl: provide event->when for advanced handlers
    - MINOR: hlua/event_hdl: timestamp for events
    - DOC: lua: restore 80 char limitation
    - BUG/MINOR: server: incorrect report for tracking servers leaving drain
    - MINOR: server: explicitly commit state change in srv_update_status()
    - BUG/MINOR: server: don't miss proxy stats update on server state transitions
    - BUG/MINOR: server: don't miss server stats update on server state transitions
    - BUG/MINOR: server: don't use date when restoring last_change from state file
    - MINOR: server: central update for server counters on state change
    - MINOR: server: propagate server state change to lb through single function
    - MINOR: server: propagate lb changes through srv_lb_propagate()
    - MINOR: server: change adm_st_chg_cause storage type
    - MINOR: server: srv_append_status refacto
    - MINOR: server: change srv_op_st_chg_cause storage type
    - CLEANUP: server: remove unused variables in srv_update_status()
    - CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment
    - MINOR: server: pass adm and op cause to srv_update_status()
    - MEDIUM: server: split srv_update_status() in two functions
    - MINOR: server/event_hdl: prepare for server event data wrapper
    - MINOR: quic: support migrating the listener as well
    - MINOR: quic_sock: index li->per_thr[] on local thread id, not global one
    - MINOR: listener: support another thread dispatch mode: "fair"
    - MINOR: receiver: add a struct shard_info to store info about each shard
    - MINOR: receiver: add RX_F_MUST_DUP to indicate that an rx must be duped
    - MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP
    - MINOR: proto: skip socket setup for duped FDs
    - MEDIUM: config: permit to start a bind on multiple groups at once
    - MINOR: listener: make accept_queue index atomic
    - MEDIUM: listener: rework thread assignment to consider all groups
    - MINOR: listener: use a common thr_idx from the reference listener
    - MINOR: listener: resync with the thread index before heavy calculations
    - MINOR: listener: make sure to avoid ABA updates in per-thread index
    - MINOR: listener: always compare the local thread as well
    - MINOR: Make `tasklet_free()` safe to be called with `NULL`
    - CLEANUP: Stop checking the pointer before calling `tasklet_free()`
    - CLEANUP: Stop checking the pointer before calling `pool_free()`
    - CLEANUP: Stop checking the pointer before calling `task_free()`
    - CLEANUP: Stop checking the pointer before calling `ring_free()`
    - BUG/MINOR: cli: clarify error message about stats bind-process
    - CI: cirrus-ci: bump FreeBSD image to 13-1
    - REGTESTS: remove unsupported "stats bind-process" keyword
    - CI: extend spellchecker whitelist, add "clen" as well
    - CLEANUP: assorted typo fixes in the code and comments
    - BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available
    - BUG/MINOR: tools: check libssl and libcrypto separately
    - BUG/MINOR: config: fix NUMA topology detection on FreeBSD
    - BUILD: sock_inet: forward-declare struct receiver
    - BUILD: proto_tcp: export the correct names for proto_tcpv[46]
    - CLEANUP: protocol: move the l3_addrlen to plug a hole in proto_fam
    - CLEANUP: protocol: move the nb_receivers to plug a hole in protocol
    - REORG: listener: move the bind_conf's thread setup code to listener.c
    - MINOR: proxy: make proxy_type_str() recognize peers sections
    - MEDIUM: peers: call bind_complete_thread_setup() to finish the config
    - MINOR: protocol: add a flags field to store info about protocols
    - MINOR: protocol: move the global reuseport flag to the protocols
    - MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT
    - MINOR: protocol: add a function to check if some features are supported
    - MINOR: sock: add a function to check for SO_REUSEPORT support at runtime
    - MINOR: protocol: perform a live check for SO_REUSEPORT support
    - MINOR: listener: do not restrict CLI to first group anymore
    - MINOR: listener: add a new global tune.listener.default-shards setting
    - MEDIUM: listener: switch the default sharding to by-group

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 22:51:59 +0000 (00:51 +0200)]

MEDIUM: listener: switch the default sharding to by-group

Sharding by-group is exactly identical to by-process for a single
group, and will use the same number of file descriptors for more than
one group, while significantly lowering the kernel's locking overhead.

Now that all special listeners (cli, peers) are properly handled, and
that support for SO_REUSEPORT is detected at runtime per protocol, there
should be no more reason for now switching to by-group by default.

That's what this patch does. It does only this and nothing else so that
it's easy to revert, should any issue be raised.

Testing on an AMD EPYC 74F3 featuring 24 cores and 48 threads distributed
into 8 core complexes of 3 cores each, shows that configuring 8 groups
(one per CCX) is sufficient to simply double the forwarded connection
rate from 112k to 214k/s, reducing kernel locking from 71 to 55%.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 20:06:23 +0000 (22:06 +0200)]

MINOR: listener: add a new global tune.listener.default-shards setting

This new setting accepts "by-process", "by-group" and "by-thread" and
will dictate how listeners will be sharded by default when nothing is
specified. While the default remains "by-process", "by-group" should be
much more efficient with many threads, while not changing anything for
single-group setups.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 20:27:31 +0000 (22:27 +0200)]

MINOR: listener: do not restrict CLI to first group anymore

Now that we're able to run listeners on any set of groups, we don't need
to maintain a special case about the stats socket anymore. It used to be
forced to group 1 only so as to avoid startup failures in case several
groups were configured, but if it's done now, it will automatically bind
the needed FDs to have one per group so this is no more an issue.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 16:26:56 +0000 (18:26 +0200)]

MINOR: protocol: perform a live check for SO_REUSEPORT support

When testing if a protocol supports SO_REUSEPORT, we're now able to
verify if the OS does really support it. While it may be supported at
build time, it may possibly have been blocked in a container for
example so we'd rather know what it's like.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 16:25:09 +0000 (18:25 +0200)]

MINOR: sock: add a function to check for SO_REUSEPORT support at runtime

The new function _sock_supports_reuseport() will be used to check if a
protocol type supports SO_REUSEPORT or not. This will be useful to verify
that shards can really work.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 15:39:30 +0000 (17:39 +0200)]

MINOR: protocol: add a function to check if some features are supported

The new function protocol_supports_flag() checks the protocol flags
to verify if some features are supported, but will support being
extended to refine the tests. Let's use it to check for REUSEPORT.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 09:38:55 +0000 (11:38 +0200)]

MINOR: listener: automatically adjust shards based on support for SO_REUSEPORT

Now if multiple shards are explicitly requested, and the listener's
protocol doesn't support SO_REUSEPORT, sharding is disabled, which will
result in the socket being automatically duped if needed. A warning is
emitted when this happens. If "shards by-group" or "shards by-thread"
are used, these will automatically be turned down to 1 since we want
this to be possible easily using -dR on the command line without having
to djust the config. For "by-thread", a diag warning will be emitted to
help troubleshoot possible performance issues.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 13:09:07 +0000 (15:09 +0200)]

MINOR: protocol: move the global reuseport flag to the protocols

Some protocol support SO_REUSEPORT and others not. Some have such a
limitation in the kernel, and others in haproxy itself (e.g. sock_unix
cannot support multiple bindings since each one will unbind the previous
one). Also it's really protocol-dependent and not just family-dependent
because on Linux for some time it was supported for TCP and not UDP.

Let's move the definition to the protocols instead. Now it's preset in
tcp/udp/quic when SO_REUSEPORT is defined, and is otherwise left unset.
The enabled() config condition test validates IPv4 (generally sufficient),
and -dR / noreuseport all protocols at once.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 13:02:35 +0000 (15:02 +0200)]

MINOR: protocol: add a flags field to store info about protocols

We'll use these flags to know if some protocols are supported, and if
so, with what options/extensions. Reuseport will move there for example.
Two functions were added to globally set/clear a flag.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 21:52:17 +0000 (23:52 +0200)]

MEDIUM: peers: call bind_complete_thread_setup() to finish the config

The listeners in peers sections were still not handing the thread
groups fine. Shards were silently ignored and if a listener was bound
to more than one group, it would simply fail. Now we can call the
dedicated function to resolve all this and possibly create the missing
extra listeners.

bind_complete_thread_setup() was adjusted to use the proxy_type_str()
instead of writing "proxy" at the only place where this word was still
hard-coded so that we continue to speak about peers sections when
relevant.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 22:04:36 +0000 (00:04 +0200)]

MINOR: proxy: make proxy_type_str() recognize peers sections

Now proxy_type_str() will emit "peers section" when the mode is set to
peers, so as to ease sharing more code between peers and proxies.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 21:25:38 +0000 (23:25 +0200)]

REORG: listener: move the bind_conf's thread setup code to listener.c

What used to be only two lines to apply a mask in a loop in
check_config_validity() grew into a 130-line block that performs deeply
listener-specific operations that do not have their place there anymore.
In addition it's worth noting that the peers code still doesn't support
shards nor being bound to more than one group, which is a second reason
for moving that code to its own function. Nothing was changed except
recreating the missing variables from the bind_conf itself (the fe only).

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 12:57:42 +0000 (14:57 +0200)]

CLEANUP: protocol: move the nb_receivers to plug a hole in protocol

This field forces an unaligned hole between two list heads. Let's move
it up where it will be more easily combined with other fields. In
addition, turn it to unsigned while it's still not used.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 09:18:54 +0000 (11:18 +0200)]

CLEANUP: protocol: move the l3_addrlen to plug a hole in proto_fam

There's a two-byte hole in proto_fam after sock_family, let's move the
l3_addrlen there as a ushort. Note that contrary to what the comment
says, it's still not used by hash algorithms though it could.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 13:26:02 +0000 (15:26 +0200)]

BUILD: proto_tcp: export the correct names for proto_tcpv[46]

The exported names were not correct (missing the 'v').

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 09:02:04 +0000 (11:02 +0200)]

BUILD: sock_inet: forward-declare struct receiver

Including sock_inet.h without receiver-t.h causes build failures due to
struct receiver not being defined. Let's just forward-declare it.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 17:26:07 +0000 (19:26 +0200)]

BUG/MINOR: config: fix NUMA topology detection on FreeBSD

In 2.6-dev1, NUMA topology detection was enabled on FreeBSD with commit
f5d48f8b3 ("MEDIUM: cfgparse: numa detect topology on FreeBSD."). But
it suffers from a minor bug which is that it forgets to check for the
number of domains and always emits a confusing warning indicating that
multiple sockets were found while it's not the case.

This can be backported to 2.6.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 17:47:19 +0000 (19:47 +0200)]

BUG/MINOR: tools: check libssl and libcrypto separately

The lib compatibility checks introduced in 2.8-dev6 with commit c3b297d5a
("MEDIUM: tools: further relax dlopen() checks too consider grouped
symbols") were partially incorrect in that they check at the same time
libcrypto and libssl. But if loading a library that only depends on
libcrypto, the ssl-only symbols will be missing and this might present
an inconsistency. This is what is observed on FreeBSD 13.1 when
libcrypto is being loaded, where it sees two symbols having disappeared.

The fix consists in splitting the checks for libcrypto and libssl.

No backport is needed, unless the patch above finally gets backported.

commit | commitdiff | tree

Willy Tarreau [Sat, 22 Apr 2023 18:12:59 +0000 (20:12 +0200)]

BUG/MINOR: sock_inet: use SO_REUSEPORT_LB where available

On FreeBSD 13.1 I noticed that thread balancing using shards was not
always working. Sometimes several threads would work, but most of the
time a single one was taking all the traffic. This is related to how
SO_REUSEPORT works on FreeBSD since version 12, as it seems there is
no guarantee that multiple sockets will receive the traffic. However
there is SO_REUSEPORT_LB that is designed exactly for this, so we'd
rather use it when available.

This patch may possibly be backported, but nobody complained and it's
not sure that many users rely on shards. So better wait for some feedback
before backporting this.

commit | commitdiff | tree

Ilya Shipitsin [Sat, 22 Apr 2023 18:20:39 +0000 (20:20 +0200)]

CLEANUP: assorted typo fixes in the code and comments

This is 36th iteration of typo fixes

commit | commitdiff | tree

Ilya Shipitsin [Sat, 22 Apr 2023 18:11:35 +0000 (20:11 +0200)]

CI: extend spellchecker whitelist, add "clen" as well

"clen" is all around the code, since codespell cannot distingush
variables names, let us ignore it

commit | commitdiff | tree

Ilya Shipitsin [Sat, 22 Apr 2023 18:09:05 +0000 (20:09 +0200)]

REGTESTS: remove unsupported "stats bind-process" keyword

reg-tests/connection/proxy_protocol_random_fail.vtc fails due to
"stats bind-process":

*** h1 debug|[ALERT] (1476756) : config : parsing [/tmp/haregtests-2023-04-22_19-24-10.diuT6B/vtc.1476661.74f1092e/h1/cfg:7] : 'stats' is
*** h1 debug|not supported anymore.

commit | commitdiff | tree

Ilya Shipitsin [Sat, 22 Apr 2023 17:13:03 +0000 (19:13 +0200)]

CI: cirrus-ci: bump FreeBSD image to 13-1

FreeBSD-13.2 released on April 11, 2023

commit | commitdiff | tree

Willy Tarreau [Sun, 23 Apr 2023 07:40:56 +0000 (09:40 +0200)]

BUG/MINOR: cli: clarify error message about stats bind-process

In 2.7-dev2, "stats bind-process" was removed by commit 94f763b5e
("MEDIUM: config: remove deprecated "bind-process" directives from
frontends") and an error message indicates that it's no more supported.
However it says "stats" is not supported instead of "stats bind-process",
making it a bit confusing.

This should be backported to 2.7.

commit | commitdiff | tree

Tim Duesterhus [Sat, 22 Apr 2023 15:47:35 +0000 (17:47 +0200)]

CLEANUP: Stop checking the pointer before calling `ring_free()`

Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
     ring_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
     ring_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
     ring_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
     ring_free(e);

commit | commitdiff | tree

Tim Duesterhus [Sat, 22 Apr 2023 15:47:34 +0000 (17:47 +0200)]

CLEANUP: Stop checking the pointer before calling `task_free()`

Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
     task_destroy(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
     task_destroy(e);
    - }

    @@
    expression e;
    @@

    - if (e)
     task_destroy(e);

    @@
    expression e;
    @@

    - if (e != NULL)
     task_destroy(e);

commit | commitdiff | tree

Tim Duesterhus [Sat, 22 Apr 2023 15:47:33 +0000 (17:47 +0200)]

CLEANUP: Stop checking the pointer before calling `pool_free()`

Changes performed with this Coccinelle patch:

    @@
    expression e;
    expression p;
    @@

    - if (e != NULL) {
     pool_free(p, e);
    - }

    @@
    expression e;
    expression p;
    @@

    - if (e) {
     pool_free(p, e);
    - }

    @@
    expression e;
    expression p;
    @@

    - if (e)
     pool_free(p, e);

    @@
    expression e;
    expression p;
    @@

    - if (e != NULL)
     pool_free(p, e);

commit | commitdiff | tree

Tim Duesterhus [Sat, 22 Apr 2023 15:47:32 +0000 (17:47 +0200)]

CLEANUP: Stop checking the pointer before calling `tasklet_free()`

Changes performed with this Coccinelle patch:

    @@
    expression e;
    @@

    - if (e != NULL) {
     tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e) {
     tasklet_free(e);
    - }

    @@
    expression e;
    @@

    - if (e)
     tasklet_free(e);

    @@
    expression e;
    @@

    - if (e != NULL)
     tasklet_free(e);

See GitHub Issue #2126

commit | commitdiff | tree

Tim Duesterhus [Sat, 22 Apr 2023 15:47:31 +0000 (17:47 +0200)]

MINOR: Make `tasklet_free()` safe to be called with `NULL`

Make this freeing function safe, like other freeing functions are as discussed
in GitHub issue #2126.

commit | commitdiff | tree

Willy Tarreau [Wed, 19 Apr 2023 16:06:16 +0000 (18:06 +0200)]

MINOR: listener: always compare the local thread as well

By comparing the local thread's load with the least loaded thread's
load, we can further improve the fairness and at the same time also
improve locality since it allows a small ratio of connections not to
be migrated. This is visible on CPU usage with long connections on
very large thread counts (224) and high bandwidth (200G). The cost
of checking the local thread's load remains fairly low so there's no
reason not to do this. We continue to update the index if we select
the local thread, because it means that the two other threads were
both more loaded so we'd rather find better ones.

commit | commitdiff | tree

Willy Tarreau [Thu, 20 Apr 2023 14:52:21 +0000 (16:52 +0200)]

MINOR: listener: make sure to avoid ABA updates in per-thread index

One limitation of the current thread index mechanism is that if the
values are assigned multiple times to the same thread and the index
loops, it can match again the old value, which will not prevent a
competing thread from finishing its CAS and assigning traffic to a
thread that's not the optimal one. The probability is low but the
solution is simple enough and consists in implementing an update
counter in the high bits of the index to force a mismatch in this
case (assuming we don't try to cover for extremely unlikely cases
where the update counter loops while the index remains equal). So
let's do that. In order to improve the situation a little bit, we
now set the index to a ulong so that in 32 bits we have 8 bits of
counter and in 64 bits we have 40 bits.

commit | commitdiff | tree

Willy Tarreau [Wed, 19 Apr 2023 15:19:28 +0000 (17:19 +0200)]

MINOR: listener: resync with the thread index before heavy calculations

During heavy accept competition, the CAS will occasionally fail and
we'll have to go through all the calculation again. While the first
two loops look heavy, they're almost never taken so they're quite
cheap. However the rest of the operation is heavy because we have to
consult connection counts and queue indexes for other threads, so
better double-check if the index is still valid before continuing.
Tests show that it's more efficient do retry half-way like this.

commit | commitdiff | tree

Willy Tarreau [Wed, 29 Mar 2023 15:02:17 +0000 (17:02 +0200)]

MINOR: listener: use a common thr_idx from the reference listener

Instead of seeing each listener use its own thr_idx, let's use the same
for all those from a shard. It should provide more accurate and smoother
thread allocation.

commit | commitdiff | tree

Willy Tarreau [Mon, 27 Mar 2023 08:38:51 +0000 (10:38 +0200)]

MEDIUM: listener: rework thread assignment to consider all groups

Till now threads were assigned in listener_accept() to other threads of
the same group only, using a single group mask. Now that we have all the
relevant info (array of listeners of the same shard), we can spread the
thr_idx to cover all assigned groups. The thread indexes now contain the
group number in their upper bits, and the indexes run over te whole list
of threads, all groups included.

One particular subtlety here is that switching to a thread from another
group also means switching the group, hence the listener. As such, when
changing the group we need to update the connection's owner to point to
the listener of the same shard that is bound to the target group.

commit | commitdiff | tree

Willy Tarreau [Thu, 20 Apr 2023 09:05:28 +0000 (11:05 +0200)]

MINOR: listener: make accept_queue index atomic

There has always been a race when checking the length of an accept queue
to determine which one is more loaded that another, because the head and
tail are read at two different moments. This is not required, we can merge
them as two 16 bit numbers inside a single 32-bit index that is always
accessed atomically. This way we read both values at once and always have
a consistent measurement.

commit | commitdiff | tree

Willy Tarreau [Mon, 27 Feb 2023 15:42:32 +0000 (16:42 +0100)]

MEDIUM: config: permit to start a bind on multiple groups at once

Now it's possible for a bind line to span multiple thread groups. When
this happens, the first one will become the reference and will be entirely
set up, and the subsequent ones will be duplicated from this reference,
so that they can be registered in distinct groups. The reference is
always setup and started first so it is always available when the other
ones are started.

The doc was updated to reflect this new possibility with its limitations
and impacts, and the differences with the "shards" option.

commit | commitdiff | tree

Willy Tarreau [Mon, 27 Feb 2023 15:40:54 +0000 (16:40 +0100)]

MINOR: proto: skip socket setup for duped FDs

It's not strictly necessary, but it's still better to avoid setting
up the same socket multiple times when it's being duplicated to a few
FDs. We don't change that for inherited ones however since they may
really need to be set up, so we only skip duplicated ones.

commit | commitdiff | tree

Willy Tarreau [Mon, 27 Feb 2023 15:39:32 +0000 (16:39 +0100)]

MEDIUM: proto: duplicate receivers marked RX_F_MUST_DUP

The different protocol's ->bind() function will now check the receiver's
RX_F_MUST_DUP flag to decide whether to bind a fresh new listener from
scratch or reuse an existing one and just duplicate it. It turns out
that the existing code already supports reusing FDs since that was done
as part of the FD passing and inheriting mechanism. Here it's not much
different, we pass the FD of the reference receiver, it gets duplicated
and becomes the new receiver's FD.

These FDs are also marked RX_F_INHERITED so that they are not exported
and avoid being touched directly (only the reference should be touched).

commit | commitdiff | tree

Willy Tarreau [Mon, 27 Feb 2023 15:37:21 +0000 (16:37 +0100)]

MINOR: receiver: add RX_F_MUST_DUP to indicate that an rx must be duped

The purpose of this new flag will be to mark that some listeners
duplicate their reference's FD instead of trying to setup a completely
new listener from scratch. This will be used when multiple groups want
to listen to the same socket, via multiple FDs.

commit | commitdiff | tree

Willy Tarreau [Wed, 1 Mar 2023 17:25:58 +0000 (18:25 +0100)]

MINOR: receiver: add a struct shard_info to store info about each shard

In order to create multiple receivers for one multi-group shard, we'll
need some more info about the shard. Here we store:
  - the number of groups (= number of receivers)
  - the number of threads (will be used for accept LB)
  - pointer to the reference rx (to get the FD and to find all threads)
  - pointers to the other members (to iterate over all threads)

For now since there's only one group per shard it remains simple. The
listener deletion code already takes care of removing the current
member from its shards list and moving others' reference to the last
one if it was their reference (so as to avoid o(n^2) updates during
ordered deletes).

Since the vast majority of setups will not use multi-group shards, we
try to save memory usage by only allocating the shard_info when it is
needed, so the principle here is that a receiver shard_info==NULL is
alone and doesn't share its socket with another group.

Various approaches were considered and tests show that the management
of the listeners during boot makes it easier to just attach to or
detach from a shard_info and automatically allocate it if it does not
exist, which is what is being done here.

For now the attach code is not called, but detach is already called
on delete.

commit | commitdiff | tree

Willy Tarreau [Thu, 20 Apr 2023 13:40:38 +0000 (15:40 +0200)]

MINOR: listener: support another thread dispatch mode: "fair"

This new algorithm for rebalancing incoming connections to multiple
threads is simpler and instead of considering the threads load, it will
only cycle through all of them, offering a fair share of the traffic to
each thread. It may be well suited for short-lived connections but is
also convenient for very large thread counts where it's not always certain
that the least loaded thread will always be found.

commit | commitdiff | tree

Willy Tarreau [Fri, 21 Apr 2023 08:46:45 +0000 (10:46 +0200)]

MINOR: quic_sock: index li->per_thr[] on local thread id, not global one

There's a li_per_thread array in each listener for use with QUIC
listeners. Since thread groups were introduced, this array can be
allocated too large because global.nbthread is allocated for each
listener, while only no more than MIN(nbthread,MAX_THREADS_PER_GROUP)
may be used by a single listener. This was because the global thread
ID is used as the index instead of the local ID (since a listener may
only be used by a single group). Let's just switch to local ID and
reduce the allocated size.

commit | commitdiff | tree

Willy Tarreau [Thu, 20 Apr 2023 17:03:49 +0000 (19:03 +0200)]

MINOR: quic: support migrating the listener as well

When migrating a quic_conn to another thread, we may need to also
switch the listener if the thread belongs to another group. When
this happens, the freshly created connection will already have the
target listener, so let's just pick it from the connection and use
it in qc_set_tid_affinity(). Note that it will be the caller's
responsibility to guarantee this.

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 24 Mar 2023 16:02:53 +0000 (17:02 +0100)]

MINOR: server/event_hdl: prepare for server event data wrapper

Adding the possibility to publish an event using a struct wrapper
around existing SERVER events to provide additional contextual info.

Using the specific struct wrapper is not required: it is supported
to cast event data as a regular server event data struct so
that we don't break the existing API.

However, casting event data with a more explicit data type allows
to fetch event-only relevant hints.

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 19 Apr 2023 14:19:58 +0000 (16:19 +0200)]

MEDIUM: server: split srv_update_status() in two functions

Considering that srv_update_status() is now synchronous again since
3ff577e1 ("MAJOR: server: make server state changes synchronous again"),
and that we can easily identify if the update is from an operational
or administrative context thanks to "MINOR: server: pass adm and op cause
to srv_update_status()".

And given that administrative and operational updates cannot be cumulated
(since srv_update_status() is called synchronously and independently for
admin updates and state/operational updates, and the function directly
consumes the changes).

We split srv_update_status() in 2 distinct parts:

Either <type> is 0, meaning the update is an operational update which
is handled by directly looking at cur_state and next_state to apply the
proper transition.
Also, the check to prevent operational state from being applied
if MAINT admin flag is set is no longer needed given that the calling
functions already ensure this (ie: srv_set_{running,stopping,stopped)

Or <type> is 1, meaning the update is an administrative update, where
cur_admin and next_admin are evaluated to apply the proper transition and
deduct the resulting server state (next_state is updated implicitly).

Once this is done, both operations share a common code path in
srv_update_status() to update proxy and servers stats if required.

Thanks to this change, the function's behavior is much more predictable,
it is not an all-in-one function anymore. Either we apply an operational
change, else it is an administrative change. That's it, we cannot mix
the 2 since both code paths are now properly separated.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 18 Apr 2023 09:00:17 +0000 (11:00 +0200)]

MINOR: server: pass adm and op cause to srv_update_status()

Operational and administrative state change causes are not propagated
through srv_update_status(), instead they are directly consumed within
the function to provide additional info during the call when required.

Thus, there is no valid reason for keeping adm and op causes within
server struct. We are wasting space and keeping uneeded complexity.

We now exlicitly pass change type (operational or administrative) and
associated cause to srv_update_status() so that no extra storage is
needed since those values are only relevant from srv_update_status().

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 19 Apr 2023 08:33:02 +0000 (10:33 +0200)]

CLEANUP: server: fix srv_set_{running, stopping, stopped} function comment

Fixing function comments for the server state changing function since they
still refer to asynchonous propagation of server state which is no longer
in play.
Moreover, there were some mixups between running/stopping.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 18 Apr 2023 10:08:18 +0000 (12:08 +0200)]

CLEANUP: server: remove unused variables in srv_update_status()

check and px local variable aliases are not very useful.
Let's remove them and use s->check and s->proxy instead.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 4 Apr 2023 08:17:40 +0000 (10:17 +0200)]

MINOR: server: change srv_op_st_chg_cause storage type

This one is greatly inspired by "MINOR: server: change adm_st_chg_cause storage type".

While looking at current srv_op_st_chg_cause usage, it was clear that
the struct needed some cleanup since some leftovers from asynchronous server
state change updates were left behind and resulted in some useless code
duplication, and making the whole thing harder to maintain.

Two observations were made:

- by tracking down srv_set_{running, stopped, stopping} usage,
  we can see that the <reason> argument is always a fixed statically
  allocated string.
- check-related state change context (duration, status, code...) is
  not used anymore since srv_append_status() directly extracts the
  values from the server->check. This is pure legacy from when
  the state changes were applied asynchronously.

To prevent code duplication, useless string copies and make the reason/cause
more exportable, we store it as an enum now, and we provide
srv_op_st_chg_cause() function to fetch the related description string.
HEALTH and AGENT causes (check related) are now explicitly identified to
make consumers like srv_append_op_chg_cause() able to fetch checks info
from the server itself if they need to.

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 14 Apr 2023 16:07:09 +0000 (18:07 +0200)]

MINOR: server: srv_append_status refacto

srv_append_status() has become a swiss-knife function over time.
It is used from server code and also from checks code, with various
inputs and distincts code paths, making it very hard to guess the
actual behavior of the function (resulting string output).

To simplify the logic behind it, we're dividing it in multiple contextual
functions that take simple inputs and do explicit things, making them
more predictable and easier to maintain.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 3 Apr 2023 15:40:28 +0000 (17:40 +0200)]

MINOR: server: change adm_st_chg_cause storage type

Even though it doesn't look like it at first glance, this is more like
a cleanup than an actual code improvement:

Given that srv->adm_st_chg_cause has been used to exclusively store
static strings ever since it was implemented, we make the choice to
store it as an enum instead of a fixed-size string within server
struct.

This will allow to save some space in server struct, and will make
it more easily exportable (ie: event handlers) because of the
reduced memory footprint during handling and the ability to later get
the corresponding human-readable message when it's explicitly needed.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 18 Apr 2023 10:02:48 +0000 (12:02 +0200)]

MINOR: server: propagate lb changes through srv_lb_propagate()

Now that we have a generic srv_lb_propagate(s) function, let's
use it each time we explicitly wan't to set the status down as
well.

Indeed, it is tricky to try to handle "down" case explicitly,
instead we use srv_lb_propagate() which will call the proper
function that will handle the new server state.

This will allow some code cleanup and will prevent any logic
error.

This commit depends on:
- "MINOR: server: propagate server state change to lb through single function"

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 17 Apr 2023 11:53:56 +0000 (13:53 +0200)]

MINOR: server: propagate server state change to lb through single function

Use a dedicated helper function to propagate server state change to
lb algorithms, since it is performed at multiple places within
srv_update_status() function.

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 19 Apr 2023 16:22:21 +0000 (18:22 +0200)]

MINOR: server: central update for server counters on state change

Based on "BUG/MINOR: server: don't miss server stats update on server
state transitions", we're also taking advantage of the new centralized
logic to update down_trans server counter directly from there instead
of multiple places.

commit | commitdiff | tree

Aurelien DARRAGON [Fri, 21 Apr 2023 08:58:32 +0000 (10:58 +0200)]

BUG/MINOR: server: don't use date when restoring last_change from state file

When restoring from a state file: the server "Status" reports weird values on
the html stats page:

"5s UP" becomes -> "? UP" after the restore

This is due to a bug in srv_state_srv_update(): when restoring the states
from a state file, we rely on date.tv_sec to compute the process-relative
server last_change timestamp.

This is wrong because everywhere else we use now.tv_sec when dealing
with last_change, for instance in srv_update_status().

date (which is Wall clock time) deviates from now (monotonic time) in the
long run.

They should not be mixed, and given that last_change is an internal time value,
we should rely on now.tv_sec instead.

last_change export through "show servers state" cli is safe since we export
a delta and not the raw time value in dump_servers_state():

srv_time_since_last_change = now.tv_sec - srv->last_change

--

While this bug affects all stable versions, it was revealed in 2.8 thanks
to 28360dc ("MEDIUM: clock: force internal time to wrap early after boot")
This is due to the fact that "now" immediately deviates from "date",
whereas in the past they had the same value when starting.

Thus prior to 2.8 the bug is trickier since it could take some time for
date and now to deviate sufficiently for the issue to arise, and instead
of reporting absurd values that are easy to spot it could just result in
last_change becoming inconsistent over time.

As such, the fix should be backported to all stable versions.
[for 2.2 the patch needs to be applied manually since
srv_state_srv_update() was named srv_update_state() and can be found in
server.c instead of server_state.c]

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 18 Apr 2023 11:52:27 +0000 (13:52 +0200)]

BUG/MINOR: server: don't miss server stats update on server state transitions

s->last_change and s->down_time updates were manually updated for each
effective server state change within srv_update_status().

This is rather error-prone, and as a result there were still some state
transitions that were not handled properly since at least 1.8.

ie:
- when transitionning from DRAIN to READY: downtime was updated
  (which is wrong since a server in DRAIN state should not be
   considered as DOWN)
- when transitionning from MAINT to READY: downtime was not updated
  (this can be easily seen in the html stats page)

To fix these all at once, and prevent similar bugs from being introduced,
we centralize the server last_change and down_time stats logic at the end
of srv_update_status():

If the server state changed during the call, then it means that
last_change must be updated, with a special case when changing from
STOPPED state which means the server was previously DOWN and thus
downtime should be updated.

This patch depends on:

- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to every stable versions.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 17 Apr 2023 13:30:11 +0000 (15:30 +0200)]

BUG/MINOR: server: don't miss proxy stats update on server state transitions

backend "down" stats logic has been duplicated multiple times in
srv_update_status(), resulting in the logic now being error-prone.

For example, the following bugfix was needed to compensate for a
copy-paste introduced bug: d332f139
("BUG/MINOR: server: update last_change on maint->ready transitions too")

While the above patch works great, we actually forgot to update the
proxy downtime like it is done for other down->up transitions...
This is simply illustrating that the current design is error-prone,
it is very easy to miss something in this area.

To properly update the proxy downtime stats on the maint->ready transition,
to cleanup srv_update_status() and to prevent similar bugs from being
introduced in the future, proxy/backend stats update are now automatically
performed at the end of the server state change if needed.

Thus we can remove existing updates that were performed at various places
within the function, this simplifies things a bit.

This patch depends on:
- "MINOR: server: explicitly commit state change in srv_update_status()"

This could be backported to all stable versions.

Backport notes:

2.2:

Replace
struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 17 Apr 2023 15:45:58 +0000 (17:45 +0200)]

MINOR: server: explicitly commit state change in srv_update_status()

As shown in 8f29829 ("BUG/MEDIUM: checks: a down server going to
maint remains definitely stucked on down state."), state changes
that don't result in explicit lb state change, require us to perform
an explicit server state commit to make sure the next state is
applied before returning from the function.

This is the case for server state changes that don't trigger lb logic
and only perform some logging.

This is quite error prone, we could easily forget a state change
combination that could result in next_state, next_admin or next_eweight
not being applied. (cur_state, cur_admin and cur_eweight would be left
with unexpected values)

To fix this, we explicitly call srv_lb_commit_status() at the end
of srv_update_status() to enforce the new values, even if they were
already applied. (when a state changes requires lb state update
an implicit commit is already performed)

Applying the state change multiple times is safe (since the next value
always points to the current value).

Backport notes:

2.2:

Replace
struct task *srv_cleanup_toremove_conns(struct task *task, void *context, unsigned int state)

by
struct task *srv_cleanup_toremove_connections(struct task *task, void *context, unsigned short state)

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 27 Mar 2023 08:17:31 +0000 (10:17 +0200)]

BUG/MINOR: server: incorrect report for tracking servers leaving drain

Report message for tracking servers completely leaving drain is wrong:

The check for "leaving drain .. via" never evaluates because the
condition !(s->next_admin & SRV_ADMF_FDRAIN) is always true in the
current block which is guarded by !(s->next_admin & SRV_ADMF_DRAIN).

For tracking servers that leave inherited drain mode, this results in the
following message being emitted:
"Server x/b is UP (leaving forced drain)"

Instead of:
"Server x/b is UP (leaving drain) via x/a"

To this fix: we check if FDRAIN is currently set, else it means that the
drain status is inherited from the tracked server (IDRAIN)

This regression was introduced with 64cc49cf ("MAJOR: servers: propagate server
status changes asynchronously."), thus it may be backported to every stable
versions.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 20 Apr 2023 10:16:17 +0000 (12:16 +0200)]

DOC: lua: restore 80 char limitation

Restore 80 char limitation throughout the file for easier reading
on the cli, and fix some raw formatting issues without altering
html rendering.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 20 Apr 2023 09:32:46 +0000 (11:32 +0200)]

MINOR: hlua/event_hdl: timestamp for events

'when' optional argument is provided to lua event handlers.

It is an integer representing the number of seconds elapsed since Epoch
and may be used in conjunction with lua `os.date()` function to provide
a custom format string.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 4 Apr 2023 19:41:10 +0000 (21:41 +0200)]

MINOR: event_hdl: provide event->when for advanced handlers

For advanced async handlers only
(Registered using EVENT_HDL_ASYNC_TASK() macro):

event->when is provided as a struct timeval and fetched from 'date'
haproxy global variable.

Thanks to 'when', related event consumers will be able to timestamp
events, even if they don't work in real-time or near real-time.
Indeed, unlike sync or normal async handlers, advanced async handlers
could purposely delay the consumption of pending events, which means
that the date wouldn't be accurate if computed directly from within
the handler.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 23 Mar 2023 18:09:15 +0000 (19:09 +0100)]

MINOR: event_hdl: dynamically allocated event data members

Add the ability to provide a cleanup function for event data passed
via the publishing function.

One use case could be the need to provide valid pointers in the safe
section of the data struct.
Cleanup function will be automatically called with data (or copy of data)
as argument when all handlers consumed the event, which provides an easy
way to release some memory or decrement refcounts to ressources that were
provided through the data struct.
data in itself may not be freed by the cleanup function, it is handled
by the API.

This would allow passing large (allocated) data blocks through the data
struct while keeping data struct size under the EVENT_HDL_ASYNC_EVENT_DATA
size limit.

To do so, when publishing an event, where we would currently do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
* may not use pointers to short-living resources
*/
        event_data.safe.my_custom_data = x;

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA(&event_data));

We could do:

        struct event_hdl_cb_data_new_family event_data;

        /* safe data, available from both sync and async contexts
* may not use pointers to short-living resources,
* unless EVENT_HDL_CB_DATA_DM is used to ensure pointer
* consistency (ie: refcount)
*/
        event_data.safe.my_custom_static_data = x;
event_data.safe.my_custom_dynamic_data = malloc(1);

        /* unsafe data, only available from sync contexts */
        event_data.unsafe.my_unsafe_data = y;

        /* once data is prepared, we can publish the event */
        event_hdl_publish(NULL,
                          EVENT_HDL_SUB_NEW_FAMILY_SUBTYPE_1,
                          EVENT_HDL_CB_DATA_DM(&event_data, data_new_family_cleanup));

With data_new_family_cleanup func which would look like this:

      void data_new_family_cleanup(const void *data)
      {
       const struct event_hdl_cb_data_new_family *event_data = ptr;

/* some data members require specific cleanup once the event
* is consumed
*/
       free(event_data.safe.my_custom_dynamic_data);
/* don't ever free data! it is not ours */
      }

Not sure if this feature will become relevant in the future, so I prefer not
to mention it in the doc for now.

But given that the implementation is trivial and does not put a burden
on the existing API, it's a good thing to have it there, just in case.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 4 Apr 2023 19:43:31 +0000 (21:43 +0200)]

CLEANUP: event_hdl: fix comment typo about _sync assertion

Fixing a comment relative to EVENT_HDL_ASSERT_SYNC macro where a
typo was made and the comment was lacking some context.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 23 Mar 2023 16:46:45 +0000 (17:46 +0100)]

CLEANUP: event_hdl: updating obsolete comment for EVENT_HDL_CB_DATA

EVENT_HDL_CB_DATA macro comments were not updated during the API
refactor, fixing that.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 30 Mar 2023 10:17:47 +0000 (12:17 +0200)]

BUG/MINOR: event_hdl: don't waste 1 event subtype slot

ESUB_INDEX(n) index macro is used exclusively with n > 0
Fixing it so that it starts numbering at 1 instead of 2.

This way, we don't waste a subtype slot in event_hdl_sub_type
struct, and we comply with the structure comments about max
supported event subtypes (currently set at 16).

If 68e692da0 ("MINOR: event_hdl: add event handler base api")
is being backported, then this commit should be backported with it.

commit | commitdiff | tree

Aurelien DARRAGON [Thu, 23 Mar 2023 13:39:51 +0000 (14:39 +0100)]

MINOR: server/event_hdl: prepare for upcoming refactors

This commit does nothing that ought to be mentioned, except that
it adds missing comments and slighty moves some function calls
out of "sensitive" code in preparation of some server code refactors.

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 27 Mar 2023 16:16:21 +0000 (18:16 +0200)]

MINOR: hlua/event_hdl: fix return type for hlua_event_hdl_cb_data_push_args

Changing hlua_event_hdl_cb_data_push_args() return type to void since it
does not return anything useful.
Also changing its name to hlua_event_hdl_cb_push_args() since it does more
than just pushing cb data argument (it also handles event type and mgmt).

Errors catched by the function are reported as lua errors.

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 22 Mar 2023 16:49:04 +0000 (17:49 +0100)]

MINOR: hlua/event_hdl: expose proxy_uuid variable in server events

Adding proxy_uuid to ServerEvent class.
proxy_uuid contains the uuid of the proxy to which the server belongs

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 22 Mar 2023 16:46:12 +0000 (17:46 +0100)]

MINOR: hlua/event_hdl: rely on proxy_uuid instead of proxy_name for lookups

Since "MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server"
we may now use proxy_uuid variable to perform proxy lookups when
handling a server event.

It is more reliable since proxy_uuid isn't subject to any size limitation

commit | commitdiff | tree

Aurelien DARRAGON [Wed, 22 Mar 2023 16:35:47 +0000 (17:35 +0100)]

MINOR: server/event_hdl: add proxy_uuid to event_hdl_cb_data_server

Expose proxy_uuid variable in event_hdl_cb_data_server struct to
overcome proxy_name fixed length limitation.

proxy_uuid may be used by the handler to perform proxy lookups.
This should be preferred over lookups relying proxy_name.
(proxy_name is suitable for printing / logging purposes but not for
ID lookups since it has a maximum fixed length)

commit | commitdiff | tree

Aurelien DARRAGON [Mon, 27 Mar 2023 09:57:28 +0000 (11:57 +0200)]

CLEANUP: server: fix update_status() function comment

srv_update_status() function comment says that the function "is designed
to be called asynchronously".

While this used to be true back then with 64cc49cf
("MAJOR: servers: propagate server status changes asynchronously.")

This is not true anymore since 3ff577e ("MAJOR: server: make server state changes
synchronous again")

Fixing the comment in order to better reflect current behavior.

commit | commitdiff | tree

Aurelien DARRAGON [Tue, 4 Apr 2023 20:04:35 +0000 (22:04 +0200)]

CLEANUP: errors: fix obsolete function comments

Since 9f903af5 ("MEDIUM: log: slightly refine the output format of
alerts/warnings/etc"), messages generated by ha_{alert,warning,notice}
don't embed date/time information anymore.

Updating some old function comments that kept saying otherwise.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 19 Apr 2023 12:26:16 +0000 (14:26 +0200)]

BUG/MINOR: quic: consume Rx datagram even on error

A BUG_ON crash can occur on qc_rcv_buf() if a Rx packet allocation
failed.

To fix this, datagram are marked as consumed even if a fatal error
occured during parsing. For the moment, only a Rx packet allocation
failure could provoke this. At this stage, it's unknown if the datagram
were partially parsed or not at all so it's better to discard it
completely.

This bug was detected using -dMfail argument.

This should be backported up to 2.7.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 19 Apr 2023 08:45:40 +0000 (10:45 +0200)]

BUG/MINOR: quic: prevent crash on qc_new_conn() failure

Properly initialize el_th_ctx member first on qc_new_conn(). This
prevents a segfault if release should be called later due to memory
allocation failure in the function on qc_detach_th_ctx_list().

This should be backported up to 2.7.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 19 Apr 2023 09:49:16 +0000 (11:49 +0200)]

BUG/MINOR: h3: fix crash on h3s alloc failure

Do not emit a CONNECTION_CLOSE on h3s allocation failure. Indeed, this
causes a crash as the calling function qcs_new() will also try to emit a
CONNECTION_CLOSE which triggers a BUG_ON() on qcc_emit_cc().

This was reproduced using -dMfail.

This should be backported up to 2.7.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 19 Apr 2023 09:42:24 +0000 (11:42 +0200)]

BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure

Previously, if a STREAM frame cannot be allocated for emission, a crash
would occurs due to an ABORT_NOW() statement in _qc_send_qcs().

Replace this by proper error code handling. Each stream were sending
fails are removed temporarily from qcc::send_list to a list local to
_qc_send_qcs(). Once emission has been conducted for all streams,
reinsert failed stream to qcc::send_list. This avoids to reloop on
failed streams on the second while loop at the end of _qc_send_qcs().

This crash was reproduced using -dMfail.

This should be backported up to 2.6.

commit | commitdiff | tree

Amaury Denoyelle [Wed, 19 Apr 2023 15:58:39 +0000 (17:58 +0200)]

BUG/MINOR: mux-quic: fix crash with app ops install failure

On MUX initialization, the application layer is setup via
qcc_install_app_ops(). If this function fails MUX is deallocated and an
error is returned.

This code path causes a crash before connection has been registered
prior into the mux_stopping_data::list for stopping idle frontend conns.
To fix this, insert the connection later in qc_init() once no error can
occured.

The crash was seen on the process closing with SUGUSR1 with a segfault
on mux_stopping_process(). This was reproduced using -dMfail.

This regression was introduced by the following patch :
commit b4d119f0c75ce7c5a977ece18dc975e14f9b460c
BUG/MEDIUM: mux-quic: fix crash on H3 SETTINGS emission

This should be backported up to 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Wed, 19 Apr 2023 15:31:28 +0000 (17:31 +0200)]

BUG/MINOR: quic: Wrong Retry token generation timestamp computing

Again a now_ms variable value used without the ticks API. It is used
to store the generation time of the Retry token to be received back
from the client.

Must be backported to 2.6 and 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Tue, 18 Apr 2023 12:42:40 +0000 (14:42 +0200)]

BUG/MINOR: quic: Unchecked buffer length when building the token

As server, an Initial does not contain a token but only the token length field
with zero as value. The remaining room was not checked before writting this field.

Must be backported to 2.6 and 2.7.

commit | commitdiff | tree

Frédéric Lécaille [Mon, 17 Apr 2023 12:10:14 +0000 (14:10 +0200)]

MINOR: quic: Do not allocate too much ack ranges

Limit the maximum number of ack ranges to QUIC_MAX_ACK_RANGES(32).

Must be backported to 2.6 and 2.7.

Mirror of https://github.com/haproxy/haproxy.git

RSS Atom