git.ipfire.org Git - thirdparty/haproxy.git/log

BUG/MEDIUM: stick-tables: properly mark stktable_data as packed

The stktable_data union is made of types of varying sizes, and depending
on which types are stored in a table, some offsets might not necessarily
be aligned. This results in a bus error for certain regtests (e.g.
lb-services) on MIPS64. This bug may impact MIPS64, SPARC64, armv7 when
accessing a 64-bit counter (e.g. bytes) and depending on how the compiler
emitted the operation, and cause a trap that's emulated by the OS on RISCV
(heavy cost). x86_64 and armv8 are not affected at all.

Let's properly mark the struct with __attribute__((packed)) so that the
compiler emits the suitable unaligned-compatible instructions when
accessing the fields.

This should be backported to all versions where it applies.

BUG/MEDIUM: htx: mark htx_sl as packed since it may be realigned

A test on MIPS64 revealed that the following reg tests would all
fail at the same place in htx_replace_stline() when updating
parts of the request line:
  reg-tests/cache/if-modified-since.vtc
  reg-tests/http-rules/h1or2_to_h1c.vtc
  reg-tests/http-rules/http_after_response.vtc
  reg-tests/http-rules/normalize_uri.vtc
  reg-tests/http-rules/path_and_pathq.vtc

While the status line is normally aligned since it's the first block
of the HTX, it may become unaligned once replaced. The problem is, it
is a structure which contains some u16 and u32, and dereferencing them
on machines not natively supporting unaligned accesses makes them crash
or handle crap. Typically, MIPS/MIPS64/SPARC will crash, ARMv5 will
either crash or (more likely) return swapped values and do crap, and
RISCV will trap and turn to slow emulation.

We can assign the htx_sl struct the packed attribute, but then this
also causes the ints to fill the 2-bytes gap before them, always causing
unaligned accesses for this part on such machines. The patch does a bit
better, by explicitly filling this two-bytes hole, and packing the
struct.

This should be backported to all versions.

BUG/MINOR: qpack: fix error code reported on QPACK decoding failure

qpack_decode_fs() is used to decode QPACK field section on HTTP/3
headers parsing. Its return value is incoherent as it returns either
QPACK_DECOMPRESSION_FAILED defined in RFC 9204 or any other internal
values defined in qpack-dec.h. On failure, such return code is reused by
HTTP/3 layer to be reported via a CONNECTION_CLOSE frame. This is
incorrect if an internal error values was reported as it is not defined
by any specification.

Fir return values of qpack_decode_fs() in two ways. Firstly, fix invalid
usages of QPACK_DECOMPRESSION_FAILED when decoded content is too large
for the correct internal error QPACK_ERR_TOO_LARGE.

Secondly, adjust qpack_decode_fs() API to only returns internal code
values. A new internal enum QPACK_ERR_DECOMP is defined to replace
QPACK_DECOMPRESSION_FAILED. Caller is responsible to convert it to a
suitable error value. For other internal values, H3_INTERNAL_ERROR is
used. This is done through a set of convert functions.

This should be backported up to 2.6. Note that trailers are not
supported in 2.6 so chunk related to h3_trailers_to_htx() can be safely
skipped.

BUG/MINOR: mux-quic: fix error code on shutdown for non HTTP/3

qcc_shutdown() is called whenever the connection must be closed. If
application protocol defined its owned shutdown callback, it is invoked
to use the correct error code. Else transport error code NO_ERROR is
used.

A bug occurs in the latter case as NO_ERROR is used with quic_err_app()
which is reserved for application errro codes. This will trigger the
emission of a CONNECTION_CLOSE of type 0x1d (Application) instead of
0x1c (Transport).

This bug is considered minor as it does not impact QUIC with HTTP/3. It
may only be visible when using experimental HTTP/0.9 protocol.

This should be backported up to 2.6. For 2.6, patch must be completed
rewritten due to code differences. Here is the change to apply :

  diff --git a/src/mux_quic.c b/src/mux_quic.c
  index 26fb70ddf..c48f82e27 100644
  --- a/src/mux_quic.c
  +++ b/src/mux_quic.c
  @@ -1918,7 +1918,9 @@ static void qc_release(struct qcc *qcc)
                          qc_send(qcc);
                  }
                  else {
  -                       qcc_emit_cc_app(qcc, QC_ERR_NO_ERROR, 0);
  +                       /* Duplicate from qcc_emit_cc_app() for Transport error code. */
  +                       if (!(qcc->conn->handle.qc->flags & QUIC_FL_CONN_IMMEDIATE_CLOSE))
  +                               qcc->conn->handle.qc->err = quic_err_transport(QC_ERR_NO_ERROR);
                  }
          }

BUG/MEDIUM: server: clear purgeable conns before server deletion

Since the following commit, idle connections are cleared before a server
is deleted. This is better than blocking server deletion due to inactive
connections :

6e0afb2e274952663957121ea33cb6bae574fc2e
MEDIUM: server: close idle conn on server deletion

A BUG_ON() has been added to ensure that server idle conn counter is nul
after these connections are removed. However, Willy managed to trigger
it easily by repeatedly and randomly delete servers accross a
single-thread haproxy using a server-template with 1000 instances. In
parallel, a h1load client is executed to generate traffic.

This BUG_ON() reflected that it some connections referencing the server
targetted for deletion remained, even though idle server list is empty.
In fact, this is caused by connections scheduled for purging. These
connections are moved from idle server list to a global toremove_list
while still being accounted by the server.

A first approach could be to decrement server idle counter while moving
connection to the purge list. However, this is functionnaly incorrect as
these purgeable connections still reference the server and it could
cause a crash if cleared after it.

The correct fix for this issue is simply to remove every purgeable
connections before a server is deleted. This is implemented by this
patch by extending cli_parse_delete_server(). It could be enough to only
remove connections targetted the deleted server, but as these
connections will be purged anyway it is justified to clear the whole
list.

This must not be backported, unless the above mentionned patch is.

MEDIUM: hlua: take nbthread into account in hlua_get_nb_instruction()

Based on Willy's idea (from 3.0-dev6 announcement message): in this patch
we try to reduce the max latency that can be caused by running lua scripts
with default settings.

Indeed, by default, hlua engine is allowed to process up to 10k
instructions per batch. While this value was found to be the optimal one
for a single thread, it turns out that keeping a thread busy for 10k lua
instructions could increase thread contention. This is especially true
when the script is loaded with 'lua-load', because in that case the
current thread owns the main lua lock and prevent other threads from
making any progress if they're also waiting on the main lock.

Thanks to Thierry Fournier's work, we know that performance-wise we can
reach optimal performance by sticking between 500 and 10k instructions
per batch. Given that, when the script is loaded using 'lua-load', if no
"tune.lua.forced-yield" was set by the user, we automatically divide the
default value (10K) by the number of threads haproxy can use to reduce
thread contention (given that all threads could compete for the main lua
lock), however we make sure not to return a value below 500, because
Thierry's work showed that this would come with a significant performance
loss.

The historical behavior may still be enforced by setting
"tune.lua.forced-yield" to 10000 in the global config section.

MINOR: hlua: add hlua_nb_instruction getter

No functional behavior change, but this will ease the work of dynamically
computing hlua_nb_instruction value depending on various inputs.

DOC: Update UUID references to RFC 9562

When support for UUIDv7 was added in commit
aab6477b67415c4cc260bba5df359fa2e6f49733
the specification still was a draft.

It has since been published as RFC 9562.

This patch updates all UUID references from the obsoleted RFC 4122 and the
draft for RFC 9562 to the published RFC 9562.

REGTESTS: ssl: be more verbose with ocsp_compat_check.vtc

the ocsp_compat_check.vtc reg-test is difficult to debug given than the
haproxy output is piped in `grep -q`.

This patch helps by showing the haproxy output as well as the return
code.

MINOR: rhttp: Don't require SSL when attach-srv name parsing

An attach-srv config line usually looks like this:
tcp-request session attach-srv be/srv name ssl_c_s_dn(CN)

while a rhttp server line usually looks like this:
server srv rhttp@ sni req.hdr(host)

The server sni argument is used as a key for looking up connection in
the connection pool. The attach-srv name argument is used as a key for
inserting connections into the pool. For it to work correctly they must
match. There was a check that either both the attach-srv and server
provide that key or neither does.

It also checked that SSL and SNI was activated on the server. However,
thanks to current connect_server() implementation, it appears that SNI
is usable even without SSL to identify a connection in the pool. Thus,
it can be diverted from its original intent in reverse HTTP case to
serve even without SSL activated. For example, this could be useful to
use `fc_pp_unique_id` as a name expression (DISCLAIMER: note that for
now PROXY protocol is not compatible with rhttp).

Error is still reported if either SNI or name is used without the other.
This patch adjust the message to a more helpful one.

Arguably it would be easier to understand if instead of using `name` and
`sni` for `attach-srv` and `server` rules it used the same term in both
places - like "conn-pool-key" or something. That would make it clear
that the two must match.

BUG/MINOR: log: smp_rgs array issues with inherited global log directives

When a log directive is defined in the global section, each time we use
"log global" in a proxy section, the global log directives are duplicated
for the current proxy. This works by creating a new proxy logger struct
and duplicating every members for each global one.

However, smp_rgs logger member is a special pointer member that is
allocated when "range" is used on a log directive. Currently, we simply
copy the array pointer (from the global one), instead of creating our own
copy. Because of that, range log sampling may not work properly in some
situations prior to 3f1284560 ("MINOR: log: remove the unused curr_idx in
struct smp_log_range") when used in global log directives, for instance:

  global
    log 127.0.0.1:5114 format raw sample 1-2,3:4 local0 info # should receive 75% of all proxy logs
    log 127.0.0.1:5115 format raw sample 4:4 local0 info     # should receive 25% of all proxy logs

  listen proxy1
    log global

  listen proxy2
    log global

May not work as expected, because curr_idx was stored within smp_rgs array
member prior to 3f1284560, and due to this bug, it happens to be shared
between every log directive inherited from a "global" one. The result is
that curr_idx counter will not behave properly because the index will be
increased globally instead of per-log directive, and it could even suffer
from concurrent thread accesses under load since we don't own the global
log directive's lock when manipulating it.

Another issue that was revealed because of this bug is that the smp_rgs
array allocated during config parsing is never freed in free_logger(),
resulting in small memory leak during clean exit.

To fix these issues all at once, let's properly duplicate smp_rgs logger
struct member in dup_logger() like we already do for other special members
so that every log directive have its own sms_rgs copy, and then
systematically free it in free_logger().

While this bug affects all stable versions (including 2.4), it's probably
best to not backport this beyond 2.6 because of 211ea252d
("BUG/MINOR: logs: fix logsrv leaks on clean exit") prerequisite that
first appears in 2.6.

[ada: for versions prior to 2.9, 969e212
("MINOR: log: add dup_logsrv() helper function") and 76acde91
("BUG/MINOR: log: keep the ref in dup_logger()") must be backported
first.
Note: Some ctx adjustments should be performed because 'logger' struct
used to be named 'logsrv' in the past and 2.9 introduced logger target
struct member. Thus it's probably easier to manually apply 76acde91 and
the current bugfix by hand directly on top of 969e212.
]

BUG/MINOR: log: fix leak in add_sample_to_logformat_list() error path

If add_sample_to_logformat_list() fails to allocate new logformat_node,
then we directly jump to error_free label to cleanup the node using
free_logformat_node() before returning an error.

However if the node failed to allocate, then the sample expression that
was allocated just before (not yet assigned) isn't released
(free_logformat_node() is a no-op when NULL is provided). Thus if expr
wasn't assigned to the node during early failure, then it must be manually
released.

This bug was introduced by 2462e5bcc ("BUG/MINOR: log: fix potential
lf->name memory leak") which wasn't marked for backports. It only
affects 3.0.

CI: drop asan.log umbrella completely

asan.log redirection appeared to work poorly, let's cease that practice
for good.

ML: https://www.mail-archive.com/haproxy@formilux.org/msg44844.html

[RELEASE] Released version 3.0-dev11

Released version 3.0-dev11 with the following main changes :
    - BUILD: clock: improve check for pthread_getcpuclockid()
    - CI: add Illumos scheduled workflow
    - CI: netbsd: limit scheduled workflow to parent repo only
    - OPTIM: log: resolve logformat options during postparsing
    - BUG/MINOR: haproxy: only tid 0 must not sleep if got signal
    - REGTEST: add tests for acl() sample fetch
    - BUG/MINOR: acl: support built-in ACLs with acl() sample
    - BUG/MINOR: cfgparse: use curproxy global var from config post validation
    - MEDIUM: stconn/muxes: Add an abort reason for SE shutdowns on muxes
    - MINOR: mux-h2: Set the SE abort reason when a RST_STREAM frame is received
    - MEDIUM: mux-h2: Forward h2 client cancellations to h2 servers
    - MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received
    - MINOR: stconn: Add samples to retrieve about stream aborts
    - MINOR: mux-quic: Add .ctl callback function to get info about a mux connection
    - MINOR: muxes: Add ctl commands to get info on streams for a connection
    - MINOR: connection: Add samples to retrieve info on streams for a connection
    - BUG/MEDIUM: log/ring: broken syslog octet counting
    - BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD
    - DOC: lua: fix filters.txt file location
    - MINOR: dynbuf: pass a criticality argument to b_alloc()
    - MINOR: dynbuf: add functions to help queue/requeue buffer_wait fields
    - MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere
    - MEDIUM: dynbuf: make the buffer_wq an array of list heads
    - CLEANUP: tinfo: better align fields in thread_ctx
    - MINOR: dynbuf: provide a b_dequeue() function to detach a bw from the queue
    - MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait
    - MEDIUM: dynbuf/stream: re-enable queueing upon failed buffer allocation
    - MEDIUM: dynbuf/stream: do not allocate the buffers in the callback
    - MEDIUM: applet: make appctx_buf_available() only wake the applet up, not allocate
    - MINOR: applet: set the blocking flag in the buffer allocation function
    - MINOR: applet: adjust the allocation criticity based on the requested buffer
    - MINOR: dynbuf/mux-h1: use different criticalities for buffer allocations
    - MEDIUM: dynbuf/mux-h1: do not allocate the buffers in the callback
    - MEDIUM: dynbuf: refrain from offering a buffer if more critical ones are waiting
    - MINOR: stconn: report that a buffer allocation succeeded
    - MINOR: stream: report that a buffer allocation succeeded
    - MINOR: applet: report about buffer allocation success
    - MINOR: mux-h1: report that a buffer allocation succeeded
    - MEDIUM: stream: allocate without queuing when retrying
    - MEDIUM: channel: allocate without queuing when retrying
    - MEDIUM: mux-h1: allocate without queuing when retrying
    - MEDIUM: dynbuf: implement emergency buffers
    - MEDIUM: dynbuf: use emergency buffers upon failed memory allocations

MEDIUM: dynbuf: use emergency buffers upon failed memory allocations

Now, if a pool_alloc() fails for a buffer and if conditions are met
based on the queue number, we'll try to get an emergency buffer.

Thanks to this the situation is way more stable now. With only 4 reserve
buffers and 1 buffer it's possible to reliably serve 500 concurrent end-
to-end H1 connections and consult stats in parallel in loops showing the
growing number of buf_wait events in "show activity" without facing an
instant stall like in the past. Lower values still cause quick stalls
though.

It's also apparent that some subsystems do not seem to detach from the
buffer_wait lists when leaving. For example several crashes in the H1
part showed list elements still present after a free(), so maybe some
operations performed inside h1_release() after the b_dequeue() call
can sometimes result in a new allocation. Same for streams, where
the dequeue is done relatively early.

MEDIUM: dynbuf: implement emergency buffers

The buffer reserve set by tune.buffers.reserve has long been unused, and
in order to deal gracefully with failed memory allocations we'll need to
resort to a few emergency buffers that are pre-allocated per thread.

These buffers are only for emergency use, so every time their count is
below the configured number a b_free() will refill them. For this reason
their count can remain pretty low. We changed the default number from 2
to 4 per thread, and the minimum value is now zero (e.g. for low-memory
systems). The tune.buffers.limit setting has always been a problem when
trying to deal with the reserve but now we could simplify it by simply
pushing the limit (if set) to match the reserve. That was already done in
the past with a static value, but now with threads it was a bit trickier,
which is why the per-thread allocators increment the limit on the fly
before allocating their own buffers. This also means that the configured
limit is saner and now corresponds to the regular buffers that can be
allocated on top of emergency buffers.

At the moment these emergency buffers are not used upon allocation
failure. The only reason is to ease bisecting later if needed, since
this commit only has to deal with resource management.

MEDIUM: mux-h1: allocate without queuing when retrying

Now when trying to allocate a buffer, we can check if we've been notified
of availability via the callback, in which case we should not consult the
queue, or if we're doing a first allocation and check the queue. At this
point it still doesn't change much since the stream still doesn't make use
of it but some progress is expected.

MEDIUM: channel: allocate without queuing when retrying

Now when trying to allocate a channel buffer, we can check if we've been
notified of availability via the producer stream connector callback, in
which case we should not consult the queue, or if we're doing a first
allocation and check the queue.

MEDIUM: stream: allocate without queuing when retrying

Now when trying to allocate the work buffer, we can check if we've been
notified of availability via the buf_wait callback, in which case we
should not consult the queue, or if we're doing a first allocation and
check the queue.

MINOR: mux-h1: report that a buffer allocation succeeded

When the buffer allocation callback is notified of a buffer availability,
it will now set a MAYALLOC flag in addition to clearing the ALLOC one, for
each of the 3 levels where we may fail an allocation. The flag will be
cleared upon a successful allocation. This will soon be used to decide to
re-allocate without waiting again in the queue. For now it has no effect.

There's just a trick, we need to clear the various *_ALLOC flags before
testing h1_recv_allowed() otherwise it will return false!

MINOR: applet: report about buffer allocation success

When appctx_buf_available() is called, it now sets APPCTX_FL_IN_MAYALLOC
or APPCTX_FL_OUT_MAYALLOC depending on the reportedly permitted buffer
allocation, and these flags are cleared when the said buffers are
allocated. For now they're not used for anything else.

MINOR: stream: report that a buffer allocation succeeded

When the buffer allocation callback is notified of a buffer availability,
it will now set a MAYALLOC flag on the stream so that the stream knows it
is allowed to bypass the queue checks. For now this is not used.

MINOR: stconn: report that a buffer allocation succeeded

We used to have two states for the channel's input buffer used by the SC,
NEED_BUFF or not, flipped by sc_need_buff() and sc_have_buff(). We want to
have a 3rd state, indicating that we've just got a desired buffer. Let's
add an HAVE_BUFF flag that is set by sc_have_buff() and that is cleared by
sc_used_buff(). This way by looking at HAVE_BUFF we know that we're coming
back from the allocation callback and that the offered buffer has not yet
been used.

MEDIUM: dynbuf: refrain from offering a buffer if more critical ones are waiting

Now b_alloc() will check the queues at the same and higher criticality
levels before allocating a buffer, and will refrain from allocating one
if these are not empty. The purpose is to put some priorities in the
allocation order so that most critical allocators are offered a chance
to complete.

However in order to permit a freshly dequeued task to allocate again while
siblings are still in the queue, there is a special DB_F_NOQUEUE flag to
pass to b_alloc() that will take care of this special situation.

MEDIUM: dynbuf/mux-h1: do not allocate the buffers in the callback

One of the problematic designs with the buffer_wait mechanism is that
the callbacks pre-allocate the buffers and stay in the run queue for
a while, resulting in all of the few buffers being assigned to waiting
tasks instead of being all available to one task that needs them all at
once.

Here we simply stop doing this, the callback clears the waiting flags
and wakes the task up so that it has a chance of still finding some
buffers.

MINOR: dynbuf/mux-h1: use different criticalities for buffer allocations

While it could certainly still be improved, this first approach consists
in assigning buffers like this in the H1 mux:
  - h1c->obuf : DB_MUX_TX
  - h1c->ibuf : DB_MUX_RX
  - h1s->rxbuf: DB_SE_RX

That's done via 3 distinct functions for better code clarity, and it
also allowed to move the missing buffer flags assignment there.

Among possible improvements would be to take into consideration the
state of the parser (i.e. no data yet vs data, or headers vs payload)
so that even server beginning of response or pure payload can be lowered
in priority.

MINOR: applet: adjust the allocation criticity based on the requested buffer

When we want to allocate an in buffer, it's in order to pass data to
the applet, that will consume it, so it must be seen as the same as
a send() from the higher level, i.e. MUX_TX. And for the outbuf, it's
a stream endpoint returning data, i.e. DB_SE_RX.

MINOR: applet: set the blocking flag in the buffer allocation function

Instead of having each caller of appctx_get_buf() think about setting
the blocking flag, better have the function do it, since it's already
handling the queue anyway. This way we're sure that both are consistent.

MEDIUM: applet: make appctx_buf_available() only wake the applet up, not allocate

Now we don't want bufwait handlers to preallocate the resources they
were expecting since it contributes to the shortage. Let's just wake
the applet up and that's all.

MEDIUM: dynbuf/stream: do not allocate the buffers in the callback

One of the problematic designs with the buffer_wait mechanism is that
the callbacks pre-allocate the buffers and stay in the run queue for
a while, resulting in all of the few buffers being assigned to waiting
tasks instead of being all available to one task that needs them all at
once.

Here we simply stop doing this, the callback clears the waiting flags
and wakes the task up so that it has a chance of still finding some
buffers.

MEDIUM: dynbuf/stream: re-enable queueing upon failed buffer allocation

The errors were not working fine anyway since we know that upon low memory
condition everything freezes. However we have a chance to do better now,
so let's start by re-enabling queueing when allocations fail.

MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait

Now thanks to this the bufq_map field is expected to remain accurate.

MINOR: dynbuf: provide a b_dequeue() function to detach a bw from the queue

Now that we need to keep the bitmap in sync with the list heads, we don't
want tasks to leave just doing a LIST_DEL_INIT() without updating the map.
Let's provide a b_dequeue() function for that purpose. The function detects
when it's going to remove the last element and figures the queue number
based on the pointer since it points to the root. It's not used yet.

CLEANUP: tinfo: better align fields in thread_ctx

The introduction of buffer_wq[] in thread_ctx pushed a few fields around
and the cache line alignment is less satisfying. And more importantly, even
before this, all the lists in the local parts were 8-aligned, with the first
one split across two cache lines.

We can do better:
  - sched_profile_entry is not atomic at all, the data it points to is
    atomic so it doesn't need to be in the atomic-only region, and it can
    fill the 8-hole before the lists
  - the align(2*void) that was only before tasklets[] moves before all
    lists (and it's a nop for now)

This now makes the lists and buffer_wq[] start on a cache line boundary,
leaves 48 bytes after the lists before the atomic-only cache line, and
leaves a full cache line at the end for 128-alignment. This way we still
have plenty of room in both parts with better aligned fields.

MEDIUM: dynbuf: make the buffer_wq an array of list heads

Let's turn the buffer_wq into an array of 4 list heads. These are chosen
by criticality. The DB_CRIT_TO_QUEUE() macro maps each criticality level
into one of these 4 queues. The goal here clearly is to make it possible
to wake up the most critical queues in priority in order to let some tasks
finish their job and release buffers that others can use.

In order to avoid having to look up all queues, a bit map indicates which
queues are in use, which also allows to avoid looping in the most common
case where queues are empty..

MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere

The code places that were used to manipulate the buffer_wq manually
now just call b_queue() or b_requeue(). This will simplify the multiple
list management later.

MINOR: dynbuf: add functions to help queue/requeue buffer_wait fields

When failing an allocation we always do the same dance, add the
buffer_wait struct to a list if it's not, and return. Let's just add
dedicated functions to centralize this, this will be useful to implement
a bit more complex logic.

For now they're not used.

MINOR: dynbuf: pass a criticality argument to b_alloc()

The goal is to indicate how critical the allocation is, between the
least one (growing an existing buffer ring) and the topmost one (boot
time allocation for the life of the process).

The 3 tcp-based muxes (h1, h2, fcgi) use a common allocation function
to try to allocate otherwise subscribe. There's currently no distinction
of direction nor part that tries to allocate, and this should be revisited
to improve this situation, particularly when we consider that mux-h2 can
reduce its Tx allocations if needed.

For now, 4 main levels are planned, to translate how the data travels
inside haproxy from a producer to a consumer:
  - MUX_RX:   buffer used to receive data from the OS
  - SE_RX:    buffer used to place a transformation of the RX data for
              a mux, or to produce a response for an applet
  - CHANNEL:  the channel buffer for sync recv
  - MUX_TX:   buffer used to transfer data from the channel to the outside,
              generally a mux but there can be a few specificities (e.g.
              http client's response buffer passed to the application,
              which also gets a transformation of the channel data).

The other levels are a bit different in that they don't strictly need to
allocate for the first two ones, or they're permanent for the last one
(used by compression).

DOC: lua: fix filters.txt file location

At the beginning of the filter class section, we encourage the user to
check out filters.txt file to get to know how the filters API works
within haproxy.

However the file location is incorrect. The proper directory to look for
the file is: doc/internals/api.

It should be backported up to 2.5.

BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD

Abort reason code received on STOP_SENDING is notified to upper layer
since the following commit :
367ce1ebf3e4cead319a9f01581037c9f0280e77
MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received

However, this causes a crash when a STOP_SENDING is received on a QCS
instance without any stream instantiated. Fix this by checking first if
qcs->sd is not NULL before setting abort code.

This bug can easily be reproduced by emitting a STOP_SENDING as first
frame of a stream.

This should fix github issue #2563.

This does not need to be backported.

BUG/MEDIUM: log/ring: broken syslog octet counting

As reported by Tristan in GH #2561, syslog messages sent over rings are
malformed since commit 01aa0a05 ("MEDIUM: ring: change the ring reader
to use the new vector-based API now").

Indeed, take a look at the following log message produced prior to
01aa0a05:

181 <134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 113700 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/21/30/51 404 369 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"

Starting with 01aa0a05, here's the equivalent log message:

<134>1 2024-05-07T09:45:21.543263+02:00 - haproxy 112729 - - 127.0.0.1:56136 [07/May/2024:09:45:21.491] front front/s1 0/0/66/39/105 404 369 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"-fwr

-> Message is missing octet counting header, and garbage bytes are found
at the end of the payload.

This bug is caused by a small mistake in syslog_applet_append_event():
when the function was refactored to use vector API instead of buffer
API, we used 'trash.area' as starting pointer to write the event instead
of 'trash.area + trash.data', causing existing octet counting prefix
(already written in trash) to be overwritten and trash.data to be
wrongly incremented.

No backport needed (01aa0a05 was introduced during 3.0 development)

MINOR: connection: Add samples to retrieve info on streams for a connection

Thanks to the previous fix, it is now possible to get the number of opened
streams for a connection and the negociated limit. Here, corresponding
sample feches are added, in fc_ and bc_ scopes.

On frontend side, the limit of streams is imposed by HAProxy. But on the
backend side, the limit is defined by the server. it may be useful for
debugging purpose because it may explain slow-downs on some processing.

MINOR: muxes: Add ctl commands to get info on streams for a connection

There are 2 new ctl commands that may be used to retrieve the current number
of streams openned for a connection and its limit (the maximum number of
streams a mux connection supports).

For the PT and H1 muxes, the limit is always 1 and the current number of
streams is 0 for idle connections, otherwise 1 is returned.

For the H2 and the FCGI muxes, info are already available in the mux
connection.

For the QUIC mux, the limit is also directly available. It is the maximum
initial sub-ID of bidirectional stream allowed for the connection. For the
current number of streams, it is the number of SC attached on the connection
and the number of not already attached streams present in the "opening_list"
list.

MINOR: mux-quic: Add .ctl callback function to get info about a mux connection

Other muxes implement this callback function. It was not implemented for the
QUIC mux because it was useless. It will be used to retrieve the current/max
number of stream for a quic connection. So let's added it, adding the
default support for MUX_CTL_EXIT_STATUS command.

MINOR: stconn: Add samples to retrieve about stream aborts

It is now possible to retrieve some info about the abort received for a
server or a client stream, if any.

  * fs.aborted and bs.aborted can be used to know if an abort was received
    on frontend or backend side. A boolean is returned.

  * fs.rst_code and bs.rst_code return the code of the received RESET_STREAM
    frame for a H2 stream or the code of the received STOP_SENDING frame for
    a QUIC stream. In both cases, the error code attached to the frame is
    returned. The sample fetch fails if no such frame was received or if the
    stream is not an H2/QUIC stream.

MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is received

When STOP_SENDING frame is received for a quic stream, the error code is now
saved in the SE abort reason. To do so, we use the QUIC source
(SE_ABRT_SRC_MUX_QUIC). For now, this code is only set but not used on the
opposite side.

MEDIUM: mux-h2: Forward h2 client cancellations to h2 servers

When a H2 client sends a RST_STREAM(CANCEL) frame to abort a request, the
abort reason is now used on server side, in the H2 mux, to set the
RST_STREAM code. The main use case is to forward client cancellations to
gRPC applications.

This patch should fix the issue #172.

MINOR: mux-h2: Set the SE abort reason when a RST_STREAM frame is received

When RST_STREAM frame is received, the error code is now saved in the SE
abort reason. To do so, we use the H2 source (SE_ABRT_SRC_MUX_H2). For now,
this code is only set but not used on the opposite side.

MEDIUM: stconn/muxes: Add an abort reason for SE shutdowns on muxes

A reason is now passed as parameter to muxes shutdowns to pass additional
info about the abort, if any. No info means no abort or only generic one.

For now, the reason is composed of 2 32-bits integer. The first on represents
the abort code and the other one represents the info about the code (for
instance the source). The code should be interpreted according to the associated
info.

One info is the source, encoding on 5 bits. Other bits are reserverd for now.
For now, the muxes are the only supported source. But we can imagine to extend
it to applets, streams, health-checks...

The current design is quite simple and will most probably evolved.. But the
idea is to let the opposite side forward some errors and let's a mux know
why its stream was aborted. At first glance, a abort reason must only be
evaluated if SE_SHW_SILENT flag is set.

The main goal at short term, is to forward some H2 RST_STREAM codes because
it is mandatory for gRPC applications, mainly to forward gRPC cancellation
from an H2 client to an H2 server. But we can imagine to alter this reason
at the applicative level to enrich it. It would also be used to report more
accurate errors in logs.

BUG/MINOR: cfgparse: use curproxy global var from config post validation

Previously check_config_validity() had its own curproxy variable. This
resulted in the acl() sample fetch being unable to determine which
proxy was in use when used from within log-format statements. This
change addresses the issue by having the check_config_validity()
function use the global variable instead.

BUG/MINOR: acl: support built-in ACLs with acl() sample

Built-in ACLs were not being searched by the acl() sample fetch. This
fixes that so they are searched if no other match is found.

REGTEST: add tests for acl() sample fetch

This adds reg tests for the recently added acl() sample fetch

BUG/MINOR: haproxy: only tid 0 must not sleep if got signal

This patch fixes the commit eea152ee68
("BUG/MINOR: signals/poller: ensure wakeup from signals").

There is some probability that run_poll_loop() becomes inifinite, if
TH_FL_SLEEPING is withdrawn from all threads in the second signal_queue_len
check, when a signal has received just after the first one.

In such particular case, the 'wake' variable, which is used to terminate
thread's poll loop is never reset to 0. So, we never enter to the "stopping"
part of the run_poll_loop() and threads, except the one with id 0 (tid 0
handles signals), will continue to call _do_poll() eternally and will never
sleep, as its TH_FL_SLEEPING flag was unset.

This flag needs to be removed only for the tid 0, as it was done in the first
signal_queue_len check.

This fixes an issue #2537 "infinite loop when shutting down".

This fix must be backported in every stable version.

OPTIM: log: resolve logformat options during postparsing

In lf_buildctx_prepare(), we perform costly bitwise operations for every
nodes to resolve node options and check for incompatibilities with global
options.

In fact, all this logic may safely be performed during postparsing. This
is what we're doing in this commit. Doing so saves us from unnecessary
runtime checks and could help speedup sess_build_logline().

Since checks are not as costly as before (due to them being performed
during postparsing and not on log building path anymore), an complementary
check for OPT_HTTP vs OPT_ENCODE incompatibity was added:

  encoding is ignored if HTTP option is set, unless HTTP option wasn't
  set globally and encoding was set globally, which means encoding
  takes the precedence

Thanks to this patch, lf_buildctx_prepare() now only takes care of
assigning proper typecast and options settings depending if it's used
from global or per-node context, and prepares CBOR-specific structure
members when CBOR encode option is set.

CI: netbsd: limit scheduled workflow to parent repo only

it is not very useful for most of forks.

CI: add Illumos scheduled workflow

this is very initial build only implementation.

BUILD: clock: improve check for pthread_getcpuclockid()

if _POSIX_THREAD_CPUTIME is greater than 0, pthread_getcpuclockid()
is implemented.

This should fix the build on Solaris 11.

Reference: https://docs.oracle.com/cd/E88353_01/html/E37842/unistd-3head.html
ML: https://www.mail-archive.com/haproxy@formilux.org/msg44915.html

[RELEASE] Released version 3.0-dev10

Released version 3.0-dev10 with the following main changes :
    - BUG/MEDIUM: cache: Vary not working properly on anything other than accept-encoding
    - REGTESTS: cache: Add test on 'vary' other than accept-encoding
    - BUG/MINOR: stats: replace objt_* by __objt_* macros
    - CLEANUP: tools/cbor: rename cbor_encode_ctx struct members
    - MINOR: log/cbor: _lf_cbor_encode_byte() explicitly requires non-NULL ctx
    - BUG/MINOR: log: fix global lf_expr node options behavior
    - CLEANUP: log: add a macro to know if a lf_node is configurable
    - MINOR: httpclient: allow to use absolute URI with new flag HC_F_HTTPROXY
    - MINOR: ssl: introduce ocsp_update.http_proxy for ocsp-update keyword
    - BUG/MINOR: log/encode: consider global options for key encoding
    - BUG/MINOR: log/encode: fix potential NULL-dereference in LOGCHAR()
    - BUG/MINOR: log: fix global lf_expr node options behavior (2nd try)
    - MINOR: log/cbor: _lf_cbor_encode_byte() explicitly requires non-NULL ctx (again)
    - BUG/MEDIUM: log: don't ignore disabled node's options
    - BUG/MINOR: stconn: don't wake up an applet waiting on buffer allocation
    - MINOR: sock: rename sock to sock_fd in sock_create_server_socket
    - MEDIUM: proto_uxst: take in account server namespace
    - MEIDUM: unix sock: use my_socketat to create bind socket
    - MINOR: sock_set_mark: take sock family in account
    - MEDIUM: proto: make common fd checks in sock_create_server_socket
    - MINOR: sock: add EPERM case in sock_handle_system_err
    - MINOR: capabilities: add cap_sys_admin support
    - CLEANUP: ssl: clean the includes in ssl_ocsp.c
    - CLEANUP: ssl: move the global ocsp-update options parsing to ssl_ocsp.c
    - MINOR: stats: fix visual alignment for stat_cols_px definition
    - MINOR: stats: convert req_tot as generic column
    - MINOR: stats: prepare stats-file support for values other than FN_COUNTER
    - MINOR: counters: move freq-ctr from proxy/server into counters struct
    - MINOR: stats: support rate in stats-file
    - MINOR: stats: convert rate as generic column for proxy stats
    - MINOR: counters: move last_change into counters struct
    - MINOR: stats: support age in stats-file
    - MINOR: stats: convert age as generic column for proxy stat
    - CLEANUP: ssl: rename new_ckch_store_load_files_path() to ckch_store_new_load_files_path()
    - MINOR: ssl: rename ocsp_update.http_proxy into ocsp-update.httpproxy
    - REORG: stats: define stats-proxy source module
    - MINOR: stats: extract proxy clear-counter in a dedicated function
    - REGTESTS: stats: add test stats-file counters preload
    - CI: netbsd: adjust packages after NetBSD-10 released
    - CLEANUP: assorted typo fixes in the code and comments
    - REGTESTS: replace REQUIRE_VERSION by version_atleast
    - MEDIUM: log: optimizing tmp->type handling in sess_build_logline()
    - BUG/MINOR: log: prevent double spaces emission in sess_build_logline()
    - OPTIM: log: declare empty buffer as global variable
    - OPTIM: log: use thread local lf_buildctx to stop pushing it on the stack
    - OPTIM: log: use lf_buildctx's buffer instead of temporary stack buffers
    - OPTIM: log: speedup date printing in sess_build_logline() when no encoding is used

OPTIM: log: speedup date printing in sess_build_logline() when no encoding is used

In sess_build_logline(), we have multiple fieds such as '%t' that build
a fixed-length string out of a date struct and then print it using
lf_rawtext(). In fact, printing it using lf_rawtext() is only mandatory
to deal with encoding options, but when no encoding is used we can output
the result to tmplog directly. Since most dates generate between 25 and 30
chars, doing so spares us from writing them twice and could help make
sess_build_logline() a bit faster when no encoding is used. (to match with
pre-encoding patch series performance).

OPTIM: log: use lf_buildctx's buffer instead of temporary stack buffers

Now that lf_buildctx isn't pushed on the stack anymore, let's take this
opportunity to store a small buffer of 256 bytes within it, and then use
this buffer as general purpose buffer to build fixed-length strings that
are then printed using lf_{raw}text() function. By doing so we stop
relying on temporary stack buffers.

OPTIM: log: use thread local lf_buildctx to stop pushing it on the stack

Following previous commit's logic, let's move lf_buildctx ctx away from
sess_build_logline() to stop abusing from the stack to push large
structure each time sess_build_logline() is called. Also, don't memset
the structure for each invokation, but only reset members explicitly when
required.

For that we now declare one static lf_buildctx per thread (using
THREAD_LOCAL) and make sess_build_logline() refer to it using a pointer.

OPTIM: log: declare empty buffer as global variable

'empty' buffer used in sess_build_logline() inside a loop, and since it
is only being read from and not modified, until recently it ended up being
cached most of the time and didn't cause overhead due to systematic push
on the stack.

However, due recent encoding work and new added variables on the stack,
we're starting to reach a stack limit and declaring 'empty' buffer within
the loop seems to cause non-negligible CPU overhead.

Since the variable isn't modified during log generation, let's declare
'empty' buffer as a global variable outside from sess_build_logline()
to prevent pushing it on the stack for each node evaluation.

BUG/MINOR: log: prevent double spaces emission in sess_build_logline()

Christian reported in GH #2556 that since 3.0-dev double spaces may be
found in log messages on some cases where it was not the case before.

As we were able to easily reproduce, a quick bisect led us to c6a7138
("MINOR: log: simplify last_isspace in sess_build_logline()"). While
it is true that all switch cases set the last_isspace variable to 0,
there was a subtelty for some fields such as '%hr', '%hrl', '%hs' or
'%hsl' and I overlooked it. Indeed, for '%hr', last_isspace was only set
to 0 if data was emitted, else the assignment didn't occur.

But with c6a7138, last_isspace is always set to 0 as long as the current
node type is not a separator. Because of that, if no data is emitted for
the current node value, and a space was already emitted prior to the
current node, then an extra space could be emitted after the node,
resulting in two spaces being emitted.

Note that while c6a7138 introduces a slight behavior regression regarding
last_isspace logic with the specific fields mentionned above, this
behavior could already be triggered with a failing or empty logformat
node sample expression. Consider this logformat expression:

  log-format "%{-M}o | %[str()] |"

str() will not print anything, and since we disabled mandatory option with
'-M', nothing gets printed for the node sample expression. As a result, we
have the following output:

  "|  |"

Instead of (when mandatory option is enabled):

  "| - |"

Thus in order to stick to the historical behavior, systematically set
last_isspace to 0 for EXPR nodes, and only set last_isspace to 0 when
data was written for TAG nodes. This way, '%hr', '%hrl', '%hs' or
'%hsl' should behave as before.

No backport needed.

MEDIUM: log: optimizing tmp->type handling in sess_build_logline()

Instead of chaining 2 switchcases and performing encoding checks for all
nodes let's actually split the logic in 2: first handle simple node types
(text/separator), and then handle dynamic node types (tag, expr). Encoding
options are only evaluated for dynamic node types.

Also, last_isspace is always set to 0 after next_fmt label, since next_fmt
label is only used for dynamic nodes, thus != LOG_FMT_SEPARATOR.

Since LF_NODE_WITH_OPT() macro (which was introduced recently) is now
unused, let's get rid of it.

No functional change should be expected.

(Use diff -w to check patch changes since reindentation makes the patch
look heavy, but in fact it remains fairly small)

REGTESTS: replace REQUIRE_VERSION by version_atleast

REQUIRE_VERSION usage is deprecated in regtests for version >= 2.5. This
allows to specify full correct 3.0-dev9 version.

CLEANUP: assorted typo fixes in the code and comments

This is 42nd iteration of typo fixes

CI: netbsd: adjust packages after NetBSD-10 released

pcre2 is installed already, installing it from packages lead to
conflict. curl is installed as a preparation for VTest

REGTESTS: stats: add test stats-file counters preload

Define a simple regtest to check stats-file loading on startup. A sample
stats-file is written with some invalid values which should be silently
ignored.

MINOR: stats: extract proxy clear-counter in a dedicated function

Split code related to proxies list looping in cli_parse_clear_counters()
to a new dedicated function. This function is placed in the new module
stats-proxy.

REORG: stats: define stats-proxy source module

Create a new module stats-proxy. Move stats functions related to proxies
list looping in it. This allows to reduce stats source file dividing its
size by half.

MINOR: ssl: rename ocsp_update.http_proxy into ocsp-update.httpproxy

Rename to the option to have a more consistent name.

CLEANUP: ssl: rename new_ckch_store_load_files_path() to ckch_store_new_load_files_path()

Rename the new_ckch_store_load_files_path() function to
ckch_store_new_load_files_path(), in order to be more consistent.

MINOR: stats: convert age as generic column for proxy stat

Convert FN_AGE in stat_cols_px[] as generic columns. These values will
be automatically used for dump/preload of a stats-file.

Remove srv_lastsession() / be_lastsession() function which are now
useless as last_sess is calculated via me_generate_field().

MINOR: stats: support age in stats-file

Extend generic stat column support to be able to fully support age stats
type. Several changes were required.

On output, me_generate_field() has been updated to report the difference
between the current tick with the stored value for FN_AGE type. Also, if
an age stats is hidden in show stats, -1 is returned instead of an empty
metric, which is the value to mark an age as unset.

On counters preload, load_ctr() was updated to handled FN_AGE. A similar
substraction is performed to the current tick value.

MINOR: counters: move last_change into counters struct

last_change was a member present in both proxy and server struct. It is
used as an age statistics to report the last update of the object.

Move last_change into fe_counters/be_counters. This is necessary to be
able to manipulate it through generic stat column and report it into
stats-file.

Note that there is a change for proxy structure with now 2 different
last_change values, on frontend and backend side. Special care was taken
to ensure that the value is initialized only on the proxy side. The
other value is set to 0 unless a listen proxy is instantiated. For the
moment, only backend counter is reported in stats. However, with now two
distinct values, stats could be extended to report it on both side.

MINOR: stats: convert rate as generic column for proxy stats

Convert every FN_RATE in stat_cols_px[] to generic column. Thanks to
prior patch, this allows to automatically dump their value into
stats-file and preload corresponding freq-ctr on process startup.

MINOR: stats: support rate in stats-file

Implement support for FN_RATE stat column into stat-file.

For the output part, only minimal change is required. Reuse the function
read_freq_ctr() to print the same value in both stats output and
stats-file dump.

For counter preloading, define a new utility function
preload_freq_ctr(). This can be used to initialize a freq-ctr type by
preloading previous period value. Reuse this function in load_ctr()
during stats-file parsing.

At the moment, no rate column is defined as generic. Thus, this commit
does not have functional change. This will be changed as soon as FN_RATE
are converted to generic columns.

MINOR: counters: move freq-ctr from proxy/server into counters struct

Move freq-ctr defined in proxy or server structures into their dedicated
fe_counters/be_counters struct.

Functionnaly no change here. This commit will allow to convert rate
stats column to generic one, which is mandatory to manipulate them in
the stats-file.

MINOR: stats: prepare stats-file support for values other than FN_COUNTER

Currently, only FN_COUNTER are dumped and preloaded via a stats-file.
Thus in several places we relied on the assumption that only FN_COUNTER
are valid in stats-file context.

New stats types will soon be implemented as they are also eligilible to
statistics reloading on process startup. Thus, prepare stats-file
functions to remove any FN_COUNTER restriction.

As one of this change, generate_stat_tree() now uses stcol_is_generic()
for stats name tree indexing before stats-file parsing.

Also related to stats-file parsing, individual counter preloading step
as been extracted from line parsing in a dedicated new function
load_ctr(). This will allow to extend it to support multiple mechanism
of counter preloading depending on the stats type.

MINOR: stats: convert req_tot as generic column

req_tot counter is a special case as it is not managed identically
between frontend and backend side.

For the backend side, this metric is available directly into
be_counters, which allows to use a generic stat column definition.

On the frontend side however, the metric value is an aggredate of
multiple fe_counters value. This is the case since the splitting between
HTTP version introduced in the following patch :

9969adbcdc1a79a6e8bb0a6283191d8d330a04f1
MINOR: stats: add by HTTP version cumulated number of sessions and requests

This difference cannot be handled automatically by me_generate_field().
Add a special case in the function to produce it on frontend side
reusing the aggregated value. This not done however for stats-file as
there is no counter to preload.

MINOR: stats: fix visual alignment for stat_cols_px definition

Simply adjust visual alignment in definition of proxy stats columns
definition for ST_I_PX_HANAFAIL column.

CLEANUP: ssl: move the global ocsp-update options parsing to ssl_ocsp.c

Move the global tunel.ssl.ocsp-update option parsing to ssl_ocsp.c.

CLEANUP: ssl: clean the includes in ssl_ocsp.c

Clean the includes in ssl_ocsp.c which were copied from ssl_sock.c and
are not relevant anymore.

Also move the include in the right order.

MINOR: capabilities: add cap_sys_admin support

If 'namespace' keyword is used in the backend server settings or/and in the
bind string, it means that haproxy process will call setns() to change its
default namespace to the configured one and then, it will create a
socket in this new namespace. setns() syscall requires CAP_SYS_ADMIN
capability in the process Effective set (see man 2 setns). Otherwise, the
process must be run as root.

To avoid to run haproxy as root, let's add cap_sys_admin capability in the
same way as we already added the support for some other network capabilities.

As CAP_SYS_ADMIN belongs to CAP_SYS_* capabilities type, let's add a separate
flag LSTCHK_SYSADM for it. This flag is set, if the 'namespace' keyword was
found during configuration parsing. The flag may be unset only in
prepare_caps_for_setuid() or in prepare_caps_from_permitted_set(), which
inspect process EUID/RUID and Effective and Permitted capabilities sets.

If system doesn't support Linux capabilities or 'cap_sys_admin' was not set
in 'setcap', but 'namespace' keyword is presented in the configuration, we
keep the previous strict behaviour. Process, that has changed uid to the
non-priviledged user, will terminate with alert. This alert invites the user
to recheck its configuration.

In the case, when haproxy will start and run under a non-root user and
'cap_sys_admin' is not set, but 'namespace' keyword is presented, this patch
does not change previous behaviour as well. We'll still let the user to try
its configuration, but we inform via warning, that unexpected things, like
socket creation errors, may occur.

MINOR: sock: add EPERM case in sock_handle_system_err

setns() may return EPERM if thread, that tries to move into different
namespace, do not have CAP_SYS_ADMIN capability in its Effective set.
So, extending sock_handle_system_err() with this error allows to send
appropriate log message and set SF_ERR_PRXCOND (SC termination
flag in log) as stream termination error code. This error code can be
simply checked with SF_ERR_MASK at protocol layer.

MEDIUM: proto: make common fd checks in sock_create_server_socket

quic_connect_server(), tcp_connect_server(), uxst_connect_server() duplicate
same code to check different ERRNOs, that socket() and setns() may return.
They also duplicate some runtime condition checks, applied to the obtained
server socket fd.

So, in order to remove these duplications and to improve code readability,
let's encapsulate socket() and setns() ERRNOs handling in
sock_handle_system_err(). It must be called just before fd's runtime condition
checks, which we also move in sock_create_server_socket by the same reason.

MINOR: sock_set_mark: take sock family in account

SO_MARK, SO_USER_COOKIE, SO_RTABLE socket options (used to set the special
mark/ID on socket, in order to perform mark-based routing) are only supported
by AF_INET sockets. So, let's check socket address family, when we enter into
this function.

MEIDUM: unix sock: use my_socketat to create bind socket

As UNIX Domain sockets could be attached to Linux namespaces (see more details
about it from the Linux kernel patch set below:

https://lore.kernel.org/netdev/m1hbl7hxo3.fsf@fess.ebiederm.org),

it is better to use my_socket_at() in order to create UNIX listener's socket.
my_socket_at() takes in account a network namespace, that may be configured
for a frontend in the bind line:

frontend fe_foo
...
bind uxst@frontend.sock user haproxy group haproxy mode 660 namespace frontend

Like this, namespace aware applications as netstat for example, will see this
listening socket in its 'frontend' namespace and not in the root namespace as
it was before.

It is important to mention, that fixes in Linux kernel referenced above allow
to connect to this listener's socket from the root and from any other
namespace. UNIX Domain socket is protected by its permission set, which must
be set with caution on its inode.

MEDIUM: proto_uxst: take in account server namespace

As UNIX Domain sockets could be attached to Linux namespaces (see more details
about it from the Linux kernel patch set below:

https://lore.kernel.org/netdev/m1hbl7hxo3.fsf@fess.ebiederm.org),

it is better to use sock_create_server_socket() in UNIX stream protocol
implementation, as this function calls my_socket_at() and the latter takes
in account server network namespace, which could be configured as in example
below:

       backend be_bar
                ...
                server rpicam0 /run/ustreamer.sock namespace foonet

So, for UNIX Domain socket, used as an address of some backend server, this
patch makes possible to perform connect() to this backend server from the same
network namespace, where the server is running, or where its listening socket
was created.

Using sock_create_server_socket() in UNIX stream protocol implementation also
makes the code of uxst_connect_server() more uniform with tcp_connect_server()
and quic_connect_server().

MINOR: sock: rename sock to sock_fd in sock_create_server_socket

Renaming sock to sock_fd makes it more clear, that sock_create_server_socket
returns the fd of newly created server socket and then we check this fd.
As we heavily use "fd" variable name in all protocol implementations, let's
prefix this one with the name of its object file: sock.o.

BUG/MINOR: stconn: don't wake up an applet waiting on buffer allocation

Since the extension of the buffers API to applets in 3.0-dev, an applet
may find itself unable to allocate a buffer, and will block respectively
on APPCTX_FL_OUTBLK_ALLOC or APPCTX_FL_INBLK_ALLOC depending on the
direction. However the code in sc_applet_process() doesn't consider this
situation when deciding to wake up an applet, so when the condition
arises, the applet keeps ringing and is killed by the loop detector.

The fix is trivial and simply consists in checking for the flags above.
No backport is needed since this is new in 3.0.

BUG/MEDIUM: log: don't ignore disabled node's options

In 3f2e8d0ed ("MEDIUM: log: lf_* build helpers now take a ctx argument")
I made a mistake, because starting with this commit it is no longer
possible from a node to disable global logformat options.
The result is that when an option is set globally, it cannot be disabled
anymore.

For instance, it is not possible to do this anymore:
log-format "%{+X}o %{-X}Ts"

The original intent was to prevent encoding options from being
disabled once enabled globally, because when encoding is enabled globally
we start the object enumeration right away (ie: in CBOR and JSON we
announce dynamic map, and for each node we announce the key..), thus it
doesn't make sense to mix encoding types there, unless encoding is only
used per-node, in which case only the value gets encoded, thus it remains
possible to print a value in JSON/CBOR-compatible format while the next
one shouldn't be printed as-is.

Thus, to restore the original behavior, slightly change the logic in
lf_buildctx_prepare() so that only global encoding options take the
precedence over node's options (instead of all options).

No backport needed.

MINOR: log/cbor: _lf_cbor_encode_byte() explicitly requires non-NULL ctx (again)

The BUG_ON() statement that was added in 9bdea51 ("MINOR: log/cbor:
_lf_cbor_encode_byte() explicitly requires non-NULL ctx") isn't
sufficient as Coverity still thinks the lf_buildctx itself may be NULL
as shown in GH #2554. In fact the original reports complains about the
lf_buildctx itself and I didn't understand it properly, let's add another
check in the BUG_ON() to ensure both cbor_ctx and cbor_ctx->ctx are not
NULL since it is not expected if used properly.

BUG/MINOR: log: fix global lf_expr node options behavior (2nd try)

In 98b44e8 ("BUG/MINOR: log: fix global lf_expr node options behavior"),
I properly restored global node options behavior for when encoding is
not used, however the fix is not optimal when encoding is involved:

Indeed, encoding logic in sess_build_logline() relies on global node
options to know if encoding must be handled expression-wide or
individually. However, because of the above fix, if an expression is
made of 1 or multiple nodes that all set an encoding option manually
(without '%o'), we consider that the option was set globally, but
that's probably not what the user intended. Instead we should only
evaluate global options from '%o', so that it remains possible to
skip global encoding when needed.

No backport needed.

BUG/MINOR: log/encode: fix potential NULL-dereference in LOGCHAR()

When CBOR encoding was added in c614fd3b9 ("MINOR: log: add +cbor encoding
option"), in LOGCHAR(), we forgot to check that we don't assign the NULL
value to tmplog (as we assume that tmplog cannot be NULL at the end of
sess_build_logline())

No backport needed.

BUG/MINOR: log/encode: consider global options for key encoding

In sess_build_logline(), contrary to what's stated in the comment
"only consider global ctx for key encoding", we check for
LOG_OPT_ENCODE flag on the current ctx options instead of global
ones. Because of this, we could end up doing the wrong thing if the
previous node had encoding enabled but it isn't set globally for
instance.

To fix the issue, let's simply check the presence of the flag on
g_options before entering the "key encoding" block.

This bug was introduced with 3f7c8387 ("MINOR: log: add +json encoding
option"), no backport needed.

MINOR: ssl: introduce ocsp_update.http_proxy for ocsp-update keyword

The ocsp_update.http_proxy global option allows to set an HTTP proxy
address which will be used to send the OCSP update request with an
absolute form URI.

MINOR: httpclient: allow to use absolute URI with new flag HC_F_HTTPROXY

The new HC_F_HTTPPROXY flag allows to use an absolute URI within a
request that won't be modified in order to use an http proxy.

CLEANUP: log: add a macro to know if a lf_node is configurable

LF_NODE_WITH_OPT(node) returns true if the node's option may be set and
thus should be considered. Logic is based on logformat node's type:
for now only TAG and FMT nodes can be configured.

BUG/MINOR: log: fix global lf_expr node options behavior

In 507223d5 ("MINOR: log: global lf_expr node options"), a mistake was
made because it was assumed that only the last occurence of %o
(LOG_FMT_GLOBAL) should be kept as global node options.

However, although not documented, it is possible to have multiple %o
within a single logformat expression to change the global settings on the
fly.

For instance, consider this example:

  log-format "%{+X}o test1=%ms %{-X}o test2=%ms %{+X}o test3=%ms"

Prior to 3f2e8d0ed ("MEDIUM: log: lf_* build helpers now take a ctx
argument"), this would output something like this:

  test1=18B test2=395 test3=18B

This is because global options is properly updated as the lf_expr string
is parsed. But now due to 507223d5 and 3f2e8d0ed, only the last %o
occurence is considered. With the above example, this gives:

  test1=18B test2=18B test3=18B

To restore historical behavior, let's partially revert 507223d5: to
compute global node options, we now start with all options enabled and
then for each configurable node in lf_expr_postcheck(), we keep options
common to the current node and previous nodes using AND masking, this way
we really end up with options common to all nodes.

No backport needed.