git.ipfire.org Git - thirdparty/haproxy.git/log

BUG/MEDIUM: h2: Revamp the way send subscriptions works.

Instead of abusing the SUB_CALL_UNSUBSCRIBE flag, revamp the H2 code a bit
so that it just checks if h2s->sending_list is empty to know if the tasklet
of the stream_interface has been waken up or not.
send_wait is now set to NULL in h2_snd_buf() (ideally we'd set it to NULL
as soon as we're waking the tasklet, but it can't be done, because we still
need it in case we have to remove the tasklet from the task list).

BUG/MEDIUM: h2: Make sure we're not already in the send_list in h2_subscribe().

In h2_subscribe(), don't add ourself to the send_list if we're already in it.
That may happen if we try to send and fail twice, as we're only removed
from the send_list if we managed to send data, to promote fairness.
Failing to do so can lead to either an infinite loop, or some random crashes,
as we'd get the same h2s in the send_list twice.

This should be backported to 1.9.

BUG/MEDIUM: muxes: Make sure we unsubcribed when destroying mux ctx.

In the h1 and h2 muxes, make sure we unsubscribed before destroying the
mux context.
Failing to do so will lead in a segfault later, as the connection will
attempt to dereference its conn->send_wait or conn->recv_wait, which pointed
to the now-free'd mux context.

This was introduced by commit 39a96ee16eeec51774f9f52a783cf624a0de4ccb, so
should only be backported if that commit gets backported.

BUILD: cli/threads: fix build in single-threaded mode

Commit a8f57d51a ("MINOR: cli/activity: report the accept queue sizes
in "show activity"") broke the single-threaded build because the
accept-rings are not implemented there. Let's ifdef this out. Ideally
we should start to think about always having such elements initialized
even without threads to improve the test coverage.

BUILD: task/thread: fix single-threaded build of task.c

As expected, commit cde7902ac ("MEDIUM: tasks: improve fairness between
the local and global queues") broke the build with threads disabled,
and I forgot to rerun this test before committing. No backport is
needed.

BUG/MINOR: ssl: Fix 48 byte TLS ticket key rotation

Whenever HAProxy was reloaded with rotated keys, the resumption would be
broken for previous encryption key. The bug was introduced with the addition
of 80 byte keys in 9e7547 (MINOR: ssl: add support of aes256 bits ticket keys
on file and cli.).

This fix needs to be backported to 1.9.

BUG/MEDIUM: map: Fix memory leak in the map converter

The allocated trash chunk is not freed properly and causes a memory leak
exhibited as the growth in the trash pool allocations. Bug was introduced
in commit 271022 (BUG/MINOR: map: fix map_regm with backref).

This should be backported to all branches where the above commit was
backported.

MINOR: tasks: restore the lower latency scheduling when niced tasks are present

In the past we used to reduce the number of tasks consulted at once when
some niced tasks were present in the run queue. This was dropped in 1.8
when the scheduler started to take batches. With the recent fixes it now
becomes possible to restore this behaviour which guarantees a better
latency between tasks when niced tasks are present. Thanks to this, with
the default number of 200 for tune.runqueue-depth, with a parasitic load
of 14000 requests per second, nice 0 gives 14000 rps, nice 1024 gives
12000 rps and nice -1024 gives 16000 rps. The amplitude widens if the
runqueue depth is lowered.

MEDIUM: tasks: only base the nice offset on the run queue depth

The offset calculated for the nice value used to be wrong for a long
time and got even worse when the improved multi-thread sheduler was
implemented because it continued to rely on the run queue size, which
become irrelevant given that we extract tasks in batches, so the run
queue size moves following a sawtooth form.

However the offsets much better reflects insertion positions in the
queue, so it's worth dropping this rq_size component of the equation.
Last point, due to the batches made of runqueue-depth entries at once,
the higher the depth, the lower the effect of the nice setting since
values are picked together in batches and placed into a list. An
intuitive approach consists in multiplying the nice value with the
batch size to allow tasks to participate to a different batch. And
experimentation shows that this works pretty well.

With a runqueue-depth of 16 and a parasitic load of 16000 requests
per second on 100 streams, a default nice of 0 shows 16000 requests
per second for nice 0, 22000 for nice -1024 and 10000 for nice 1024.

The difference is even bigger with a runqueue depth of 5. At 200
however it's much smoother (16000-22000).

MEDIUM: tasks: improve fairness between the local and global queues

Tasks allowed to run on multiple threads, as well as those scheduled by
one thread to run on another one pass through the global queue. The
local queues only see tasks scheduled by one thread to run on itself.
The tasks extracted from the global queue are transferred to the local
queue when they're picked by one thread. This causes a priority issue
because the global tasks experience a priority contest twice while the
local ones experience it only once. Thus if a tasks returns still
running, it's immediately reinserted into the local run queue and runs
much faster than the ones coming from the global queue.

Till 1.9 the tasks going through the global queue were mostly :
  - health checks initialization
  - queue management
  - listener dequeue/requeue

These ones are moderately sensitive to unfairness so it was not that
big an issue.

Since 2.0-dev2 with the multi-queue accept, tasks are scheduled to
remote threads on most accept() and it becomes fairly visible under
load that the accept slows down, even for the CLI.

This patch remedies this by consulting both the local and the global
run queues in parallel and by always picking the task whose deadline
is the earliest. This guarantees to maintain an excellent fairness
between the two queues and removes the cascade effect experienced
by the global tasks.

Now the CLI always continues to respond quickly even in presence of
expensive tasks running for a long time.

This patch may possibly be backported to 1.9 if some scheduling issues
are reported but at this time it doesn't seem necessary.

CLEANUP: task: do not export rq_next anymore

This one hasn't been used anymore since the scheduler changes after 1.8
but it kept being exported and maintained up to date while it's always
reset when scanning the trees. Let's stop exporting it and updating it.

BUG/MEDIUM: muxes: Don't dereference mux context if null in release functions

When a mux context is released, we must be sure it exists before dereferencing
it. The bug was introduced in the commit 39a96ee16 ("MEDIUM: muxes: Be prepared
to don't own connection during the release").

No need to backport this patch, expect if the commit 39a96ee16 is backported
too.

REGTEST: Use HTX by default and add '--no-htx' option to disable it

Because the HTX is now the default mode in HAProxy, it is also enabled by
default in reg-tests. The option '--use-htx' is still available, but deprecated
and have concretly no effect. To run reg-tests with the legacy HTTP mode, you
should use the option '--no-htx'.

MAJOR: htx: Enable the HTX mode by default for all proxies

The legacy HTTP mode is no more the default one. So now, by default, without any
option in your configuration, all proxies will use the HTX mode. The line
"option http-use-htx" in proxy sections are now useless, except to cancel the
legacy HTTP mode. To fallback on legacy HTTP mode, you should use the line "no
option http-use-htx" explicitly.

Note that the reg-tests still work by default on legacy HTTP mode. The HTX will
be enabled by default in a futur commit.

MAJOR: muxes/htx: Handle inplicit upgrades from h1 to h2

The upgrade is performed when an H2 preface is detected when the first request
on a connection is parsed. The CS is destroyed by setting EOS flag on it. A
special flag is added on the HTX message to warn the HTX analyzers the stream
will be closed because of an upgrade. This way, no error and no log are
emitted. When the mux h1 is released, we create a mux h2, without any CS and
passing the buffer with the unparsed H2 preface.

MAJOR: proxy/htx: Handle mux upgrades from TCP to HTTP in HTX mode

It is now possible to upgrade TCP streams to HTX when an HTTP backend is set for
a TCP frontend (both with the HTX enabled). So concretely, in such case, an
upgrade is performed from the mux pt to the mux h1. The current CS and the
channel's buffer are used to initialize the mux h1.

MEDIUM: htx: Allow the option http-use-htx to be used on TCP proxies too

This will be mandatory to allow upgrades from TCP to HTTP in HTX. Of course, raw
buffers will still be used by default on TCP proxies, this option sets or
not. But if you want to handle mux upgrades from a TCP proxy, you must enable
the HTX on it and on all its backends.

There is only a small change in the lua code. Because TCP proxies can be HTX
aware, to exclude TCP services only for HTTP proxies, we must also check the
mode (TCP/HTTP) now.

MEDIUM: connection: Add conn_upgrade_mux_fe() to handle mux upgrades

This function will handle mux upgrades, for frontend connections only. It will
retrieve the best mux in the same way than conn_install_mux_fe except that the
mode and optionnally the proto are forced.

The new multiplexer is initialized using a new context and a specific input
buffer. Then, the old one is destroyed. If an error occurred, everything is
rolled back.

MEDIUM: muxes: Be prepared to don't own connection during the release

This happens during mux upgrades. In such case, when the destroy() callback is
called, the connection points to a different mux's context than the one passed
to the callback. It means the connection is owned by another mux. The old mux is
then released but the connection is not closed.

MINOR: muxes: Pass the context of the mux to destroy() instead of the connection

It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.

MEDIUM: muxes: Add an optional input buffer during mux initialization

The mux's callback init() now take a pointer to a buffer as extra argument. It
must be used by the multiplexer as its input buffer. This buffer is always NULL
when a multiplexer is initialized with a fresh connection. But if a mux upgrade
is performed, it may be filled with existing data. Note that, for now, mux
upgrades are not supported. But this commit is mandatory to do so.

MINOR: muxes: Rely on conn_is_back() during init to handle front/back conn

Instead of using the connection context to make the difference between a
frontend connection and a backend connection, we now rely on the function
conn_is_back().

MINOR: filters/htx: Use stream flags instead of px mode to instanciate a filter

In the function flt_stream_add_filter(), if the HTX is enabled, before attaching
a filter to a stream, we test if the filter can handle it or not. If not, the
filter is ignored. Before the proxy mode was tested. Now we test if the stream
is an HTX stream or not.

MINOR: http_fetch/htx: Use stream flags instead of px mode in smp_prefetch_htx

In the function smp_prefetch_htx(), we must know if data in the channel's buffer
are structured or not. Before the proxy mode was tested. Now we test if the
stream is an HTX stream or not. If yes, we know the HTX is used to structure
data in the channel's buffer.

MINOR: http: update the macro IS_HTX_STRM() to check the stream flag SF_HTX

Instead of matching on the frontend options, we now check if the flag SF_HTX is
set or not on the stream to know if it is an HTX stream or not.

MINOR: stream: Set a flag when the stream uses the HTX

The flag SF_HTX has been added to know when a stream uses the HTX or not. It is
set when an HTX stream is created. There are 2 conditions to set it. The first
one is when the HTTP frontend enables the HTX. The second one is when the attached
conn_stream uses an HTX multiplexer.

MINOR: muxes: Add a flag to specify a multiplexer uses the HTX

A multiplexer must now set the flag MX_FL_HTX when it uses the HTX to structured
the data exchanged with channels. the muxes h1 and h2 set this flag. Of course,
for the mux h2, it is set on h2_htx_ops only.

MINOR: mux-h2: Add a mux_ops dedicated to the HTX mode

Instead of using the same mux_ops structure for the legacy HTTP mode and the HTX
mode, a dedicated mux_ops is now used for the HTX mode. Same callbacks are used
for both. But the flags may be different depending on the mode used.

BUG/MINOR: mux-h1: Handle the flag CS_FL_KILL_CONN during a shutdown read/write

This flag is used to explicitly kill the connection when the CS is closed. It
may be set by tcp rules. It must be respect by the mux-h1.

This patch must be backported to 1.9.

MINOR: mux-h1: Don't release the conn_stream anymore when h1s is destroyed

An H1 stream is destroyed when the conn_stream is detached or when the H1
connection is destroyed. In the first case, the CS is released by the caller. In
the second one, because the connection is closed, no CS is attached anymore. In
both, there is no reason to release the conn_stream in h1s_destroy().

MEDIUM: mux-h1: Simplify the connection mode management by sanitizing headers

Connection headers are now sanitized during the parsing and the formatting. This
means "close" and "keep-alive" values are always removed but right flags are
set. This way, client side and server side are independent of each other. On the
input side, after the parsing, neither "close" nor "keep-alive" values
remain. So on the output side, if we found one of these values in a connection
headers, it means it was explicitly added by HAProxy. So it overwrites the other
rules, if applicable. Always sanitizing the output is also a way to simplifiy
conditions to update the connection header. Concretly, only additions of "close"
or "keep-alive" values remain, depending the case.

No need to backport this patch.

MEDIUM: h1: Add an option to sanitize connection headers during parsing

The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.

If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().

No need to backport this patch

MINOR: stats/htx: Don't add "Connection: close" header anymore in stats responses

On the client side, as far as possible, we will try to keep connection
alive. So, in most of cases, this header will be removed. So it is better to not
add it at all. If finally the connection must be closed, the header will be
added by the mux h1.

No need to backport this patch.

MINOR: mux-h1: Simplify handling of 1xx responses

Because of previous changes on http tunneling, the synchronization of the
transaction can be simplified. Only the check on intermediate messages remains
and it only concerns the response path.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.

BUG/MEDIUM: htx: Fix the process of HTTP CONNECT with h2 connections

In HTX, the HTTP tunneling does not work if h1 and h2 are mixed (an h1 client
sending requests to an h2 server or this opposite) because the h1 multiplexer
always adds an EOM before switching it to tunnel mode. The h2 multiplexer
interprets it as an end of stream, closing the stream as for any other
transaction.

To make it works again, we need to swith to the tunnel mode without emitting any
EOM blocks. Because of that, HTX analyzers have been updated to switch the
transaction to tunnel mode before end of the message (because there is no end of
message...).

To be consistent, the protocol switching is also handled the same way even
though the 101 responses are not supported in h2.

This patch must be backported to 1.9.

MINOR: proto_htx: Don't adjust transaction mode anymore in HTX analyzers

Because the option http-tunnel is now ignored in HTX, there is no longer any
need to adjust the transaction mode in HTX analyzers. A channel can still be
switch to the tunnel mode for legitimate cases (HTTP CONNECT or switching
protocols). So the function htx_adjust_conn_mode() is now useless.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.

MEDIUM: htx: Deprecate the option 'http-tunnel' and ignore it in HTX

The option http-tunnel disables any HTTP processing past the first
transaction. In HTX, it works for full h1 transactions. As for the legacy HTTP,
it is a workaround, but it works. But it is impossible to make it works with an
h2 connection. In such case, it has no effect, the stream is closed at the end
of the transaction. So to avoid any inconsistancies between h1 and h2
connections, this option is now always ignored when the HTX is enabled. It is
also a good opportinity to deprecate an old and ugly option. A warning is
emitted during HAProxy startup to encourage users to remove this option.

Note that in legacy HTTP, this option only works with full h1 transactions
too. If an h2 connection is established on a frontend with this option enabled,
it will have no effect at all. But we keep it for the legacy HTTP for
compatibility purpose. It will be removed with the legacy HTTP.

So to be short, if you have to really (REALLY) use it, it will only work for
legacy HTTP frontends with H1 clients.

The documentation has been updated accordingly.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.

BUG/MEDIUM: htx: Don't crush blocks payload when append is done on a data block

If there is a data block when a header block is added in a HTX message, its
payload will be inserted after the data block payload. But its index will be
moved before the EOH block. So at this stage, if a new data block is added, we
will try to append its payload to the last data block (because it is also the
tail). Thus the payload of the further header block will be crushed.

This cannot happens if the payloads wrap thanks to the previous fix. But it
happens when the tail is not the front too. So now, in this case, we add a new
block instead of appending.

This patch must be backported in 1.9.

BUG/MEDIUM: htx: Defrag if blocks position is changed and the payloads wrap

When a header is added or when a data block is added before another one, the
blocks position may be changed (but not their payloads position). For instance,
when a header is added, we move the block just before the EOH, if any. When the
payloads wraps, it is pretty annoying because we loose the last inserted
block. It is neither the tail nor the head. And it is not the front either.

It is a design problem. Waiting for fixing this problem, we force a
defragmentation in such case. Anyway, it should be pretty rare, so it's not
really critical.

This patch must be backported to 1.9.

BUG/MINOR: spoe: Be sure to set tv_request when each message fragment is encoded

When a message or a fragment is encoded, the date the frame processing starts
must be set if it is undefined. The test on tv_request field was wrong.

This patch must be backported to 1.9.

BUG/MEDIUM: spoe: Return an error if nothing is encoded for fragmented messages

If the maximum frame size is very small with a large message or argument name,
it is possible to be unable to encode anything. In such case, it is important to
stop processing returning an error otherwise we will retry in loop to encode the
message, failing each time because of the too small frame size.

This patch must be backported to 1.9 and 1.8.

BUG/MEDIUM: spoe: Queue message only if no SPOE applet is attached to the stream

If a SPOE applet is already attached to a stream to handle its messages, we must
not queue them. Otherwise it could be handled by another applet leading to
errors. This happens with fragmented messages only. When the first framgnent is
sent, the SPOE applet sending it is attached to the stream. It should be used to
send all other fragments.

This patch must be backported to 1.9 and 1.8.

MINOR: cli/activity: report the accept queue sizes in "show activity"

Seeing the size of each ring helps understand which threads are
overloaded and why some of them are less often elected than others
by the multi-queue load balancer.

MINOR: cli/listener: report the number of accepts on "show activity"

The "show activity" command reports the number of incoming connections
dispatched per thread but doesn't report the number of connections
received by each thread. It is important to be able to monitor this
value as it can show that for whatever reason a smaller set of threads
is receiving the connections and dispatching them to all other ones.

BUG/MINOR: listener: renice the accept ring processing task

It is not acceptable that the accept queues are handled with a normal
priority since they are supposed to quickly dispatch the incoming
traffic, resulting in tasks which will have their respective nice
values and place in the queue. Let's renice the accept ring tasks
to -1024.

No backport is needed, this is strictly 2.0.

BUG/MINOR: tasks: make sure the first task to be queued keeps its nice value

The run queue offset computed from the nice value depends on the run
queue size, but for the first task to enter the run queue, this size
is zero and the task gets queued just as if its nice value was zero as
well. This is problematic for example for the CLI socket if another
higher priority task gets queued immediately after as it can steal its
place.

This patch simply adds one to the rq_size value to make sure the nice
is never multiplied by zero. The way the offset is calculated is
questionable anyway these days, since with the newer scheduler it seems
that just using the nice value as an offset should work (possibly damped
by the task's number of calls).

This fix must be backported to 1.9. It may possibly be backported to
older versions if it proves to make the CLI more interactive.

BUG/MEDIUM: task/threads: address a fairness issue between local and global tasks

It is possible to hit a fairness issue in the scheduler when a local
task runs for a long time (i.e. process_stream() returns running), and
a global task wants to run on the same thread and remains in the global
queue. What happens in this case is that the condition to extract tasks
from the global queue will rarely be satisfied for very low task counts
since whatever non-null queue size multiplied by a thread count >1 is
always greater than the small remaining number of tasks in the queue.
In theory another thread should pick the task but we do have some mono
threaded tasks in the global queue as well during inter-thread wakeups.

Note that this can only happen with task counts lower than the thread
counts, typically one task in each queue for more than two threads.

This patch works around the problem by allowing a very small unfairness,
making sure that we can always pick at least one task from the global
queue even if there is already one in the local queue.

A better approach will consist in scanning the two trees in parallel
and always pick the best task. This will be more complex and will
constitute a separate patch.

This fix must be backported to 1.9.

BUG/MEDIUM: stream_interface: Don't bother doing chk_rcv/snd if not connected.

If the interface is not in state SI_ST_CON or SI_ST_EST, don't bother
trying to send/recv data, we can't do it anyway, and if we're in SI_ST_TAR,
that may lead to adding the SI_FL_ERR flag back on the stream_interface,
while we don't want it.

This should be backported to 1.9.

BUG/MEDIUM: streams: Only re-run process_stream if we're in a connected state.

In process_stream(), only try again when there's the SI_FL_ERR flag and we're
in a connected state, otherwise we can loop forever.
It used to work because si_update_both() bogusly removed the SI_FL_ERR flag,
and it would never be set at this point. Now it does, so take that into
account.
Many, many thanks to Maciej Zdeb for reporting the problem, and helping
investigating it.

This should be backported to 1.9.

MINOR: ssl: Activate aes_gcm_dec converter for BoringSSL

BoringSSL can support it, no need to disable.

MINOR: skip get_gmtime where tm is unused

For LOG_FMT_TS (%Ts), the tm variable is not used, so save some cycles
on the call to get_gmtime.

Backport: 1.9 1.8
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>

BUG/MEDIUM: pattern: assign pattern IDs after checking the config validity

Pavlos Parissis reported an interesting case where some map identifiers
were not assigned (appearing as -1 in show map). It turns out that it
only happens for log-format expressions parsed in check_config_validity()
that involve maps (log-format, use_backend, unique-id-header), as in the
sample configuration below :

    frontend foo
        bind :8001
        unique-id-format %[src,map(addr.lst)]
        log-format %[src,map(addr.lst)]
        use_backend %[src,map(addr.lst)]

The reason stems from the initial introduction of unique IDs in 1.5 via
commit af5a29d5f ("MINOR: pattern: Each pattern is identified by unique
id.") : the unique_id assignment was done before calling
check_config_validity() so all maps loaded after this call are not
properly configured. From what the function does, it seems they will not
be able to use a cache, will not have a unique_id assigned and will not
be updatable from the CLI.

This fix must be backported to all supported versions.

MINOR: threads: Implement thread_cpus_enabled() for FreeBSD.

Use cpuset_getaffinity() to implement thread_cpus_enabled() on FreeBSD, so
that we can know the number of CPUs available, and automatically launch as
much threads if nbthread isn't specified.

MINOR: initcall: Don't forget to define the __start/stop_init_##stg symbols.

When creating a new initcall, don't forget to define the symbols, as it may
not be done automatically and that would lead to undefined symbols.

This should be backported to 1.9.

BUG/MEDIUM: stream: Don't clear the stream_interface flags in si_update_both.

In commit d7704b534, we introduced and expiration flag on the stream interface,
which is used for the connect, the queue and the turn around. Because the
turn around state isn't an error, the flag was reset in process_stream(), and
later in commit cff6411f9 when introducing the SI_FL_ERR flag, the cleanup
of the flag at this place was erroneously generalized.
To fix this, the SI_FL_EXP flag is only cleared at the end of the turn around
state, and nobody should clear the stream interface flags anymore.

This should be backported to 1.9, it has no known impact on older versions.

BUG/MEDIUM: streams: Store prev_state before calling si_update_both().

As si_update_both() sets prev_state to state for each stream_interface, if
we want to check it changed, copy it before calling si_update_both().

This should be backported to 1.9.

BUG/MEDIUM: streams: Don't remove the SI_FL_ERR flag in si_update_both().

Don't inconditionally remove the SI_FL_ERR code in si_update_both(), which
is called at the end of process_stream(). Doing so was a bug that was there
since the flag was introduced, because we were always setting si->flags to
SI_FL_NONE, however we don't want to lose that one, except if we will retry
connecting, so only remove it in sess_update_st_cer().

This should be backported to 1.9.

BUG/MEDIUM: htx: fix random premature abort of data transfers

It can happen in some cases that the last block of an H2 transfer over
HTX is truncated. This was tracked down to a leftover of an earlier
implementation of htx_xfer_blks() causing the computed size of a block
to be incorrectly calculated if a data block doesn't completely fit into
the target buffer. In practice it causes the EOM block to be attempted to
be emitted with a wrong size and the message to be truncated. One way to
reproduce this is to chain two haproxy instances in h1->h2->h1 with
httpterm as the server and h2load as the client, making many requests
between 8 and 10kB over a single connection. Usually one of the very
first requests will fail.

This fix must be backported to 1.9.

BUG/MEDIUM: h2: Don't attempt to recv from h2_process_demux if we subscribed.

Modify h2c_restart_reading() to add a new parameter, to let it know if it
should consider if the buffer isn't empty when retrying to read or not, and
call h2c_restart_reading() using 0 as a parameter from h2_process_demux().
If we're leaving h2_process_demux() with a non-empty buffer, it means the
frame is incomplete, and we're waiting for more data, and if we already
subscribed, we'll be waken when more data are available.
Failing to do so means we'll be waken up in a loop until more data are
available.

This should be backported to 1.9.

BUG/MEDIUM: peers: fix a case where peer session is not cleanly reset on release.

The deinit took place in only peer_session_release, but in the a case of a
previous call to peer_session_forceshutdown, the session cursors
won't be reset, resulting in a bad state for new session of the same
peer. For instance, a table definition message could be dropped and
so all update messages will be dropped by the remote peer.

This patch move the deinit processing directly in the force shutdown
funtion. Killed session remains in "ST_END" state but ref on peer was
reset to NULL and deinit will be skipped on session release function.

The session release continue to assure the deinit for "active" sessions.

This patch should be backported on all stable version since proto
peers v2.

REGTEST: lua/b00003: Specify the HAProxy pid when the command ss is executed

This avoids confusions with any other haproxy process.

REGTEST: lua/b00003: Relax the regex matching the log message

The timeer TR may be greater than 10ms, making the test fails.

REGTEST: log/b00000: Be sure the client always hits its timeout

To do so, the server does not send anything. Instead it waits 2ms before
closing. The client, on its side, will wait for a response. So it will be
blocked. Becauase the client timeout is set to 1ms, HAProxy should always close
the client connection because it times out.

REGTEST: http-rules/h00003: Use a different client for requests expecting a 301

Because HAProxy may decide to close 301 responses, as others internal responses,
it is safer to use a different client for these requests. This is not the
purpose of this test to verify the keep-alive in such cases.

REGTEST: http-messaging/h00000: Fix the test when the HTX is enabled

The way unexpected bodies are handled for responses to HEAD requests differs from
the legacy HTTP to the HTX. While it is dropped wih the legacy HTTP, in HTX, it
is parsed as the response to the next request. So, in HTX, a 502 error is
returned to the client and the connexion is closed.

This test has been modified to pass in both mode.

REGTEST: http-capture/h00000: Relax a regex matching the log message

Size reported in logs may differ between legacy HTTP and HTX, at least for
now. So in the regtest http-capture/h00000.vtc, we need to relax the regex
matching the log message.

BUG/MINOR: proto_htx: Reset to_forward value when a message is set to DONE

Because we try to forward infinitly message body, when its state is set to DONE,
we must be sure to reset to_foward value of the corresponding
channel. Otherwise, some errors can be errornously triggered.

No need to backport this patch.

BUG/MINOR: htx: Preserve empty HTX messages with an unprocessed parsing error

This let a chance to HTX analyzers to handle the error and send the appropriate
response to the client.

This patch must be backported to 1.9.

MINOR: cli: export HAPROXY_CLI environment variable

Export the HAPROXY_CLI environment variable which contains the list of
all stats sockets (including the sockpair@) separated by semicolons.

MINOR: cli: start addresses by a prefix in 'show cli sockets'

Displays a prefix for every addresses in 'show cli sockets'.
It could be 'unix@', 'ipv4@', 'ipv6@', 'abns@' or 'sockpair@'.

Could be backported in 1.9 and 1.8.

BUG/MINOR: cli: correctly handle abns in 'show cli sockets'

The 'show cli sockets' was not handling the abns sockets. This is a
problem since it uses the AF_UNIX family, it displays nothing
in the path column because the path starts by \0.

Should be backported to 1.9 and 1.8.

MINOR: mworker/cli: show programs in 'show proc'

Show the programs in 'show proc'

Example:

# programs
2285 dataplane-api - 0 0d 00h00m12s
# old programs
2261 dataplane-api - 1 0d 00h00m53s

MEDIUM: mworker-prog: implement program for master-worker

This patch implements the external binary support in the master worker.

To configure an external process, you need to use the program section,
for example:

program dataplane-api
command ./dataplane_api

Those processes are launched at the same time as the workers.

During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:

  - the master is re-executed
  - the master sends a USR1 signal to the program
  - the master launches a new instance of the program

During a stop, or restart, a SIGTERM is sent to the program.

REORG: mworker/cli: move CLI functions to mworker.c

Move the CLI functions of the master worker to mworker.c

MINOR: cli: export cli_parse_default() definition in cli.h

Export the cli_parse_default() function in cli.h so it could be used in
other files.

MINOR: mworker: don't use children variable anymore

The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.

The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.

MINOR: mworker: calloc mworker_proc structures

Initialize mworker_proc structures to 0 with calloc instead of just
doing a malloc.

REORG: mworker: move mworker_cleanlisteners to mworker.c

REORG: mworker: move signal handlers and related functions

Move the following functions to mworker.c:

void mworker_catch_sighup(struct sig_handler *sh);
void mworker_catch_sigterm(struct sig_handler *sh);
void mworker_catch_sigchld(struct sig_handler *sh);

static void mworker_kill(int sig);
int current_child(int pid);

REORG: mworker: move IPC functions to mworker.c

Move the following functions to mworker.c:

void mworker_accept_wrapper(int fd);
void mworker_pipe_register();

REORG: mworker: move signals functions to mworker.c

Move the following functions to mworker.c:

void mworker_block_signals();
void mworker_unblock_signals();

REORG: mworker: move serializing functions to mworker.c

Move the 2 following functions to mworker.c:

void mworker_proc_list_to_env()
void mworker_env_to_proc_list()

MINOR: ssl: Add aes_gcm_dec converter

The converter can be used to decrypt the raw byte input using the
AES-GCM algorithm, using provided nonce, key and AEAD tag. This can
be useful to decrypt encrypted cookies for example and make decisions
based on the content.

BUILD: Makefile: disable shared cache on AIX 5.1

AIX 5.1 is missing the following builtins used for atomic locking of the
shared inter-process cache :

   .__sync_val_compare_and_swap_4
   .__sync_lock_test_and_set_4
   .__sync_sub_and_fetch_4

Let's simply use the private cache by default since nobody cares on
such old systems. No test was made on a more recent version.

BUILD: define unsetenv on AIX 5.1

This version doesn't have unsetenv(), so let's map it to my_unsetenv() instead.
This wasn't tested on more recent versions.

BUILD: makefile: add _LINUX_SOURCE_COMPAT to build on AIX-51

Not tested on later versions, but at least there _LINUX_SOURCE_COMPAT
must be defined to access the CMSG_SPACE() and CMSG_LEN() macros.

BUILD: makefile: fix build of IPv6 header on aix51

ip6_hdr is called ip6hdr there and is only defined when STEVENS_API is
defined.

BUILD: connection: fix naming of ip_v field

AIX defines ip_v as ip_ff.ip_fv in netinet/ip.h using a macro, and
unfortunately we do have a local variable with such a name and which
uses the same header file. Let's rename the variable to ip_ver to fix
this.

BUILD: use inttypes.h instead of stdint.h

I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.

As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:

  https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html

Also it's even clearer here in the autoconf doc :

  https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html

  "The C99 standard says that inttypes.h includes stdint.h, so there's
   no need to include stdint.h separately in a standard environment.
   Some implementations have inttypes.h but not stdint.h (e.g., Solaris
   7), but we don't know of any implementation that has stdint.h but not
   inttypes.h"

BUILD: re-implement an initcall variant without using executable sections

The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.

On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :

$ size  haproxy.sections haproxy.constructor
   text    data     bss     dec     hex filename
4060312  249176 1457652 5767140  57ffe4 haproxy.sections
4062862  260408 1457652 5780922  5835ba haproxy.constructor

This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.

Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.

MINOR: tools: add an unsetenv() implementation

Older Solaris and AIX versions do not have unsetenv(). This adds a
fairly simple implementation which scans the environment, for use
with those systems. It will simply require to pass the define in
the "DEFINE" macro at build time like this :

DEFINE="-Dunsetenv=my_unsetenv"

MINOR: tools: make memvprintf() never pass a NULL target to vsnprintf()

Most modern platforms don't touch the output buffer when the size
argument is null, but there exist a few old ones (like AIX 5 and
possibly Tru64) where the output will be dereferenced anyway, probably
to write the trailing null, crashing the process. memprintf() uses this
to measure the desired length.

There is a very simple workaround to this consisting in passing a pointer
to a character instead of a NULL pointer. It was confirmed to fix the issue
on AIX 5.1.

BUILD: cache: avoid a build warning with some compilers/linkers

The struct http_cache_applet was fully declared at the beginning
instead of just doing a forward declaration using an extern modifier.
Some linkers report warnings about a redefined symbol since these
really are two complete declarations.

The proper way to do this is to use extern on the first one and to
have a full declaration later. However it's not permitted to have
both static and extern so the change done in commit 0f2229943
("CLEANUP: cache: don't export http_cache_applet anymore") has to
be partially undone.

This should be backported to 1.9 for sanity but has no effet on
most platforms. However on 1.9 the extern keyword must also be
added to include/types/cache.h.

BUILD: chunk: properly declare pool_head_trash as extern

This one was also declared without the extern modifier in an include
file.

This needs to be backported to 1.9.

BUILD: http: properly mark some struct as extern

http_known_methods, HTTP_100 and HTTP_103 were not declared extern and
as such were multiply defined since they were in http.h. There was
apparently no more side effect but it may depend on the platform and
the linker.

This needs to be backported to 1.9.

BUILD: makefile: work around another bug in make 3.80

GNU make 3.80 has an issue with calls to functions inside an if block,
which is just what we recently introduced to simplify the targets
declaration. The fix is easy, it simply consists in assigning the
command to a variable inside the if block and evaluating this command
after the block. This also makes the code slightly more readable so we
can keep compatibility with 3.80 for now.

No backport is needed.

BUILD: makefile: work around an old bug in GNU make-3.80

GNU make-3.80 fails on the .build_opts target, expecting the closing
brace before the first semi-colon in the shell command, it probably
uses a more limited parser for dependencies. Actually it appears it's
enough to place this command in a variable and reference the variable
there. Since it doesn't affect later versions (and the resulting string
is always empty anyway), let's apply the minor change to continue to
comply with the announced dependencies.

This could be backported as far as 1.6.

BUG/MAJOR: checks: segfault during tcpcheck_main

When using TCP health checks (tcp-check connect), it is possible to
crash with a segfault when, for reasons yet to be understood, the
protocol family is unknown.

In the function tcpcheck_main(), proto is dereferenced without a prior
test in case it is NULL, leading to the segfault during proto->connect
dereference.

The line has been unmodified since it was introduced, in commit
69e273f3fcfbfb9cc0fb5a09668faad66cfbd36b. This was the only use of
proto (or more specifically, the return of protocol_by_family()) that
was unprotected; all other callsites perform the test for a NULL
pointer.

This patch should be backported to 1.9, 1.8, 1.7, and 1.6.

BUG/MEDIUM: checks: Don't bother subscribing if we have a connection error.

In __event_srv_chk_r() and __event_srv_chk_w(), don't bother subscribing
if we're waiting for a handshake, but we had a connection error. We will
never be able to send/receive anything on that connection anyway, and
the conn_stream is probably about to be destroyed, and we will crash if
the tasklet is waken up.
I'm not convinced we need to subscribe here at all anyway, but I'd rather
modify the check code as little as possible.

This should be backported to 1.9.

BUG/MEDIUM: mworker: don't free the wrong child when not found

A bug occurs when the sigchld handler is called and a child which is
not in the process list just left, or with an empty process list.

The child variable won't be set and left as an uninitialized variable or
set to the wrong child entry, which can lead to a free of this
uninitialized variable or of the wrong child.

This can lead to a crash of the master during a stop or a reload.

It is not supposed to happen with a worker which was created by the
master. A cause could be a fork made by a dependency. (openssl, lua ?)

This patch strengthens the case of the missing child by doing the free
only if the child was found.

This patch must be backported to 1.9.