git.ipfire.org Git - thirdparty/haproxy.git/log

MINOR: cache: allow caching of OPTIONS request

Allow HAProxy to cache responses to OPTIONS HTTP requests.
This is useful in the use case of "Cross-Origin Resource Sharing" (cors)
to cache CORS responses from API servers.

Since HAProxy does not support Vary header for now, this would be only
useful for "access-control-allow-origin: *" use case.

MINOR: cache: add method to cache hash

Current HTTP cache hash contains only the Host header and the url path.
That said, request method should also be added to the mix to support
caching other request methods on the same URL. IE GET and OPTIONS.

CLEANUP: mux-h2: move the demuxed frame check code in its own function

The frame check code in the demuxer was moved to its own function to
keep the demux function clean enough. This also simplifies the test
case as we can now simply call this function once in H2_CS_FRAME_P
state.

BUG/MEDIUM: stick-table: Wrong stick-table backends parsing.

When parsing references to stick-tables declared as backends, they are added to
a list of proxies (they are proxies!) which refer to this stick-tables.
Before this patch we added them to these list without checking they were already
present, making the silly hypothesis the actions/sample were checked/resolved in the same
order the proxies are parsed.

This patch implement a simple inline function to in_proxies_list() to test
the presence of a proxy in a list of proxies. We use this function when resolving
/checking samples/actions.

This bug was introduced by 015e4d7 commit.

Must be backported to 2.0.

BUG/MEDIUM: checks: make sure to close nicely when we're the last to speak

In SMTP, MySQL and PgSQL checks, we're supposed to finish with a message
to politely quit the server, otherwise some of them will log some errors.
This is the case with Postfix as reported in GH issue #187. Since commit
fe4abe6 ("BUG/MEDIUM: connections: Don't call shutdown() if we want to
disable linger.") we are a bit more aggressive on outgoing connection
closure and checks were not prepared for this.

This patch makes the 3 checks above disable the linger_risk for these
checks so that we close cleanly, with the side effect that it will leave
some TIME_WAIT connections behind (hence why it should not be generalized
to all checks). It's worth noting that in issue #187 it's mentioned that
this patch doesn't seem to be sufficient for Postfix, however based only
on local network activity this looks OK, so maybe this will need to be
improved later.

Given that the patch above was backported to 2.0 and 1.9, this one should
as well.

BUG/MINOR: mux-h2: always reset rcvd_s when switching to a new frame

In Patrick's trace it was visible that after a stream had been missed,
the next stream would receive a WINDOW_UPDATE with the first one's
credit added to its own. This makes sense because in case of error
h2c->rcvd_s is not reset. Given that this counter is per frame, better
reset it when starting to parse a new frame, it's easier and safer.

This must be backported as far as 1.8.

BUG/MINOR: mux-h2: always send stream window update before connection's

In h2_process_mux() if we have some room and an attempt to send a window
update for the connection was pending, it's done first. But it's not done
for the stream, which will have for effect of postponing this attempt
till next pass into h2_process_demux(), at the risk of seeing the send
buffer full again. Let's always try to send both pending frames as soon
as possible.

This should be backported as far as 1.8.

BUG/MEDIUM: mux-h2: do not recheck a frame type after a state transition

Patrick Hemmer reported a rare case where the H2 mux emits spurious
RST_STREAM(STREAM_CLOSED) that are triggered by the send path and do
not even appear to be associated with a previous incoming frame, while
the send path never emits such a thing.

The problem is particularly complex (hence its rarity). What happens is
that when data are uploaded (POST) we must refill the sending stream's
window by sending a WINDOW_UPDATE message (and we must refill the
connection's too). But in a highly bidirectional traffic, it is possible
that the mux's buffer will be full and that there is no more room to build
this WINDOW_UPDATE frame. In this case the demux parser switches to the
H2_CS_FRAME_A state, noting that an "acknowledgement" is needed for the
current frame, and it doesn't change the current stream nor frame type.
But the stream's state was possibly updated (typically OPEN->HREM when
a DATA frame carried the ES flag).

Later the data can leave the buffer, wake up h2_io_cb(), which calls
h2_send() to send pending data, itself calling h2_process_mux() which
detects that there are unacked data in the connection's window so it
emits a WINDOW_UPDATE for the connection and resets the counter. so it
emits a WINDOW_UPDATE for the connection and resets the counter. Then
h2_process() calls h2_process_demux() which continues the processing
based on the current frame type and the current state H2_CS_FRAME_A.
Unfortunately the protocol compliance checks matching the frame type
against the current state are still present. These tests are designed
for new frames only, not for those in progress, but they are not
limited by frame types. Thus the current DATA frame is checked again
against the current stream state that is now HREM, and fails the test
with a STREAM_CLOSED error.

The quick and backportable solution consists in adding the test for
this ACK and bypass all these checks that were already validated prior
to the state transition. A better long-term solution would consist in
having a new state between H and P indicating the frame is new and
needs to be checked ("N" for new?) and apply the protocol tests only
in this state. In addition everywhere we decide to send a window
update, we should send a stream WU first if there are unacked data
for the current stream. Last, rcvd_s should always be reset when
transitioning to FRAME_H (and a BUGON for this in dev would help).

The bug will be way harder to trigger on 2.0 than on 1.8/1.9 because
we have a ring buffer for the connection so the buffer full situations
are extremely rare.

This fix must be backpored to all versions having H2 (as far as 1.8).

Special thanks to Patrick for providing exploitable traces.

BUG/MINOR: mux-h2: do not send REFUSED_STREAM on aborted uploads

If the server decides to close early, we don't want to send a
REFUSED_STREAM error but a CANCEL, so that the client doesn't want
to retry. The test in h2_do_shutw() was wrong for this as it would
handle the HLOC case like the case where nothing had been sent for
this stream, which is wrong. Now h2_do_shutw() does nothing in this
case and lets h2_do_shutr() decide.

Note that this partially undoes f983d00a1 ("BUG/MINOR: mux-h2: make
the do_shut{r,w} functions more robust against retries").

This must be backported to 2.0. The patch above was not backported to
1.9 for being too risky there, but if it eventually gets to it, this
one will be needed as well.

BUG/MINOR: mux-h2: use CANCEL, not STREAM_CLOSED in h2c_frt_handle_data()

There is a test on the existence of the conn_stream when receiving data,
to be sure to have somewhere to deliver it. Right now it responds with
STREAM_CLOSED, which is not correct since from an H2 point of view the
stream is not closed and a peer could be upset to see this. After some
analysis, it is important to keep this test to be sure not to fill the
rxbuf then stall the connection. Another option could be to modiffy
h2_frt_transfer_data() to silently discard any contents but the CANCEL
error code is designed exactly for this and to save the peer from
continuing to stream data that will be discarded, so better switch to
using this.

This must be backported as far as 1.8.

BUG/MINOR: mux-h2: don't refrain from sending an RST_STREAM after another one

The test in h2s_send_rst_stream() is excessive, it refrains from sending an
RST_STREAM if *the last frame* was an RST_STREAM, regardless of the stream
ID. In a context where both clients and servers abort a lot, it could happen
that one RST_STREAM is dropped from responses from time to time, causing
delays to the client.

This must be backported to 2.0, 1.9 and 1.8.

BUILD: travis-ci: enable daily Coverity scan

Coverity scan takes some time, also we are limited to 3 scans per day.
So, running it once a day seem to be good idea.

BUG/MEDIUM: pollers: Clear the poll_send bits as well.

In _update_fd(), if we're about to remove the FD from the poller, remove
both the receive and the send bits, instead of removing the receive bits
twice.

BUG/MEDIUM: fd: Always reset the polled_mask bits in fd_dodelete().

In fd_dodelete(), always reset the polled_mask bits, instead on only doing
it if we're closing the file descriptor. We call the poller clo() method
anyway, and failing to do so means that if fd_remove() is used while the
fd is polled, the poller won't attempt to poll on a fd with the same value
as the old one.
This leads to fd being stuck in the SSL code while using the async engine.

This should be backported to 2.0, 1.9 and 1.8.

DOC: Add 'Question.md' issue template, discouraging asking questions

This commit adds a new `Question.md` issue template that aims to prevent
users asking questions within the issue tracker, because the other two
templates do not fit "question", thus leading to people not seeing that
the issue tracker is not for questions.

This template inspired by TypeScript's Question template at:
https://github.com/microsoft/TypeScript/blob/master/.github/ISSUE_TEMPLATE/Question.md

BUG/MEDIUM: proxy: Make sure to destroy the stream on upgrade from TCP to H2

In stream_set_backend(), if we have a TCP stream, and we want to upgrade it
to H2 instead of attempting ot reuse the stream, just destroy the
conn_stream, make sure we don't log anything about the stream, and pretend
we failed setting the backend, so that the stream will get destroyed.
New streams will then be created by the mux, as if the connection just
happened.
This fixes a crash when upgrading from TCP to H2, as the H2 mux totally
ignored the conn_stream provided by the upgrade, as reported in github
issue #196.

This should be backported to 2.0.

BUG/MEDIUM: mux-h2: split the stream's and connection's window sizes

The SETTINGS frame parser updates all streams' window for each
INITIAL_WINDOW_SIZE setting received on the connection (like h2spec
does in test 6.5.3), which can start to be expensive if repeated when
there are many streams (up to 100 by default). A quick test shows that
it's possible to parse only 35000 settings per second on a 3 GHz core
for 100 streams, which is rather small.

Given that window sizes are relative and may be negative, there's no
point in pre-initializing them for each stream and update them from
the settings. Instead, let's make them relative to the connection's
initial window size so that any change immediately affects all streams.
The only thing that remains needed is to wake up the streams that were
unblocked by the update, which is now done once at the end of
h2_process_demux() instead of once per setting. This now results in
5.7 million settings being processed per second, which is way better.

In order to keep the change small, the h2s' mws field was renamed to
"sws" for "stream window size", and an h2s_mws() function was added
to add it to the connection's initial window setting and determine the
window size to use when muxing. The h2c_update_all_ws() function was
renamed to h2c_unblock_sfctl() since it's now only used to unblock
previously blocked streams.

This needs to be backported to all versions till 1.8.

BUG/MEDIUM: mux-h2: unbreak receipt of large DATA frames

Recent optimization in commit 4d7a88482 ("MEDIUM: mux-h2: don't try to
read more than needed") broke the receipt of large DATA frames because
it would unconditionally subscribe if there was some room left, thus
preventing any new rx from being done since subscription may only be
done once the end was reached, as indicated by ret == 0.

However, fixing this uncovered that in HTX mode previous versions might
occasionally be affected as well, when an available frame is the same
size as the maximum data that may fit into an HTX buffer, we may end
up reading that whole frame and still subscribe since it's still allowed
to receive, thus causing issues to read the next frame.

This patch will only work for 2.1-dev but a minor adaptation will be
needed for earlier versions (down to 1.9, where subscribe() was added).

BUG/MINOR: stream-int: also update analysers timeouts on activity

Between 1.6 and 1.7, some parts of the stream forwarding process were
moved into lower layers and the stream-interface had to keep the
stream's task up to date regarding the timeouts. The analyser timeouts
were not updated there as it was believed this was not needed during
forwarding, but actually there is a case for this which is "option
contstats" which periodically triggers the analyser timeout, and this
change broke the option in case of sustained traffic (if there is some
I/O activity during the same millisecond as the timeout expires, then
the update will be missed).

This patch simply brings back the analyser expiration updates from
process_stream() to stream_int_notify().

It may be backported as far as 1.7, taking care to adjust the fields
names if needed.

BUG/MEDIUM: ssl: open the right path for multi-cert bundle

Multi-cert bundle was not working anymore because we tried to open the
wrong path.

BUG/MINOR: stream-int: make sure to always release empty buffers after sending

There are some situations, after sending a request or response, upon I/O
completion, or applet execution, where we end up with an empty buffer that
was not released. This results in excessive memory usage (back to 1.5) and
a lower CPU cache efficiency since buffers are not recycled as fast. This
has changed since the places where we send have changed with the new
layering, but not all cases susceptible of leaving an empty buffer were
properly spotted. Doing so reduces the memory pressure on buffers by about
2/3 in high traffic tests.

This should be backported to 2.0 and maybe 1.9.

BUG/MAJOR: http/sample: use a static buffer for raw -> htx conversion

Multiple calls to smp_fetch_fhdr use the header context to keep track of
header parsing position; however, when using header sampling on a raw
connection, the raw buffer is converted into an HTX structure each time, and
this was done in the trash areas; so the block reference would be invalid on
subsequent calls.

This patch must be backported to 2.0 and 1.9.

BUG/MEDIUM: lb-chash: Ensure the tree integrity when server weight is increased

When the server weight is increased in consistant hash, extra nodes have to be
allocated. So a realloc() is performed on the nodes array of the server. the
previous commit 962ea7732 ("BUG/MEDIUM: lb-chash: Remove all server's entries
before realloc() to re-insert them after") have fixed the size used during the
realloc() to avoid segfaults. But another bug remains. After the realloc(), the
memory area allocated for the nodes array may change, invalidating all node
addresses in the chash tree.

So, to fix the bug, we must remove all server's entries from the chash tree
before the realloc to insert all of them after, old nodes and new ones. The
insert will be automatically handled by the loop at the end of the function
chash_queue_dequeue_srv().

Note that if the call to realloc() failed, no new entries will be created for
the server, so the effective server weight will be unchanged.

This issue was reported on Github (#189).

This patch must be backported to all versions since the 1.6.

BUG/MINOR: ssl: fix ressource leaks on error

Commit 36b84637 "MEDIUM: ssl: split the loading of the certificates"
introduce leaks on fd/memory in case of error.

BUG/MEDIUM: ssl: don't free the ckch in multi-cert bundle

When using a ckch we should never try to free its content, because it
won't be usable after and can result in a NULL derefence during
parsing.

The content was previously freed because the ckch wasn't stored in a
tree to be used later, now that we use it multiple time, we need to keep
the data.

BUILD: ssl: BoringSSL add EVP_PKEY_base_id

Remove EVP_PKEY_base_id compatibility, it is now included in BoringSSL.

REGTESTS: checks: make 4be_1srv_health_checks more reliable

This test occasionally fails on the Travis CI tests because the
"in progress" bit is sometimes still set (or set again) in the show
servers state output and is not expected in all regexes (some do
already cover it), like in this one :

https://travis-ci.com/haproxy/haproxy/jobs/221324920

Let's extend the remaining ones to accept this as well. Other tests
do not seem affected as they only expect sequences of digits there.

MINOR: wdt: also consider that waiting in the thread dumper is normal

It happens that upon looping threads the watchdog fires, starts a dump,
and other threads expire their budget while waiting for the other threads
to get dumped and trigger a watchdog event again, adding some confusion
to the traces. With this patch the situation becomes clearer as we export
the list of threads being dumped so that the watchdog can check it before
deciding to trigger. This way such threads in queue for being dumped are
not attempted to be reported in turn.

This should be backported to 2.0 as it helps understand stack traces.

BUG/MINOR: debug: fix a small race in the thread dumping code

If a thread dump is requested from a signal handler, it may interrupt
a thread already waiting for a dump to complete, and may see the
threads_to_dump variable go to zero while others are waiting, steal
the lock and prevent other threads from ever completing. This tends
to happen when dumping many threads upon a watchdog timeout, to threads
waiting for their turn.

Instead now we proceed in two steps :
  1) the last dumped thread sets all bits again
  2) all threads only wait for their own bit to appear, then clear it
     and quit

This way there's no risk that a bit performs a double flip in the same
loop and threads cannot get stuck here anymore.

This should be backported to 2.0 as it clarifies stack traces.

BUG/MEDIUM: ssl: does not try to free a DH in a ckch

ssl_sock_load_dh_params() should not free the DH * of a ckch, or the
ckch won't be usable during the next call.

BUG/BUILD: ssl: fix build with openssl < 1.0.2

Recent changes use struct cert_key_and_chain to load all certificates in
frontends, this structure was previously used only to load multi-cert
bundle, which is supported only on >= 1.0.2.

MEDIUM: mux-h2: don't try to read more than needed

The h2_recv() loop was historically built around a loop to deal with
the callback model but this is not needed anymore, as it the upper
layer wants more data, it will simply try to read again. Right now
50% of the recvfrom() calls made over H2 return EAGAIN. With this
change it doesn't happen anymore. Note that the code simply consists
in breaking the loop, and reporting real data receipt instead of
always returning 1.

A test was made not to subscribe if we actually read data but it
doesn't change anything since we might be subscribed very early
already.

MEDIUM: pollers: Remember the state for read and write for each threads.

In the poller code, instead of just remembering if we're currently polling
a fd or not, remember if we're polling it for writing and/or for reading, that
way, we can avoid to modify the polling if it's already polled as needed.

MAJOR: fd: Get rid of the fd cache.

Now that the architecture was changed so that attempts to receive/send data
always come from the upper layers, instead of them only trying to do so when
the lower layer let them know they could try, we can finally get rid of the
fd cache. We don't really need it anymore, and removing it gives us a small
performance boost.

MINOR: ssl: clean ret variable in ssl_sock_load_ckchn

In ssl_sock_load_ckchn, ret variable is now in a half dead usage.
Remove it to clean compilation warnings.

CLEANUP: ssl: ssl_sock_load_crt_file_into_ckch

Fix comments for this function and remove free before alloc call: ckch
call is correctly balanced (alloc/free).

MINOR: ssl: do not look at DHparam with OPENSSL_NO_DH

OPENSSL_NO_DH can be defined to avoid obsolete and heavy DH processing.
With OPENSSL_NO_DH, parse the entire PEM file to look at DHparam is wast
of time.

MINOR: ssl: check private key consistency in loading

Load a PEM certificate and use it in CTX are now decorrelated.
Checking the certificate and private key consistency can be done
earlier: in loading phase instead CTX set phase.

MINOR: ssl: add extra chain compatibility

cert_key_and_chain handling is now outside openssl 1.0.2 #if: the
code must be libssl compatible. SSL_CTX_add1_chain_cert and
SSL_CTX_set1_chain requires openssl >= 1.0.2, replace it by legacy
SSL_CTX_add_extra_chain_cert when SSL_CTX_set1_chain is not provided.

MINOR: ssl: use STACK_OF for chain certs

Used native cert chain manipulation with STACK_OF from ssl lib.

BUG/MAJOR: queue/threads: avoid an AB/BA locking issue in process_srv_queue()

A problem involving server slowstart was reported by @max2k1 in issue #197.
The problem is that pendconn_grab_from_px() takes the proxy lock while
already under the server's lock while process_srv_queue() first takes the
proxy's lock then the server's lock.

While the latter seems more natural, it is fundamentally incompatible with
mayn other operations performed on servers, namely state change propagation,
where the proxy is only known after the server and cannot be locked around
the servers. Howwever reversing the lock in process_srv_queue() is trivial
and only the few functions related to dynamic cookies need to be adjusted
for this so that the proxy's lock is taken for each server operation. This
is possible because the proxy's server list is built once at boot time and
remains stable. So this is what this patch does.

The comments in the proxy and server structs were updated to mention this
rule that the server's lock may not be taken under the proxy's lock but
may enclose it.

Another approach could consist in using a second lock for the proxy's queue
which would be different from the regular proxy's lock, but given that the
operations above are rare and operate on small servers list, there is no
reason for overdesigning a solution.

This fix was successfully tested with 10000 servers in a backend where
adjusting the dyncookies in loops over the CLI didn't have a measurable
impact on the traffic.

The only workaround without the fix is to disable any occurrence of
"slowstart" on server lines, or to disable threads using "nbthread 1".

This must be backported as far as 1.8.

MEDIUM: ssl: load DH param in struct cert_key_and_chain

Load the DH param at the same time as the certificate, we don't need to
open the file once more and read it again. We store it in the ckch_node.

There is a minor change comparing to the previous way of loading the DH
param in a bundle. With a bundle, the DH param in a certificate file was
never loaded, it only used the global DH or the default DH, now it's
able to use the DH param from a certificate file.

MEDIUM: ssl: lookup and store in a ckch_node tree

Don't read a certificate file again if it was already stored in the
ckchn tree. It allows HAProxy to start more quickly if the same
certificate is used at different places in the configuration.

HAProxy lookup in the ssl_sock_load_cert() function, doing it at this
level allows to skip the reading of the certificate in the filesystem.

If the certificate is not found in the tree, we insert the ckch_node in
the tree once the certificate is read on the filesystem, the filename or
the bundle name is used as the key.

MEDIUM: ssl: split the loading of the certificates

Split the functions which open the certificates.

Instead of opening directly the certificates and inserting them directly
into a SSL_CTX, we use a struct cert_key_and_chain to store them in
memory and then we associate a SSL_CTX to the certificate stored in that
structure.

Introduce the struct ckch_node for the multi-cert bundles so we can
store multiple cert_key_and_chain in the same structure.

The functions ssl_sock_load_multi_cert() and ssl_sock_load_cert_file()
were modified so they don't open the certicates anymore on the
filesystem. (they still open the sctl and ocsp though). These functions
were renamed ssl_sock_load_ckchn() and ssl_sock_load_multi_ckchn().

The new function ckchn_load_cert_file() is in charge of loading the
files in the cert_key_and_chain. (TODO: load ocsp and sctl from there
too).

The ultimate goal is to be able to load a certificate from a certificate
tree without doing any filesystem access, so we don't try to open it
again if it was already loaded, and we share its configuration.

MEDIUM: ssl: use cert_key_and_chain struct in ssl_sock_load_cert_file()

This structure was only used in the case of the multi-cert bundle.

Using these primitives everywhere when we load the file are a first step
in the deduplication of the code.

MINOR: ssl: merge ssl_sock_load_cert_file() and ssl_sock_load_cert_chain_file()

This commit merges the function ssl_sock_load_cert_file() and
ssl_sock_load_cert_chain_file().

The goal is to refactor the SSL code and use the cert_key_and_chain
struct to load everything.

BUG/MINOR: htx: Fix free space addresses calculation during a block expansion

When the payload of a block is shrinked or enlarged, addresses of the free
spaces must be updated. There are many possible cases. One of them is
buggy. When there is only one block in the HTX message and its payload is just
before the tail room and it needs to be moved in the head room to be enlarged,
addresses are not correctly updated. This bug may be hit by the compression
filter.

This patch must be backported to 2.0.

BUG/MINOR: hlua: Only execute functions of HTTP class if the txn is HTTP ready

The flag HLUA_TXN_HTTP_RDY was added in the previous commit to know when a
function is called for a channel with a valid HTTP message or not. Of course it
also depends on the calling direction. In this commit, we allow the execution of
functions of the HTTP class only if this flag is set.

Nobody seems to use them from an unsupported context (for instance, trying to
set an HTTP header from a tcp-request rule). But it remains a bug leading to
undefined behaviors or crashes.

This patch may be backported to all versions since the 1.6. It depends on the
commits "MINOR: hlua: Add a flag on the lua txn to know in which context it can
be used" and "MINOR: hlua: Don't set request analyzers on response channel for
lua actions".

MINOR: hlua: Add a flag on the lua txn to know in which context it can be used

When a lua action or a lua sample fetch is called, a lua transaction is
created. It is an entry in the stack containing the class TXN. Thanks to it, we
can know the direction (request or response) of the call. But, for some
functions, it is also necessary to know if the buffer is "HTTP ready" for the
given direction. "HTTP ready" means there is a valid HTTP message in the
channel's buffer. So, when a lua action or a lua sample fetch is called, the
flag HLUA_TXN_HTTP_RDY is set if it is appropriate.

MINOR: hlua: Don't set request analyzers on response channel for lua actions

Setting some requests analyzers on the response channel was an old trick to be
sure to re-evaluate the request's analyers after the response's ones have been
called. It is no more necessary. In fact, this trick was removed in the version
1.8 and backported up to the version 1.6.

This patch must be backported to all versions since 1.6 to ease the backports of
fixes on the lua code.

BUG/MEDIUM: hlua: Check the calling direction in lua functions of the HTTP class

It is invalid to manipulate responses from http-request rules or to manipulate
requests from http-response rules. When http-request rules are evaluated, the
connection to server is not yet established, so there is no response at all. And
when http-response rules are evaluated, the request has already been sent to the
server.

Now, the calling direction is checked. So functions "txn.http:req_*" can now
only be called from http-request rules and the functions "txn.http:res_*" can
only be called from http-response rules.

This issue was reported on Github (#190).

This patch must be backported to all versions since the 1.6.

BUG/MINOR: hlua/htx: Reset channels analyzers when txn:done() is called

For HTX streams, when txn:done() is called, the work is delegated to the
function http_reply_and_close(). But it is not enough. The channel's analyzers
must also be reset. Otherwise, some analyzers may still be called while
processing should be aborted.

For instance, if the function is called from an http-request rules on the
frontend, request analyzers on the backend side are still called. So we may try
to add an header to the request, while this one was already reset.

This patch must be backported to 2.0 and 1.9.

REGTESTS: checks: exclude freebsd target for tcp-check_multiple_ports.vtc

This patch excludes freebsd, osx and generic targets for this vtc.

Basic tcp checks performed by haproxy on a linux system leverage the
TCP_QUICKACK option which implies that the connection is never
established from the perspective of the backend server. On other systems
a regular tcp 3 way handshake is performed immediately followed by a
reset, which from the perspective of the server is an aborted connection.

When we run this regtest on FreeBSD (or anything other than linux) there
is a race condition in the server_thread() function of the vtc_server.c
file. If we receive the reset when we are in accept() then fd is -1 and
vtest calls vtc_fatal, failing the test.

Other checks specific reg-tests were excluded on FreeBSD, osx and
generic for the same reason, but were at the time documented as being
disabled because they used TCP_DEFER_ACCEPT. These commits are
15685c791 ("REGTEST: Exclude freebsd target for some reg tests") and
03c6ab0cb ("REGTEST: exclude osx and generic targets for
40be_2srv_odd_health_checks")

MEDIUM: h1: Don't wake the H1 tasklet if we got the whole request.

In h1_rcv_buf(), don't wake the H1 tasklet to attempt to receive more data
if we got the whole request. It will lead to a recv and maybe to a subscribe
while it may not be needed.
If the connection is keep alive, the tasklet will be woken up later by
h1_detach(), so that we'll be able to get the next request, or an end of
connection.

MEDIUM: h1: Don't try to subscribe if we managed to read data.

In h1_recv(), don't subscribe if we managed to receive data. We may not have
to, if we received a complete request, and a new receive will be attempted
later, as the tasklet is woken up either by h1_rcv_buf() or by h1_detach.

DOC: improve the wording in CONTRIBUTING about how to document a bug fix

Insufficiently described bug fixes are still too frequent. It's a real
pain to create each new maintenance release, as 3/4 of the time is spent
trying to guess what problem a patch fixes, which is already important
in order to decide whether to pick the fix or not, but is even more
capital in order to write understandable release notes.

Christopher rightfully demands that a patch tagged "BUG" MUST ABSOLUTELY
describe the problem and why this problem is a bug. Describing the fix
is one thing but if the bug is unknown, why would there be a fix ? How
can a stable maintainer be convinced to take a fix if its author didn't
care about checking whether it was a real bug ? This patch tries to
explain a bit better what really needs to appear in the commit message
and how to describe a bug.

To be backported to all relevant stable versions.

BUG/MINOR: log: make sure writev() is not interrupted on a file output

Since 1.9 we support sending logs to various non-blocking outputs like
stdou/stderr or flies, by using writev() which guarantees that it only
returns after having written everything or nothing. However the syscall
may be interrupted while doing so, and this is visible when writing to
a tty during debug sessions, as some logs occasionally appear interleaved
if an xterm or SSH connection is not very fast. Performance here is not a
critical concern, log correctness is. Let's simply take the logger's lock
around the writev() call to prevent multiple senders from stepping onto
each other's toes.

This may be backported to 2.0 and 1.9.

BUG/MEDIUM: streams: Don't switch the SI to SI_ST_DIS if we have data to send.

In sess_established(), don't immediately switch the backend stream_interface
to SI_ST_DIS if we only got a SHUTR. We may still have something to send,
ie if the request is a POST, and we should be switched to SI_ST8DIS later
when the shutw will happen.

This should be backported to 2.0 and 1.9.

BUG/MEDIUM: lb-chash: Fix the realloc() when the number of nodes is increased

When the number of nodes is increased because the server weight is changed, the
nodes array must be realloc. But its new size is not correctly set. Only the
total number of nodes is used to set the new size. But it must also depends on
the size of a node. It must be the total nomber of nodes times the size of a
node.

This issue was reported on Github (#189).

This patch must be backported to all versions since the 1.6.

BUILD: threads: add the definition of PROTO_LOCK

This one was added by commit daacf3664 ("BUG/MEDIUM: protocols: add a
global lock for the init/deinit stuff") but I forgot to add it to the
include file, breaking DEBUG_THREAD.

MEDIUM: mux-h1: Add the support of headers adjustment for bogus HTTP/1 apps

There is no standard case for HTTP header names because, as stated in the
RFC7230, they are case-insensitive. So applications must handle them in a
case-insensitive manner. But some bogus applications erroneously rely on the
case used by most browsers. This problem becomes critical with HTTP/2
because all header names must be exchanged in lowercase. And HAProxy uses the
same convention. All header names are sent in lowercase to clients and servers,
regardless of the HTTP version.

This design choice is linked to the HTX implementation. So, for previous
versions (2.0 and 1.9), a workaround is to disable the HTX mode to fall
back to the legacy HTTP mode.

Since the legacy HTTP mode was removed, some users reported interoperability
issues because their application was not able anymore to handle HTTP/1 message
received from HAProxy. So, we've decided to add a way to change the case of some
headers before sending them. It is now possible to define a "mapping" between a
lowercase header name and a version supported by the bogus application. To do
so, you must use the global directives "h1-case-adjust" and
"h1-case-adjust-file". Then options "h1-case-adjust-bogus-client" and
"h1-case-adjust-bogus-server" may be used in proxy sections to enable the
conversion. See the configuration manual for more info.

Of course, our advice is to urgently upgrade these applications for
interoperability concerns and because they may be vulnerable to various types of
content smuggling attacks. But, if your are really forced to use an unmaintained
bogus application, you may use these directive, at your own risks.

If it is relevant, this feature may be backported to 2.0.

BUG/MINOR: proxy: always lock stop_proxy()

There is one unprotected call to stop_proxy() from the manage_proxy()
task, so there is a single caller by definition, but there is also
another such call from the CLI's "shutdown frontend" parser. This
one does it under the proxy's lock but the first one doesn't use it.
Thus it is theorically possible to corrupt the list of listeners in a
proxy by issuing "shutdown frontend" and SIGUSR1 exactly at the same
time. While it sounds particularly contrived or stupid, it could
possibly happen with automated tools that would send actions via
various channels. This could cause the process to loop forever or
to crash and thus stop faster than expected.

This might be backported as far as 1.8.

BUG/MEDIUM: protocols: add a global lock for the init/deinit stuff

Dragan Dosen found that the listeners lock is not sufficient to protect
the listeners list when proxies are stopping because the listeners are
also unlinked from the protocol list, and under certain situations like
bombing with soft-stop signals or shutting down many frontends in parallel
from multiple CLI connections, it could be possible to provoke multiple
instances of delete_listener() to be called in parallel for different
listeners, thus corrupting the protocol lists.

Such operations are pretty rare, they are performed once per proxy upon
startup and once per proxy on shut down. Thus there is no point trying
to optimize anything and we can use a global lock to protect the protocol
lists during these manipulations.

This fix (or a variant) will have to be backported as far as 1.8.

BUG/CRITICAL: http_ana: Fix parsing of malformed cookies which start by a delimiter

When client-side or server-side cookies are parsed, HAProxy enters in an
infinite loop if a Cookie/Set-Cookie header value starts by a delimiter (a colon
or a semicolon). Depending on the operating system, the service may become
degraded, unresponsive, or may trigger haproxy's watchdog causing a service stop
or automatic restart.

To fix this bug, in the loop parsing the attributes, we must be sure to always
skip delimiters once the first attribute-value pair was parsed, empty or
not. The credit for the fix goes to Olivier.

CVE-2019-14241 was assigned to this bug. This patch fixes the Github issue #181.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.

BUG/MINOR: http_htx: Support empty errorfiles

Empty error files may be used to disable the sending of any message for specific
error codes. A common use-case is to use the file "/dev/null". This way the
default error message is overridden and no message is returned to the client. It
was supported in the legacy HTTP mode, but not in HTX. Because of a bug, such
messages triggered an error.

This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted.

BUG/MINOR: http_ana: Be sure to have an allocated buffer to generate an error

In http_reply_and_close() and http_server_error(), we must be sure to have an
allocated buffer (buf.size > 0) to consider it as a valid HTX message. For now,
there is no way to hit this bug. But a fix to support "empty" error messages in
HTX is pending. Such empty messages, after parsing, will be converted into
unallocated buffer (buf.size == 0).

This patch must be backported to 2.0 and 1.9. owever, the patch will have to be
adapted.

BUG/MEDIUM: tcp-checks: do not dereference inexisting conn_stream

Github user @jpulz reported a crash with tcp-checks in issue #184
where cs==NULL. If we enter the function with cs==NULL and check->result
!= CHK_RES_UKNOWN, we'll go directly to out_end_tcpcheck and dereference
cs. We must validate there that cs is valid (and conn at the same time
since it would be NULL as well).

This fix must be backported as far as 1.8.

BUG/MINOR: mux-h1: Close server connection if input data remains in h1_detach()

With the previous commit 03627245c ("BUG/MEDIUM: mux-h1: Trim excess server data
at the end of a transaction"), we try to avoid to handle junk data coming from a
server as a response. But it only works for data already received. Starting from
the moment a server sends an invalid response, it is safer to close the
connection too, because more data may come after and there is no good reason to
handle them.

So now, when a conn_stream is detached from a server connection, if there are
some unexpected input data, we simply trim them and close the connection
ASAP. We don't close it immediately only if there are still some outgoing data
to deliver to the server.

This patch must be backported to 2.0 and 1.9.

MEDIUM: backend: remove impossible cases from connect_server()

Now that we start by releasing any possibly existing connection,
the conditions simplify a little bit and some of the complex cases
can be removed. A few comments were also added for non-obvious cases.

MEDIUM: backend: always release any existing prior connection in connect_server()

When entering connect_server() we're not supposed to have a connection
already, except when retrying a failed connection, which is pretty rare.
Let's simplify the code by starting to unconditionally release any existing
connection. For now we don't go further, as this change alone will lead to
quite some simplification that'd rather be done as a separate cleanup.

MEDIUM: lua: do not allocate the remote connection anymore

Lua cosockets do not need to allocate the remote connection anymore.
However this was trickier than expected because some tests were made
on this remote connection's existence to detect establishment instead
of relying on the stream interface's state (which is how it's now done).
The flag SF_ADDR_SET was set a bit too early (before assigning the
address) so this was moved to the right place. It should not have had
any impact beyond confusing debugging.

The only remaining occurrence of the remote connection knowledge now
is for getsockname() which requires to access the connection to send
the syscall, and it's unlikely that we'll need to change this before
QUIC or so.

MINOR: peers: now remove the remote connection setup code

The connection is not needed anymore, the backend does the job.

MAJOR: stream: store the target address into s->target_addr

When forcing the outgoing address of a connection, till now we used to
allocate this outgoing connection and set the address into it, then set
SF_ADDR_SET. With connection reuse this causes a whole lot of issues and
difficulties in the code.

Thanks to the previous changes, it is now possible to store the target
address into the stream instead, and copy the address from the stream to
the connection when initializing the connection. assign_server_address()
does this and as a result SF_ADDR_SET now reflects the presence of the
target address in the stream, not in the connection. The http_proxy mode,
the peers and the master's CLI now use the same mechanism. For now the
existing connection code was not removed to limit the amount of tricky
changes, but the allocated connection is not used anymore.

This change also revealed a latent issue that we've been having around
option http_proxy : the address was set in the connection but neither the
SF_ADDR_SET nor the SF_ASSIGNED flags were set. It looks like the connection
could establish only due to the fact that it existed with a non-null
destination address.

MINOR: stream: add a new target_addr entry in the stream structure

The purpose will be to store the target address there and not to
allocate a connection just for this anymore. For now it's only placed
in the struct, a few fields were moved to plug some holes, and the
entry is freed on release (never allocated yet for now). This must
have no impact. Note that in order to fit, the store_count which
previously was an int was turned into a short, which is way more
than enough given that the hard-coded limit is 8.

MINOR: connection: don't use clear_addr() anymore, just release the address

Now that we have dynamically allocated addresses, there's no need to
clear an address before reusing it, just release it. Note that this
is not equivalent to saying that an address is never zero, as shown in
assign_server_address() where an address 0.0.0.0 can still be assigned
to a connection for the time it takes to modify it.

MAJOR: connection: remove the addr field

Now addresses are dynamically allocated when needed. Each connection is
created with src=dst=NULL, these entries are allocated on the fly, and
released when the connection is released.

MEDIUM: connection: make sure all address producers allocate their address

This commit places calls to sockaddr_alloc() at the places where an address
is needed, and makes sure that the allocation is properly tested. This does
not add too many error paths since connection allocations are already in the
vicinity and share the same error paths. For the two cases where a
clear_addr() was called, instead the address was not allocated.

MINOR: connection: create a new pool for struct sockaddr_storage

This pool will be used to allocate storage for source and destination
addresses used in connections. Two functions sockaddr_{alloc,free}()
were added and will have to be used everywhere an address is needed.
These ones are safe for progressive replacement as they check that the
existing pointer is set before replacing it. The pool is not yet used
during allocation nor freeing. Also they operate on pointers to pointers
so they will perform checks and replace values. The free one nulls the
pointer.

MEDIUM: backend: turn all conn->addr.{from,to} to conn->{src,dst}

All reads were carefully reviewed for only reading already checked
values. Assignments were commented indicating that an allocation will
be needed once they become dynamic. The memset() used to clear the
addresses should then be turned to a free() and a NULL assignment.

MINOR: http: convert conn->addr.from to conn->src in sample fetches

These calls are safe because the address' validity was already checked
prior to reaching that code.

MINOR: frontend: switch from conn->addr.{from,to} to conn->{src,dst}

All these values were already checked, it's safe to use them as-is.

MINOR: checks: replace conn->addr.to with conn->dst

Two places will require a dynamic address allocation since the connection
is created from scratch. For the source address it looks like the
clear_addr() call will simply have to be removed as the pointer will
already be NULL.

MINOR: log: use conn->{src,dst} instead of conn->addr.{from,to}

This is used to retrieve the addresses to be logged (client, frontend,
backend, server). In all places the validity check was already performed.

MINOR: sockpair: use conn->dst for the target address in ->connect()

No extra check is needed since the destination must be set there.

MINOR: unix: use conn->dst for the target address in ->connect()

No extra check is needed since the destination must be set there.

MINOR: tcp: replace conn->addr.{from,to} with conn->{src,dst}

Most of the locations were already safe, only two places needed to have
one extra check to avoid assuming that cli_conn->src is necessarily set
(it is in practice but let's stay safe).

MINOR: session: use conn->src instead of conn->addr.from

In session_accept_fd() we'll soon have to dynamically allocate the
address, or better, steal it from the caller and define a strict calling
convention regarding who's responsible for the freeing. In the simpler
session_prepare_log_prefix(), just add an attempt to retrieve the address
if not yet set and do not dereference it on failure.

MINOR: proxy: switch to conn->src in error snapshots

The source address was taken unchecked from a client connection. In
practice we know it's set but better strengthen this now.

MINOR: stream: switch from conn->addr.{from,to} to conn->{src,dst}

No allocation is needed there. Some extra checks were added in the
stream dump code to make sure the source address is effectively valid
(it always is but it doesn't cost much to be certain).

MINOR: htx: switch from conn->addr.{from,to} to conn->{src,dst}

One place (transparent proxy) will require an allocation when the
address becomes dynamic. A few dereferences of the family were adjusted
to preliminary check for the address pointer to exist at all. The
remaining operations were already performed under control of a
successful retrieval.

MINOR: peers: use conn->dst for the peer's target address

The target address is duplicated from the peer's configured one. For
now we keep the target address as-is but we'll have to dynamically
allocate it and place it into the stream instead. Maybe a sockaddr_dup()
will help by the way.

The "show peers" part is safe as it's already called after checking
the addresses' validity.

MINOR: lua: switch to conn->dst for a connection's target address

This one will soon need a dynamic allocation, though this will be
temporary as ideally the address will be placed on the stream and no
connection will be allocated anymore.

MINOR: ssl-sock: use conn->dst instead of &conn->addr.to

This part can be definitive as the check was already in place.

MINOR: connection: use conn->{src,dst} instead of &conn->addr.{from,to}

This is in preparation for the switch to dynamic address allocation,
let's migrate the code using the old fields to the pointers instead.
Note that no extra check was added for now, the purpose is only to
get the code to use the pointers and still work.

In the proxy protocol message handling we make sure the addresses are
properly allocated before declaring them unset.

MINOR: connection: add new src and dst fields

At the moment we're facing difficulties with connection reuse based on
the fact that connections may be allocated very early only to set a
target address in transparent mode. With the imminent removal of the
legacy mode, the connection reuse by a same stream will not exist
anymore and all this awful complexity is not justified anymore. However
we still need to be able to assign addresses somewhere.

Thus instead of allocating a connection, we'll only place addresses where
needed in the stream during operations. But this takes quite some room
(typically 128 bytes). This is a nice opportunity for cleaning all this
up and dynamically allocatating the addresses fields, which will result
in actually saving memory from connection structs since most of the time
the client's "to" address is not used and the server's "from" is not used
either, thus saving ~256 bytes per end-to-end connection.

For now these new "src" and "dst" pointers point to addr.from and addr.to.
This will allow us to smoothly update the whole code to use these pointers
prior to going further and switching them to pools.

CLEANUP: connection: remove the now unused conn_get_{from,to}_addr()

These functions are not used anymore. They didn't report failures and
as such were often misused. conn_get_src() and conn_get_dst() now
replaced them everywhere.

MINOR: http: check the source address via conn_get_src() in sample fetch functions

In smp_fetch_url32_src() and smp_fetch_base32_src() it's better to
validate that the source address was properly initialized since it
will soon be dynamic, thus let's call conn_get_src().

MINOR: lua: use conn_get_{src,dst} to retrieve connection addresses

This replaces the previous conn_get_{from,to}_addr() and reuses the
existing error checks.

MINOR: http/htx: use conn_get_dst() to retrieve the destination address

When adding the X-Original-To header, let's use conn_get_dst() and make
sure it succeeds, since previous call to conn_get_to_addr() was unchecked.

MINOR: log: use conn_get_{dst,src}() to retrieve the cli/frt/bck/srv/ addresses

This also allows us to check that the operation succeeded without
logging whatever remained in the memory area in case of failure.