Willy Tarreau [Fri, 9 Oct 2020 09:14:35 +0000 (11:14 +0200)]
REGTESTS: mark abns_socket as broken
This test is inherently racy. It regularly pops up on the CI, and I've
spent one hour chasing a bug that apparently doesn't exist, just because
I'm running it 10 times in a row and it reports from 4 to 8 failures
when built at -O2 and generally even more at -O0. The logs are very
confusing, often reporting that it failed with status 0, with nothing
else wrong. I suspect it might sometimes be the shell command that fails
if it executes faster than haproxy finishes to start up, which would
also explain the relation with the optimization level. E.g:
> Testing with haproxy version: 2.2.0
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.006) exit=2
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.006) exit=2
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.009) exit=2
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.008) exit=2
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.007) exit=2
> # top TEST reg-tests/seamless-reload/abns_socket.vtc FAILED (3.007) exit=2
> 6 tests failed, 0 tests skipped, 4 tests passed
Some of the failures include this, suggesting that some barriers could
help:
---- h1 haproxy h1 PID file check failed:
Could not read PID file '/tmp/haregtests-2020-10-09_11-19-40.kgsDB4/vtc.30539.04dbea7f/h1/pid
Since it has been causing false positives and consumed way more
troubleshooting time than it saved, let's mark it as broken so that it
doesn't waste more time. We can bring it back when someone manages to
figure what the problem is.
BUG/MINOR: http-htx: Expect no body for 204/304 internal HTTP responses
204 and 304 HTTP responses must no contain message body. These status codes are
correctly handled when the responses are received from a server. But there is no
specific processing for internal HTTP reponses (errorfile and http replies).
Now, when errorfiles or an http replies are parsed during the configuration
parsing, an error is triggered if a 204/304 message contains a body. An extra
check is also performed to ensure the body length matches the announce
content-length.
This patch should fix the issue #891. It must be backported as far as 2.0. For
2.1 and 2.0, only the http_str_to_htx() function must be fixed.
http_parse_http_reply() function does not exist.
BUG/MINOR: http: Fix content-length of the default 500 error
96 bytes is announce in the C-L header for a message of body of 97 bytes. This
bug was introduced by the patch 46a030cdd ("CLEANUP: assorted typo fixes in the
code and comments").
This patch must be backported in all versions where the patch above is (the 2.2
for now).
BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams
This patch is similar to the previous one on the fcgi. Same is true for the
H2. But the bug is far harder to trigger because of the protocol cinematic. But
it may explain strange aborts in some edge cases.
A read0 received on the connection must not be handled too early by H2 streams.
If the demux buffer is not empty, the pending read0 must not be considered. The
H2 streams must not be passed in half-closed remote state in
h2s_wake_one_stream() and the CS_FL_EOS flag must not be set on the associated
conn-stream in h2_rcv_buf(). To sum up, it means, if there are still data
pending in the demux buffer, no abort must be reported to the streams.
To fix the issue, a dedicated function has been added, responsible for detecting
pending read0 for a H2 connection. A read0 is reported only if the demux buffer
is empty. This function is used instead of conn_xprt_read0_pending() at some
places.
Note that the HREM stream state should not be used to report aborts. It is
performed on h2s_wake_one_stream() function and it is a legacy of the very first
versions of the mux-h2.
This patch should be backported as far as 2.0. In the 1.8, the code is too
different to apply it like that. But it is probably useless because the mux-h2
can only be installed on the client side.
BUG/MEDIUM: mux-fcgi: Don't handle pending read0 too early on streams
A read0 received on the connection must not be handled too early by FCGI
streams. If the demux buffer is not empty, the pending read0 must not be
considered. The FCGI streams must not be passed in half-closed remote state in
fcgi_strm_wake_one_stream() and the CS_FL_EOS flag must not be set on the
associated conn-stream in fcgi_rcv_buf(). To sum up, it means, if there are
still data pending in the demux buffer, no abort must be reported to the
streams.
To fix the issue, a dedicated function has been added, responsible for detecting
pending read0 for a FCGI connection. A read0 is reported only if the demux
buffer is empty. This function is used instead of conn_xprt_read0_pending() at
some places.
This patch should fix the issue #886. It must be backported as far as 2.1.
Willy Tarreau [Fri, 9 Oct 2020 03:56:56 +0000 (05:56 +0200)]
BUG/MINOR: makefile: fix a tiny typo in the target list
Previous commit 382001b46 ("BUILD: Add a DragonFlyBSD target") introduced
a tiny typo in the target list ("iopenbs" vs "openbsd"). This will have to
be backported if that patch is backported.
Willy Tarreau [Thu, 8 Oct 2020 16:05:56 +0000 (18:05 +0200)]
DOC: fix a confusing typo on a regsub example
Sébastien reported a confusing example in the doc about regsub when used
with quotes. Nested quotes are already not trivial to grasp, but when
typos are there and result in something valid, it's even worse. The closing
quote ought to have been inside the brackets. However haproxy will not make
any difference because the single quotes delimit a word and the delimited
word remains the same. Let's just not add yet another level of confusion.
MINOR: mux-h1: Don't wakeup the H1C when output buffer become available
There is no reason to wake up the H1 connection when a new output buffer is
retrieved after an allocation failure because only the H1 stream will fill it.
BUG/MINOR: mux-h1: Always set the session on frontend h1 stream
The session is always defined for a frontend connection. When a new client
connection is established, the session is set for the first H1 stream. But on
keep-alived connections, it is not set for the followings H1 streams while it is
possible.
This patch is tagged as a bug because it fixes an inconsistency in the H1
streams creation. But it does not fixed a known bug.
BUG/MINOR: mux-h1: Be sure to only set CO_RFL_READ_ONCE for the first read
The condition to set CO_RFL_READ_ONCE flag is not really accurate. We must check
the request state on frontend connection only and, in the opposite, the response
state on backend connection only. Only the parsed side must be considered, not
the opposite one.
CLEANUP: ssl: Release cached SSL sessions on deinit
On deinit, when the server SSL ctx is released, we must take care to release the
cached SSL sessions stored in the array <ssl_ctx.reused_sess>. There are
global.nbthread entries in this array, each one may have a pointer on a cached
session.
This patch should fix the issue #802. No backport needed.
Tim Duesterhus [Mon, 14 Sep 2020 16:01:33 +0000 (18:01 +0200)]
CLEANUP: cache: Fix leak of cconf->c.name during config check
During the config check, the post parsing is not performed. Thus, cache filters
are not fully initialized and their cache name are never released. To be able to
release them, a flag is now set when a cache filter is fully initialized. On
deinit, if the flag is not set, it means the cache name must be freed.
The patch should fix #849. No backport needed.
[Cf: Tim is the patch author, but I added the commit message]
BUG/MINOR: proto_tcp: Report warning messages when listeners are bound
When a TCP listener is bound, in the tcp_bind_listener() function, a warning
message may be reported and should be displayed on verbose mode. But the warning
message is actually lost if the socket is successfully bound because we don't
fill the <errmsg> variable in this case.
This patch should fix the issue #863. No backport is needed.
BUG/MINOR: peers: Inconsistency when dumping peer status codes.
A peer connection status must be considered as valid only if there is an applet
which has been instantiated for the connection to the peer. So, ->statuscode
should be considered as the last known peer connection status from the last
connection to this peer if any. To reflect this, "statuscode" field of peer dump
is renamed to "last_statuscode".
This patch also add "active"/"inactive" field after the peer location type
("remote" or "local") if an applet has been instantiated for this peer connection
or not.
Thank you to Emeric for having noticed this issue.
Remove variable declaration inside a for-loop. This was introduced by my
patches serie of the implementation of dynamic stats. This is not
supported by older gcc, notably on the freebsd environment of the ci.
Use the new stats module API to integrate the dns counters in the
standard stats. This is done in order to avoid code duplication, keep
the code related to cli out of dns and use the full possibility of the
stats function, allowing to print dns stats in csv or json format.
MINOR: stats: display extra proxy stats on the html page
Integrate the additional proxy stats on the html stats page. For each
module, a new column is displayed with the individual stats available as
a tooltip.
MINOR: stats: support clear counters for dynamic stats
Add a boolean 'clearable' on stats module structure. If set, it forces
all the counters to be reset on 'clear counters' cli command. If not,
the counters are reset only when 'clear counters all' is used.
MEDIUM: stats: integrate static proxies stats in new stats
This is executed on startup with the registered statistics module. The
existing statistics have been merged in a list containing all
statistics for each domain. This is useful to print all available
statistics in a generic way.
Allocate extra counters for all proxies/servers/listeners instances.
These counters are allocated with the counters from the stats modules
registered on startup.
MEDIUM: stats: add abstract type to store counters
Implement a small API to easily add extra counters inside a structure
instance. This will be used to implement dynamic statistics linked on
every type of object as needed.
The counters are stored in a dynamic array inside the relevant objects.
MEDIUM: stats: define an API to register stat modules
A stat module can be registered to quickly add new statistics on
haproxy. It must be attached to one of the available stats domain. The
register must be done using INITCALL on STG_REGISTER.
The stat module has a name which should be unique for each new module in
a domain. It also contains a statistics list with their name/desc and a
pointer to a function used to fill the stats from the module counters.
The module also provides the initial counters values used on
automatically allocated counters. The offset for these counters
are stored in the module structure.
MEDIUM: stats: add delimiter for static proxy stats on csv
Use the character '-' to mark the end of static statistics on proxy
domain. After this marker, the order of the fields is not guaranteed and
should be parsed with care.
MINOR: stats: define additional flag px cap on domain
This flag can be used to determine on what type of proxy object the
statistics should be relevant. It will be useful when adding dynamic
statistics. Currently, this flag is not used.
MINOR: stats: define the concept of domain for statistics
The domain option will be used to have statistics attached to other
objects than proxies/listeners/servers. At the moment, only the PROXY
domain is available.
Add an argument 'domain' on the 'show stats' cli command to specify the
domain. Only 'domain proxy' is available now. If not specified, proxy
will be considered the default domain.
For HTML output, only proxy statistics will be displayed.
MINOR: hlua: Display debug messages on stderr only in debug mode
Debug Messages emitted in lua using core.Debug() or core.log() are now only
displayed on stderr if HAProxy is started in debug mode (-d parameter on the
command line). There is no change for other message levels.
This patch should fix the issue #879. It may be backported to all stable
versions.
MINOR: stats: hide px/sv/li fields in applet struct
Use an opaque pointer to store proxy instance. Regroup server/listener
as a single opaque pointer. This has the benefit to render the structure
more evolutive to support statistics on other types of objects in the
future.
This patch is needed to extend stat support for components other than
proxies objects.
The prometheus module has been adapted for these changes.
MINOR: stats: add stats size as a parameter for csv/json dump
Render the stats size parametric in csv/json dump functions. This is
needed for the future patch which provides dynamic stats. For now the
static value ST_F_TOTAL_FIELDS is provided.
Remove unused parameter px on stats_dump_one_line.
This patch is needed to extend stat support to components other than
proxies objects.
Un-mark stats_dump_one_line and stats_putchk as static and export them
in the header file. These functions will be reusable by other components to
print their statistics.
This patch is needed to extend stat support to components other than
proxies objects.
The json schema seems to be invalid when checking using the validator
from https://www.jsonschemavalidator.net/. Correct it using the
following specification :
http://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.9.1
The impact of the bug it not well known as I am not sure of how useful
the json schema is for users. It is probably not used at all or else
this bug would have been reported.
Willy Tarreau [Fri, 2 Oct 2020 15:52:49 +0000 (17:52 +0200)]
BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe
A crash reported in github issue #880 looks impossible unless
pendconn_cond_unlink() occasionally sees a null leaf_p when attempting
to remove an entry, which seems to be confirmed by the reporter. What
seems to be happening is that depending on compiler optimizations,
this pointer can appear as null while pointers are moved if one of
the node's parents is removed from or inserted into the tree. There's
no explicit null of the pointer during these operations but those
pointers are rewritten in multiple steps and nothing prevents this
situation from happening, and there are no particular barrier nor
atomic ops around this.
This test was used to avoid unnecessary locking, for already deleted
entries, but looking at the code it appears that pendconn_free() already
resets s->pend_pos that's used as <p> there, and that the other call
reasons are after an error where the connection will be dropped as
well. So we don't save anything by doing this test, and make it
unsafe. The older code used to check for list emptiness there and
not inside pendconn_unlink(), which explains why the code has stayed
there. Let's just remove this now.
Thanks to @jaroslawr for reporting this issue in great details and for
testing the proposed fix.
This should be backpored to 1.8, where the test on LIST_ISEMPTY should
be moved to pendconn_unlink() instead (inside the lock, just like 2.0+).
Update the documentation with the new bundle behavior which does not use
the same OpenSSL certificate store anymore but loads the PEM separately
as multiple "crt" were specified.
BUG/MINOR: tcpcheck: Set socks4 and send-proxy flags before the connect call
Since the health-check refactoring in the 2.2, the checks through a socks4 proxy
are broken. To fix this bug, CO_FL_SOCKS4 flag must be set on the connection
before calling the connect() callback function because this flags is checked to
use the right destination address. The same is done for the CO_FL_SEND_PROXY
flag for a consistency purpose.
A reg-test has been added to test the "check-via-socks4" directive.
MEDIUM: tcp-rules: Warn if a track-sc* content rule doesn't depend on content
The warning is only emitted for HTTP frontend. Idea is to encourage the usage of
"tcp-request session" rules to track counters that does not depend on the
request content. The documentation has been updated accordingly.
The warning is important because since the multiplexers were added in the
processing chain, the HTTP parsing is performed at a lower level. Thus parsing
errors are detected in the multiplexers, before the stream creation. In HTTP/2,
the error is reported by the multiplexer itself and the stream is never
created. This difference has a certain number of consequences, one of which is
that HTTP request counting in stick tables only works for valid H2 request, and
HTTP error tracking in stick tables never considers invalid H2 requests but only
invalid H1 ones. And the aim is to do the same with the mux-h1. This change will
not be done for the 2.3, but the 2.4. At the end, H1 and H2 parsing errors will
be caught by the multiplexers, at the session level. Thus, tracking counters at
the content level should be reserved for rules using a key based on the request
content or those using ACLs based on the request content.
To be clear, a warning will be emitted for the following rules :
DOC: tcp-rules: Refresh details about L7 matching for tcp-request content rules
Because the parsing of HTTP message is now performed in the HTTP multiplexers,
the content is immediatly available when "tcp-request content" rules are
evaluated for an HTTP frontend. So, it is a good idea to make the documentation
explicit on this point. In addition, because in all cases, the parsing is
already performed, there is no reason to still use "tcp-request content" rules
based on L7 matching, although it is still valid. The recommended way is to use
"http-request" rules instead. Again, it is a good idea to update the
documentation on this point.
Eric Salama [Fri, 2 Oct 2020 09:58:19 +0000 (11:58 +0200)]
BUG/MINOR: Fix several leaks of 'log_tag' in init().
We use chunk_initstr() to store the program name as the default log-tag.
If we use the log-tag directive in the config file, this chunk will be
destroyed and replaced. chunk_initstr() sets the chunk size to 0 so we
will free the chunk itself, but not its content.
This happens for a global section and also for a proxy.
We fix this by using chunk_initlen() instead of chunk_initstr().
We also check that the memory allocation was successfull, otherwise we quit.
This fixes github issue #850.
It can be backported as far as 1.9, with minor adjustments to includes.
Tim Duesterhus [Tue, 29 Sep 2020 16:00:28 +0000 (18:00 +0200)]
MINOR: ssl: Add error if a crt-list might be truncated
Similar to warning during the parsing of the regular configuration file
that was added in 2fd5bdb439da29f15381aeb57c51327ba57674fc this patch adds
a warning to the parsing of a crt-list if the file does not end in a
newline (and thus might have been truncated).
The logic essentially just was copied over. It might be good to refactor
this in the future, allowing easy re-use within all line-based config
parsers.
Willy Tarreau [Thu, 1 Oct 2020 16:04:40 +0000 (18:04 +0200)]
BUILD: tools: fix minor build issue on isspace()
Previous commit fa41cb679 ("MINOR: tools: support for word expansion
of environment in parse_line") introduced two new isspace() on a char
and broke the build on systems using an array disguised in a macro
instead of a function (like cygwin). Just use the usual cast.
MINOR: tools: support for word expansion of environment in parse_line
Allow the syntax "${...[*]}" to expand an environment variable
containing several values separated by spaces as individual arguments. A
new flag PARSE_OPT_WORD_EXPAND has been added to toggle this feature on
parse_line invocation. In case of an invalid syntax, a new error
PARSE_ERR_WRONG_EXPAND will be triggered.
This feature has been asked on the github issue #165.
Willy Tarreau [Thu, 1 Oct 2020 02:05:39 +0000 (04:05 +0200)]
BUILD: makefile: add an EXTRAVERSION variable to ease local naming
Sometimes it's desirable to append local version naming to packages,
and currently it can only be done using SUBVERS which is already set
by default to the git commit ID and patch count since last known tag,
making the addition a bit complicated.
Let's just add a new EXTRAVERSION field that is empty by default, and
systematically appended verbatim to the version string everywhere. This
way it becomes trivial to append some local strings, such as:
make TARGET=foo EXTRAVERSION=+$(quilt applied|wc -l)
-> 2.3-dev5-5018aa-15+1
or :
make TARGET=foo EXTRAVERSION=-$(date +%F)
-> 2.3-dev5-5018aa-15-20200110
Let's be careful not to add double quotes (used as the string delimiter)
nor spaces (which can confuse version parsers on the output). The extra
version is also used to name a tarball. It's always pre-initialized to an
empty string so that it's not accidently inherited from the environment.
It's not reported in "make version" to avoid fooling tools (it would be
pointless anyway).
As a side effect it also becomes possible to force VERSION and SUBVERS
to an empty string and use EXTRAVERSION alone to force a specific version
(could possibly be useful when bisecting from patch queues outside of Git
for example).
Brad Smith [Wed, 30 Sep 2020 05:04:48 +0000 (01:04 -0400)]
BUILD: makefile: Fix building with closefrom() support enabled
I noticed the USE_CLOSEFROM define was not being passed along like the rest
during the build.
Looking around I see this was broken with the following two commits and related
series..
BUILD: Makefile: also report disabled options in the BUILD_OPTIONS variable
http://git.haproxy.org/?p=haproxy.git;a=commit;h=05fd82da76d1bbc8d65d63ab246bda7cbcf8481a
OPTIM: backend: skip LB when we know the backend is full
For some algos (roundrobin, static-rr, leastconn, first) we know that
if there is any request queued in the backend, it's because a previous
attempt failed at finding a suitable server after trying all of them.
This alone is sufficient to decide that the next request will skip the
LB algo and directly reach the backend's queue. Doing this alone avoids
an O(N) lookup when load-balancing on a saturated farm of N servers,
which starts to be very expensive for hundreds of servers, especially
under the lbprm lock. This change alone has increased the request rate
from 110k to 148k RPS for 200 saturated servers on 8 threads, and
fwlc_reposition_srv() doesn't show up anymore in perf top. See github
issue #880 for more context.
It could have been the same for random, except that random is performed
using a consistent hash and it only considers a small set of servers (2
by default), so it may result in queueing at the backend despite having
some free slots on unknown servers. It's no big deal though since random()
only performs two attempts by default.
For hashing algorithms this is pointless since we don't queue at the
backend, except when there's no hash key found, which is the least of
our concerns here.
OPTIM: backend/random: never queue on the server, always on the backend
If random() returns a server whose maxconn is reached or the queue is
used, instead of adding the request to the server's queue, better add
it to the backend queue so that it can be served by any server (hence
the fastest one).
REGTEST: fix host part in balance-uri-path-only.vtc
We used to set it to ${h1_px_addr} but it randomly fails on certain
hosts (FreeBSD and OSX) where the address is surprisingly set to "::1"
while the Host field contains 127.0.0.1 (hence two different address
families). While this is likely a minor issue in vtest, we don't need
to depend on this and can easily hard-code 127.0.0.1 which is already
used in other tests.
William Dauchy [Sat, 26 Sep 2020 11:35:52 +0000 (13:35 +0200)]
DOC: crt: advise to move away from cert bundle
especially when starting to use `new ssl cert` runtime API, it might
become a bit confusing for users to mix bundle and single cert,
especially when it comes to use the commit command:
e.g.:
- start the process with `crt` loading a bundle
- use `set ssl cert my_cert.pem.ecdsa`: API detects it as a replacement
of a bundle.
- `commit` has to be done on the bundle: `commit ssl cert my_cert.pem`
however:
- add a new cert: `new ssl cert my_cert.pem.rsa`: added as a single
certificate
- `commit` has to be done on the certificate: `commit ssl cert
my_cert.pem.rsa`
this should resolve github issue #872
this should probably be backported in >= v2.2 in order to encourage
people to move away from bundle certificates loading.
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Brad Smith [Sun, 27 Sep 2020 03:05:25 +0000 (23:05 -0400)]
BUILD: makefile: Update feature flags for OpenBSD
Update the OpenBSD target features being enabled.
I updated the list of features after noticing
"BUILD: makefile: disable threads by default on OpenBSD".
The Makefile utilizing gcc(1) by default resulted in utilizing
our legacy and obsolete compiler (GCC 4.2.1) instead of the
proper system compiler (Clang), which does support TLS. With
"BUILD: makefile: change default value of CC from gcc to cc"
that is resolved.
Released version 2.3-dev5 with the following main changes :
- DOC: Fix typo in iif() example
- CLEANUP: Update .gitignore
- BUILD: introduce possibility to define ABORT_NOW() conditionally
- CI: travis-ci: help Coverity to recognize abort()
- BUG/MINOR: Fix type passed of sizeof() for calloc()
- CLEANUP: Do not use a fixed type for 'sizeof' in 'calloc'
- CLEANUP: tree-wide: use VAR_ARRAY instead of [0] in various definitions
- BUILD: connection: fix build on clang after the VAR_ARRAY cleanup
- BUG/MINOR: ssl: verifyhost is case sensitive
- BUILD: makefile: change default value of CC from gcc to cc
- CI: travis-ci: split asan step out of running tests
- BUG/MINOR: server: report correct error message for invalid port on "socks4"
- BUG/MEDIUM: ssl: Don't call ssl_sock_io_cb() directly.
- BUG/MINOR: ssl/crt-list: crt-list could end without a \n
- BUG/MINOR: log-forward: fail on unknown keywords
- MEDIUM: log-forward: use "dgram-bind" instead of "bind" for the listener
- BUG/MEDIUM: log-forward: always quit on parsing errors
- MEDIUM: ssl: remove bundle support in crt-list and directories
- MEDIUM: ssl/cli: remove support for multi certificates bundle
- MINOR: ssl: crtlist_dup_ssl_conf() duplicates a ssl_bind_conf
- MINOR: ssl: crtlist_entry_dup() duplicates a crtlist_entry
- MEDIUM: ssl: emulates the multi-cert bundles in the crtlist
- MEDIUM: ssl: emulate multi-cert bundles loading in standard loading
- CLEANUP: ssl: remove test on "multi" variable in ckch functions
- CLEANUP: ssl/cli: remove test on 'multi' variable in CLI functions
- CLEANUP: ssl: remove utility functions for bundle
- DOC: explain bundle emulation in configuration.txt
- BUILD: fix build with openssl < 1.0.2 since bundle removal
- BUG/MINOR: log: gracefully handle the "udp@" address format for log servers
- BUG/MINOR: dns: gracefully handle the "udp@" address format for nameservers
- MINOR: listener: create a new struct "settings" in bind_conf
- MINOR: listener: move bind_proc and bind_thread to struct settings
- MINOR: listener: move the interface to the struct settings
- MINOR: listener: move the network namespace to the struct settings
- REORG: listener: create a new struct receiver
- REORG: listener: move the listening address to a struct receiver
- REORG: listener: move the receiving FD to struct receiver
- REORG: listener: move the listener's proto to the receiver
- MINOR: listener: make sock_find_compatible_fd() check the socket type
- REORG: listener: move the receiver part to a new file
- MINOR: receiver: link the receiver to its settings
- MINOR: receiver: link the receiver to its owner
- MINOR: listener: prefer to retrieve the socket's settings via the receiver
- MINOR: receiver: add a receiver-specific flag to indicate the socket is bound
- MINOR: listener: move the INHERITED flag down to the receiver
- MINOR: receiver: move the FOREIGN and V6ONLY options from listener to settings
- MINOR: sock: make sock_find_compatible_fd() only take a receiver
- MINOR: protocol: rename the ->bind field to ->listen
- MINOR: protocol: add a new ->bind() entry to bind the receiver
- MEDIUM: sock_inet: implement sock_inet_bind_receiver()
- MEDIUM: tcp: make use of sock_inet_bind_receiver()
- MEDIUM: udp: make use of sock_inet_bind_receiver()
- MEDIUM: sock_unix: implement sock_unix_bind_receiver()
- MEDIUM: uxst: make use of sock_unix_bind_receiver()
- MEDIUM: sockpair: implement sockpair_bind_receiver()
- MEDIUM: proto_sockpair: make use of sockpair_bind_receiver()
- MEDIUM: protocol: explicitly start the receiver before the listener
- MEDIUM: protocol: do not call proto->bind() anymore from bind_listener()
- MINOR: protocol: add a new proto_fam structure for protocol families
- MINOR: protocol: retrieve the family-specific fields from the family
- CLEANUP: protocol: remove family-specific fields from struct protocol
- MINOR: protocol: add a real family for existing FDs
- CLEANUP: tools: make str2sa_range() less awful for fd@ and sockpair@
- MINOR: tools: make str2sa_range() take more options than just resolve
- MINOR: tools: add several PA_O_PORT_* flags in str2sa_range() callers
- MEDIUM: tools: make str2sa_range() validate callers' port specifications
- MEDIUM: config: remove all checks for missing/invalid ports/ranges
- MINOR: tools: add several PA_O_* flags in str2sa_range() callers
- MINOR: listener: remove the inherited arg to create_listener()
- MINOR: tools: make str2sa_range() optionally return the fd
- MINOR: log: detect LOG_TARGET_FD from the fd and not from the syntax
- MEDIUM: tools: make str2sa_range() resolve pre-bound listeners
- MINOR: config: do not test an inherited socket again
- MEDIUM: tools: make str2sa_range() check for the sockpair's FD usability
- MINOR: tools: start to distinguish stream and dgram in str2sa_range()
- MEDIUM: tools: make str2sa_range() only report AF_CUST_UDP on listeners
- MINOR: tools: remove the central test for "udp" in str2sa_range()
- MINOR: cfgparse: add str2receiver() to parse dgram receivers
- MINOR: log-forward: use str2receiver() to parse the dgram-bind address
- MEDIUM: config: make str2listener() not accept datagram sockets anymore
- MINOR: listener: pass the chosen protocol to create_listeners()
- MINOR: tools: make str2sa_range() directly return the protocol
- MEDIUM: tools: make str2sa_range() check that the protocol has ->connect()
- MINOR: protocol: add the control layer type in the protocol struct
- MEDIUM: protocol: store the socket and control type in the protocol array
- MEDIUM: tools: make str2sa_range() use protocol_lookup()
- MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET*
- MINOR: tools: drop listener detection hack from str2sa_range()
- BUILD: sock_unix: add missing errno.h
- MINOR: sock_inet: report the errno string in binding errors
- MINOR: sock_unix: report the errno string in binding errors
- BUILD: sock_inet: include errno.h
- MINOR: h2/trace: also display the remaining frame length in traces
- BUG/MINOR: h2/trace: do not display "stream error" after a frame ACK
- BUG/MEDIUM: h2: report frame bits only for handled types
- BUG/MINOR: http-fetch: Don't set the sample type during the htx prefetch
- BUG/MINOR: Fix memory leaks cfg_parse_peers
- BUG/MINOR: config: Fix memory leak on config parse listen
- MINOR: backend: make the "whole" option of balance uri take only one bit
- MINOR: backend: add a new "path-only" option to "balance uri"
- REGTESTS: add a few load balancing tests
- BUG/MEDIUM: listeners: do not pause foreign listeners
- BUG/MINOR: listeners: properly close listener FDs
- BUILD: trace: include tools.h
Miroslav Zagorac [Thu, 24 Sep 2020 07:15:46 +0000 (09:15 +0200)]
BUILD: trace: include tools.h
If the TRACE option is used when compiling the haproxy source,
the following error occurs on debian 9.13:
src/calltrace.o: In function `make_line':
.../src/calltrace.c:204: undefined reference to `rdtsc'
src/calltrace.o: In function `calltrace':
.../src/calltrace.c:277: undefined reference to `rdtsc'
collect2: error: ld returned 1 exit status
Makefile:866: recipe for target 'haproxy' failed
The code dealing with zombie proxies in soft_stop() is bogus, it uses
close() instead of fd_delete(), leaving a live entry in the fdtab with
a dangling pointer to a free memory location. The FD might be reassigned
for an outgoing connection for the time it takes the proxy to completely
stop, or could be dumped on the CLI's "show fd" command. In addition,
the listener's FD was not even reset, leaving doubts about whether or
not it will happen again in deinit().
And in deinit(), the loop in charge of closing zombie FDs is particularly
unsafe because it closes the fd then calls unbind_listener() then
delete_listener() hoping none of them will touch it again. Since it
requires some mental efforts to figure what's done there, let's correctly
reset the fd here as well and close it using fd_delete() to eliminate any
remaining doubts.
It's uncertain whether this should be backported. Zombie proxies are rare
and the situations capable of triggering such issues are not trivial to
setup. However it's easy to imagine how things could go wrong if backported
too far. Better wait for any matching report if at all (this code has been
there since 1.8 without anobody noticing).
BUG/MEDIUM: listeners: do not pause foreign listeners
There's a nasty case with listeners that belong to foreign processes.
If a proxy is defined this way:
global
nbproc 2
frontend f
bind :1111 process 1
bind :2222 process 2
and if stats expose-fd listeners is set, the listeners' FDs will not
be closed on the processes that don't use them. At this point it's not
a big deal, except that they're shared between processes and that a
"disable frontend f" issued on one process will pause all of them and
cause the other process to see accept() fail, turning its own listener
to state LI_LIMITED to try to leave it some time to recover. But it
will never recover, even after an enable.
The root cause of the issue is that the ZOMBIE state doesn't cover
this situation since it's only for a proxy being entirely bound to a
process.
What we do here to address this is that we refrain from pausing a
file descriptor that belongs to a foreign process in pause_listener().
This definitely solves the problem. A similar test is present in
resume_listener() and is the reason why the FD doesn't recover upon the
"enable" action by the way.
This ought to be backported to 1.8 where seamless reload was integrated.
The config above should be sufficient to validate that the fix works;
after a pair of "disable/enable frontend" no process will handle the
traffic to one of the ports anymore.
This adds "balance-rr" to test round robin, "balance-uri" to test the
default balance-uri method, and "balance-uri-path-only" which mixes H1
and H2 through "balance uri path-only" and verifies that they reach
the same server.
Note that for the latter, "proto h2" explicitly had to be placed on
the listening socket otherwise it would timeout. This may indicate an
issue in the H1->H2 upgrade depending how the H2 preface is sent maybe.
MINOR: backend: add a new "path-only" option to "balance uri"
Since we've fixed the way URIs are handled in 2.1, some users have started
to experience inconsistencies in "balance uri" between requests received
over H1 and the same ones received over H2. This is caused by the fact
that H1 rarely uses absolute URIs while H2 always uses them. Similar
issues were reported already around replace-uri etc, leading to "pathq"
recently being introduced, so this isn't new.
Here what this patch does is add a new option to "balance uri" to indicate
that the hashing should only start at the path and not cover the authority.
This makes H1 relative URIs and H2 absolute URI hashes equally again.
Some extra options could be added to normalize URIs by always hashing the
authority (or host) in front of them, which would make sure that both
absolute and relative requests provide the same hash. This is left for
later if needed.
BUG/MINOR: config: Fix memory leak on config parse listen
This memory leak happens if there is two or more defaults section. When
the default proxy is reinitialized, the structure member containing the
config filename must be freed.
Fix github issue #851.
Should be backported as far as 1.6.
BUG/MINOR: http-fetch: Don't set the sample type during the htx prefetch
A subtle bug was introduced by the commit a6d9879e6 ("BUG/MEDIUM: htx:
smp_prefetch_htx() must always validate the direction"), for the "method"
sample fetch only. The sample data type and the method id are always
overwritten because smp_prefetch_htx() function is called later in the
sample fetch evaluation. The bug is in the smp_prefetch_htx() function but
it is only visible for the "method" sample fetch, for an unknown method.
In fact, when smp_prefetch_htx() is called, the sample object is
altered. The data type is set to SMP_T_BOOL and, on success, the data value
is set to 1. Thus, if the caller has already set some infos into the sample
object, they may be lost. AFAIK, there is no reason to do so. It is
inherited from the legacy HTTP code and I honestely don't known why it was
done this way. So, instead of fixing the "method" sample fetch to set useful
info after the call to smp_prefetch_htx() function, I prefer to not alter
the sample object in smp_prefetch_htx().
This patch must be backported as far as 2.0. On the 2.0, only the HTX part
must be fixed.
BUG/MEDIUM: h2: report frame bits only for handled types
As part of his GREASE experiments on Chromium, Bence Béky reported in
https://lists.w3.org/Archives/Public/ietf-http-wg/2020JulSep/0202.html
and https://bugs.chromium.org/p/chromium/issues/detail?id=1127060 that
a certain combination of frame type and frame flags was causing an error
on app.slack.com. It turns out that it's haproxy that is causing this
issue because the frame type is wrongly assumed to support padding, the
frame flags indicate padding is present, and the frame is too short for
this, resulting in an error.
The reason why only some frame types are affected is due to the frame
type being used in a bit shift to match against a mask, and where the
5 lower bits of the frame type only are used to compute the frame bit.
If the resulting frame bit matches a DATA, HEADERS or PUSH_PROMISE frame
bit, then padding support is assumed and the test is enforced, resulting
in a PROTOCOL_ERROR or FRAME_SIZE_ERROR depending on the payload size.
We must never match any such bit for unsupported frame types so let's
add a check for this. This must be backported as far as 1.8.
Thanks to Cooper Bethea for providing enough context to help narrow the
issue down and to Bence Béky for creating a simple reproducer.
BUG/MINOR: h2/trace: do not display "stream error" after a frame ACK
When sending a frame ACK, the parser state is not equal to H2_CS_FRAME_H
and we used to report it as an error, which is not true. In fact we should
only indicate when we skip remaining data.
I was careful to have it for sock_unix.c but missed it for sock_inet
which broke with commit 36722d227 ("MINOR: sock_inet: report the errno
string in binding errors") depending on the build options. No backport
is needed.
MINOR: sock_inet: report the errno string in binding errors
With the socket binding code cleanup it becomes easy to add more info to
error messages. One missing thing used to be the error string, which is
now added after the generic one, for example:
MINOR: tools: drop listener detection hack from str2sa_range()
We used to resort to a trick to detect whether the caller was a listener
or an outgoing socket in order never to present an AF_CUST_UDP* socket
to a log server nor a nameserver. This is no longer necessary, the socket
type alone will be enough.
MEDIUM: proto_udp: replace last AF_CUST_UDP* with AF_INET*
We don't need to cheat with the sock_domain anymore, we now always have
the SOCK_DGRAM sock_type as a complementary selector. This patch restores
the sock_domain to AF_INET* in the udp* protocols and removes all traces
of the now unused AF_CUST_*.
MEDIUM: tools: make str2sa_range() use protocol_lookup()
By doing so we can remove the hard-coded mapping from AF_INET to AF_CUST_UDP
but we still need to keep the test on the listeners as long as these dummy
families remain present in the code.
MEDIUM: protocol: store the socket and control type in the protocol array
The protocol array used to be only indexed by socket family, which is very
problematic with UDP (requiring an extra family) and with the forthcoming
QUIC (also requiring an extra family), especially since that binds them to
certain families, prevents them from supporting dgram UNIX sockets etc.
In order to address this, we now start to register the protocols with more
info, namely the socket type and the control type (either stream or dgram).
This is sufficient for the protocols we have to deal with, but could also
be extended further if multiple protocol variants were needed. But as is,
it still fits nicely in an array, which is convenient for lookups that are
instant.
MINOR: protocol: add the control layer type in the protocol struct
This one will be needed to more accurately select a protocol. It may
differ from the socket type for QUIC, which uses dgram at the socket
layer and provides stream at the control layer. The upper level requests
a control layer only so we need this field.
MEDIUM: tools: make str2sa_range() check that the protocol has ->connect()
Most callers of str2sa_range() need the protocol only to check that it
provides a ->connect() method. It used to be used to verify that it's a
stream protocol, but it might be a bit early to get rid of it. Let's keep
the test for now but move it to str2sa_range() when the new flag PA_O_CONNECT
is present. This way almost all call places could be cleaned from this.
There's a strange test in the server address parsing code that rechecks
the family from the socket which seems to be a duplicate of the previously
removed tests. It will have to be rechecked.