Olivier Houchard [Sun, 29 Mar 2020 22:23:57 +0000 (00:23 +0200)]
MEDIUM: connections: Revamp the way idle connections are killed
The original algorithm always killed half the idle connections. This doesn't
take into account the way the load can change. Instead, we now kill half
of the exceeding connections (exceeding connection being the number of
used + idle connections past the last maximum used connections reached).
That way if we reach a peak, we will kill much less, and it'll slowly go back
down when there's less usage.
Olivier Houchard [Wed, 25 Mar 2020 18:41:03 +0000 (19:41 +0100)]
MINOR: servers: Add a counter for the number of currently used connections.
Add a counter to know the current number of used connections, as well as the
max, this will be used later to refine the algorithm used to kill idle
connections, based on current usage.
Jerome Magnin [Sun, 29 Mar 2020 07:37:12 +0000 (09:37 +0200)]
MEDIUM: stream: support use-server rules with dynamic names
With server-template was introduced the possibility to scale the
number of servers in a backend without needing a configuration change
and associated reload. On the other hand it became impractical to
write use-server rules for these servers as they would only accept
existing server labels as argument. This patch allows the use of
log-format notation to describe targets of a use-server rules, such
as in the example below:
listen test
bind *:1234
use-server %[hdr(srv)] if { hdr(srv) -m found }
use-server s1 if { path / }
server s1 127.0.0.1:18080
server s2 127.0.0.1:18081
If a use-server rule is applied because it was conditionned by an
ACL returning true, but the target of the use-server rule cannot be
resolved, no other use-server rule is evaluated and we fall back to
load balancing.
This feature was requested on the ML, and bumped with issue #563.
Jerome Magnin [Fri, 27 Mar 2020 21:08:40 +0000 (22:08 +0100)]
MINOR: listener: add so_name sample fetch
Add a sample fetch for the name of a bind. This can be useful to
take decisions when PROXY protocol is used and we can't rely on dst,
such as the sample config below.
defaults
mode http
listen bar
bind 127.0.0.1:1111
server s1 127.0.1.1:1234 send-proxy
listen foo
bind 127.0.1.1:1234 name foo accept-proxy
http-request return status 200 hdr dst %[dst] if { dst 127.0.1.1 }
Emmanuel Hocdet [Fri, 28 Feb 2020 15:00:34 +0000 (16:00 +0100)]
MINOR: ssl: skip self issued CA in cert chain for ssl_ctx
First: self issued CA, aka root CA, is the enchor for chain validation,
no need to send it, client must have it. HAProxy can skip it in ssl_ctx.
Second: the main motivation to skip root CA in ssl_ctx is to be able to
provide it in the chain without drawback. Use case is to provide issuer
for ocsp without the need for .issuer and be able to share it in
issuers-chain-path. This concerns all certificates without intermediate
certificates. It's useless for BoringSSL, .issuer is ignored because ocsp
bits doesn't need it.
Baptiste Assmann [Wed, 19 Feb 2020 00:08:51 +0000 (01:08 +0100)]
BUG/MEDIUM: dns: improper parsing of aditional records
13a9232ebc63fdf357ffcf4fa7a1a5e77a1eac2b introduced parsing of
Additionnal DNS response section to pick up IP address when available.
That said, this introduced a side effect for other query types (A and
AAAA) leading to consider those responses invalid when parsing the
Additional section.
This patch avoids this situation by ensuring the Additional section is
parsed only for SRV queries.
Olivier Houchard [Wed, 25 Mar 2020 11:24:11 +0000 (12:24 +0100)]
BUG/MEDIUM: mux_h1: Process a new request if we already received it.
In h1_detach(), if our input buffer isn't empty, don't just subscribe(), we
may hold a new request, and there's nothing left to read. Instead, call
h1_process() directly, so that a new stream is created.
Failure to do so means if we received the new request to early, the
connecetion will just hang, as it happens when using svn.
BUG/MINOR: peers: Use after free of "peers" section.
When a "peers" section has not any local peer, it is removed of the list
of "peers" sections by check_config_validity(). But a stick-table which
refers to a "peers" section stores a pointer to this peers section.
These pointer must be reset to NULL value for each stick-table refering to
such a "peers" section to prevent stktable_init() to start the peers frontend
attached to the peers section dereferencing the invalid pointer.
Furthemore this patch stops the peers frontend as this is done for other
configurations invalidated by check_config_validity().
Thank you to Olivier D for having reported this issue with such a
simple configuration file which made haproxy crash when started with
-c option for configuration file validation.
defaults
mode http
peers mypeers
peer toto 127.0.0.1:1024
backend test
stick-table type ip size 10k expire 1h store http_req_rate(1h) peers mypeers
BUG/MINOR: peers: init bind_proc to 1 if it wasn't initialized
Tim reported that in master-worker mode, if a stick-table is declared
but not used in the configuration, its associated peers listener won't
bind.
This problem is due the fact that the master-worker and the daemon mode,
depend on the bind_proc field of the peers proxy to disable the listener.
Unfortunately the bind_proc is left to 0 if no stick-table were used in
the configuration, stopping the listener on all processes.
This fixes sets the bind_proc to the first process if it wasn't
initialized.
Should fix bug #558. Should be backported as far as 1.8.
Emmanuel Hocdet [Fri, 28 Feb 2020 15:00:34 +0000 (16:00 +0100)]
MINOR: ssl: rework add cert chain to CTX to be libssl independent
SSL_CTX_set1_chain is used for openssl >= 1.0.2 and a loop with
SSL_CTX_add_extra_chain_cert for openssl < 1.0.2.
SSL_CTX_add_extra_chain_cert exist with openssl >= 1.0.2 and can be
used for all openssl version (is new name is SSL_CTX_add0_chain_cert).
This patch use SSL_CTX_add_extra_chain_cert to remove any #ifdef for
compatibilty. In addition sk_X509_num()/sk_X509_value() replace
sk_X509_shift() to extract CA from chain, as it is used in others part
of the code.
Emmanuel Hocdet [Mon, 23 Mar 2020 09:31:47 +0000 (10:31 +0100)]
BUG/MINOR: ssl: memory leak when find_chain is NULL
This bug was introduced by 85888573 "BUG/MEDIUM: ssl: chain must be
initialized with sk_X509_new_null()". No need to set find_chain with
sk_X509_new_null(), use find_chain conditionally to fix issue #516.
Willy Tarreau [Mon, 23 Mar 2020 08:43:45 +0000 (09:43 +0100)]
[RELEASE] Released version 2.2-dev5
Released version 2.2-dev5 with the following main changes :
- CLEANUP: ssl: is_default is a bit in ckch_inst
- BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters
- DOC: ssl: clarify security implications of TLS tickets
- CLEANUP: remove support for Linux i686 vsyscalls
- CLEANUP: drop support for USE_MY_ACCEPT4
- CLEANUP: remove support for USE_MY_EPOLL
- CLEANUP: remove support for USE_MY_SPLICE
- CLEANUP: remove the now unused common/syscall.h
- BUILD: make dladdr1 depend on glibc version and not __USE_GNU
- BUILD: wdt: only test for SI_TKILL when compiled with thread support
- BUILD: Makefile: the compiler-specific flags should all be in SPEC_CFLAGS
- CLEANUP: ssl: separate the directory loading in a new function
- BUG/MINOR: buffers: MT_LIST_DEL_SAFE() expects the temporary pointer.
- BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL;
- MINOR: init: move the maxsock calculation code to compute_ideal_maxsock()
- MEDIUM: init: always try to push the FD limit when maxconn is set from -m
- BUG/MAJOR: list: fix invalid element address calculation
- BUILD: stream-int: fix a few includes dependencies
- MINOR: mt_lists: Appease gcc.
- MINOR: lists: Implement function to convert list => mt_list and mt_list => list
- MINOR: servers: Kill priv_conns.
- MINOR: lists: fix indentation.
- BUG/MEDIUM: random: align the state on 2*64 bits for ARM64
- BUG/MEDIUM: connections: Don't assume the connection has a valid session.
- BUG/MEDIUM: pools: Always update free_list in pool_gc().
- BUG/MINOR: haproxy: always initialize sleeping_thread_mask
- BUG/MINOR: listener/mq: do not dispatch connections to remote threads when stopping
- BUG/MINOR: haproxy/threads: try to make all threads leave together
- Revert "BUILD: travis-ci: enable s390x builds"
- BUILD: travis-ci: enable regular s390x builds
- DOC: proxy_protocol: Reserve TLV type 0x05 as PP2_TYPE_UNIQUE_ID
- MINOR: proxy_protocol: Ingest PP2_TYPE_UNIQUE_ID on incoming connections
- MEDIUM: proxy_protocol: Support sending unique IDs using PPv2
- CLEANUP: connection: Add blank line after declarations in PP handling
- CLEANUP: assorted typo fixes in the code and comments
- CI: add spellcheck github action
- DOC: correct typo in alert message about rspirep
- CI: travis: switch linux builds to clang-9
- MINOR: debug: add a new DISGUISE() macro to pass a value as identity
- MINOR: debug: consume the write() result in BUG_ON() to silence a warning
- MINOR: use DISGUISE() everywhere we deliberately want to ignore a result
- BUILD: pools: silence build warnings with DEBUG_MEMORY_POOLS and DEBUG_UAF
- CLEANUP: connection: Stop directly setting an ist's .ptr
- CI: travis: revert to clang-7 for BoringSSL tests
- BUILD: on ARM, must be linked to libatomic.
- BUILD: makefile: fix regex syntax in ARM platform detection
- BUG/MEDIUM: peers: resync ended with RESYNC_PARTIAL in wrong cases.
- REORG: ssl: move ssl_sock_load_cert()
- MINOR: ssl: pass ckch_inst to ssl_sock_load_ckchs()
- MEDIUM: ssl: allow crt-list caching
- MINOR: ssl: directories are loaded like crt-list
- BUG/MINOR: ssl: can't open directories anymore
- BUG/MEDIUM: spoe: dup agent's engine_id string from trash.area
- MINOR: fd: Use a separate lock for logs instead of abusing the fd lock.
- MINOR: mux_pt: Don't try to remove the connection from the idle list.
- MINOR: ssl/cli: show/dump ssl crt-list
- BUG/MINOR: ssl/cli: free the trash chunk in dump_crtlist
- MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock.
- BUG/MINOR: ssl: memory leak in crtlist_parse_file()
- MINOR: tasks: Provide the tasklet to the callback.
- BUG/MINOR: ssl: memleak of struct crtlist_entry
- BUG/MINOR: pattern: Do not pass len = 0 to calloc()
- BUILD: makefile: fix expression again to detect ARM platform
- CI: travis: re-enable ASAN on clang
- CI: travis: proper group output redirection together with travis_wait
- DOC: assorted typo fixes in the documentation
- MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h.
- BUG/MEDIUM: wdt: Don't ignore WDTSIG and DEBUGSIG in __signal_process_queue().
- MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc.
- MINOR: ssl/cli: 'new ssl cert' command
- MINOR: ssl/cli: show certificate status in 'show ssl cert'
- MEDIUM: sessions: Don't be responsible for connections anymore.
- MEDIUM: servers: Split the connections into idle, safe, and available.
- MINOR: fd: Implement fd_takeover().
- MINOR: connections: Add a new mux method, "takeover".
- MINOR: connections: Make the "list" element a struct mt_list instead of list.
- MINOR: connections: Add a flag to know if we're in the safe or idle list.
- MEDIUM: connections: Attempt to get idle connections from other threads.
- MEDIUM: mux_h1: Implement the takeover() method.
- MEDIUM: mux_h2: Implement the takeover() method.
- MEDIUM: mux_fcgi: Implement the takeover() method.
- MEDIUM: connections: Kill connections even if we are reusing one.
- BUG/MEDIUM: connections: Don't forget to decrement idle connection counters.
- BUG/MINOR: ssl: Do not free garbage pointers on memory allocation failure
- BUG/MINOR: ssl: Correctly add the 1 for the sentinel to the number of elements
- BUG/MINOR: ssl: crtlist_dup_filters() must return NULL with fcount == 0
- BUG/MEDIUM: build: Fix compilation by spelling decl correctly.
- BUILD/MEDIUM: fd: Declare fd_mig_lock as extern.
- CI: run travis-ci builds on push only, skip pull requests
- CI: temporarily disable unstable travis arm64 builds
- BUG/MINOR: ssl/cli: free BIO upon error in 'show ssl cert'
- BUG/MINOR: connections: Make sure we free the connection on failure.
- BUG/MINOR: ssl/cli: fix a potential NULL dereference
- BUG/MEDIUM: h1: Make sure we subscribe before going into idle list.
- BUG/MINOR: connections: Set idle_time before adding to idle list.
- MINOR: muxes: Note that we can't usee a connection when added to the srv idle.
- REGTEST: increase timeouts on the seamless-reload test
- BUG/MINOR: haproxy/threads: close a possible race in soft-stop detection
- CLEANUP: haproxy/threads: don't check global_tasks_mask twice
In run_thread_poll_loop() we test both for (global_tasks_mask & tid_bit)
and thread_has_tasks(), but the former is useless since this test is
already part of the latter.
Willy Tarreau [Mon, 23 Mar 2020 08:27:28 +0000 (09:27 +0100)]
BUG/MINOR: haproxy/threads: close a possible race in soft-stop detection
Commit 4b3f27b ("BUG/MINOR: haproxy/threads: try to make all threads
leave together") improved the soft-stop synchronization but it left a
small race open because it looks at tasks_run_queue, which can drop
to zero then back to one while another thread picks the task from the
run queue to insert it into the tasklet_list. The risk is very low but
not null. In addition the condition didn't consider the possible presence
of signals in the queue.
This patch moves the stopping detection just after the "wake" calculation
which already takes care of the various queues' sizes and signals. It
avoids needlessly duplicating these tests.
The bug was discovered during a code review but will probably never be
observed. This fix may be backported to 2.1 and 2.0 along with the commit
above.
Willy Tarreau [Mon, 23 Mar 2020 08:11:51 +0000 (09:11 +0100)]
REGTEST: increase timeouts on the seamless-reload test
The abns_socket in seamless-reload regtest regularly fails in Travis-CI
on smaller machines only (typically the ppc64le and sometimes s390x).
The error always reports an incomplete HTTP header as seen from the
client. And this can occasionally be reproduced on the minicloud ppc64le
image when setting a huge file descriptors limit (1 million).
What happens in fact is the following: depending on the binding order,
some connections from the client might reach the TCP listener on the
old instance and be forwarded to the ABNS listener of the second
instance just being prepared to start up. But due to the huge number
of FDs, setting them up takes slightly more time and the 20ms server
timeout may expire before the new instance finishes its startup. This
can result in an occasional 504, except that since the client timeout
is the same as the server timeout, both sides are closed at the same
time and the client doesn't receive the 504.
In addition a second problem plugs onto this: by default http-reuse is
enabled. Some requests being forwarded to the older instance will be
sent over an already established connection. But the CPU used by the
starting process using many FDs will be taken away from the older
process, whose abns listener will not see a request for more than 20ms,
and will decide to kill the idle client connection. At the same moment
the TCP proxy forwards a request over this closing connection, it
detects the close and silently closes the other side to let the
client retry, which is detected by the vtest client as another case
of empty header. This is easier to reproduce in VMs with few CPUs
(2 or less) and some noisy neighbors such as a few spinning loops in
background.
Let's just increase this tests' timeout to avoid this. While a few
ms are close to the scheduler's granularity, this test is never
supposed to trigger the timeouts so it's safe to go higher without
impacts on the test execution time. At one second the problem seems
impossible to reproduce on the minicloud VMs.
Olivier Houchard [Sun, 22 Mar 2020 22:25:51 +0000 (23:25 +0100)]
MINOR: muxes: Note that we can't usee a connection when added to the srv idle.
In the various muxes, add a comment documenting that once
srv_add_to_idle_list() got called, any thread may pick that conenction up,
so it is unsafe to access the mux context/the connection, the only thing we
can do is returning.
Olivier Houchard [Sun, 22 Mar 2020 18:59:52 +0000 (19:59 +0100)]
BUG/MINOR: connections: Set idle_time before adding to idle list.
In srv_add_to_idle_list(), make sure we set the idle_time before we add
the connection to an idle list, not after, otherwise another thread may
grab it, set the idle_time to 0, only to have the original thread set it
back to now_ms.
This may have an impact, as in conn_free() we check idle_time to decide
if we should decrement the idle connection counters for the server.
Olivier Houchard [Sun, 22 Mar 2020 18:56:03 +0000 (19:56 +0100)]
BUG/MEDIUM: h1: Make sure we subscribe before going into idle list.
In h1_detach(), make sure we subscribe before we call
srv_add_to_idle_list(), not after. As soon as srv_add_to_idle_list() is
called, and it is put in an idle list, another thread can take it, and
we're no longer allowed to subscribe.
This fixes a race condition when another thread grabs a connection as soon
as it is put, the original owner would subscribe, and thus the new thread
would fail to do so, and to activate polling.
Olivier Houchard [Fri, 20 Mar 2020 13:26:32 +0000 (14:26 +0100)]
BUG/MINOR: connections: Make sure we free the connection on failure.
In connect_server(), make sure we properly free a newly created connection
if we somehow fail, and it has not yet been attached to a conn_stream, or
it would lead to a memory leak.
This should appease coverity for backend.c, as reported in inssue #556.
[wt: arm64 shows timeouts during packages downloads and causes all
builds to be reported as failures; building for arm64 on real hardware
is still done on a regular basis and works fine however]
Tim Duesterhus [Thu, 19 Mar 2020 15:12:09 +0000 (16:12 +0100)]
BUG/MINOR: ssl: Do not free garbage pointers on memory allocation failure
In `ckch_inst_sni_ctx_to_sni_filters` use `calloc()` to allocate the filter
array. When the function fails to allocate memory for a single entry the
whole array will be `free()`d using free_sni_filters(). With the previous
`malloc()` the pointers for entries after the failing allocation could
possibly be a garbage value.
Olivier Houchard [Thu, 19 Mar 2020 22:52:28 +0000 (23:52 +0100)]
BUG/MEDIUM: connections: Don't forget to decrement idle connection counters.
In conn_backend_get(), when we manage to get an idle connection from the
current thread's pool, don't forget to decrement the idle connection
counters, or we may end up not reusing connections when we could, and/or
killing connections when we shouldn't.
Olivier Houchard [Mon, 16 Mar 2020 12:49:00 +0000 (13:49 +0100)]
MEDIUM: connections: Kill connections even if we are reusing one.
In connect_server(), if we notice we have more file descriptors opened than
we should, there's no reason not to close a connection just because we're
reusing one, so do it anyway.
MEDIUM: connections: Attempt to get idle connections from other threads.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
Olivier Houchard [Wed, 19 Feb 2020 16:18:57 +0000 (17:18 +0100)]
MINOR: connections: Add a new mux method, "takeover".
Add a new mux method, "takeover", that will attempt to make the current thread
responsible for the connection.
It should return 0 on success, and non-zero on failure.
Implement a new function, fd_takeover(), that lets you become the thread
responsible for the fd. On architectures that do not have a double-width CAS,
use a global rwlock.
fd_set_running() was also changed to be able to compete with fd_takeover(),
either using a dooble-width CAS on both running_mask and thread_mask, or
by claiming a reader on the global rwlock. This extra operation should not
have any measurable impact on modern architectures where threading is
relevant.
Olivier Houchard [Thu, 13 Feb 2020 18:12:07 +0000 (19:12 +0100)]
MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
Olivier Houchard [Mon, 20 Jan 2020 12:56:01 +0000 (13:56 +0100)]
MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
Olivier Houchard [Wed, 18 Mar 2020 14:48:29 +0000 (15:48 +0100)]
MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc.
The flush_lock was introduced, mostly to be sure that pool_gc() will never
dereference a pointer that has been free'd. __pool_get_first() was acquiring
the lock to, the fear was that otherwise that pointer could get free'd later,
and then pool_gc() would attempt to dereference it. However, that can not
happen, because the only functions that can free a pointer, when using
lockless pools, are pool_gc() and pool_flush(), and as long as those two
are mutually exclusive, nobody will be able to free the pointer while
pool_gc() attempts to access it.
So change the flush_lock to a spinlock, and don't bother acquire/release
it in __pool_get_first(), that way callers of __pool_get_first() won't have
to wait while the pool is flushed. The worst that can happen is we call
__pool_refill_alloc() while the pool is getting flushed, and memory can
get allocated just to be free'd.
Olivier Houchard [Wed, 18 Mar 2020 12:10:05 +0000 (13:10 +0100)]
BUG/MEDIUM: wdt: Don't ignore WDTSIG and DEBUGSIG in __signal_process_queue().
When running __signal_process_queue(), we ignore most signals. We can't,
however, ignore WDTSIG and DEBUGSIG, otherwise that thread may end up
waiting for another one that could hold a glibc lock, while the other thread
wait for this one to enter debug_handler().
So make sure WDTSIG and DEBUGSIG aren't ignored, if they are defined.
This probably explains the watchdog deadlock described in github issue
Olivier Houchard [Wed, 18 Mar 2020 12:07:19 +0000 (13:07 +0100)]
MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h.
Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into
types/signal.h, so that we can access them in another file.
We need those definition to avoid blocking those signals when running
__signal_process_queue().
Willy Tarreau [Wed, 18 Mar 2020 08:35:58 +0000 (09:35 +0100)]
CI: travis: re-enable ASAN on clang
As spotted by Tim, ASAN is disabled on clang-9 due to an exact compiler
name match. Let's relax the rule and accept "clang" and "clang-*". More
context here: https://www.mail-archive.com/haproxy@formilux.org/msg36688.html
Willy Tarreau [Wed, 18 Mar 2020 07:20:45 +0000 (08:20 +0100)]
BUILD: makefile: fix expression again to detect ARM platform
I messed up the fix in 67b095e ("BUILD: makefile: fix regex syntax in
ARM platform detection"), I tried it by hand in the shell without "-v"
but left it in the expression. It works on ARM because it only finds
lines starting with '#' but on other platforms it insists for -latomic.
Tim Duesterhus [Tue, 17 Mar 2020 20:08:24 +0000 (21:08 +0100)]
BUG/MINOR: pattern: Do not pass len = 0 to calloc()
The behavior of calloc() when being passed `0` as `nelem` is implementation
defined. It may return a NULL pointer.
Avoid this issue by checking before allocating. While doing so adjust the local
integer variables that are used to refer to memory offsets to `size_t`.
There is a memleak of the entry structure in crtlist_load_cert_dir(), in
the case we can't stat the file, or this is not a regular file. Let's
move the entry allocation so it's done after these tests.
Olivier Houchard [Tue, 17 Mar 2020 17:15:04 +0000 (18:15 +0100)]
MINOR: tasks: Provide the tasklet to the callback.
When tasklet were introduced, it has been decided not to provide the tasklet
to the callback, but NULL instead. While it may have been reasonable back
then, maybe to be able to differentiate a task from a tasklet from the
callback, it also means that we can't access the tasklet from the handler if
the context provided can't be trusted.
As no handler is shared between a task and a tasklet, and there are now
other means of distinguishing between task and tasklet, just pass the
tasklet pointer too.
This may be backported to 2.1, 2.0 and 1.9 if needed.
Olivier Houchard [Thu, 27 Feb 2020 16:26:13 +0000 (17:26 +0100)]
MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock.
In the struct fdtab, introduce a new mask, running_mask. Each thread should
add its bit before using the fd.
Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just
spin as long as the mask is non-zero, to be sure we access the data
exclusively.
fd_set_running_excl() spins until the mask is 0, fd_set_running() just
adds the thread bit, and fd_clr_running() removes it.
show ssl crt-list [<filename>]: Without a specified filename, display
the list of crt-lists used by the configuration. If a filename is
specified, it will displays the content of this crt-list, with a line
identifier at the beginning of each line. This output must not be used
as a crt-list file.
dump ssl crt-list <filename>: Dump the content of a crt-list, the output
can be used as a crt-list file.
Note: It currently displays the default ssl-min-ver and ssl-max-ver
which are potentialy not in the original file.
Olivier Houchard [Tue, 10 Mar 2020 17:01:25 +0000 (18:01 +0100)]
MINOR: mux_pt: Don't try to remove the connection from the idle list.
Don't bother trying to remove the connection from the idle list, as the
only connections the mux_pt handles are now the TCP-mode connections, and
those are never added to the idle list.
Kevin Zhu [Fri, 13 Mar 2020 02:39:51 +0000 (10:39 +0800)]
BUG/MEDIUM: spoe: dup agent's engine_id string from trash.area
The agent's engine_id forgot to dup from trash, all engine_ids point to
the same address "&trash.area", the engine_id changed at run time and will
double free when release agents and trash.
This bug was introduced by the commit ee3bcddef ("MINOR: tools: add a generic
function to generate UUIDs").
The commit 6be66ec ("MINOR: ssl: directories are loaded like crt-list")
broke the directory loading of the certificates. The <crtlist> wasn't
filled by the crtlist_load_cert_dir() function. And the entries were
not correctly initialized. Leading to a segfault during startup.
Generate a directory cache with the crtlist and crtlist_entry structures.
With this new model, directories are a special case of the crt-lists.
A directory is a crt-list which allows only one occurence of each file,
without SSL configuration (ssl_bind_conf) and without filters.
The crtlist structure defines a crt-list in the HAProxy configuration.
It contains crtlist_entry structures which are the lines in a crt-list
file.
crt-list are now loaded in memory using crtlist and crtlist_entry
structures. The file is read only once. The generation algorithm changed
a little bit, new ckch instances are generated from the crtlist
structures, instead of being generated during the file loading.
The loading function was split in two, one that loads and caches the
crt-list and certificates, and one that looks for a crt-list and creates
the ckch instances.
Filters are also stored in crtlist_entry->filters as a char ** so we can
generate the sni_ctx again if needed. I won't be needed anymore to parse
the sni_ctx to do that.
A crtlist_entry stores the list of all ckch_inst that were generated
from this entry.
MINOR: ssl: pass ckch_inst to ssl_sock_load_ckchs()
Pass a pointer to the struct ckch_inst to the ssl_sock_load_ckchs()
function so we can manipulate the ckch_inst from
ssl_sock_load_cert_list_file() and ssl_sock_load_cert().
Emeric Brun [Mon, 16 Mar 2020 09:51:01 +0000 (10:51 +0100)]
BUG/MEDIUM: peers: resync ended with RESYNC_PARTIAL in wrong cases.
This bug was introduced with peers.c code re-work (7d0ceeec80):
"struct peer" flags are mistakenly checked instead of
"struct peers" flags to check the resync status of the local peer.
The issue was reported here:
https://github.com/haproxy/haproxy/issues/545
This bug affects all branches >= 2.0 and should be backported.
Willy Tarreau [Mon, 16 Mar 2020 07:10:56 +0000 (08:10 +0100)]
CI: travis: revert to clang-7 for BoringSSL tests
Building BoringSSL with clang9 fails:
https://travis-ci.com/github/haproxy/haproxy/jobs/298267505
https://bugs.chromium.org/p/boringssl/issues/detail?id=323
These are purposely there to crash the process at specific locations.
But the annoying warnings do not help with debugging and they are not
even reliable as the compiler may decide to optimize them away. Let's
pass the pointer through DISGUISE() to avoid this.
Willy Tarreau [Sat, 14 Mar 2020 10:03:20 +0000 (11:03 +0100)]
MINOR: use DISGUISE() everywhere we deliberately want to ignore a result
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
Willy Tarreau [Sat, 14 Mar 2020 09:58:35 +0000 (10:58 +0100)]
MINOR: debug: consume the write() result in BUG_ON() to silence a warning
Tim reported that BUG_ON() issues warnings on his distro, as the libc marks
some syscalls with __attribute__((warn_unused_result)). Let's pass the
write() result through DISGUISE() to hide it.
Willy Tarreau [Sat, 14 Mar 2020 09:42:26 +0000 (10:42 +0100)]
MINOR: debug: add a new DISGUISE() macro to pass a value as identity
This does exactly the same as ALREADY_CHECKED() but does it inline,
returning an identical copy of the scalar variable without letting
the compiler know how it might have been transformed. This can
forcefully disable certain null-pointer checks or result checks when
known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0
will not cause a null-deref warning.
This message comes when we run:
haproxy -c -V -f /etc/haproxy/haproxy.cfg
[ALERT] 072/233727 (30865) : parsing [/etc/haproxy/haproxy.cfg:34] : The 'rspirep' directive is not supported anymore sionce HAProxy 2.1. Use 'http-response replace-header' instead.
[ALERT] 072/233727 (30865) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 072/233727 (30865) : Fatal errors found in configuration.
Ilya Shipitsin [Tue, 10 Mar 2020 07:06:11 +0000 (12:06 +0500)]
CLEANUP: assorted typo fixes in the code and comments
These are mostly comments in the code. A few error messages were fixed
and are of low enough importance not to deserve a backport. Some regtests
were also fixed.
Tim Duesterhus [Fri, 13 Mar 2020 11:34:25 +0000 (12:34 +0100)]
CLEANUP: connection: Add blank line after declarations in PP handling
This adds the missing blank lines in `make_proxy_line_v2` and
`conn_recv_proxy`. It also adjusts the type of the temporary variable
used for the return value of `recv` to be `ssize_t` instead of `int`.
Tim Duesterhus [Fri, 13 Mar 2020 11:34:24 +0000 (12:34 +0100)]
MEDIUM: proxy_protocol: Support sending unique IDs using PPv2
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.
This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.
Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
Willy Tarreau [Fri, 13 Mar 2020 03:10:31 +0000 (04:10 +0100)]
BUILD: travis-ci: enable regular s390x builds
Previous patch didn't only disable removal of the reg-test but
disabled s390x entirely. Now we enable it like other platforms.
This is an attempt at fixing build issue #504.
Willy Tarreau [Thu, 12 Mar 2020 16:28:01 +0000 (17:28 +0100)]
BUG/MINOR: haproxy/threads: try to make all threads leave together
There's a small issue with soft stop combined with the incoming
connection load balancing. A thread may dispatch a connection to
another one at the moment stopping=1 is set, and the second one could
stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(),
without ever picking these connections from the queue. This is
visible in that it may occasionally cause a connection drop on
reload since no remaining thread will ever pick that connection
anymore.
In order to address this, this patch adds a stopping_thread_mask
variable by which threads acknowledge their willingness to stop
when their runqueue is empty. And all threads will only stop at
this moment, so that if finally some late work arrives in the
thread's queue, it still has a chance to process it.
Willy Tarreau [Thu, 12 Mar 2020 16:33:29 +0000 (17:33 +0100)]
BUG/MINOR: listener/mq: do not dispatch connections to remote threads when stopping
When stopping there is a risk that other threads are already in the
process of stopping, so let's not add new work in their queue and
instead keep the incoming connection local.
Surprizingly the variable was never initialized, though on most
platforms it's zeroed at boot, and it is relatively harmless
anyway since in the worst case the bits are updated around poll().
This was introduced by commit 79321b95a85 and needs to be backported
as far as 1.9.
Olivier Houchard [Thu, 12 Mar 2020 18:05:39 +0000 (19:05 +0100)]
BUG/MEDIUM: pools: Always update free_list in pool_gc().
In pool_gc(), when we're not using lockless pool, always update free_list,
and read from it the next element to free. As we now unlock the pool while
we're freeing the item, another thread could have updated free_list in our
back. Not doing so could lead to segfaults when pool_gc() is called.
Olivier Houchard [Thu, 12 Mar 2020 14:30:17 +0000 (15:30 +0100)]
BUG/MEDIUM: connections: Don't assume the connection has a valid session.
Don't assume the connection always has a valid session in "owner".
Instead, attempt to retrieve the session from the stream, and modify
the error snapshot code to not assume we always have a session, or the proxy
for the other end.
Willy Tarreau [Wed, 11 Mar 2020 23:31:18 +0000 (00:31 +0100)]
BUG/MEDIUM: random: align the state on 2*64 bits for ARM64
x86_64 and ARM64 do support the double-word atomic CAS. However on
ARM it must be done only on aligned data. The random generator makes
use of such double-word atomic CAS when available but didn't enforce
alignment, which causes ARM64 to crash early in the startup since
commit 52bf839 ("BUG/MEDIUM: random: implement a thread-safe and
process-safe PRNG").
This commit just unconditionally aligns the arrays. It must be
backported to all branches where the commit above is backported
(likely till 2.0).
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
Olivier Houchard [Wed, 11 Mar 2020 13:57:52 +0000 (14:57 +0100)]
MINOR: lists: Implement function to convert list => mt_list and mt_list => list
Implement mt_list_to_list() and list_to_mt_list(), to be able to convert
from a struct list to a struct mt_list, and vice versa.
This is normally of no use, except for struct connection's list field, that
can go in either a struct list or a struct mt_list.
Willy Tarreau [Wed, 11 Mar 2020 13:10:23 +0000 (14:10 +0100)]
BUILD: stream-int: fix a few includes dependencies
The stream-int code doesn't need to load server.h as it doesn't use
servers at all. However removing this one reveals that proxy.h was
lacking types/checks.h that used to be silently inherited from
types/server.h loaded before in stream_interface.h.
Willy Tarreau [Wed, 11 Mar 2020 10:54:04 +0000 (11:54 +0100)]
BUG/MAJOR: list: fix invalid element address calculation
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:
while (item != head) {
item = LIST_ELEM(item.n);
}
loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.
This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.
With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :
while ((listener = MT_LIST_POP(...)))
used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:
There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.
This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.