MEDIUM: connections: Attempt to get idle connections from other threads.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
MINOR: connections: Make the "list" element a struct mt_list instead of list.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
Olivier Houchard [Wed, 19 Feb 2020 16:18:57 +0000 (17:18 +0100)]
MINOR: connections: Add a new mux method, "takeover".
Add a new mux method, "takeover", that will attempt to make the current thread
responsible for the connection.
It should return 0 on success, and non-zero on failure.
Implement a new function, fd_takeover(), that lets you become the thread
responsible for the fd. On architectures that do not have a double-width CAS,
use a global rwlock.
fd_set_running() was also changed to be able to compete with fd_takeover(),
either using a dooble-width CAS on both running_mask and thread_mask, or
by claiming a reader on the global rwlock. This extra operation should not
have any measurable impact on modern architectures where threading is
relevant.
Olivier Houchard [Thu, 13 Feb 2020 18:12:07 +0000 (19:12 +0100)]
MEDIUM: servers: Split the connections into idle, safe, and available.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
Olivier Houchard [Mon, 20 Jan 2020 12:56:01 +0000 (13:56 +0100)]
MEDIUM: sessions: Don't be responsible for connections anymore.
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
Olivier Houchard [Wed, 18 Mar 2020 14:48:29 +0000 (15:48 +0100)]
MINOR: memory: Change the flush_lock to a spinlock, and don't get it in alloc.
The flush_lock was introduced, mostly to be sure that pool_gc() will never
dereference a pointer that has been free'd. __pool_get_first() was acquiring
the lock to, the fear was that otherwise that pointer could get free'd later,
and then pool_gc() would attempt to dereference it. However, that can not
happen, because the only functions that can free a pointer, when using
lockless pools, are pool_gc() and pool_flush(), and as long as those two
are mutually exclusive, nobody will be able to free the pointer while
pool_gc() attempts to access it.
So change the flush_lock to a spinlock, and don't bother acquire/release
it in __pool_get_first(), that way callers of __pool_get_first() won't have
to wait while the pool is flushed. The worst that can happen is we call
__pool_refill_alloc() while the pool is getting flushed, and memory can
get allocated just to be free'd.
Olivier Houchard [Wed, 18 Mar 2020 12:10:05 +0000 (13:10 +0100)]
BUG/MEDIUM: wdt: Don't ignore WDTSIG and DEBUGSIG in __signal_process_queue().
When running __signal_process_queue(), we ignore most signals. We can't,
however, ignore WDTSIG and DEBUGSIG, otherwise that thread may end up
waiting for another one that could hold a glibc lock, while the other thread
wait for this one to enter debug_handler().
So make sure WDTSIG and DEBUGSIG aren't ignored, if they are defined.
This probably explains the watchdog deadlock described in github issue
Olivier Houchard [Wed, 18 Mar 2020 12:07:19 +0000 (13:07 +0100)]
MINOR: wdt: Move the definitions of WDTSIG and DEBUGSIG into types/signal.h.
Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into
types/signal.h, so that we can access them in another file.
We need those definition to avoid blocking those signals when running
__signal_process_queue().
Willy Tarreau [Wed, 18 Mar 2020 08:35:58 +0000 (09:35 +0100)]
CI: travis: re-enable ASAN on clang
As spotted by Tim, ASAN is disabled on clang-9 due to an exact compiler
name match. Let's relax the rule and accept "clang" and "clang-*". More
context here: https://www.mail-archive.com/haproxy@formilux.org/msg36688.html
Willy Tarreau [Wed, 18 Mar 2020 07:20:45 +0000 (08:20 +0100)]
BUILD: makefile: fix expression again to detect ARM platform
I messed up the fix in 67b095e ("BUILD: makefile: fix regex syntax in
ARM platform detection"), I tried it by hand in the shell without "-v"
but left it in the expression. It works on ARM because it only finds
lines starting with '#' but on other platforms it insists for -latomic.
Tim Duesterhus [Tue, 17 Mar 2020 20:08:24 +0000 (21:08 +0100)]
BUG/MINOR: pattern: Do not pass len = 0 to calloc()
The behavior of calloc() when being passed `0` as `nelem` is implementation
defined. It may return a NULL pointer.
Avoid this issue by checking before allocating. While doing so adjust the local
integer variables that are used to refer to memory offsets to `size_t`.
There is a memleak of the entry structure in crtlist_load_cert_dir(), in
the case we can't stat the file, or this is not a regular file. Let's
move the entry allocation so it's done after these tests.
Olivier Houchard [Tue, 17 Mar 2020 17:15:04 +0000 (18:15 +0100)]
MINOR: tasks: Provide the tasklet to the callback.
When tasklet were introduced, it has been decided not to provide the tasklet
to the callback, but NULL instead. While it may have been reasonable back
then, maybe to be able to differentiate a task from a tasklet from the
callback, it also means that we can't access the tasklet from the handler if
the context provided can't be trusted.
As no handler is shared between a task and a tasklet, and there are now
other means of distinguishing between task and tasklet, just pass the
tasklet pointer too.
This may be backported to 2.1, 2.0 and 1.9 if needed.
Olivier Houchard [Thu, 27 Feb 2020 16:26:13 +0000 (17:26 +0100)]
MEDIUM: fd: Introduce a running mask, and use it instead of the spinlock.
In the struct fdtab, introduce a new mask, running_mask. Each thread should
add its bit before using the fd.
Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just
spin as long as the mask is non-zero, to be sure we access the data
exclusively.
fd_set_running_excl() spins until the mask is 0, fd_set_running() just
adds the thread bit, and fd_clr_running() removes it.
show ssl crt-list [<filename>]: Without a specified filename, display
the list of crt-lists used by the configuration. If a filename is
specified, it will displays the content of this crt-list, with a line
identifier at the beginning of each line. This output must not be used
as a crt-list file.
dump ssl crt-list <filename>: Dump the content of a crt-list, the output
can be used as a crt-list file.
Note: It currently displays the default ssl-min-ver and ssl-max-ver
which are potentialy not in the original file.
Olivier Houchard [Tue, 10 Mar 2020 17:01:25 +0000 (18:01 +0100)]
MINOR: mux_pt: Don't try to remove the connection from the idle list.
Don't bother trying to remove the connection from the idle list, as the
only connections the mux_pt handles are now the TCP-mode connections, and
those are never added to the idle list.
Kevin Zhu [Fri, 13 Mar 2020 02:39:51 +0000 (10:39 +0800)]
BUG/MEDIUM: spoe: dup agent's engine_id string from trash.area
The agent's engine_id forgot to dup from trash, all engine_ids point to
the same address "&trash.area", the engine_id changed at run time and will
double free when release agents and trash.
This bug was introduced by the commit ee3bcddef ("MINOR: tools: add a generic
function to generate UUIDs").
The commit 6be66ec ("MINOR: ssl: directories are loaded like crt-list")
broke the directory loading of the certificates. The <crtlist> wasn't
filled by the crtlist_load_cert_dir() function. And the entries were
not correctly initialized. Leading to a segfault during startup.
Generate a directory cache with the crtlist and crtlist_entry structures.
With this new model, directories are a special case of the crt-lists.
A directory is a crt-list which allows only one occurence of each file,
without SSL configuration (ssl_bind_conf) and without filters.
The crtlist structure defines a crt-list in the HAProxy configuration.
It contains crtlist_entry structures which are the lines in a crt-list
file.
crt-list are now loaded in memory using crtlist and crtlist_entry
structures. The file is read only once. The generation algorithm changed
a little bit, new ckch instances are generated from the crtlist
structures, instead of being generated during the file loading.
The loading function was split in two, one that loads and caches the
crt-list and certificates, and one that looks for a crt-list and creates
the ckch instances.
Filters are also stored in crtlist_entry->filters as a char ** so we can
generate the sni_ctx again if needed. I won't be needed anymore to parse
the sni_ctx to do that.
A crtlist_entry stores the list of all ckch_inst that were generated
from this entry.
MINOR: ssl: pass ckch_inst to ssl_sock_load_ckchs()
Pass a pointer to the struct ckch_inst to the ssl_sock_load_ckchs()
function so we can manipulate the ckch_inst from
ssl_sock_load_cert_list_file() and ssl_sock_load_cert().
Emeric Brun [Mon, 16 Mar 2020 09:51:01 +0000 (10:51 +0100)]
BUG/MEDIUM: peers: resync ended with RESYNC_PARTIAL in wrong cases.
This bug was introduced with peers.c code re-work (7d0ceeec80):
"struct peer" flags are mistakenly checked instead of
"struct peers" flags to check the resync status of the local peer.
The issue was reported here:
https://github.com/haproxy/haproxy/issues/545
This bug affects all branches >= 2.0 and should be backported.
Willy Tarreau [Mon, 16 Mar 2020 07:10:56 +0000 (08:10 +0100)]
CI: travis: revert to clang-7 for BoringSSL tests
Building BoringSSL with clang9 fails:
https://travis-ci.com/github/haproxy/haproxy/jobs/298267505
https://bugs.chromium.org/p/boringssl/issues/detail?id=323
These are purposely there to crash the process at specific locations.
But the annoying warnings do not help with debugging and they are not
even reliable as the compiler may decide to optimize them away. Let's
pass the pointer through DISGUISE() to avoid this.
Willy Tarreau [Sat, 14 Mar 2020 10:03:20 +0000 (11:03 +0100)]
MINOR: use DISGUISE() everywhere we deliberately want to ignore a result
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
Willy Tarreau [Sat, 14 Mar 2020 09:58:35 +0000 (10:58 +0100)]
MINOR: debug: consume the write() result in BUG_ON() to silence a warning
Tim reported that BUG_ON() issues warnings on his distro, as the libc marks
some syscalls with __attribute__((warn_unused_result)). Let's pass the
write() result through DISGUISE() to hide it.
Willy Tarreau [Sat, 14 Mar 2020 09:42:26 +0000 (10:42 +0100)]
MINOR: debug: add a new DISGUISE() macro to pass a value as identity
This does exactly the same as ALREADY_CHECKED() but does it inline,
returning an identical copy of the scalar variable without letting
the compiler know how it might have been transformed. This can
forcefully disable certain null-pointer checks or result checks when
known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0
will not cause a null-deref warning.
This message comes when we run:
haproxy -c -V -f /etc/haproxy/haproxy.cfg
[ALERT] 072/233727 (30865) : parsing [/etc/haproxy/haproxy.cfg:34] : The 'rspirep' directive is not supported anymore sionce HAProxy 2.1. Use 'http-response replace-header' instead.
[ALERT] 072/233727 (30865) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 072/233727 (30865) : Fatal errors found in configuration.
Ilya Shipitsin [Tue, 10 Mar 2020 07:06:11 +0000 (12:06 +0500)]
CLEANUP: assorted typo fixes in the code and comments
These are mostly comments in the code. A few error messages were fixed
and are of low enough importance not to deserve a backport. Some regtests
were also fixed.
Tim Duesterhus [Fri, 13 Mar 2020 11:34:25 +0000 (12:34 +0100)]
CLEANUP: connection: Add blank line after declarations in PP handling
This adds the missing blank lines in `make_proxy_line_v2` and
`conn_recv_proxy`. It also adjusts the type of the temporary variable
used for the return value of `recv` to be `ssize_t` instead of `int`.
Tim Duesterhus [Fri, 13 Mar 2020 11:34:24 +0000 (12:34 +0100)]
MEDIUM: proxy_protocol: Support sending unique IDs using PPv2
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.
This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.
Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
Willy Tarreau [Fri, 13 Mar 2020 03:10:31 +0000 (04:10 +0100)]
BUILD: travis-ci: enable regular s390x builds
Previous patch didn't only disable removal of the reg-test but
disabled s390x entirely. Now we enable it like other platforms.
This is an attempt at fixing build issue #504.
Willy Tarreau [Thu, 12 Mar 2020 16:28:01 +0000 (17:28 +0100)]
BUG/MINOR: haproxy/threads: try to make all threads leave together
There's a small issue with soft stop combined with the incoming
connection load balancing. A thread may dispatch a connection to
another one at the moment stopping=1 is set, and the second one could
stop by seeing (jobs - unstoppable_jobs) == 0 in run_poll_loop(),
without ever picking these connections from the queue. This is
visible in that it may occasionally cause a connection drop on
reload since no remaining thread will ever pick that connection
anymore.
In order to address this, this patch adds a stopping_thread_mask
variable by which threads acknowledge their willingness to stop
when their runqueue is empty. And all threads will only stop at
this moment, so that if finally some late work arrives in the
thread's queue, it still has a chance to process it.
Willy Tarreau [Thu, 12 Mar 2020 16:33:29 +0000 (17:33 +0100)]
BUG/MINOR: listener/mq: do not dispatch connections to remote threads when stopping
When stopping there is a risk that other threads are already in the
process of stopping, so let's not add new work in their queue and
instead keep the incoming connection local.
Surprizingly the variable was never initialized, though on most
platforms it's zeroed at boot, and it is relatively harmless
anyway since in the worst case the bits are updated around poll().
This was introduced by commit 79321b95a85 and needs to be backported
as far as 1.9.
Olivier Houchard [Thu, 12 Mar 2020 18:05:39 +0000 (19:05 +0100)]
BUG/MEDIUM: pools: Always update free_list in pool_gc().
In pool_gc(), when we're not using lockless pool, always update free_list,
and read from it the next element to free. As we now unlock the pool while
we're freeing the item, another thread could have updated free_list in our
back. Not doing so could lead to segfaults when pool_gc() is called.
Olivier Houchard [Thu, 12 Mar 2020 14:30:17 +0000 (15:30 +0100)]
BUG/MEDIUM: connections: Don't assume the connection has a valid session.
Don't assume the connection always has a valid session in "owner".
Instead, attempt to retrieve the session from the stream, and modify
the error snapshot code to not assume we always have a session, or the proxy
for the other end.
Willy Tarreau [Wed, 11 Mar 2020 23:31:18 +0000 (00:31 +0100)]
BUG/MEDIUM: random: align the state on 2*64 bits for ARM64
x86_64 and ARM64 do support the double-word atomic CAS. However on
ARM it must be done only on aligned data. The random generator makes
use of such double-word atomic CAS when available but didn't enforce
alignment, which causes ARM64 to crash early in the startup since
commit 52bf839 ("BUG/MEDIUM: random: implement a thread-safe and
process-safe PRNG").
This commit just unconditionally aligns the arrays. It must be
backported to all branches where the commit above is backported
(likely till 2.0).
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
Olivier Houchard [Wed, 11 Mar 2020 13:57:52 +0000 (14:57 +0100)]
MINOR: lists: Implement function to convert list => mt_list and mt_list => list
Implement mt_list_to_list() and list_to_mt_list(), to be able to convert
from a struct list to a struct mt_list, and vice versa.
This is normally of no use, except for struct connection's list field, that
can go in either a struct list or a struct mt_list.
Willy Tarreau [Wed, 11 Mar 2020 13:10:23 +0000 (14:10 +0100)]
BUILD: stream-int: fix a few includes dependencies
The stream-int code doesn't need to load server.h as it doesn't use
servers at all. However removing this one reveals that proxy.h was
lacking types/checks.h that used to be silently inherited from
types/server.h loaded before in stream_interface.h.
Willy Tarreau [Wed, 11 Mar 2020 10:54:04 +0000 (11:54 +0100)]
BUG/MAJOR: list: fix invalid element address calculation
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:
while (item != head) {
item = LIST_ELEM(item.n);
}
loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.
This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.
With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :
while ((listener = MT_LIST_POP(...)))
used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:
There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.
This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.
Willy Tarreau [Tue, 10 Mar 2020 16:54:54 +0000 (17:54 +0100)]
MEDIUM: init: always try to push the FD limit when maxconn is set from -m
When a maximum memory setting is passed to haproxy and maxconn is not set
and ulimit-n is not set, it is expected that maxconn will be set to the
highest value permitted by this memory setting, possibly affecting the
FD limit.
When maxconn was changed to be deduced from the current process's FD limit,
the automatic setting above was partially lost because it now remains
limited to the current FD limit in addition to being limited to the
memory usage. For unprivileged processes it does not change anything,
but for privileged processes the difference is important. Indeed, the
previous behavior ensured that the new FD limit could be enforced on
the process as long as the user had the privilege to do so. Now this
does not happen anymore, and some people rely on this for automatic
sizing in VM environments.
This patch implements the ability to verify if the setting will be
enforceable on the process or not. First it computes maxconn based on
the memory limits alone, then checks if the process is willing to accept
them, otherwise tries again by respecting the process' hard limit.
Thanks to this we now have the best of the pre-2.0 behavior and the
current one, in that privileged users will be able to get as high a
maxconn as they need just based on the memory limit, while unprivileged
users will still get as high a setting as permitted by the intersection
of the memory limit and the process' FD limit.
Ideally, after some observation period, this patch along with the
previous one "MINOR: init: move the maxsock calculation code to
compute_ideal_maxsock()" should be backported to 2.1 and 2.0.
Willy Tarreau [Tue, 10 Mar 2020 16:08:53 +0000 (17:08 +0100)]
MINOR: init: move the maxsock calculation code to compute_ideal_maxsock()
The maxsock value is currently derived from global.maxconn and a few other
settings, some of which also depend on global.maxconn. This makes it
difficult to check if a limit is already too high or not during the maxconn
automatic sizing.
Let's move this code into a new function, compute_ideal_maxsock() which now
takes a maxconn in argument. It performs the same operations and returns
the maxsock value if global.maxconn were to be set to that value. It now
replaces the previous code to compute maxsock.
Olivier Houchard [Tue, 10 Mar 2020 16:41:53 +0000 (17:41 +0100)]
BUG/MEDIUM: mt_lists: Make sure we set the deleted element to NULL;
In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable
instead of using the provided pointer directly, we shouldn't have changed
the code that set the pointer to NULL, as we really want the pointer
provided to be nullified, otherwise other parts of the code won't know
we just deleted an element, and bad things will happen.
Olivier Houchard [Tue, 10 Mar 2020 16:39:21 +0000 (17:39 +0100)]
BUG/MINOR: buffers: MT_LIST_DEL_SAFE() expects the temporary pointer.
When calling MT_LIST_DEL_SAFE(), give him the temporary pointer "tmpelt",
as that's what is expected. We want to be able to set that pointer to NULL,
to let other parts of the code know we deleted an element.
CLEANUP: ssl: separate the directory loading in a new function
In order to store and cache the directory loading, the directory loading
was separated from ssl_sock_load_cert() and put in a new function
ssl_sock_load_cert_dir() to be more readable.
Willy Tarreau [Tue, 10 Mar 2020 08:37:08 +0000 (09:37 +0100)]
BUILD: Makefile: the compiler-specific flags should all be in SPEC_CFLAGS
We used to have -Wall -Wextra -Werror in COPTS which are flags fed by
the various USE_* options, and all other warnings in SPEC_CFLAGS. This
makes it impossible to remove these -W* entries (typically -Wextra).
Let's move these 3 flags into SPEC_CFLAGS where they should have been.
Now it's possible to override SPEC_CFLAGS to match any compiler's
specificities, or to clear all warnings at once, or to replace them
all with "-w" to silence warnings.
Willy Tarreau [Tue, 10 Mar 2020 08:26:17 +0000 (09:26 +0100)]
BUILD: wdt: only test for SI_TKILL when compiled with thread support
SI_TKILL is not necessarily defined on older systems and is used only
with the pthread_kill() call a few lines below, so it should also be
subject to the USE_THREAD condition.
Willy Tarreau [Tue, 10 Mar 2020 06:51:48 +0000 (07:51 +0100)]
BUILD: make dladdr1 depend on glibc version and not __USE_GNU
Technically speaking the call was implemented in glibc 2.3 so we must
rely on this and not on __USE_GNU which is an internal define of glibc
to track use of GNU_SOURCE.
Willy Tarreau [Tue, 10 Mar 2020 06:20:10 +0000 (07:20 +0100)]
CLEANUP: remove support for USE_MY_SPLICE
The splice() syscall has been supported in glibc since version 2.5 issued
in 2006 and is present on supported systems so there's no need for having
our own arch-specific syscall definitions anymore.
Willy Tarreau [Tue, 10 Mar 2020 06:08:10 +0000 (07:08 +0100)]
CLEANUP: remove support for USE_MY_EPOLL
This was made to support epoll on patched 2.4 kernels, and on early 2.6
using alternative libcs thanks to the arch-specific syscall definitions.
All the features we support have been around since 2.6.2 and present in
glibc since 2.3.2, neither of which are found in field anymore. Let's
simply drop this and use epoll normally.
Willy Tarreau [Tue, 10 Mar 2020 06:02:46 +0000 (07:02 +0100)]
CLEANUP: drop support for USE_MY_ACCEPT4
The accept4() syscall has been present for a while now, there is no more
reason for maintaining our own arch-specific syscall implementation for
systems lacking it in libc but having it in the kernel.
Willy Tarreau [Tue, 10 Mar 2020 05:48:17 +0000 (06:48 +0100)]
CLEANUP: remove support for Linux i686 vsyscalls
This was introduced 10 years ago to squeeze a few CPU cycles per syscall
on 32-bit x86 machines and was already quite old by then, requiring to
explicitly enable support for this in the kernel. We don't even know if
it still builds, let alone if it works at all on recent kernels! Let's
completely drop this now.
Lukas Tribus [Mon, 9 Mar 2020 23:56:09 +0000 (00:56 +0100)]
DOC: ssl: clarify security implications of TLS tickets
Clarifies security implications of TLS ticket usage when not
rotating TLS ticket keys, after commit 7b5e136458 ("DOC:
improve description of no-tls-tickets").
BUG/MINOR: ssl/cli: sni_ctx' mustn't always be used as filters
Since commit 244b070 ("MINOR: ssl/cli: support crt-list filters"),
HAProxy generates a list of filters based on the sni_ctx in memory.
However it's not always relevant, sometimes no filters were configured
and the CN/SAN in the new certificate are not the same.
This patch fixes the issue by using a flag filters in the ckch_inst, so
we are able to know if there were filters or not. In the late case it
uses the CN/SAN of the new certificate to generate the sni_ctx.
note: filters are still only used in the crt-list atm.
Willy Tarreau [Mon, 9 Mar 2020 13:57:20 +0000 (14:57 +0100)]
[RELEASE] Released version 2.2-dev4
Released version 2.2-dev4 with the following main changes :
- MEDIUM: buffer: remove the buffer_wq lock
- MINOR: ssl: move find certificate chain code to its own function
- MINOR: ssl: resolve issuers chain later
- MINOR: ssl: resolve ocsp_issuer later
- MINOR: ssl/cli: "show ssl cert" command should print the "Chain Filename:"
- BUG/MINOR: h2: reject again empty :path pseudo-headers
- MINOR: wdt: always clear sigev_value to make valgrind happy
- MINOR: epoll: always initialize all of epoll_event to please valgrind
- BUG/MINOR: sample: Make sure to return stable IDs in the unique-id fetch
- BUG/MEDIUM: ssl: chain must be initialized with sk_X509_new_null()
- BUILD: cirrus-ci: suppress OS version check when installing packages
- BUG/MINOR: http_ana: make sure redirect flags don't have overlapping bits
- CLEANUP: fd: remove the FD_EV_STATUS aggregate
- CLEANUP: fd: remove some unneeded definitions of FD_EV_* flags
- MINOR: fd: merge the read and write error bits into RW error
- BUG/MINOR: dns: ignore trailing dot
- MINOR: contrib/prometheus-exporter: Add the last heathcheck duration metric
- BUG/MINOR: http-htx: Do case-insensive comparisons on Host header name
- MINOR: mux-h1: Remove useless case-insensitive comparisons
- MINOR: rawsock: always mark the FD not ready when we're certain it happens
- MEDIUM: connection: make the subscribe() call able to wakeup if ready
- MEDIUM: connection: don't stop receiving events in the FD handler
- MEDIUM: mux-h1: do not blindly wake up the tasklet at end of request anymore
- BUG/MINOR: arg: don't reject missing optional args
- MINOR: tools: make sure to correctly check the returned 'ms' in date2std_log
- MINOR: debug: report the task handler's pointer relative to main
- BUG/MEDIUM: debug: make the debug_handler check for the thread in threads_to_dump
- MINOR: haproxy: export main to ease access from debugger
- MINOR: haproxy: export run_poll_loop
- MINOR: task: export run_tasks_from_list
- BUILD: tools: remove obsolete and conflicting trace() from standard.c
- MINOR: tools: add new function dump_addr_and_bytes()
- MINOR: tools: add resolve_sym_name() to resolve function pointers
- MINOR: debug: use resolve_sym_name() to dump task handlers
- MINOR: cli: make "show fd" rely on resolve_sym_name()
- MEDIUM: debug: add support for dumping backtraces of stuck threads
- MINOR: debug: call backtrace() once upon startup
- MINOR: ssl: add "ca-verify-file" directive
- BUG/MINOR: wdt: do not return an error when the watchdog couldn't be enabled
- BUILD: Makefile: include librt before libpthread
- MEDIUM: wdt: fall back to CLOCK_REALTIME if CLOCK_THREAD_CPUTIME is not available
- MINOR: wdt: do not depend on USE_THREAD
- MINOR: debug: report the number of entries in the backtrace
- MINOR: debug: improve backtrace() on aarch64 and possibly other systems
- MINOR: debug: use our own backtrace function on clang+x86_64
- MINOR: debug: dump the whole trace if we can't spot the starting point
- BUILD: tools: unbreak resolve_sym_name() on non-GNU platforms
- BUILD: tools: rely on __ELF__ not USE_DL to enable use of dladdr()
- CLEANUP: contrib/spoa_example: Fix several typos
- BUILD: makefile: do not modify the build options during make reg-tests
- BUG/MEDIUM: connection: stop polling for sending when the event is ready
- MEDIUM: stream-int: make sure to try to immediately validate the connection
- MINOR: tcp/uxst/sockpair: only ask for I/O when really waiting for a connect()
- MEDIUM: connection: only call ->wake() for connect() without I/O
- OPTIM: connection: disable receiving on disabled events when the run queue is too high
- OPTIM: mux-h1: subscribe rather than waking up at a few other places
- REGTEST: Add unique-id reg-test
- MINOR: stream: Add stream_generate_unique_id function
- MINOR: stream: Use stream_generate_unique_id
- BUG/MINOR: connection/debug: do not enforce !event_type on subscribe() anymore
- MINOR: ssl/cli: support crt-list filters
- MINOR: ssl: reach a ckch_store from a sni_ctx
- DOC: fix incorrect indentation of http_auth_*
- BUG/MINOR: ssl-sock: do not return an uninitialized pointer in ckch_inst_sni_ctx_to_sni_filters
- MINOR: debug: add CLI command "debug dev write" to write an arbitrary size
- MINOR: ist: Add `IST_NULL` macro
- MINOR: ist: Add `int isttest(const struct ist)`
- MINOR: ist: Add `struct ist istalloc(size_t)` and `void istfree(struct ist*)`
- CLEANUP: Use `isttest()` and `istfree()`
- MINOR: ist: Add `struct ist istdup(const struct ist)`
- MINOR: proxy: Make `header_unique_id` a `struct ist`
- MEDIUM: stream: Make the `unique_id` member of `struct stream` a `struct ist`
- OPTIM: startup: fast unique_id allocation for acl.
- DOC: configuration.txt: fix various typos
- DOC: assorted typo fixes in the documentation and Makefile
- BUG/MINOR: init: make the automatic maxconn consider the max of soft/hard limits
- BUG/MAJOR: proxy_protocol: Properly validate TLV lengths
- CLEANUP: proxy_protocol: Use `size_t` when parsing TLVs
- MINOR: buf: Add function to insert a string at an absolute offset in a buffer
- MINOR: htx: Add a function to return a block at a specific offset
- MINOR: htx: Use htx_find_offset() to truncate an HTX message
- MINOR: flt_trace: Use htx_find_offset() to get the available payload length
- BUG/MINOR: filters: Use filter offset to decude the amount of forwarded data
- BUG/MINOR: filters: Forward everything if no data filters are called
- BUG/MEDIUM: cache/filters: Fix loop on HTX blocks caching the response payload
- BUG/MEDIUM: compression/filters: Fix loop on HTX blocks compressing the payload
- BUG/MINOR: http-ana: Reset request analysers on a response side error
- BUG/MINOR: lua: Abort when txn:done() is called from a Lua action
- BUG/MINOR: lua: Ignore the reserve to know if a channel is full or not
- MINOR: lua: Add function to know if a channel is a response one
- MINOR: lua: Stop using the lua txn in hlua_http_get_headers()
- MINOR: lua: Stop using the lua txn in hlua_http_rep_hdr()
- MINOR: lua: Stop using lua txn in hlua_http_del_hdr() and hlua_http_add_hdr()
- MINOR: lua: Remove the flag HLUA_TXN_HTTP_RDY
- MINOR: lua: Rename hlua_action_wake_time() to hlua_set_wake_time()
- BUG/MINOR: lua: Init the lua wake_time value before calling a lua function
- BUG/MINOR: http-rules: Return ACT_RET_ABRT to abort a transaction
- BUG/MINOR: http-rules: Preserve FLT_END analyzers on reject action
- BUG/MINOR: http-rules: Fix a typo in the reject action function
- MINOR: cache/filters: Initialize the cache filter when stream is created
- MINOR: compression/filters: Initialize the comp filter when stream is created
- BUG/MINOR: rules: Preserve FLT_END analyzers on silent-drop action
- BUG/MINOR: rules: Return ACT_RET_ABRT when a silent-drop action is executed
- BUG/MINOR: rules: Increment be_counters if backend is assigned for a silent-drop
- BUG/MINOR: http-rules: Abort transaction when a redirect is applied on response
- BUILD: buffer: types/{ring.h,checks.h} should include buf.h, not buffer.h
- BUILD: ssl: include mini-clist.h
- BUILD: global: must not include common/standard.h but only types/freq_ctr.h
- BUILD: freq_ctr: proto/freq_ctr needs to include common/standard.h
- BUILD: listener: types/listener.h must not include standard.h
- BUG/MEDIUM: random: initialize the random pool a bit better
- BUG/MEDIUM: random: implement per-thread and per-process random sequences
- Revert "BUG/MEDIUM: random: implement per-thread and per-process random sequences"
- BUILD: cirrus-ci: get rid of unstable freebsd images
- MINOR: tools: add 64-bit rotate operators
- BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
- MINOR: backend: use a single call to ha_random32() for the random LB algo
- BUG/MINOR: checks/threads: use ha_random() and not rand()
- MINOR: sample: make all bits random on the rand() sample fetch
- MINOR: tools: add a generic function to generate UUIDs
- DOC: fix typo about no-tls-tickets
- DOC: improve description of no-tls-tickets
- DOC: assorted typo fixes in the documentation
- CLEANUP: remove unused code in 'my_ffsl/my_flsl' functions
Willy Tarreau [Sun, 8 Mar 2020 16:48:17 +0000 (17:48 +0100)]
MINOR: tools: add a generic function to generate UUIDs
We currently have two UUID generation functions, one for the sample
fetch and the other one in the SPOE filter. Both were a bit complicated
since they were made to support random() implementations returning an
arbitrary number of bits, and were throwing away 33 bits every 64. Now
we don't need this anymore, so let's have a generic function consuming
64 bits at once and use it as appropriate.
Willy Tarreau [Sun, 8 Mar 2020 17:01:10 +0000 (18:01 +0100)]
MINOR: sample: make all bits random on the rand() sample fetch
The rand() sample fetch supports being limited to a certain range, but
it only uses 31 bits and scales them as requested, which means that when
the requested output range is larger than 31 bits, the least significant
one is not random and may even be constant.
Let's make use of the whole 32 bits now that we have access ot them.
Willy Tarreau [Sun, 8 Mar 2020 16:53:53 +0000 (17:53 +0100)]
BUG/MINOR: checks/threads: use ha_random() and not rand()
In order to honor spread_checks we currently call rand() which is not
thread safe and which must never turn its internal state to zero. This
is not thread safe, let's use ha_random() instead. This is a complement
to commimt 52bf839394 ("BUG/MEDIUM: random: implement a thread-safe and
process-safe PRNG") and may be backported with it.
Willy Tarreau [Sun, 8 Mar 2020 16:31:39 +0000 (17:31 +0100)]
MINOR: backend: use a single call to ha_random32() for the random LB algo
For the random LB algorithm we need a random 32-bit hashing key that used
to be made of two calls to random(). Now we can simply perform a single
call to ha_random32() and get rid of the useless operations.
Willy Tarreau [Sat, 7 Mar 2020 23:42:37 +0000 (00:42 +0100)]
BUG/MEDIUM: random: implement a thread-safe and process-safe PRNG
This is the replacement of failed attempt to add thread safety and
per-process sequences of random numbers initally tried with commit 1c306aa84d ("BUG/MEDIUM: random: implement per-thread and per-process
random sequences").
This new version takes a completely different approach and doesn't try
to work around the horrible OS-specific and non-portable random API
anymore. Instead it implements "xoroshiro128**", a reputedly high
quality random number generator, which is one of the many variants of
xorshift, which passes all quality tests and which is described here:
http://prng.di.unimi.it/
While not cryptographically secure, it is fast and features a 2^128-1
period. It supports fast jumps allowing to cut the period into smaller
non-overlapping sequences, which we use here to support up to 2^32
processes each having their own, non-overlapping sequence of 2^96
numbers (~7*10^28). This is enough to provide 1 billion randoms per
second and per process for 2200 billion years.
The implementation was made thread-safe either by using a double 64-bit
CAS on platforms supporting it (x86_64, aarch64) or by using a local
lock for the time needed to perform the shift operations. This ensures
that all threads pick numbers from the same pool so that it is not
needed to assign per-thread ranges. For processes we use the fast jump
method to advance the sequence by 2^96 for each process.
Before this patch, the following config:
global
nbproc 8
frontend f
bind :4445
mode http
log stdout format raw daemon
log-format "%[uuid] %pid"
redirect location /
There are multiple benefits to using this method. First, it doesn't
depend anymore on a non-portable API. Second it's thread safe. Third it
is fast and more proven than any hack we could attempt to try to work
around the deficiencies of the various implementations around.
This commit depends on previous patches "MINOR: tools: add 64-bit rotate
operators" and "BUG/MEDIUM: random: initialize the random pool a bit
better", all of which will need to be backported at least as far as
version 2.0. It doesn't require to backport the build fixes for circular
include files dependecy anymore.
It breaks the build on all non-glibc platforms. I got confused by the
man page (which possibly is the most confusing man page I've ever read
about a standard libc function) and mistakenly understood that random_r
was portable, especially since it appears in latest freebsd source as
well but not in released versions, and with a slightly different API :-/
We need to find a different solution with a fallback. Among the
possibilities, we may reintroduce this one with a fallback relying on
locking around the standard functions, keeping fingers crossed for no
other library function to call them in parallel, or we may also provide
our own PRNG, which is not necessarily more difficult than working
around the totally broken up design of the portable API.
Willy Tarreau [Fri, 6 Mar 2020 18:04:55 +0000 (19:04 +0100)]
BUG/MEDIUM: random: implement per-thread and per-process random sequences
As mentioned in previous patch, the random number generator was never
made thread-safe, which used not to be a problem for health checks
spreading, until the uuid sample fetch function appeared. Currently
it is possible for two threads or processes to produce exactly the
same UUID. In fact it's extremely likely that this will happen for
processes, as can be seen with this config:
global
nbproc 8
frontend f
bind :4445
mode http
log stdout daemon format raw
log-format "%[uuid] %pid"
redirect location /
What this patch does is to use a distinct per-thread and per-process
seed to make sure the same sequences will not appear, and will then
extend these seeds by "burning" a number of randoms that depends on
the global random seed, the thread ID and the process ID. This adds
roughly 20 extra bits of randomness, resulting in 52 bits total per
thread and per process.
It only takes a few milliseconds to burn these randoms and given
that threads start with a different seed, we know they will not
catch each other. So these random extra bits are essentially added
to ensure randomness between boots and cluster instances.
This replaces all uses of random() with ha_random() which uses the
thread-local state.
This must be backported as far as 2.0 or any version having the
UUID sample-fetch function since it's the main victim here.
It's important to note that this patch, in addition to depending on
the previous one "BUG/MEDIUM: init: initialize the random pool a bit
better", also depends on the preceeding build fixes to address a
circular dependency issue in the include files that prevented it
from building. Part or all of these patches may need to be backported
or adapted as well.
Willy Tarreau [Fri, 6 Mar 2020 17:57:15 +0000 (18:57 +0100)]
BUG/MEDIUM: random: initialize the random pool a bit better
Since the UUID sample fetch was created, some people noticed that in
certain virtualized environments they manage to get exact same UUIDs
on different instances started exactly at the same moment. It turns
out that the randoms were only initialized to spread the health checks
originally, not to provide "clean" randoms.
This patch changes this and collects more randomness from various
sources, including existing randoms, /dev/urandom when available,
RAND_bytes() when OpenSSL is available, as well as the timing for such
operations, then applies a SHA1 on all this to keep a 160 bits random
seed available, 32 of which are passed to srandom().
It's worth mentioning that there's no clean way to pass more than 32
bits to srandom() as even initstate() provides an opaque state that
must absolutely not be tampered with since known implementations
contain state information.
At least this allows to have up to 4 billion different sequences
from the boot, which is not that bad.
Note that the thread safety was still not addressed, which is another
issue for another patch.
This must be backported to all versions containing the UUID sample
fetch function, i.e. as far as 2.0.