MINOR: ssl/cli: update ocsp/issuer/sctl file from the CLI
It is now possible to update new parts of a CKCH from the CLI.
Currently you will be able to update a PEM (by default), a OCSP response
in base64, an issuer file, and a SCTL file.
Each update will creates a new CKCH and new sni_ctx structure so we will
need a "commit" command later to apply several changes and create the
sni_ctx only once.
Split the ssl_sock_load_crt_file_into_ckch() in two functions:
- ssl_sock_load_files_into_ckch() which is dedicated to opening every
files related to a filename during the configuration parsing (PEM, sctl,
ocsp, issuer etc)
- ssl_sock_load_pem_into_ckch() which is dedicated to opening a PEM,
either in a file or a buffer
Willy Tarreau [Wed, 23 Oct 2019 09:06:35 +0000 (11:06 +0200)]
BUG/MINOR: mux-h2: do not emit logs on backend connections
The logs were added to the H2 mux so that we can report logs in case
of errors that prevent a stream from being created, but as a side effect
these logs are emitted twice for backend connections: once by the H2 mux
itself and another time by the upper layer stream. It can even happen
more with connection retries.
This patch makes sure we do not emit logs for backend connections.
Willy Tarreau [Wed, 23 Oct 2019 04:59:31 +0000 (06:59 +0200)]
BUG/MEDIUM: pattern: make the pattern LRU cache thread-local and lockless
As reported in issue #335, a lot of contention happens on the PATLRU lock
when performing expensive regex lookups. This is absurd since the purpose
of the LRU cache was to have a fast cache for expressions, thus the cache
must not be shared between threads and must remain lockless.
This commit makes the LRU cache thread-local and gets rid of the PATLRU
lock. A test with 7 threads on 4 cores climbed from 67kH/s to 369kH/s,
or a scalability factor of 5.5.
Given the huge performance difference and the regression caused to
users migrating from processes to threads, this should be backported at
least to 2.0.
Thanks to Brian Diekelman for his detailed report about this regression.
Willy Tarreau [Wed, 23 Oct 2019 04:21:05 +0000 (06:21 +0200)]
BUG/MINOR: stick-table: fix an incorrect 32 to 64 bit key conversion
As reported in issue #331, the code used to cast a 32-bit to a 64-bit
stick-table key is wrong. It only copies the 32 lower bits in place on
little endian machines or overwrites the 32 higher ones on big endian
machines. It ought to simply remove the wrong cast dereference.
This bug was introduced when changing stick table keys to samples in
1.6-dev4 by commit bc8c404449 ("MAJOR: stick-tables: use sample types
in place of dedicated types") so it the fix must be backported as far
as 1.6.
Emeric Brun [Tue, 8 Oct 2019 16:27:37 +0000 (18:27 +0200)]
BUG/MINOR: ssl: fix memcpy overlap without consequences.
A trick is used to set SESSION_ID, and SESSION_ID_CONTEXT lengths
to 0 and avoid ASN1 encoding of these values.
There is no specific function to set the length of those parameters
to 0 so we fake this calling these function to a different value
with the same buffer but a length to zero.
But those functions don't seem to check the length of zero before
performing a memcpy of length zero but with src and dst buf on the
same pointer, causing valgrind to bark.
So the code was re-work to pass them different pointers even
if buffer content is un-used.
In a second time, reseting value, a memcpy overlap
happened on the SESSION_ID_CONTEXT. It was re-worked and this is
now reset using the constant global value SHCTX_APPNAME which is a
different pointer with the same content.
This patch should be backported in every version since ssl
support was added to haproxy if we want valgrind to shut up.
This is tracked in github issue #56.
Baptiste Assmann [Mon, 21 Oct 2019 13:13:48 +0000 (15:13 +0200)]
BUG/MINOR: dns: allow srv record weight set to 0
Processing of SRV record weight was inaccurate and when a SRV record's
weight was set to 0, HAProxy enforced it to '1'.
This patch aims at fixing this without breaking compability with
previous behavior.
Willy Tarreau [Tue, 22 Oct 2019 08:42:10 +0000 (10:42 +0200)]
REGTESTS: make seamless-reload depend on 1.9 and above
Since latest updates, vtest requires the master CLI when running in
master-worker mode, and this one is only available starting with 1.9.
The seamless reload test is the only one depending on this and now
fails on 1.8, so let's adjust it accordingly.
When running haproxy -c, the cache parser is trying to allocate the size
of the cache. This can be a problem in an environment where the RAM is
limited.
This patch moves the cache allocation in the post_check callback which
is not executed during a -c.
This patch may be backported at least to 2.0 and 1.9. In 1.9, the callbacks
registration mechanism is not the same. So the patch will have to be adapted. No
need to backport it to 1.8, the code is probably too different.
BUG/MINOR: stick-table: Never exceed (MAX_SESS_STKCTR-1) when fetching a stkctr
When a stick counter is fetched, it is important that the requested counter does
not exceed (MAX_SESS_STKCTR -1). Actually, there is no bug with a default build
because, by construction, MAX_SESS_STKCTR is defined to 3 and we know that we
never exceed the max value. scN_* sample fetches are numbered from 0 to 2. For
other sample fetches, the value is tested.
But there is a bug if MAX_SESS_STKCTR is set to a lower value. For instance
1. In this case the counters sc1_* and sc2_* may be undefined.
This patch fixes the issue #330. It must be backported as far as 1.7.
BUG/MINOR: ssl: Fix fd leak on error path when a TLS ticket keys file is parsed
When an error occurred in the function bind_parse_tls_ticket_keys(), during the
configuration parsing, the opened file is not always closed. To fix the bug, all
errors are catched at the same place, where all ressources are released.
This patch fixes the bug #325. It must be backported as far as 1.7.
BUG/MINOR: mworker/cli: reload fail with inherited FD
When using the master CLI with 'fd@', during a reload, the master CLI
proxy is stopped. Unfortunately if this is an inherited FD it is closed
too, and the master CLI won't be able to bind again during the
re-execution. It lead the master to fallback in waitpid mode.
This patch forbids the inherited FDs in the master's listeners to be
closed during a proxy_stop().
This patch is mandatory to use the -W option in VTest versions that contain the
-mcli feature.
(https://github.com/vtest/VTest/commit/86e65f1024453b1074d239a88330b5150d3e44bb)
Emeric Brun [Thu, 17 Oct 2019 12:53:03 +0000 (14:53 +0200)]
BUG/MEDIUM: ssl: 'tune.ssl.default-dh-param' value ignored with openssl > 1.1.1
If openssl 1.1.1 is used, c2aae74f0 commit mistakenly enables DH automatic
feature from openssl instead of ECDH automatic feature. There is no impact for
the ECDH one because the feature is always enabled for that version. But doing
this, the 'tune.ssl.default-dh-param' was completely ignored for DH parameters.
This patch fix the bug calling 'SSL_CTX_set_ecdh_auto' instead of
'SSL_CTX_set_dh_auto'.
Currently some users may use a 2048 DH bits parameter, thinking they're using a
1024 bits one. Doing this, they may experience performance issue on light hardware.
This patch warns the user if haproxy fails to configure the given DH parameter.
In this case and if openssl version is > 1.1.0, haproxy will let openssl to
automatically choose a default DH parameter. For other openssl versions, the DH
ciphers won't be usable.
A commonly case of failure is due to the security level of openssl.cnf
which could refuse a 1024 bits DH parameter for a 2048 bits key:
Emeric Brun [Thu, 17 Oct 2019 11:27:40 +0000 (13:27 +0200)]
CLEANUP: ssl: make ssl_sock_load_dh_params handle errcode/warn
ssl_sock_load_dh_params used to return >0 or -1 to indicate success
or failure. Make it return a set of ERR_* instead so that its callers
can transparently report its status. Given that its callers only used
to know about ERR_ALERT | ERR_FATAL, this is the only code returned
for now. An error message was added in the case of failure and the
comment was updated.
Emeric Brun [Thu, 17 Oct 2019 11:25:14 +0000 (13:25 +0200)]
CLEANUP: ssl: make ssl_sock_put_ckch_into_ctx handle errcode/warn
ssl_sock_put_ckch_into_ctx used to return 0 or >0 to indicate success
or failure. Make it return a set of ERR_* instead so that its callers
can transparently report its status. Given that its callers only used
to know about ERR_ALERT | ERR_FATAL, this is the only code returned
for now. And a comment was updated.
Emeric Brun [Thu, 17 Oct 2019 11:16:58 +0000 (13:16 +0200)]
CLEANUP: ssl: make ckch_inst_new_load_(multi_)store handle errcode/warn
ckch_inst_new_load_store() and ckch_inst_new_load_multi_store used to
return 0 or >0 to indicate success or failure. Make it return a set of
ERR_* instead so that its callers can transparently report its status.
Given that its callers only used to know about ERR_ALERT | ERR_FATAL,
his is the only code returned for now. And the comment was updated.
Willy Tarreau [Wed, 16 Oct 2019 15:06:25 +0000 (17:06 +0200)]
CLEANUP: ssl: make ssl_sock_load_ckchs() return a set of ERR_*
ssl_sock_load_ckchs() used to return 0 or >0 to indicate success or
failure even though this was not documented. Make it return a set of
ERR_* instead so that its callers can transparently report its status.
Given that its callers only used to know about ERR_ALERT | ERR_FATAL,
this is the only code returned for now. And a comment was added.
Willy Tarreau [Wed, 16 Oct 2019 14:42:19 +0000 (16:42 +0200)]
CLEANUP: ssl: make ssl_sock_load_cert*() return real error codes
These functions were returning only 0 or 1 to mention success or error,
and made it impossible to return a warning. Let's make them return error
codes from ERR_* and map all errors to ERR_ALERT|ERR_FATAL for now since
this is the only code that was set on non-zero return value.
In addition some missing comments were added or adjusted around the
functions' return values.
REGTEST: mcli/mcli_show_info: launch a 'show info' on the master CLI
This test launches a HAProxy process in master worker with 'nbproc 4'.
It sends a "show info" to the process 3 and verify that the right
process replied.
This regtest depends on the support of the master CLI for VTest.
Olivier Houchard [Fri, 18 Oct 2019 12:18:29 +0000 (14:18 +0200)]
BUG/MEDIUM: mux_pt: Only call the wake emthod if nobody subscribed to receive.
In mux_pt_io_cb(), instead of always calling the wake method, only do so
if nobody subscribed for receive. If we have a subscription, just wake the
associated tasklet up.
Olivier Houchard [Fri, 18 Oct 2019 11:56:40 +0000 (13:56 +0200)]
BUG/MEDIUM: mux_pt: Don't destroy the connection if we have a stream attached.
There's a small window where the mux_pt tasklet may be woken up, and thus
mux_pt_io_cb() get scheduled, and then the connection is attached to a new
stream. If this happen, don't do anything, and just let the stream know
by calling its wake method. If the connection had an error, the stream should
take care of destroying it by calling the detach method.
This reverts commit "BUG/MEDIUM: mux_pt: Make sure we don't have a
conn_stream before freeing.".
mux_pt_io_cb() is only used if we have no associated stream, so we will
never have a cs, so there's no need to check that, and we of course have to
destroy the mux in mux_pt_detach() if we have no associated session, or if
there's an error on the connection.
Willy Tarreau [Fri, 18 Oct 2019 04:43:53 +0000 (06:43 +0200)]
BUG/MEDIUM: task: make tasklets either local or shared but not both at once
Tasklets may be woken up to run on the calling thread or by a specific thread
(the owner). But since we use a non-thread safe mechanism when the calling
thread is also the for the owner, there may sometimes be collisions when two
threads decide to wake the same tasklet up at the same time and one of them
is the owner.
This is more of a matter of usage than code, in that a tasklet usually is
designed to be woken up and executed on the calling thread only (most cases)
or on a specific thread. Thus it is a property of the tasklet itself as this
solely depends how the code is constructed around it.
This patch performs a small change to address this. By default tasklet_new()
creates a "local" tasklet, which will run on the calling thread, like in 2.0.
This is done by setting tl->tid to a negative value. If the caller wants the
tasklet to run exclusively on a specific thread, it just has to set tl->tid,
which is already what shared tasklet callers do anyway.
Willy Tarreau [Fri, 18 Oct 2019 06:50:49 +0000 (08:50 +0200)]
BUG/MAJOR: idle conns: schedule the cleanup task on the correct threads
The idle cleanup tasks' masks are wrong for threads 32 to 64, which
causes the wrong thread to wake up and clean the connections that it
does not own, with a risk of crash or infinite loop depending on
concurrent accesses. For thread 32, any thread between 32 and 64 will
be woken up, but for threads 33 to 64, in fact threads 1 to 32 will
run the task instead.
This issue only affects deployments enabling more than 32 threads. While
is it not common in 1.9 where this has to be explicit, and can easily be
dealt with by lowering the number of threads, it can be more common in
2.0 since by default the thread count is determined based on the number
of available processors, hence the MAJOR tag which is mostly relevant
to 2.x.
The problem was first introduced into 1.9-dev9 by commit 0c18a6fe3
("MEDIUM: servers: Add a way to keep idle connections alive.") and was
later moved to cfgparse.c by commit 980855bd9 ("BUG/MEDIUM: server:
initialize the orphaned conns lists and tasks at the end").
This patch needs to be backported as far as 1.9, with care as 1.9 is
slightly different there (uses idle_task[] instead of idle_conn_cleanup[]
like in 2.x).
Willy Tarreau [Fri, 18 Oct 2019 06:45:41 +0000 (08:45 +0200)]
BUG/MEDIUM: tasklet: properly compute the sleeping threads mask in tasklet_wakeup()
The use of ~(1 << tid) to compute the sleeping_mask in tasklet_wakeup()
will result in breakage above 32 threads, because (1<<31) = 0xFFFFFFFF8000000,
and upper values will lead to theorically undefined results, but practically
will wrap over 0x1 to 0x80000000 again and indicate wrong sleeping masks. It
seems that the main visible effect maybe extra latency on some threads or
short CPU loops on others.
Olivier Houchard [Thu, 17 Oct 2019 16:02:53 +0000 (18:02 +0200)]
BUG/MEDIUM: mux_pt: Make sure we don't have a conn_stream before freeing.
On error, make sure we don't have a conn_stream before freeing the connection
and the associated mux context. Otherwise a stream will still reference
the connection, and attempt to use it.
If we still have a conn_stream, it will properly be free'd when the detach
method is called, anyway.
Olivier Houchard [Thu, 17 Oct 2019 15:46:01 +0000 (17:46 +0200)]
BUG/MEDIUM: lists: Handle 1-element-lists in MT_LIST_BEHEAD().
In MT_LIST_BEHEAD(), explicitely set the next element of the prev to NULL,
instead of setting it to the prev of the next. If we only had one element,
then we'd set the next and the prev to the element itself, and thus it would
make the element appear to be outside any list.
BUG/MINOR: tcp: Don't alter counters returned by tcp info fetchers
There are 2 kinds of tcp info fetchers. Those returning a time value (fc_rtt and
fc_rttval) and those returning a counter (fc_unacked, fc_sacked, fc_retrans,
fc_fackets, fc_lost, fc_reordering). Because of a bug, the counters were handled
as time values, and by default, were divided by 1000 (because of an invalid
conversion from us to ms). To work around this bug and have the right value, the
argument "us" had to be specified.
So now, tcp info fetchers returning a counter don't support any argument
anymore. To not break old configurations, if an argument is provided, it is
ignored and a warning is emitted during the configuration parsing.
In addition, parameter validiation is now performed during the configuration
parsing.
BUG/MINOR: mworker/ssl: close openssl FDs unconditionally
Patch 56996da ("BUG/MINOR: mworker/ssl: close OpenSSL FDs on reload")
fixes a issue where the /dev/random FD was leaked by OpenSSL upon a
reload in master worker mode. Indeed the FD was not flagged with
CLOEXEC.
The fix was checking if ssl_used_frontend or ssl_used_backend were set
to close the FD. This is wrong, indeed the lua init code creates an SSL
server without increasing the backend value, so the deinit is never
done when you don't use SSL in your configuration.
To reproduce the problem you just need to build haproxy with openssl and
lua with an openssl which does not use the getrandom() syscall. No
openssl nor lua configuration are required for haproxy.
Willy Tarreau [Thu, 17 Oct 2019 07:28:28 +0000 (09:28 +0200)]
BUG/MINOR: cache: also cache absolute URIs
The recent changes to address URI issues mixed with the recent fix to
stop caching absolute URIs have caused the cache not to cache H2 requests
anymore since these ones come with a scheme and authority. Let's unbreak
this by using absolute URIs all the time, now that we keep host and
authority in sync. So what is done now is that if we have an authority,
we take the whole URI as it is as the cache key. This covers H2 and H1
absolute requests. If no authority is present (most H1 origin requests),
then we prepend "https://" and the Host header. The reason for https://
is that most of the time we don't care about the scheme, but since about
all H2 clients use this scheme, at least we can share the cache between
H1 and H2.
No backport is needed since the breakage only affects 2.1-dev.
Willy Tarreau [Thu, 17 Oct 2019 08:38:10 +0000 (10:38 +0200)]
MINOR: istbuf: add b_fromist() to make a buffer from an ist
A lot of our chunk-based functions are able to work on a buffer pointer
but not on an ist. Instead of duplicating all of them to also take an
ist as a source, let's have a macro to make a temporary dummy buffer
from an ist. This will only result in structure field manipulations
that the compiler will quickly figure to eliminate them with inline
functions, and in other cases it will just use 4 words in the stack
before calling a function, instead of performing intermediary
conversions.
Willy Tarreau [Thu, 17 Oct 2019 04:53:55 +0000 (06:53 +0200)]
BUILD: travis-ci: limit build to branches "master" and "next"
Occasionally some short-lived branches are pushed to help developers
rebase their work, these ones do not need to be built. This patch
explicitly lists "master" and "next" as the two only branches of
interest. It also adds a comment with the URL for the build status.
MINOR: mux-h1: Force close mode for proxy responses with an unfinished request
When a response generated by HAProxy is handled by the mux H1, if the
corresponding request has not fully been received, the close mode is
forced. Thus, the client is notified the connection will certainly be closed
abruptly, without waiting the end of the request.
MINOR: htx: Add a flag on HTX to known when a response was generated by HAProxy
The flag HTX_FL_PROXY_RESP is now set on responses generated by HAProxy,
excluding responses returned by applets and services. It is an informative flag
set by the applicative layer.
BUG/MINOR: http-htx: Properly set htx flags on error files to support keep-alive
When an error file was loaded, the flag HTX_SL_F_XFER_LEN was never set on the
HTX start line because of a bug. During the headers parsing, the flag
H1_MF_XFER_LEN is never set on the h1m. But it was the condition to set
HTX_SL_F_XFER_LEN on the HTX start-line. Instead, we must only rely on the flags
H1_MF_CLEN or H1_MF_CHNK.
Because of this bug, it was impossible to keep a connection alive for a response
generated by HAProxy. Now the flag HTX_SL_F_XFER_LEN is set when an error file
have a content length (chunked responses are unsupported at this stage) and the
connection may be kept alive if there is no connection header specified to
explicitly close it.
Willy Tarreau [Wed, 16 Oct 2019 07:44:55 +0000 (09:44 +0200)]
MINOR: version: make the version strings variables, not constants
It currently is not possible to figure the exact haproxy version from a
core file for the sole reason that the version is stored into a const
string and as such ends up in the .text section that is not part of a
core file. By turning them into variables we move them to the data
section and they appear in core files. In order to help finding them,
we just prepend an extra variable in front of them and we're able to
immediately spot the version strings from a core file:
(These are haproxy_version and haproxy_date respectively). This may be
backported to 2.0 since this part is not support to impact anything but
the developer's time spent debugging.
246c024 ("MINOR: ssl: load the ocsp in/from the ckch") broke the loading
of OCSP files. The function ssl_sock_load_ocsp_response_from_file() was
not returning 0 upon success which lead to an error after the .ocsp was
read.
BUG/MINOR: ssl: fix error messages for OCSP loading
The error messages for OCSP in ssl_sock_load_crt_file_into_ckch() add a
double extension to the filename, that can be confusing. The messages
reference a .issuer.issuer file.
Miroslav Zagorac [Mon, 14 Oct 2019 15:15:56 +0000 (17:15 +0200)]
BUG/MINOR: WURFL: fix send_log() function arguments
If the user agent data contains text that has special characters that
are used to format the output from the vfprintf() function, haproxy
crashes. String "%s %s %s" may be used as an example.
% curl -A "%s %s %s" localhost:10080/index.html
curl: (52) Empty reply from server
REGTESTS: Send valid URIs in peers reg-tests and fix HA config to avoid warnings
Absolute path must be used, otherwise, the requests are rejected by HAProxy
because of the recent changes. In addition, the configuration has been slightly
updated to remove warnings at startup.
MINOR: h1: Reject requests if the authority does not match the header host
As stated in the RCF7230#5.4, a client must send a field-value for the header
host that is identical to the authority if the target URI includes one. So, now,
by default, if the authority, when provided, does not match the value of the
header host, an error is triggered. To mitigate this behavior, it is possible to
set the option "accept-invalid-http-request". In that case, an http error is
captured without interrupting the request parsing.
MINOR: h1: Reject requests with different occurrences of the header host
There is no reason for a client to send several headers host. It even may be
considered as a bug. However, it is totally invalid to have different values for
those. So now, in such case, an error is triggered during the request
parsing. In addition, when several headers host are found with the same value,
only the first instance is kept and others are skipped.
When the option "accept-invalid-http-request" is enabled, some parsing errors
are ignored. But the position of the error is reported. In legacy HTTP mode,
such errors were captured. So, we now do the same in the H1 multiplexer.
If required, this patch may be backported to 2.0 and 1.9.
MINOR: mux-h1: Xfer as much payload data as possible during output processing
When an outgoing HTX message is formatted to a raw message, DATA blocks may be
splitted to not tranfser more data than expected. But if the buffer is almost
full, the formatting is interrupted, leaving some unused free space in the
buffer, because data are too large to be copied in one time.
Now, we transfer as much data as possible. When the message is chunked, we also
count the size used to encode the data.
BUG/MINOR: mux-h1: Mark the output buffer as full when the xfer is interrupted
When an outgoing HTX message is formatted to a raw message, if we fail to copy
data of an HTX block into the output buffer, we mark it as full. Before it was
only done calling the function buf_room_for_htx_data(). But this function is
designed to optimize input processing.
BUG/MINOR: chunk: Fix tests on the chunk size in functions copying data
When raw data are copied or appended in a chunk, the result must not exceed the
chunk size but it can reach it. Unlike functions to copy or append a string,
there is no terminating null byte.
This patch must be backported as far as 1.8. Note in 1.8, the functions
chunk_cpy() and chunk_cat() don't exist.
BUG/MEDIUM: htx: Catch chunk_memcat() failures when HTX data are formatted to h1
In functions htx_*_to_h1(), most of time several calls to chunk_memcat() are
chained. The expected size is always compared to available room in the buffer to
be sure the full copy will succeed. But it is a bit risky because it relies on
the fact the function chunk_memcat() evaluates the available room in the buffer
in a same way than htx ones. And, unfortunately, it does not. A bug in
chunk_memcat() will always leave a byte unused in the buffer. So, for instance,
when a chunk is copied in an almost full buffer, the last CRLF may be skipped.
To fix the issue, we now rely on the result of chunk_memcat() only.
The operation is locked at the ckch level with a HA_SPINLOCK_T which
prevents the ckch architecture (ckch_store, ckch_inst..) to be modified
at the same time. So you can't do a certificate update at the same time
from multiple CLI connections.
SNI trees are also locked with a HA_RWLOCK_T so reading operations are
locked only during a certificate update.
Bundles are supported but you need to update each file (.rsa|ecdsa|.dsa)
independently. If a file is used in the configuration as a bundle AND
as a unique certificate, both will be updated.
Bundles, directories and crt-list are supported, however filters in
crt-list are currently unsupported.
The code tries to allocate every SNIs and certificate instances first,
so it can rollback the operation if that was unsuccessful.
If you have too much instances of the certificate (at least 20000 in my
tests on my laptop), the function can take too much time and be killed
by the watchdog. This will be fixed later. Also with too much
certificates it's possible that socat exits before the end of the
generation without displaying a message, consider changing the socat
timeout in this case (-t2 for example).
The size of the certificate is currently limited by the maximum size of
a payload, that must fit in a buffer.
MEDIUM: ssl: ssl_sock_load_ckchs() alloc a ckch_inst
The ssl_sock_load_{multi}_ckchs() function were renamed and modified:
- allocate a ckch_inst and loads the sni in it
- return a ckch_inst or NULL
- the sni_ctx are not added anymore in the sni trees from there
- renamed in ckch_inst_new_load_{multi}_store()
- new ssl_sock_load_ckchs() function calls
ckch_inst_new_load_{multi}_store() and add the sni_ctx to the sni trees.
MEDIUM: ssl: introduce the ckch instance structure
struct ckch_inst represents an instance of a certificate (ckch_node)
used in a bind_conf. Every sni_ctx created for 1 ckch_node in a
bind_conf are linked in this structure.
This patch allocate the ckch_inst for each bind_conf and inserts the
sni_ctx in its linked list.
The ssl_sock_add_cert_sni() function never return an error when a
sni_ctx allocation fail. It silently ignores the problem and continues
to try to allocate other snis.
It is unlikely that a sni allocation will succeed after one failure and
start a configuration without all the snis. But to avoid any problem we
return a -1 upon an sni allocation error and stop the configuration
parsing.
This patch must be backported in every version supporting the crt-list
sni filters. (as far as 1.5)
Olivier Houchard [Fri, 11 Oct 2019 14:35:01 +0000 (16:35 +0200)]
MEDIUM: task: Split the tasklet list into two lists.
As using an mt_list for the tasklet list is costly, instead use a regular list,
but add an mt_list for tasklet woken up by other threads, to be run on the
current thread. At the beginning of process_runnable_tasks(), we just take
the new list, and merge it into the task_list.
This should give us performances comparable to before we started using a
mt_list, but allow us to use tasklet_wakeup() from other threads.
Willy Tarreau [Fri, 4 Oct 2019 16:02:40 +0000 (18:02 +0200)]
MINOR: list: add new macro MT_LIST_BEHEAD
This macro atomically cuts the head of a list and returns the list
of elements as a detached list, meaning that they're all linked
together without any head. If the list was empty, NULL is returned.
Willy Tarreau [Fri, 11 Oct 2019 14:31:46 +0000 (16:31 +0200)]
BUILD: stats: fix missing '=' sign in array declaration
I introduced this mistake when adding the description for the stats
metrics, it's even amazing it built and worked at all! This was
reported by Travis CI on non-GNU platforms :
src/stats.c:92:39: warning: use of GNU 'missing =' extension in designator [-Wgnu-designator]
[INF_NAME] { .name = "Name", .desc = "Product name" },
^
=
No backport is needed.
Willy Tarreau [Fri, 11 Oct 2019 12:15:47 +0000 (14:15 +0200)]
BUG/MEDIUM: applet: always check a fast running applet's activity before killing
In issue #277 is reported a strange problem related to a fast-spinning
applet which seems to show valid progress being made. It's uncertain how
this can happen, maybe some very specific timing patterns manage to place
just a few bytes in each buffer and result in the peers applet being called
a lot. But it appears possible to artificially cross the spinning threshold
by asking for monster stats page (500 MB) and limiting the send() size to
1 MSS (1460 bytes), causing the stats page to be called for very small
blocks which most often do not leave enough room to place a new chunk.
The idea developed in this patch consists in not crashing for an applet
which reaches a very high call rate if it shows some indication of
progress. Detecting progress on applets is not trivial but in our case
we know that they must at least not claim to wait for a buffer allocation
if this buffer is present, wait for room if the buffer is empty, ask for
more data without polling if such data are still present, nor leave with
an empty input buffer without having written anything nor read anything
from the other side while a shutw is pending.
Doing so doesn't affect normal behaviors nor abuses of our existing
applets and does at least protect against an applet performing an
early return without processing events, or one causing an endless
loop by asking for impossible conditions.
Willy Tarreau [Wed, 9 Oct 2019 14:41:38 +0000 (16:41 +0200)]
MINOR: stats: fill all the descriptions for "show info" and "show stat"
Now "show info desc", "show info typed desc" and "show stat typed desc"
will report (hopefully) accurate descriptions of each field. These ones
were verified in the code. When some metrics are specific to the process
or the thread, they are indicated. Sometimes a config option is known
for a setting and it is reported as well. The purpose mainly is to help
sysadmins in field more easily sort out issues vs non-issues. In part
inspired by this very informative talk :
$ socat - /var/run/haproxy.sock <<< "show info desc"
Name: HAProxy:"Product name"
Version: 2.1-dev2-991035-31:"Product version"
Release_date: 2019/10/09:"Date of latest source code update"
Nbthread: 1:"Number of started threads (global.nbthread)"
Nbproc: 1:"Number of started worker processes (global.nbproc)"
Process_num: 1:"Relative process number (1..Nbproc)"
Pid: 11975:"This worker process identifier for the system"
Uptime: 0d 0h00m10s:"How long ago this worker process was started (days+hours+minutes+seconds)"
Uptime_sec: 10:"How long ago this worker process was started (seconds)"
Memmax_MB: 0:"Worker process's hard limit on memory usage in MB (-m on command line)"
PoolAlloc_MB: 0:"Amount of memory allocated in pools (in MB)"
PoolUsed_MB: 0:"Amount of pool memory currently used (in MB)"
PoolFailed: 0:"Number of failed pool allocations since this worker was started"
Ulimit-n: 300000:"Hard limit on the number of per-process file descriptors"
Maxsock: 300000:"Hard limit on the number of per-process sockets"
Maxconn: 149982:"Hard limit on the number of per-process connections (configured or imposed by Ulimit-n)"
Hard_maxconn: 149982:"Hard limit on the number of per-process connections (imposed by Memmax_MB or Ulimit-n)"
CurrConns: 0:"Current number of connections on this worker process"
CumConns: 1:"Total number of connections on this worker process since started"
CumReq: 1:"Total number of requests on this worker process since started"
MaxSslConns: 0:"Hard limit on the number of per-process SSL endpoints (front+back), 0=unlimited"
CurrSslConns: 0:"Current number of SSL endpoints on this worker process (front+back)"
CumSslConns: 0:"Total number of SSL endpoints on this worker process since started (front+back)"
Maxpipes: 0:"Hard limit on the number of pipes for splicing, 0=unlimited"
PipesUsed: 0:"Current number of pipes in use in this worker process"
PipesFree: 0:"Current number of allocated and available pipes in this worker process"
ConnRate: 0:"Number of front connections created on this worker process over the last second"
ConnRateLimit: 0:"Hard limit for ConnRate (global.maxconnrate)"
MaxConnRate: 0:"Highest ConnRate reached on this worker process since started (in connections per second)"
SessRate: 0:"Number of sessions created on this worker process over the last second"
SessRateLimit: 0:"Hard limit for SessRate (global.maxsessrate)"
MaxSessRate: 0:"Highest SessRate reached on this worker process since started (in sessions per second)"
SslRate: 0:"Number of SSL connections created on this worker process over the last second"
SslRateLimit: 0:"Hard limit for SslRate (global.maxsslrate)"
MaxSslRate: 0:"Highest SslRate reached on this worker process since started (in connections per second)"
SslFrontendKeyRate: 0:"Number of SSL keys created on frontends in this worker process over the last second"
SslFrontendMaxKeyRate: 0:"Highest SslFrontendKeyRate reached on this worker process since started (in SSL keys per second)"
SslFrontendSessionReuse_pct: 0:"Percent of frontend SSL connections which did not require a new key"
SslBackendKeyRate: 0:"Number of SSL keys created on backends in this worker process over the last second"
SslBackendMaxKeyRate: 0:"Highest SslBackendKeyRate reached on this worker process since started (in SSL keys per second)"
SslCacheLookups: 0:"Total number of SSL session ID lookups in the SSL session cache on this worker since started"
SslCacheMisses: 0:"Total number of SSL session ID lookups that didn't find a session in the SSL session cache on this worker since started"
CompressBpsIn: 0:"Number of bytes submitted to HTTP compression in this worker process over the last second"
CompressBpsOut: 0:"Number of bytes out of HTTP compression in this worker process over the last second"
CompressBpsRateLim: 0:"Limit of CompressBpsOut beyond which HTTP compression is automatically disabled"
Tasks: 10:"Total number of tasks in the current worker process (active + sleeping)"
Run_queue: 1:"Total number of active tasks+tasklets in the current worker process"
Idle_pct: 100:"Percentage of last second spent waiting in the current worker thread"
node: wtap.local:"Node name (global.node)"
Stopping: 0:"1 if the worker process is currently stopping, otherwise zero"
Jobs: 14:"Current number of active jobs on the current worker process (frontend connections, master connections, listeners)"
Unstoppable Jobs: 0:"Current number of unstoppable jobs on the current worker process (master connections)"
Listeners: 13:"Current number of active listeners on the current worker process"
ActivePeers: 0:"Current number of verified active peers connections on the current worker process"
ConnectedPeers: 0:"Current number of peers having passed the connection step on the current worker process"
DroppedLogs: 0:"Total number of dropped logs for current worker process since started"
BusyPolling: 0:"1 if busy-polling is currently in use on the worker process, otherwise zero (config.busy-polling)"
FailedResolutions: 0:"Total number of failed DNS resolutions in current worker process since started"
TotalBytesOut: 0:"Total number of bytes emitted by current worker process since started"
BytesOutRate: 0:"Number of bytes emitted by current worker process over the last second"
Willy Tarreau [Wed, 9 Oct 2019 13:44:21 +0000 (15:44 +0200)]
MINOR: stats: make "show stat" and "show info"
Now "show info" supports "desc" after the default and "typed" formats,
and "show stat" supports this after the typed format. In both cases
this appends the description for the represented metric between double
quotes. The same could be done for JSON output but would possibly require
to update the schema first.
Willy Tarreau [Wed, 9 Oct 2019 05:39:11 +0000 (07:39 +0200)]
MINOR: stats: prepare to add a description with each stat/info field
Several times some users have expressed the non-intuitive aspect of some
of our stat/info metrics and suggested to add some help. This patch
replaces the char* arrays with an array of name_desc so that we now have
some reserved room to store a description with each stat or info field.
These descriptions are currently empty and not reported yet.
Willy Tarreau [Wed, 9 Oct 2019 09:43:59 +0000 (11:43 +0200)]
MINOR: stats: support the "desc" output format modifier for info and stat
Now "show info" and "show stat" can parse "desc" as an output format
modifier that will be passed down the chain to add some descriptions
to the fields depending on the format in use. For now it is not
exploited.
Willy Tarreau [Wed, 9 Oct 2019 09:27:51 +0000 (11:27 +0200)]
MINOR: stats: uniformize the calling convention of the dump functions
Some functions used to take flags + appctx with flags==appctx.flags,
others neither, others just one of them. Some functions used to have
the flags before the object being dumped (server) while others had
it after (listener). This patch aims at cleaning this up a little bit
by following this principle:
- low-level functions which do not need the appctx take flags only
- medium-level functions which already use the appctx for other
reasons do not keep the flags
- top-level functions which already have the stream-int don't need
the flags nor the appctx.