BUG/MINOR: stats: Remove a break preventing ST_F_QCUR to be set for servers
There is an extra break statement wrongly placed in stats_fill_sv_stats()
function, just before filling the ST_F_QCUR field. It prevents this field to
be set to the right value for servers.
No backport needed except if commit 3a9a4992 ("MEDIUM: stats: allow to
select one field in `stats_fill_sv_stats`") is backported.
Tim Duesterhus [Tue, 26 Jan 2021 18:24:24 +0000 (19:24 +0100)]
BUILD: Include stdlib.h in compiler.h if DEBUG_USE_ABORT is set
Building with `"DEBUG=-DDEBUG_STRICT=1 -DDEBUG_USE_ABORT=1"` previously emitted the warning:
In file included from include/haproxy/api.h:35:0,
from src/mux_pt.c:13:
include/haproxy/buf.h: In function ‘br_init’:
include/haproxy/bug.h:42:90: warning: implicit declaration of function ‘abort’ [-Wimplicit-function-declaration]
#define ABORT_NOW() do { extern void ha_backtrace_to_stderr(); ha_backtrace_to_stderr(); abort(); } while (0)
^
include/haproxy/bug.h:56:21: note: in expansion of macro ‘ABORT_NOW’
#define CRASH_NOW() ABORT_NOW()
^
include/haproxy/bug.h:68:4: note: in expansion of macro ‘CRASH_NOW’
CRASH_NOW(); \
^
include/haproxy/bug.h:62:35: note: in expansion of macro ‘__BUG_ON’
#define _BUG_ON(cond, file, line) __BUG_ON(cond, file, line)
^
include/haproxy/bug.h:61:22: note: in expansion of macro ‘_BUG_ON’
#define BUG_ON(cond) _BUG_ON(cond, __FILE__, __LINE__)
^
include/haproxy/buf.h:875:2: note: in expansion of macro ‘BUG_ON’
BUG_ON(size < 2);
^
This patch fixes that issue. The `DEBUG_USE_ABORT` option exists for use with
static analysis tools. No backport needed.
Since the server SSL_CTX is now stored in the ckch_inst, it is not
needed anymore to pass an SSL_CTX to ckch_inst_new_load_srv_store() and
ssl_sock_load_srv_ckchs().
CLEANUP: ssl/cli: rework free in cli_io_handler_commit_cert()
The new feature allowing the change of server side certificates
introduced duplicated free code. Rework the code in
cli_io_handler_commit_cert() to be more consistent.
MINOR: ssl: Remove client_crt member of the server's ssl context
The client_crt member is not used anymore since the server's ssl context
initialization now behaves the same way as the bind lines one (using
ckch stores and instances).
MEDIUM: ssl: Enable backend certificate hot update
When trying to update a backend certificate, we should find a
server-side ckch instance thanks to which we can rebuild a new ssl
context and a new ckch instance that replace the previous ones in the
server structure. This way any new ssl session will be built out of the
new ssl context and the newly updated certificate.
This resolves a subpart of GitHub issue #427 (the certificate part)
MEDIUM: ssl: Load client certificates in a ckch for backend servers
In order for the backend server's certificate to be hot-updatable, it
needs to fit into the implementation used for the "bind" certificates.
This patch follows the architecture implemented for the frontend
implementation and reuses its structures and general function calls
(adapted for the server side).
The ckch store logic is kept and a dedicated ckch instance is used (one
per server). The whole sni_ctx logic was not kept though because it is
not needed.
All the new functions added in this patch are basically server-side
copies of functions that already exist on the frontend side with all the
sni and bind_cond references removed.
The ckch_inst structure has a new 'is_server_instance' flag which is
used to distinguish regular instances from the server-side ones, and a
new pointer to the server's structure in case of backend instance.
Since the new server ckch instances are linked to a standard ckch_store,
a lookup in the ckch store table will succeed so the cli code used to
update bind certificates needs to be covered to manage those new server
side ckch instances.
MINOR: ssl: Server ssl context prepare function refactoring
Split the server's ssl context initialization into the general ssl
related initializations and the actual initialization of a single
SSL_CTX structure. This way the context's initialization will be
usable by itself from elsewhere.
Amaury Denoyelle [Tue, 26 Jan 2021 13:35:26 +0000 (14:35 +0100)]
REORG: backend: simplify conn_backend_get
Reorganize the conditions for the reuse of idle/safe connections :
- reduce code by using variable to store reuse mode and idle/safe conns
counts
- consider that idle/safe/avail lists are properly allocated if
max_idle_conns not null. An allocation failure prevents haproxy
startup.
BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is unknown
It is only a problem on the response path because the request payload length
it always known. But when a filter is registered to analyze the response
payload, the filtering may hang if the server closes just after the headers.
The root cause of the bug comes from an attempt to allow the filters to not
immediately forward the headers if necessary. A filter may choose to hold
the headers by not forwarding any bytes of the payload. For a message with
no payload but a known payload length, there is always a EOM block to
forward. Thus holding the EOM block for bodyless messages is a good way to
also hold the headers. However, messages with an unknown payload length,
there is no EOM block finishing the message, but only a SHUTR flag on the
channel to mark the end of the stream. If there is no payload when it
happens, there is no payload at all to forward. In the filters API, it is
wrongly detected as a condition to not forward the headers.
Because it is not the most used feature and not the obvious one, this patch
introduces another way to hold the message headers at the begining of the
forwarding. A filter flag is added to explicitly says the headers should be
hold. A filter may choose to set the STRM_FLT_FL_HOLD_HTTP_HDRS flag and not
forwad anything to hold the headers. This flag is removed at each call, thus
it must always be explicitly set by filters. This flag is only evaluated if
no byte has ever been forwarded because the headers are forwarded with the
first byte of the payload.
reg-tests/filters/random-forwarding.vtc reg-test is updated to also test
responses with unknown payload length (with and without payload).
depending on cases, if the limit was not configured or enabled, NaN is
returned instead. It should not be an issue for users, even better than
before as it provides more precise info.
William Dauchy [Mon, 25 Jan 2021 16:29:03 +0000 (17:29 +0100)]
MEDIUM: stats: allow to select one field in `stats_fill_sv_stats`
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend and backend
side.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the server.
A few things to note though:
- state require prior calculation, so I moved that to a sort of helper
`stats_fill_be_stats_computestate`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
beginning of the function under a condition.
William Dauchy [Mon, 25 Jan 2021 16:29:02 +0000 (17:29 +0100)]
MINOR: contrib/prometheus-exporter: use fill_be_stats for backend dump
use `stats_fill_be_stats` when possible to avoid duplicating code; make
use of field selector to get the needed field only.
the only difference is on `haproxy_backend_downtime_seconds_total` as
stats.c is testing `px->srv`. This behaviour is present since commit 7344f4789321ef8ce2ce17cf73dabd672f7c8c6 ("MINOR: stats: only report
backend's down time if it has servers"). The end result is a NaN instead
of a zero when no server are present.
William Dauchy [Mon, 25 Jan 2021 16:29:01 +0000 (17:29 +0100)]
MEDIUM: stats: allow to select one field in `stats_fill_be_stats`
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
This patch follows what has already been done on frontend side.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the backend
A few things to note though:
- status and uweight field requires prior compute, so I moved that to a
sort of helper `stats_fill_be_stats_computesrv`.
- all ST_F*TIME fields requires some minor compute, so I moved it at te
beginning of the function under a condition.
Tim Duesterhus [Sat, 23 Jan 2021 16:50:21 +0000 (17:50 +0100)]
DOC: Improve documentation of the various hdr() fetches
GitHub issue #796 notes that many administrators miss the fact that the `hdr()`
fetch (without the `f`) splits the header value at commas. This is only
mentioned at the end of a long paragraph.
This patch attempts to improve the documentation by:
- Explaning the "comma issue" as early as possible.
- Adding newlines to split the explanation into distinct sections.
- Reducing duplication by making the `res` siblings refer to their `req`
counterparts.
This patch may be backported as long as it applies cleanly. During the
refactoring I needed to adjust several explanations for consistency and not all
of them might be available in older branches.
William Dauchy [Fri, 22 Jan 2021 20:09:48 +0000 (21:09 +0100)]
CLEANUP: stats: improve field selection for frontend http fields
while working on backend/servers I realised I could have written that in
a better way and avoid one extra break. This is slightly improving
readiness.
also while being here, fix function declaration which was not 100%
accurate.
this patch does not change the behaviour of the code.
BUG/MINOR: stats: Init the metric variable when frontend stats are filled
In stats_fill_fe_stats(), some fields are conditionnal (ST_F_HRSP_* for
instance). But unlike unimplemented fields, for those fields, the <metric>
variable is used to fill the <stats> array, but it is not initialized. This
bug as no impact, because these fields are not used. But it is better to fix
it now to avoid future bugs.
To fix it, the metric is now defined and initialized into the for loop.
The bug was introduced by the commit 0ef54397 ("MEDIUM: stats: allow to
select one field in `stats_fill_fe_stats`"). No backport is needed except if
the above commit is backported. It fixes the issue #1063.
BUG/MINOR: stats: Continue to fill frontend stats on unimplemented metric
A regression was introduced by the commit 0ef54397b ("MEDIUM: stats: allow
to select one field in `stats_fill_fe_stats`"). stats_fill_fe_stats()
function fails on unimplemented metrics for frontends. However, not all
stats metrics are used by frontends. For instance ST_F_QCUR. As a
consequence, the frontends stats are always skipped.
To fix the bug, we just skip unimplemented metric for frontends. An error is
triggered only if a specific field is given and is unimplemented.
No backport is needed except if the above commit is backported.
Willy Tarreau [Fri, 22 Jan 2021 15:19:46 +0000 (16:19 +0100)]
[RELEASE] Released version 2.4-dev6
Released version 2.4-dev6 with the following main changes :
- MINOR: converter: adding support for url_enc
- BUILD: SSL: guard TLS13 ciphersuites with HAVE_SSL_CTX_SET_CIPHERSUITES
- BUILD: ssl: guard EVP_PKEY_get_default_digest_nid with ASN1_PKEY_CTRL_DEFAULT_MD_NID
- BUILD: ssl: guard openssl specific with SSL_READ_EARLY_DATA_SUCCESS
- BUILD: Makefile: exclude broken tests by default
- CLEANUP: cfgparse: replace "realloc" with "my_realloc2" to fix to memory leak on error
- BUG/MINOR: hlua: Fix memory leak in hlua_alloc
- MINOR: contrib/prometheus-exporter: export build_info
- DOC: fix some spelling issues over multiple files
- CLEANUP: Fix spelling errors in comments
- SCRIPTS: announce-release: fix typo in help message
- CI: github: add a few more words to the codespell ignore list
- DOC: Add maintainers for the Prometheus exporter
- BUG/MINOR: sample: fix concat() converter's corruption with non-string variables
- BUG/MINOR: server: Memory leak of proxy.used_server_addr during deinit
- CLEANUP: sample: remove uneeded check in json validation
- MINOR: reg-tests: add a way to add service dependency
- BUG/MINOR: sample: check alloc_trash_chunk return value in concat()
- BUG/MINOR: reg-tests: fix service dependency script
- MINOR: reg-tests: add base prometheus test
- Revert "BUG/MINOR: dns: SRV records ignores duplicated AR records"
- BUG/MINOR: sample: Memory leak of sample_expr structure in case of error
- BUG/MINOR: check: Don't perform any check on servers defined in a frontend
- BUG/MINOR: init: enforce strict-limits when using master-worker
- MINOR: contrib/prometheus-exporter: avoid connection close header
- MINOR: contrib/prometheus-exporter: use fill_info for process dump
- BUG/MINOR: init: Use a dynamic buffer to set HAPROXY_CFGFILES env variable
- MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities
- MINOR: server: Forbid server definitions in frontend sections
- BUG/MINOR: threads: Fixes the number of possible cpus report for Mac.
- CLEANUP: pattern: rename pat_ref_commit() to pat_ref_commit_elt()
- MINOR: pattern: add the missing generation ID manipulation functions
- MINOR: peers: Add traces for peer control messages.
- BUG/MINOR: dns: SRV records ignores duplicated AR records (v2)
- BUILD: peers: fix build warning about unused variable
- BUG/MEDIUM: stats: add missing INF_BUILD_INFO definition
- MINOR: cache: Do not store responses with an unknown encoding
- BUG/MINOR: peers: Possible appctx pointer dereference.
- MINOR: build: discard echoing in help target
- MINOR: cache: Remove the `hash` part of the accept-encoding secondary key
- CLEANUP: cache: Use proper data types in secondary_key_cmp()
- CLEANUP: Rename accept_encoding_hash_cmp to accept_encoding_bitmap_cmp
- BUG/MINOR: peers: Wrong "new_conn" value for "show peers" CLI command.
- MINOR: contrib: Make the wireshark peers dissector compile for more distribs.
- BUG/MINOR: mux_h2: missing space between "st" and ".flg" in the "show fd" helper
- CLEANUP: tools: make resolve_sym_name() take a const pointer
- CLEANUP: cli: make "show fd" use a const connection to access other fields
- MINOR: cli: make "show fd" also report the xprt and xprt_ctx
- MINOR: xprt: add a new show_fd() helper to complete some "show fd" dumps.
- MINOR: ssl: provide a "show fd" helper to report important SSL information
- MINOR: xprt/mux: export all *_io_cb functions so that "show fd" resolves them
- MINOR: mux-h2: make the "show fd" helper also decode the h2s subscriber when known
- MINOR: mux-h1: make the "show fd" helper also decode the h1s subscriber when known
- MINOR: mux-fcgi: make the "show fd" helper also decode the fstrm subscriber when known
- CI: Pin VTest to a known good commit
- MINOR: cli: give the show_fd helpers the ability to report a suspicious entry
- MINOR: cli/show_fd: report some easily detectable suspicious states
- MINOR: ssl/show_fd: report some FDs as suspicious when possible
- MINOR: mux-h2/show_fd: report as suspicious an entry with too many calls
- MINOR: mux-h1/show_fd: report as suspicious an entry with too many calls
- BUG/MINOR: mworker: define _GNU_SOURCE for strsignal()
- BUG/MEDIUM: tcpcheck: Don't destroy connection in the wake callback context
- BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper layer when creating a front stream
- MINOR: http: Add HTTP 501-not-implemented error message
- MINOR: muxes: Add exit status for errors about not implemented features
- MINOR: mux-h1: Be prepared to return 501-not-implemented error during parsing
- MEDIUM: mux-h1: Return a 501-not-implemented for upgrade requests with a body
- DOC: Remove space after comma in converter signature
- DOC: Rename '<var name>' to '<var>' in converter signature
- MINOR: stats: duplicate 3 fields in bytes in info
- MINOR: stats: add new start time field
- MINOR: contrib/prometheus-exporter: merge info description from stats
- MEDIUM: stats: allow to select one field in `stats_fill_fe_stats`
- MINOR: contrib/prometheus-exporter: use fill_fe_stats for frontend dump
- MINOR: contrib/prometheus-exporter: Don't needlessly set empty label for metrics
- MINOR: contrib/prometheus-exporter: Split the PROMEX_FL_STATS_METRIC flag
- MINOR: contrib/prometheus-exporter: Add promex_metric struct defining a metric
- MEDIUM: contrib/prometheus-exporter: Rework matrices defining Promex metrics
- BUG/MINOR: stream: Don't update counters when TCP to H2 upgrades are performed
- BUG/MEDIUM: mux-h2: fix read0 handling on partial frames
- MINOR: debug: always export the my_backtrace function
- MINOR: debug: extract the backtrace dumping code to its own function
- MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace
- MEDIUM: debug: now always print a backtrace on CRASH_NOW() and friends
- MINOR: debug: let ha_dump_backtrace() dump a bit further for some callers
- BUILD: debug: fix build warning by consuming the write() result
- MINOR: lua: remove unused variable
- BUILD/MINOR: lua: define _GNU_SOURCE for LLONG_MAX
Bertrand Jacquin [Thu, 21 Jan 2021 19:14:46 +0000 (19:14 +0000)]
MINOR: lua: remove unused variable
hlua_init() uses 'idx' only in openssl related code, while 'i' is used
in shared code and is safe to be reused. This commit replaces the use of
'idx' with 'i'
Willy Tarreau [Fri, 22 Jan 2021 14:58:26 +0000 (15:58 +0100)]
BUILD: debug: fix build warning by consuming the write() result
When writing commit a8459b28c ("MINOR: debug: create
ha_backtrace_to_stderr() to dump an instant backtrace") I just forgot
that some distros are a bit extremist about the syscall return values.
src/debug.c: In function `ha_backtrace_to_stderr':
src/debug.c:147:3: error: ignoring return value of `write', declared with attribute warn_unused_result [-Werror=unused-result]
write(2, b.area, b.data);
^~~~~~~~~~~~~~~~~~~~~~~~
CC src/h1_htx.o
Let's apply the usual tricks to shut them up. No backport is needed.
Willy Tarreau [Fri, 22 Jan 2021 13:48:34 +0000 (14:48 +0100)]
MINOR: debug: let ha_dump_backtrace() dump a bit further for some callers
The dump state is now passed to the function so that the caller can adjust
the behavior. A new series of 4 values allow to stop *after* dumping main
instead of before it or any of the usual loops. This allows to also report
BUG_ON() that could happen very high in the call graph (e.g. startup, or
the scheduler itself) while still understanding what the call path was.
Willy Tarreau [Fri, 22 Jan 2021 13:15:46 +0000 (14:15 +0100)]
MEDIUM: debug: now always print a backtrace on CRASH_NOW() and friends
The purpose is to enable the dumping of a backtrace on BUG_ON(). While
it's very useful to know that a condition was met, very often some
caller context is missing to figure how the condition could happen.
From now on, on systems featuring backtrace, a backtrace of the calling
thread will also be dumped to stderr in addition to the unexpected
condition. This will help users of DEBUG_STRICT as they'll most often
find this backtrace in their logs even if they can't find their core
file.
A new "debug dev bug" expert-mode CLI command was added to test the
feature.
Willy Tarreau [Fri, 22 Jan 2021 13:12:27 +0000 (14:12 +0100)]
MINOR: debug: create ha_backtrace_to_stderr() to dump an instant backtrace
This function calls the ha_dump_backtrace() function with a locally
allocated buffer and sends the output slightly indented to fd #2. It's
meant to be used as an emergency backtrace dump.
Willy Tarreau [Fri, 22 Jan 2021 12:52:41 +0000 (13:52 +0100)]
MINOR: debug: extract the backtrace dumping code to its own function
The backtrace dumping code was located into the thread dump function
but it looks particularly convenient to be able to call it to produce
a dump in other situations, so let's move it to its own function and
make sure it's called last in the function so that we can benefit from
tail merging to save one entry.
Willy Tarreau [Fri, 22 Jan 2021 11:12:29 +0000 (12:12 +0100)]
MINOR: debug: always export the my_backtrace function
In order to simplify the code and remove annoying ifdefs everywhere,
let's always export my_backtrace() and make it adapt to the situation
and return zero if not supported. A small update in the thread dump
function was needed to make sure we don't use its results if it fails
now.
Willy Tarreau [Wed, 20 Jan 2021 09:53:13 +0000 (10:53 +0100)]
BUG/MEDIUM: mux-h2: fix read0 handling on partial frames
Since commit aade4edc1 ("BUG/MEDIUM: mux-h2: Don't handle pending read0
too early on streams"), we've met a few cases where an early connection
close wouldn't be properly handled if some data were pending in a frame
header, because the test now considers the buffer's contents before
accepting to report the close, but given that frame headers or preface
are consumed at once, the buffer cannot make progress when it's stuck
at intermediary lengths.
In order to address this, this patch introduces two flags in the h2c
connection to store any reported shutdown and failed parsing. The idea
is that we cannot rely on conn_xprt_read0_pending() in the parser since
it wouldn't consider data pending in the buffer nor intermediary layers,
but we know for certain that after a read0 is reported by the transport
layer in presence of an RD_SH on the connection, no more progress will
be made there. This alone is not sufficient to decide to end processing,
we can only do this once these final data have been submitted to a parser.
Therefore, now when a parser fails on missing data, we check if a read0
has already been reported on this connection, and if so we set a new
END_REACHED flag on the connection to indicate a failure to process the
final data. The h2c_read0_pending() function now simply reports this
flag's status. This way we're certain that the input shutdown is only
considered after the demux attempted to parse the last frame.
Maybe over the long term the subscribe() API should be improved to
synchronously fail when trying to subscribe for an even that will not
happen. This may be an elegant solution that could possibly work across
multiple layers and even muxes, and be usable at a few specific places
where that's needed.
Given the patch above was backported as far as 2.0, this one should be
backported there as well. It is possible that the fcgi mux has the same
issue, but this was not analysed yet.
Thanks to Pierre Cheynier for providing detailed traces allowing to
quickly narrow the problem down, and to Olivier for his analysis.
BUG/MINOR: stream: Don't update counters when TCP to H2 upgrades are performed
When a TCP to H2 upgrade is performed, the SF_IGNORE flag is set on the
stream before killing it. This happens when a TCP/SSL client connection is
routed to a HTTP backend and the h2 alpn detected. The SF_IGNORE flag was
added for this purpose, to skip some processing when the stream is aborted
before a mux upgrade. Some counters updates were skipped this way. But some
others are still updated.
Now, all counters update at the end of process_stream(), before releasing
the stream, are ignored if SF_IGNORE flag is set. Note this stream is
aborted because we switch from a mono-stream to a multi-stream
multiplexer. It works differently for TCP to H1 upgrades.
This patch should be backported as far as 2.0 after some observation period.
The global and stats matrices are replaced by a simpler ones. Now we have
only 2 arrays of prometheus metrics. Their flags are used to match on the
entity type. This simplify a bit the metrics definition. For now, labels and
descriptions are still outside of these arrays, because the labels must be
reworked to be more dynamic and the descrptions must be replaced by stats
ones as far as possible.
MINOR: contrib/prometheus-exporter: Add promex_metric struct defining a metric
This structure will be used to define a Prometheus metric, i.e its name, its
type (gauge or counter) and its flags. The flags will be used to know for
which entities the metric is defined (global, frontend, backend and/or server).
MINOR: contrib/prometheus-exporter: Split the PROMEX_FL_STATS_METRIC flag
PROMEX_FL_STATS_METRIC flag is splitted in 3 flags to easily identify the
processed entity type (frontend, backend or server). Thus, now we are using
PROMEX_FL_FRONT_METRIC, PROMEX_FL_BACK_METRIC or PROMEX_FL_SRV_METRIC. These
flags will be used to know if a metric is defined and must be exported for a
given entity type.
William Dauchy [Sun, 17 Jan 2021 17:27:45 +0000 (18:27 +0100)]
MEDIUM: stats: allow to select one field in `stats_fill_fe_stats`
prometheus approach requires to output all values for a given metric
name; meaning we iterate through all metrics, and then iterate in the
inner loop on all objects for this metric.
In order to allow more code reuse, adapt the stats API to be able to
select one field or fill them all otherwise.
From this patch it should be possible to remove most of the duplicate
code on prometheuse side for the frontend.
William Dauchy [Fri, 15 Jan 2021 21:41:39 +0000 (22:41 +0100)]
MINOR: contrib/prometheus-exporter: merge info description from stats
Now that units are coherent we can merge the info description from
haproxy stats.
Description were not always the same, but I guess we may eventually
improve them in the future.
William Dauchy [Fri, 15 Jan 2021 21:41:38 +0000 (22:41 +0100)]
MINOR: stats: add new start time field
Another patch in order to try to reconciliate haproxy stats and
prometheus. Here I'm adding a proper start time field in order to make
proper use of uptime field.
That being done we can move the calculation in `fill_info`
William Dauchy [Fri, 15 Jan 2021 21:41:37 +0000 (22:41 +0100)]
MINOR: stats: duplicate 3 fields in bytes in info
in order to prepare a possible merge of fields between haproxy stats and
prometheus, duplicate 3 fields:
INF_MEMMAX
INF_POOL_ALLOC
INF_POOL_USED
Those were specifically named in MB unit which is not what prometheus
recommends. We therefore used them but changed the unit while doing the
calculation. It created a specific case for that, up to the description.
This patch:
- removes some possible confusion, i.e. using MB field for bytes
- will permit an easier merge of fields such as description
First consequence for now, is that we can remove the calculation on
prometheus side and move it on `fill_info`.
MEDIUM: mux-h1: Return a 501-not-implemented for upgrade requests with a body
If an HTTP protocol upgrade request with a payload is received, a
501-not-implemented error is now returned to the client. It is valid from
the RFC point of view but will be incompatible with the way the H2
websockets will be handled by HAProxy. And it is probably a very uncommon
way to do perform protocol upgrades.
MINOR: mux-h1: Be prepared to return 501-not-implemented error during parsing
With this patch, the H1 mux is now able to return 501-not-implemented errors
to client during the request parsing. However, no such errors are returned
for now.
MINOR: muxes: Add exit status for errors about not implemented features
The MUX_ES_NOTIMPL_ERR exit status is added to allow the multiplexers to
report errors about not implemented features. This will be used by the H1
mux to return 501-not-implemented errors.
Add the support for the 501-not-implemented status code with the
corresponding default message. The documentation is updated accordingly
because it is now part of status codes HAProxy may emit via an errorfile or
a deny/return HTTP action.
BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper layer when creating a front stream
Just like the H1 muliplexer, when a new frontend H2 stream is created, the
rxbuf is xferred to the stream at the upper layer.
Originally, it is not a bug fix, but just an api standardization. And in
fact, it fixes a crash when a h2 stream is aborted after the request parsing
but before the first call to process_stream(). It crashes since the commit 8bebd2fe5 ("MEDIUM: http-ana: Don't process partial or empty request
anymore"). It is now totally unexpected to have an HTTP stream without a
valid request. But here the stream is unable to get the request because the
client connection was aborted. Passing it during the stream creation fixes
the bug. But the true problem is that the stream-interfaces are still
relying on the connection state while only the muxes should do so.
BUG/MEDIUM: tcpcheck: Don't destroy connection in the wake callback context
When a tcpcheck ruleset uses multiple connections, the existing one must be
closed and destroyed before openning the new one. This part is handled in
the tcpcheck_main() function, when called from the wake callback function
(wake_srv_chk). But it is indeed a problem, because this function may be
called from the mux layer. This means a mux may call the wake callback
function of the data layer, which may release the connection and the mux. It
is easy to see how it is hazardous. And actually, depending on the
scheduling, it leads to crashes.
Thus, we must avoid to release the connection in the wake callback context,
and move this part in the check's process function instead. To do so, we
rely on the CHK_ST_CLOSE_CONN flags. When a connection must be replaced by a
new one, this flag is set on the check, in tcpcheck_main() function, and the
check's task is woken up. Then, the connection is really closed in
process_chk_conn() function.
This patch must be backported as far as 2.2, with some adaptations however
because the code is not exactly the same.
Bertrand Jacquin [Thu, 21 Jan 2021 01:31:46 +0000 (01:31 +0000)]
BUG/MINOR: mworker: define _GNU_SOURCE for strsignal()
glibc < 2.10 requires _GNU_SOURCE in order to make use of strsignal(),
otherwise leading to SEGV at runtime.
$ make V=1 TARGET=linux-glibc-legacy USE_THREAD= USE_ACCEPT4=
..
src/mworker.c: In function 'mworker_catch_sigchld':
src/mworker.c:285: warning: implicit declaration of function 'strsignal'
src/mworker.c:285: warning: pointer/integer type mismatch in conditional expression
..
$ make V=1 reg-tests REGTESTS_TYPES=slow,default
..
###### Test case: reg-tests/mcli/mcli_start_progs.vtc ######
## test results in: "/tmp/haregtests-2021-01-19_15-18-07.n24989/vtc.29077.28f6153d"
---- h1 Bad exit status: 0x008b exit 0x0 signal 11 core 128
---- h1 Assert error in haproxy_wait(), src/vtc_haproxy.c line 792: Condition(*(&h->fds[1]) >= 0) not true. Errno=0 Success
..
$ gdb ./haproxy /tmp/core.0.haproxy.30270
..
Core was generated by `/root/haproxy/haproxy -d -W -S fd@8 -dM -f /tmp/haregtests-2021-01-19_15-18-07.'.
Program terminated with signal 11, Segmentation fault.
#0 0x00002aaaab387a10 in strlen () from /lib64/libc.so.6
(gdb) bt
#0 0x00002aaaab387a10 in strlen () from /lib64/libc.so.6
#1 0x00002aaaab354b69 in vfprintf () from /lib64/libc.so.6
#2 0x00002aaaab37788a in vsnprintf () from /lib64/libc.so.6
#3 0x00000000004a76a3 in memvprintf (out=0x7fffedc680a0, format=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n", orig_args=0x7fffedc680d0)
at src/tools.c:3868
#4 0x00000000004bbd40 in print_message (label=0x58abed "ALERT", fmt=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n", argp=0x7fffedc680d0)
at src/log.c:1066
#5 0x00000000004bc07f in ha_alert (fmt=0x5a5d58 "Current worker #%d (%d) exited with code %d (%s)\n") at src/log.c:1109
#6 0x0000000000534b7b in mworker_catch_sigchld (sh=<value optimized out>) at src/mworker.c:293
#7 0x0000000000556af3 in __signal_process_queue () at src/signal.c:88
#8 0x00000000004f6216 in signal_process_queue () at include/haproxy/signal.h:39
#9 run_poll_loop () at src/haproxy.c:2859
#10 0x00000000004f63b7 in run_thread_poll_loop (data=<value optimized out>) at src/haproxy.c:3028
#11 0x00000000004faaac in main (argc=<value optimized out>, argv=0x7fffedc68498) at src/haproxy.c:904
Willy Tarreau [Thu, 21 Jan 2021 07:53:50 +0000 (08:53 +0100)]
MINOR: ssl/show_fd: report some FDs as suspicious when possible
If a subscriber's tasklet was called more than one million times, if
the ssl_ctx's connection doesn't match the current one, or if the
connection appears closed in one direction while the SSL stack is
still subscribed, the FD is reported as suspicious. The close cases
may occasionally trigger a false positive during very short and rare
windows. Similarly the 1M calls will trigger after 16GB are transferred
over a given connection. These are rare enough events to be reported as
suspicious.
Willy Tarreau [Thu, 21 Jan 2021 08:07:29 +0000 (09:07 +0100)]
MINOR: cli/show_fd: report some easily detectable suspicious states
A file descriptor which maps to a connection but has more than one
thread in its mask, or an FD handle that doesn't correspond to the FD,
or wiht no mux context, or an FD with no thread in its mask, or with
more than 1 million events is flagged as suspicious.
Willy Tarreau [Thu, 21 Jan 2021 07:26:06 +0000 (08:26 +0100)]
MINOR: cli: give the show_fd helpers the ability to report a suspicious entry
Now the show_fd helpers at the transport and mux levels return an integer
which indicates whether or not the inspected entry looks suspicious. When
an entry is reported as suspicious, "show fd" will suffix it with an
exclamation mark ('!') in the dump, that is supposed to help detecting
them.
For now, helpers were adjusted to adapt to the new API but none of them
reports any suspicious entry yet.
Willy Tarreau [Wed, 20 Jan 2021 13:55:01 +0000 (14:55 +0100)]
MINOR: xprt/mux: export all *_io_cb functions so that "show fd" resolves them
In FD dumps it's often very important to figure what upper layer function
is going to be called. Let's export the few I/O callbacks that appear as
tasklet functions so that "show fd" can resolve them instead of printing
a pointer relative to main. For example:
Willy Tarreau [Wed, 20 Jan 2021 13:41:29 +0000 (14:41 +0100)]
MINOR: ssl: provide a "show fd" helper to report important SSL information
The SSL context contains a lot of important details that are currently
missing from debug outputs. Now that we detect ssl_sock, we can perform
some sanity checks, print the next xprt, the subscriber callback's context,
handler and number of calls. The process function is also resolved. This
now gives for example on an H2 connection:
Willy Tarreau [Wed, 20 Jan 2021 14:30:56 +0000 (15:30 +0100)]
MINOR: xprt: add a new show_fd() helper to complete some "show fd" dumps.
Just like we did for the muxes, now the transport layers will have the
ability to provide helpers to report more detailed information about their
internal context. When the helper is not known, the pointer continues to
be dumped as-is if it's not NULL. This way a transport with no context nor
dump function will not add a useless "xprt_ctx=(nil)" but the pointer will
be emitted if valid or if a helper is defined.
Willy Tarreau [Wed, 20 Jan 2021 13:40:04 +0000 (14:40 +0100)]
MINOR: cli: make "show fd" also report the xprt and xprt_ctx
These ones are definitely missing from some dumps, let's report them! We
print the xprt's name instead of its useless pointer, as well as its ctx
when xprt is not NULL.
Willy Tarreau [Wed, 20 Jan 2021 13:13:46 +0000 (14:13 +0100)]
CLEANUP: cli: make "show fd" use a const connection to access other fields
Over time the code has uglified, casting fdt.owner as a struct connection
for about everything. Let's have a const struct connection* there and take
this opportunity for passing all fields as const as well.
Additionally a misplaced closing parenthesis on the output was fixed.
Willy Tarreau [Wed, 20 Jan 2021 13:37:59 +0000 (14:37 +0100)]
CLEANUP: tools: make resolve_sym_name() take a const pointer
When 0c439d895 ("BUILD: tools: make resolve_sym_name() return a const")
was written, the pointer argument ought to have been turned to const for
more flexibility. Let's do it now.
MINOR: contrib: Make the wireshark peers dissector compile for more distribs.
With a 2.6.8 wireshark, this module could not compile because of ws_version.h
missing header. This patch offers the possibility to compile this plugin without
having to include this header. Furthermore with my wireshark version a
"plugin_release" object is required to make it be loaded by wireshark. This is
a string which seems to have to match a dotted string made of you wireshark
major and minor versions.
BUG/MINOR: peers: Wrong "new_conn" value for "show peers" CLI command.
This counter could be hugely incremented by the peer task responsible of managing
peer synchronizations and reconnections, for instance when a peer is not reachable
there is a period where the appctx is not created. If we receive stick-table
updates before the peer session (appctx) is instantiated, we reach the code
responsible of incrementing the "new_conn" counter.
With this patch we increment this counter only when we really instantiate a new
peer session thanks to peer_session_create().
Tim Duesterhus [Mon, 18 Jan 2021 12:41:16 +0000 (13:41 +0100)]
MINOR: cache: Remove the `hash` part of the accept-encoding secondary key
As of commit 6ca89162dc881df8fecd7713ca1fe5dbaa66b315 this hash no longer is
required, because unknown encodings are not longer stored and known encodings
do not use the cache.
Bertrand Jacquin [Sun, 17 Jan 2021 18:47:47 +0000 (18:47 +0000)]
MINOR: build: discard echoing in help target
When V=1 is used in conjuction with help, the output becomes pretty
difficult to read properly.
$ make TARGET=linux-glibc V=1 help
..
DEBUG_USE_ABORT: use abort() for program termination, see include/haproxy/bug.h for details
echo; \
if [ -n "" ]; then \
if [ -n "" ]; then \
echo "Current TARGET: "; \
else \
echo "Current TARGET: (custom target)"; \
fi; \
else \
echo "TARGET not set, you may pass 'TARGET=xxx' to set one among :";\
echo " linux-glibc, linux-glibc-legacy, solaris, freebsd, dragonfly, netbsd,"; \
echo " osx, openbsd, aix51, aix52, aix72-gcc, cygwin, haiku, generic,"; \
echo " custom"; \
fi
TARGET not set, you may pass 'TARGET=xxx' to set one among :
linux-glibc, linux-glibc-legacy, solaris, freebsd, dragonfly, netbsd,
osx, openbsd, aix51, aix52, aix72-gcc, cygwin, haiku, generic,
custom
echo;echo "Enabled features for TARGET '' (disable with 'USE_xxx=') :"
Enabled features for TARGET '' (disable with 'USE_xxx=') :
set -- POLL ; echo " $*" | (fmt || cat) 2>/dev/null
POLL
echo;echo "Disabled features for TARGET '' (enable with 'USE_xxx=1') :"
MINOR: cache: Do not store responses with an unknown encoding
If a server varies on the accept-encoding header and it sends a response
with an encoding we do not know (see parse_encoding_value function), we
will not store it. This will prevent unexpected errors caused by
cache collisions that could happen in accept_encoding_hash_cmp.
Willy Tarreau [Fri, 15 Jan 2021 16:08:38 +0000 (17:08 +0100)]
BUILD: peers: fix build warning about unused variable
Previous commit da2b0844f ("MINOR: peers: Add traces for peer control
messages.") introduced a build warning on some compiler versions after
the removal of variable "peers" in peer_send_msgs() because variable
"s" was used only to assign this one, and variable "si" to assign "s".
Let's remove both to fix the warning. No backport is needed.
This bug happens when a service has multiple records on the same host
and the server provides the A/AAAA resolution in the response as AR
(Additional Records).
In such condition, the first occurence of the host will be taken from
the Additional section, while the second (and next ones) will be process
by an independent resolution task (like we used to do before 2.2).
This can lead to a situation where the "synchronisation" of the
resolution may diverge, like described in github issue #971.
Because of this behavior, HAProxy mixes various type of requests to
resolve the full list of servers: SRV+AR for all "first" occurences and
A/AAAA for all other occurences of an existing hostname.
IE: with the following type of response:
;; ANSWER SECTION:
_http._tcp.be2.tld. 3600 IN SRV 5 500 80 A2.tld.
_http._tcp.be2.tld. 3600 IN SRV 5 500 86 A3.tld.
_http._tcp.be2.tld. 3600 IN SRV 5 500 80 A1.tld.
_http._tcp.be2.tld. 3600 IN SRV 5 500 85 A3.tld.
;; ADDITIONAL SECTION:
A2.tld. 3600 IN A 192.168.0.2
A3.tld. 3600 IN A 192.168.0.3
A1.tld. 3600 IN A 192.168.0.1
A3.tld. 3600 IN A 192.168.0.3
the first A3 host is resolved using the Additional Section and the
second one through a dedicated A request.
When linking the SRV records to their respective Additional one, a
condition was missing (chek if said SRV record is already attached to an
Additional one), leading to stop processing SRV only when the target
SRV field matches the Additional record name. Hence only the first
occurence of a target was managed by an additional record.
This patch adds a condition in this loop to ensure the record being
parsed is not already linked to an Additional Record. If so, we can
carry on the parsing to find a possible next one with the same target
field value.
MINOR: peers: Add traces for peer control messages.
Display traces when sending/receiving peer control messages (synchronisation, heartbeat).
Add remaining traces when parsing malformed messages (acks, stick-table definitions)
or ignoring them.
Also add traces when releasing session or when reaching the PEER_SESS_ST_ERRPROTO
peer protocol state.
Willy Tarreau [Fri, 15 Jan 2021 13:11:59 +0000 (14:11 +0100)]
CLEANUP: pattern: rename pat_ref_commit() to pat_ref_commit_elt()
It's about the third time I get confused by these functions, half of
which manipulate the reference as a whole and those manipulating only
an entry. For me "pat_ref_commit" means committing the pattern reference,
not just an element, so let's rename it. A number of other ones should
really be renamed before 2.4 gets released :-/
David CARLIER [Fri, 15 Jan 2021 08:09:56 +0000 (08:09 +0000)]
BUG/MINOR: threads: Fixes the number of possible cpus report for Mac.
There is no low level api to achieve same as Linux/FreeBSD, we rely
on CPUs available. Without this, the number of threads is just 1 for
Mac while having 8 cores in my M1.
Backporting to 2.1 should be enough if that's possible.
MINOR: server: Forbid server definitions in frontend sections
An fatal error is now reported if a server is defined in a frontend
section. til now, a warning was just emitted and the server was ignored. The
warning was added in the 1.3.4 when the frontend/backend keywords were
introduced to allow a smooth transition and to not break existing
configs. It is old enough now to emit an fatal error in this case.
This patch is related to the issue #1043. It may be backported at least as
far as 2.2, and possibly to older versions. It relies on the previous commit
("MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities").
MINOR: config: Add failifnotcap() to emit an alert on proxy capabilities
This function must be used to emit an alert if a proxy does not have at
least one of the requested capabilities. An additional message may be
appended to the alert.
BUG/MINOR: init: Use a dynamic buffer to set HAPROXY_CFGFILES env variable
The HAPROXY_CFGFILES env variable is built using a static trash chunk, via a
call to get_trash_chunk() function. This chunk is reserved during the whole
configuration parsing. It is far too large to guarantee it will not be
reused during the configuration parsing. And in fact, it happens in the lua
code since the commit f67442efd ("BUG/MINOR: lua: warn when registering
action, conv, sf, cli or applet multiple times"), when a lua script is
loaded.
To fix the bug, we now use a dynamic buffer instead. And we call memprintf()
function to handle both the allocation and the formatting. Allocation errors
at this stage are fatal.
This patch should fix the issue #1041. It must be backported as far as 2.0.
Jerome Magnin [Tue, 12 Jan 2021 19:19:38 +0000 (20:19 +0100)]
BUG/MINOR: init: enforce strict-limits when using master-worker
The strict-limits global option was introduced with commit 0fec3ab7b
("MINOR: init: always fail when setrlimit fails"). When used in
conjuction with master-worker, haproxy will not fail when a setrlimit
fails. This happens because we only exit() if master-worker isn't used.
This patch removes all tests for master-worker mode for all cases covered
by strict-limits scope.
This should be backported from 2.1 onward.
This should fix issue #1042.
BUG/MINOR: check: Don't perform any check on servers defined in a frontend
If a server is defined in a frontend, thus a proxy without the backend
capability, the 'check' and 'agent-check' keywords are ignored. This way, no
check is performed on an ignored server. This avoids a segfault because some
part of the tcpchecks are not fully initialized (or released for frontends
during the post-check).
In addition, an test on the server's proxy capabilities is performed when
checks or agent-checks are initialized and nothing is performed for servers
attached to a non-backend proxy.
This patch should fix the issue #1043. It must be backported as far as 2.2.
The first part of the patch introduces a bug. When a dns answer item is
allocated, its <ar_item> is only initialized at the end of the parsing, when
the item is added in the answer list. Thus, we must not try to release it
during the parsing.
The second part is also probably buggy. It fixes the issue #971 but reverts
a fix for the issue #841 (see commit fb0884c8297 "BUG/MEDIUM: dns: Don't
store additional records in a linked-list"). So it must be at least
revalidated.
This revert fixes a segfault reported in a comment of the issue #971. It
must be backported as far as 2.2.
William Dauchy [Sun, 10 Jan 2021 20:13:06 +0000 (21:13 +0100)]
MINOR: reg-tests: add base prometheus test
Add a base test to start with something, even though this is not
necessarily complete.
Also make use of the recent REQUIRE_SERVICE option to exclude it from
test list of it was not build with prometheus included.
note: I thought it was possible to send multiple requests within the
same client, but I'm getting "HTTP header is incomplete" from the second
request
William Dauchy [Sun, 10 Jan 2021 20:13:05 +0000 (21:13 +0100)]
BUG/MINOR: reg-tests: fix service dependency script
I badly tested my previous patch forgetting to remove the "+" testing
present in options, and not in services; the list of services do not
have any "+" at the beginning of each service