Willy Tarreau [Mon, 13 Jun 2022 13:59:39 +0000 (15:59 +0200)]
MINOR: task: move profiling bit to per-thread
Instead of having a global mask of all the profiled threads, let's have
one flag per thread in each thread's flags. They are never accessed more
than one at a time an are better located inside the threads' contexts for
both performance and scalability.
Amaury Denoyelle [Mon, 13 Jun 2022 09:30:46 +0000 (11:30 +0200)]
BUG/MEDIUM: mux-quic: fix segfault on flow-control frame cleanup
LIST_ELEM macro was incorrectly used in the loop when purging
flow-control frames from qcc.lfctl.frms on MUX release. This caused a
segfault in qc_release() due to an invalid quic_frame pointer instance.
The occurence of this bug seems fairly rare. To happen, some
flow-control frames must have been allocated but not yet sent just as
the MUX release is triggered.
I did not find a reproducer scenario. Instead, I artificially triggered
it by inserting a quic_frame in qcc.lfctl.frms just before purging it in
qc_release() using the following snippet.
BUG/MEDIUM: cli: Notify cli applet won't consume data during request processing
The CLI applet process one request after another. Thus, when several
requests are pipelined, it is important to notify it won't consume remaining
outgoing data while it is processing a request. Otherwise, the applet may be
woken up in loop. For instance, it may happen with the HTTP client while we
are waiting for the server response if a shutr is received.
This patch must be backported in all supported versions after an observation
period. But a massive refactoring was performed in 2.6. So, for the 2.5 and
below, the patch will have to be adapted. Note also that, AFAIK, the bug can
only be triggered by the HTTP client for now.
BUG/MEDIUM: stconn: Don't wakeup applet for send if it won't consume data
in .chk_snd applet callback function, we must not wake up an applet if
SE_FL_WONT_CONSUME flag is set. Indeed, if an applet explicitly specify it
will not consume any outgoing data, it is useless to wake it up when more
data are sent. Note the applet may still be woken up for another reason. In
this case SE_FL_WONT_CONSUME flag will be removed. It is the applet
responsibility to set it again, if necessary.
This patch must be backported to 2.6 after an observation period. On earlier
versions, the bug probably exists too. However, because a massive
refactoring was performed in 2.6, the patch will have to be adapted and
carefully reviewed/tested if it is backported..
CLEANUP: check: Remove useless tests on check's stream-connector
Since the conn-stream refactoring, from the time the health-check is in
progress, its stream-connector is always defined. So, some tests on it are
useless and can be removed.
BUG/MINOR: tcp-rules: Make action call final on read error and delay expiration
When a TCP content ruleset is evaluated, we stop waiting for more data if
the inspect-delay is reached, if there is a read error or if we know no more
data will be received. This last point is only valid for ACLs. An action may
decide to yield for another reason. For instance, in the SPOE, the
"send-spoe-group" action yields while the agent response is not
received. Thus, now, an action call is final only when the inspect-delay is
reached or if there is a read error. But it is possible for an action to
yield if the buffer is full or if CF_EOI flag is set.
This patch could be backported to all supported versions.
Amaury Denoyelle [Fri, 10 Jun 2022 13:16:40 +0000 (15:16 +0200)]
BUG/MINOR: mux-quic: fix memleak on frames rejected by transport
When the MUX transfers a big amount of data to the client, the transport
layer may reject some of them because of the congestion controller
limit. Frames built by the MUX are thus dropped, even if streams transferred
data are kept in buffers for future new frames.
Thus, the MUX is required to free rejected frames. This fixes a memory
leak which may grow with important data transfers.
It should be backported to 2.6 after it has been tested and validated.
Amaury Denoyelle [Fri, 10 Jun 2022 13:18:12 +0000 (15:18 +0200)]
MINOR: mux-quic: complete BUG_ON on TX flow-control enforcing
TX flow-control enforcing is not straightforward : it requires the usage
of several counters at stream and connection level, in part due to the
difficult sending API between MUX and quic-conn layers.
To strengthen this part and ensures it behaves as expected, some
existing BUG_ON statements were adjusted and new one were added. This
should help to catch errors as early as possible, as in the case with
github issue #1738.
Amaury Denoyelle [Fri, 10 Jun 2022 13:16:21 +0000 (15:16 +0200)]
BUG/MEDIUM: mux-quic: fix flow control connection Tx level
The flow control enforced at connection level is incorrectly calculated.
There is a risk of exceeding the limit. In most cases, this results in a
segfault produced by a BUG_ON which is here to catch this kind of error.
If not compiled with DEBUG_STRICT, this should generate a connection
closed by the client due to the flow control overflow.
The problem is encountered when transfered payload is big enough to fill
the transport congestion window. In this case, some data are rejected by
the transport layer and kept by the MUX to be reemitted later. However,
these preserved data are not counted on the connection flow control when
resubmitted, which gradually amplify the gap between expected and real
consumed flow control.
To fix this, handle the flow-control at the connection level in the same
way as the stream level. A new field qcc.tx.offsets is incremented as
soon as data are transfered between stream TX buffers. The field
qcc.tx.sent_offsets is preserved to count bytes handled by the transport
layer and stop the MUX transfer if limit is reached.
As already stated, this bug can occur during transfers with enough
emitted data, over multiple streams. When using a single stream, the
flow control at the stream level hides it.
The BUG_ON crash is reproduced systematically with quiche client :
$ quiche-client --no-verify --http-version HTTP/3 -n 10000 https://127.0.0.1:20443/10K
This must be backported up to 2.6 when confirmed to work as expected.
Willy Tarreau [Fri, 10 Jun 2022 13:12:21 +0000 (15:12 +0200)]
BUG/MINOR: cli/stats: add missing trailing LF after "show info json"
This is the continuation of commit 5a0e7ca5d ("BUG/MINOR: cli/stats: add
missing trailing LF after JSON outputs"). There's also a "show info json"
command which was also missing the trailing LF. It's constructed exactly
like the "show stat json", in that it dumps a series of fields without any
LF. The difference however is that for the stats output, everything was
enclosed in an array which required an LF *after* the closing bracket,
while here there's no such array so we have to emit the LF after the loop.
That makes the two functions a bit inconsistent, it's quite annoying, but
making them better would require sending one LF per line in the stats
output, which is not particularly interesting.
Given that it took 5+ years to spot that this code wasn't working as
expected it doesn't seem worth investing much time trying to refactor
it to make it look cleaner at the risk of breaking other obscure parts.
Willy Tarreau [Fri, 10 Jun 2022 09:11:44 +0000 (11:11 +0200)]
BUG/MINOR: server: do not enable DNS resolution on disabled proxies
Leonhard Wimmer reported an interesting bug in github issue #1742.
Servers in disabled proxies that are configured for resolution are still
subscribed to DNS resolutions, but the LB algos are not initialized at
all since the proxy is disabled, so when the server state changes,
attempts to update its status cause a crash when the server's weight
is recalculated via a divide by the proxy's total weight which is zero.
This should be backported to all versions. Beware that before 2.5 or
so, there's no PR_FL_DISABLED flag, instead px->disabled should be
used (2.3-2.4) or PR_STSTOPPED for older versions.
Willy Tarreau [Fri, 10 Jun 2022 07:21:22 +0000 (09:21 +0200)]
BUG/MINOR: cli/stats: add missing trailing LF after JSON outputs
Patrick Hemmer reported that we have a bug in the CLI commands
"show stat json" and "show schema json" in that they forget the trailing
LF that's required to mark the end of the response. This has been the
case since the introduction of the feature in 1.8-dev1 by commit 6f6bb380e
("MEDIUM: stats: Add show json schema"), so this fix may be backported to
all versions.
Function used to parse SETTINGS frame is incorrect as it does not stop
at the frame length but continue to parse beyond it. In most cases, it
will result in a connection closed with error H3_FRAME_ERROR.
This bug can be reproduced with clients that sent more than just a
SETTINGS frame on the H3 control stream. This is notably the case with
aioquic which emit a MAX_PUSH_ID after SETTINGS.
This bug has been introduced in the current dev release, by the
following patch 62eef85961f4a2a241e0b24ef540cc91f156b842
MINOR: mux-quic: simplify decode_qcs API
thus, it does not need to be backported.
Frame type has changed during HTTP/3 specification process. Adjust it to
reflect the latest RFC 9114 status.
Concretly, type for GOAWAY and MAX_PUSH_ID frames has been adjusted.
The impact of this bug is limited as currently these frames are not
handled by haproxy and are ignored.
Glenn Strauss [Sun, 5 Jun 2022 02:11:50 +0000 (22:11 -0400)]
OPTIM: mux-h2: increase h2_settings_initial_window_size default to 64k
This changes the default from RFC 7540's default 65535 (64k-1) to avoid
avoid some degenerative WINDOW_UPDATE behaviors in the wild observed with
clients using 65536 as their buffer size, and have to complete each block
with a 1-byte frame, which with some servers tend to degenerate in 1-byte
WU causing more 1-byte frames to be sent until the transfer almost only
uses 1-byte frames.
More details here: https://github.com/nghttp2/nghttp2/issues/1722
As mentioned in previous commit (MEDIUM: mux-h2: try to coalesce outgoing
WINDOW_UPDATE frames) the issue could not be reproduced with haproxy but
individual WU frames are sent so theoretically nothing prevents this from
happening. As such it should be backported as a workaround for already
deployed clients after watching for any possible side effect with rare
clients. As an added benefit, uploads from curl now use less DATA frames
(all are 16384 now). Note that the previous patch alone is sufficient to
stop the issue with curl in case this one would need to be reverted.
Willy Tarreau [Wed, 8 Jun 2022 14:32:22 +0000 (16:32 +0200)]
MEDIUM: mux-h2: try to coalesce outgoing WINDOW_UPDATE frames
Glenn Strauss from Lighttpd reported a corner case affecting curl+lighttpd
that causes some uploads to degenerate to extremely suboptimal conditions
under certain circumstances, and noted that many other implementations
were possibly not safe against this degradation.
Glenn's detailed analysis is available here:
https://github.com/nghttp2/nghttp2/issues/1722
In short, curl uses a 65536 bytes buffer and the default stream window
is 65535, with 16384 bytes per frame. Curl will then send 3 frames of
16384 bytes followed by one of 16383, will wait for a window update to
send the last byte before recycling the buffer to read the next 64kB.
On each round like this, one extra single-byte frame will be sent, and
if ACKs for these single-byte frames are not aggregated, this will only
allow the client to send one extra byte at a time. At some point it is
possible (at least Glenn observed it) to have mostly 1-byte frames in
the transfer, resulting in huge CPU usage and a long transfer.
It was not possible to reproduce this with haproxy, even when playing
with frame sizes, buffer sizes nor window sizes. One reason seems to
be that we're using the same buffer size for the connection and the
stream and that the frame headers prevent the filling of the window
from happening on the same boundaries as on the sender. However it
does occasionally happen to see up to two 1-byte data frames in a row,
indicating that there's definitely room for improvement.
The WINDOW_UPDATE frames for the connection are sent at the end of the
demuxing, but the ones for the streams are currently sent immediately
after a DATA frame is processed, mostly for convenience. But we don't
need to proceed like this, we already have the counter of unacked bytes
in rcvd_s, so we can simply use that to decide when to send an ACK. It
must just be done before processing a new frame. The benefit is that
contiguous frames for the same stream will now only produce a single
WU, like for the connection. On complicated tests involving a client
that was limited to 100 Mbps transfers and a dummy Lua-based payload
consumer, it was possible to see the number of stream WU frames being
halved for a 100 MB transfer, which is already a nice saving anyway.
Glenn proposed a better workaround consisting in increasing the
default window size to 65536. This will be done in a separate patch
so that both can be studied independently in field and backported as
needed.
This patch is not much complicated and shold be backportable. It just
needs to be tested in development first.
BUG/MINOR: h3: fix incorrect BUG_ON assert on SETTINGS parsing
BUG_ON() assertion to check for incomplete SETTINGS frame is incorrect.
It should check if frame length is greater, not smaller, than current
buffer data. Anyway, this BUG_ON() is useless as h3_decode_qcs()
prevents parsing of an incomplete frame, except for H3 DATA. Remove it
to fix this bug.
This bug was introduced in the current dev tree by commit
commit 62eef85961f4a2a241e0b24ef540cc91f156b842
MINOR: mux-quic: simplify decode_qcs API
Thus it does not need to be backported.
This fixes crashes which happen with DEBUG_STRICT=2. Most notably, this
is reproducible with clients that emit more than just a SETTINGS frame
on the H3 control stream. It can be reproduced with aioquic for example.
The info field in the log message may change. For instance, on FreeBSD, a
"broken pipe" is reported. Thus, the expected log message must be more
generic.
BUG/MEDIUM: mailers: Set the object type for check attached to an email alert
The health-check attached to an email alert has no type. It is unexpected,
and since the 2.6, it is important because we rely on it to know the
application type in front of a connection at the stream-connector
level. Because the object type is not set, the SE descriptor is not properly
initialized, leading to a segfault when a connection to the SMTP server is
established.
This patch must be backported to 2.6 and may be backported as far as
2.0. However, it is only an issue for the 2.6 and upper.
BUG/MINOR: checks: Properly handle email alerts in trace messages
There is no server for email alerts. So the trace messages must be adapted
to handle this case. Information related to the server are now skipped for
email alerts and "[EMAIL]" prefix is used.
BUG/MINOR: trace: Test server existence for health-checks to get proxy
Email alerts are based on health-checks but with no server. Thus, in
__trace() function, responsible to write a trace message, we must be
prepared to have no server and thus no proxy.
Willy Tarreau [Tue, 7 Jun 2022 10:03:48 +0000 (12:03 +0200)]
DEV: tcploop: add a new "bind" command to bind to ip/port.
The Listen command automatically relies on it (without passing its
argument), and both Listen and Connect now support working with the
existing socket, so that it's possible to Bind an ip:port on an
existing socket or to create a new one for the purpose of listening
or connecting. It now becomes possible to do:
Willy Tarreau [Tue, 7 Jun 2022 09:46:57 +0000 (11:46 +0200)]
DEV: tcploop: make it possible to change the target address of a connect()
Sometimes it's more convenient to be able to specify where to connect on
the connect() statement, let's make it possible to pass it in argument to
the C command.
Willy Tarreau [Wed, 8 Jun 2022 10:14:23 +0000 (12:14 +0200)]
BUILD: compiler: implement unreachable for older compilers too
Benoit Dolez reported that gcc-4.4 emits several "may be used
uninitialized" warnings around places where there are BUG_ON()
or ABORT_NOW(). The reason is that __builtin_unreachable() was
introduced in gcc-4.5 thus older ones do not know that the code
after such statements is not reachable.
This patch solves the problem by deplacing the statement with
an infinite loop on older versions. The compiler knows that the
code following it cannot be reached, and this is quite cheap
(2 to 4 bytes depending on architectures). It even reduces the
code size a little bit as the compiler doesn't have to optimize
for branches that do not exist.
Benoit DOLEZ [Wed, 8 Jun 2022 07:28:56 +0000 (09:28 +0200)]
BUILD: quic: fix anonymous union for gcc-4.4
Building QUIC with gcc-4.4 on el6 shows this error:
src/xprt_quic.c: In function 'qc_release_lost_pkts':
src/xprt_quic.c:1905: error: unknown field 'loss' specified in initializer
compilation terminated due to -Wfatal-errors.
make: *** [src/xprt_quic.o] Error 1
make: *** Waiting for unfinished jobs....
Initializing an anonymous form of union like :
struct quic_cc_event ev = {
(...)
.loss.time_sent = newest_lost->time_sent,
(...)
};
generates an error with gcc-4.4 but not when initializing the
fields outside of the declaration.
Without this, QUIC MUX won't consider the call as an error and will try
to remove one byte from the buffer. This may cause a BUG_ON failure if
the buffer is empty at this stage.
This bug was introduced in the current dev tree. Does not need to be
backported.
MINOR: mux-quic/h3: adjust demuxing function return values
Clean the API used by decode_qcs() and transcoder internal functions.
Parsing functions now returns a ssize_t which represents the number of
consumed bytes or a negative error code. The total consumed bytes is
returned via decode_qcs().
The API is now unified and cleaner. The MUX can thus simply use the
return value of decode_qcs() instead of substracting the data bytes in
the buffer before and after the call. Transcoders functions are not
anymore obliged to remove consumed bytes from the buffer which was not
obvious.
Slightly modify decode_qcs function used by transcoders. The MUX now
gives a buffer instance on which each transcoder is free to work on it.
At the return of the function, the MUX removes consume data from its own
buffer.
This reduces the number of invocation to qcs_consume at the end of a
full demuxing process. The API is also cleaner with the transcoders not
responsible of calling it with the risk of having the input buffer
freed if empty.
As a mirror to qcc/qcs types, add a h3c pointer into h3s struct. This
should help to clean up H3 code and avoid to use qcs.qcc.ctx to retrieve
the h3c instance.
MINOR: connection: support HTTP/3.0 for smp_*_http_major fetch
smp_fc_http_major may be used to return the http version as an integer
used on the frontend or backend side. Previously, the handler only
checked for version 2 or 1 as a fallback. Extend it to support version 3
with the QUIC mux.
BUG/MINOR: ssl_ckch: Fix another possible uninitialized value
Commit d6c66f06a ("MINOR: ssl_ckch: Remove service context for "set ssl
crl-file" command") introduced a regression leading to a build error because
of a possible uninitialized value. It is now fixed.
BUILD: ssl_ckch: Fix build error about a possible uninitialized value
A build error is reported about the path variable in the switch statement on
the commit type, in cli_io_handler_commit_cafile_crlfile() function. The
enum contains only 2 values, but a default clause has been added to return an
error to make GCC happy.
BUG/MINOR: ssl_ckch: Fix possible uninitialized value in show_crlfile I/O handler
Commit 9a99e5478 ("BUG/MINOR: ssl_ckch: Dump CRL transaction only once if
show command yield") introduced a regression leading to a build error
because of a possible uninitialized value. It is now fixed.
BUG/MINOR: ssl_ckch: Fix possible uninitialized value in show_cafile I/O handler
Commit 5a2154bf7 ("BUG/MINOR: ssl_ckch: Dump CA transaction only once if
show command yield") introduced a regression leading to a build error
because of a possible uninitialized value. It is now fixed.
BUG/MINOR: ssl_ckch: Fix possible uninitialized value in show_cert I/O handler
Commit 3e94f5d4b ("BUG/MINOR: ssl_ckch: Dump cert transaction only once if
show command yield") introduced a regression leading to a build error
because of a possible uninitialized value. It is now fixed.
MINOR: ssl_ckch: Simplify structure used to commit changes on CA/CRL entries
The same type is used for CA and CRL entries. So, in commit_cert_ctx
structure, there is no reason to have different fields for the CA and CRL
entries.
BUG/MINOR: ssl_ckch: Init right field when parsing "commit ssl crl-file" cmd
.next_ckchi_link field must be initialized to NULL instead of .next_ckchi in
cli_parse_commit_crlfile() function. Only '.nex_ckchi_link' is used in the
I/O handler.
This patch must be backported as far as 2.5 with some adaptations for the 2.5.
BUG/MINOR: ssl_ckch: Dump cert transaction only once if show command yield
When loaded SSL certificates are displayed via "show ssl cert" command, the
in-progess transaction, if any, is also displayed. However, if the command
yield, the transaction is re-displayed again and again.
To fix the issue, old_ckchs field is used to remember the transaction was
already displayed.
BUG/MINOR: ssl_ckch: Dump CA transaction only once if show command yield
When loaded CA files are displayed via "show ssl ca-file" command, the
in-progress transaction, if any, is also displayed. However, if the command
yield, the transaction is re-displayed again and again.
To fix the issue, old_cafile_entry field is used to remember the transaction
was already displayed.
BUG/MINOR: ssl_ckch: Dump CRL transaction only once if show command yield
When loaded CRL files are displayed via "show ssl crl-file" command, the
in-progess transaction, if any, is also displayed. However, if the command
yield, the transaction is re-displayed again and again.
To fix the issue, old_crlfile_entry field is used to remember the transaction
was already displayed.
BUG/MINOR: ssl_ckch: Use right type for old entry in show_crlfile_ctx
Because of a typo (I guess), an unknown type is used for the old entry in
show_crlfile_ctx structure. Because this field is unused, there is no
compilation error. But it must be a cafile_entry and not a crlfile_entry.
Note this field is not used for now, but it will be used.
MINOR: ssl_ckch: Simplify I/O handler to commit changes on CA/CRL entry
Simplify cli_io_handler_commit_cafile_crlfile() handler function by
retrieving old and new entries at the beginning. In addition the path is
also retrieved at this stage. This removes several switch statements.
Note that the ctx was already validated by the corresponding parsing
function. Thus there is no reason to test the pointers.
While it is not a bug, this patch may help to fix issue #1731.
CLEANUP: ssl_ckch: Use corresponding enum for commit_cacrlfile_ctx.cafile_type
There is an enum to determine the entry entry type when changes are
committed on a CA/CRL entry. So use it in the service context instead of an
integer.
REGTESTS: http_request_buffer: Increase client timeout to wait "slow" clients
The default client timeout is too small to be sure to always wait end of
slow clients (the last 2 clients use a delay to send their request). But it
cannot be increased because it will slow down the regtest execution. So a
dedicated frontend with a higher client timeout has been added. This
frontend is used by "slow" clients. The other one is used for normal
requests.
REGTESTS: abortonclose: Add a barrier to not mix up log messages
Depending on the timing, time to time, the log message for "/c4" request can
be received before the one for "/c2" request. To (hopefully) fix the issue,
a barrier has been added to wait "/c2" log message before sending other
requests.
MEDIUM: http-ana: Always report rewrite failures as PRXCOND in logs
Rewrite failures in http rules are reported as proxy errors (PRXCOND) in
logs. However, other rewrite errors are reported as internal errors. For
instance, it happens when we fail to add X-Forwarded-For header. It is not
consistent and it is confusing. So now, all rewite failures are reported as
proxy errors.
BUG/MEDIUM: httpclient: Rework CLI I/O handler to handle full buffer cases
'httpclient' command does not properly handle full buffer cases. When the
response buffer is full, we exit to retry later. However, the context flags
are updated. It means when this happens, we may loose a part of the
response.
So now, flags are preserved when we fail to push data into the response
buffer. In addition, instead of dumping one part per call, we now try to
dump as much data as possible.
Finally, when there is no more data, because everything was dumped or
because we are waiting for more data from the HTTP client, the applet is
updated accordingly by calling applet_have_no_more_data(). Otherwise, when
some data are blocked, applet_putchk() already takes care to update the SE
flags. So, it is useless to call sc_need_room().
This patch should fix the issue #1723. It must be backported as far as
2.5. But a massive refactoring was performed in 2.6. So, for the 2.5 and
below, the patch will have to be adapted.
BUG/MEDIUM: httpclient: Don't remove HTX header blocks before duplicating them
Commit 534645d6 ("BUG/MEDIUM: httpclient: Fix loop consuming HTX blocks from
the response channel") introduced a regression. When the response is
consumed, The HTX header blocks are removed before duplicating them. Thus,
the first header block is always lost.
BUG/MEDIUM: ssl/crt-list: Rework 'add ssl crt-list' to handle full buffer cases
'add ssl crt-list' command is also concerned. This patch is similar to the
previous ones. Full buffer cases when we try to push the reply are not
properly handled. To fix the issue, the functions responsible to add a
crt-list entry were reworked.
First, the error message is now part of the service context. This way, if we
cannot push the error message in the reponse buffer, we may retry later. To
do so, a dedicated state was created (ADDCRT_ST_ERROR,). Then, the success
message is also handled in a dedicated state (ADDCRT_ST_SUCCESS). This way
we are able to retry to push it if necessary. Finally, the dot displayed for
each new instance is now immediatly pushed in the response buffer, and
before the update. This way, we are able to retry too if necessary.
This patch should fix the issue #1724. It must be backported as far as
2.2. But a massive refactoring was performed in 2.6. So, for the 2.5 and
below, the patch will have to be adapted.
BUG/MEDIUM: ssl_ckch: Rework 'commit ssl ca-file' to handle full buffer cases
'commit ssl crl-file' command is also concerned. This patch is similar to
the previous one. Full buffer cases when we try to push the reply are not
properly handled. To fix the issue, the functions responsible to commit CA
or CRL entry changes were reworked.
First, the error message is now part of the service context. This way, if we
cannot push the error message in the reponse buffer, we may retry later. To
do so, a dedicated state was created (CACRL_ST_ERROR). Then, the success
message is also handled in a dedicated state (CACRL_ST_SUCCESS). This way we
are able to retry to push it if necessary. Finally, the dot displayed for
each updated CKCH instance is now immediatly pushed in the response buffer,
and before the update. This way, we are able to retry too if necessary.
This patch should fix the issue #1722. It must be backported as far as
2.5. But a massive refactoring was performed in 2.6. So, for the 2.5, the
patch will have to be adapted.
BUG/MEDIUM: ssl_ckch: Rework 'commit ssl cert' to handle full buffer cases
When changes on a certificate are commited, a trash buffer is used to create
the response. Once done, the message is copied in the response buffer.
However, if the buffer is full, there is no way to retry and the message is
lost. The same issue may happen with the error message. It is a design issue
of cli_io_handler_commit_cert() function.
To fix it, the function was reworked. First, the error message is now part
of the service context. This way, if we cannot push the error message in the
reponse buffer, we may retry later. To do so, a dedicated state was created
(CERT_ST_ERROR). Then, the success message is also handled in a dedicated
state (CERT_ST_SUCCESS). This way we are able to retry to push it if
necessary. Finally, the dot displayed for each updated CKCH instance is now
immediatly pushed in the response buffer, and before the update. This way,
we are able to retry too if necessary.
This patch should fix the issue #1725. It must be backported as far as
2.2. But massive refactoring was performed in 2.6. So, for the 2.5 and
below, the patch must be adapted.
BUG/MINOR: ssl_ckch: Don't duplicate path when replacing a CA/CRL entry
When a CA or CRL entry is replaced (via 'set ssl ca-file' or 'set ssl
crl-file' commands), the path is duplicated and used to identify the ongoing
transaction. However, if the same command is repeated, the path is still
duplicated but the transaction is not changed and the duplicated path is not
released. Thus there is a memory leak.
By reviewing the code, it appears there is no reason to duplicate the
path. It is always the filename path of the old entry. So, a reference on it
is now used. This simplifies the code and this fixes the memory leak.
BUG/MINOR: ssl_ckch: Don't duplicate path when replacing a cert entry
When a certificate entry is replaced (via 'set ssl cert' command), the path
is duplicated and used to identify the ongoing transaction. However, if the
same command is repeated, the path is still duplicated but the transaction
is not changed and the duplicated path is not released. Thus there is a
memory leak.
By reviewing the code, it appears there is no reason to duplicate the
path. It is always the path of the old entry. So, a reference on it is now
used. This simplifies the code and this fixes the memory leak.
BUG/MEDIUM: ssl_ckch: Don't delete CA/CRL entry if it is being modified
When a CA or a CRL entry is being modified, we must take care to no delete
it because the corresponding ongoing transaction still references it. If we
do so, it leads to a null-deref and a crash may be exeperienced if changes
are commited.
BUG/MEDIUM: ssl_ckch: Don't delete a cert entry if it is being modified
When a certificate entry is being modified, we must take care to no delete
it because the corresponding ongoing transaction still references it. If we
do so, it leads to a null-deref and a crash may be exeperienced if changes
are commited.
Willy Tarreau [Tue, 31 May 2022 14:58:21 +0000 (16:58 +0200)]
[RELEASE] Released version 2.6.0
Released version 2.6.0 with the following main changes :
- DOC: Fix formatting in configuration.txt to fix dconv
- CLEANUP: tcpcheck: Remove useless test on the stream-connector in tcpcheck_main
- CLEANUP: muxes: Consider stream's sd as defined in .show_fd callback functions
- MINOR: quic: Ignore out of packet padding.
- CLEANUP: quic: Useless QUIC_CONN_TX_BUF_SZ definition
- CLEANUP: quic: No more used handshake output buffer
- MINOR: quic: QUIC transport parameters split.
- MINOR: quic: Transport parameters dump
- DOC: quic: Update documentation for QUIC Retry
- MINOR: quic: Tunable "max_idle_timeout" transport parameter
- MINOR: quic: Tunable "initial_max_streams_bidi" transport parameter
- MINOR: quic: Clarifications about transport parameters value
- MINOIR: quic_stats: add QUIC connection errors counters
- BUG/MINOR: quic: Largest RX packet numbers mixing
- MINOR: quic_stats: Add transport new counters (lost, stateless reset, drop)
- DOC: quic: Documentation update for QUIC
- MINOR: quic: Connection TX buffer setting renaming.
- MINOR: h3: Add a statistics module for h3
- MINOR: quic: Send STOP_SENDING frames if mux is released
- MINOR: quic: Do not drop packets with RESET_STREAM frames
- BUG/MINOR: qpack: fix buffer API usage on prefix integer encoding
- BUG/MINOR: qpack: support bigger prefix-integer encoding
- BUG/MINOR: h3: do not report bug on unknown method
- SCRIPTS: add make-releases-json to recreate a releases.json file in download dirs
- SCRIPTS: make publish-release try to launch make-releases-json
- MINOR: htx: add an unchecked version of htx_get_head_blk()
- BUILD: htx: use the unchecked version of htx_get_head_blk() where needed
- BUILD: quic: use inttypes.h instead of stdint.h
- DOC: internal: remove totally outdated diagrams
- DOC: remove the outdated ROADMAP file
- DOC: add maintainers for QUIC and HTTP/3
- MINOR: h3: define h3 trace module
- MINOR: h3: add traces on frame recv
- MINOR: h3: add traces on frame send
- MINOR: h3: add traces on h3s init/end
- EXAMPLES: remove completely outdated acl-content-sw.cfg
- BUILD: makefile: reorder objects by build time
- DOC: fix a few spelling mistakes in the docs
- BUG/MEDIUM: peers/cli: fix "show peers" crash
- CLEANUP: peers/cli: stop misusing the appctx local variable
- CLEANUP: peers/cli: make peers_dump_peer() take an appctx instead of an stconn
- BUG/MINOR: peers: set the proxy's name to the peers section name
- MINOR: server: indicate when no address was expected for a server
- BUG/MINOR: peers: detect and warn on init_addr/resolvers/check/agent-check
- DOC: peers: indicate that some server settings are not usable
- DOC: peers: clarify when entry expiration date is renewed.
- DOC: peers: fix port number and addresses on new peers section format
- DOC: gpc/gpt: add commments of gpc/gpt array definitions on stick tables.
- DOC: install: update supported OpenSSL versions in the INSTALL doc
- MINOR: ncbuf: adjust ncb_data with NCBUF_NULL
- BUG/MINOR: h3: fix frame demuxing
- BUG/MEDIUM: h3: fix H3_EXCESSIVE_LOAD when receiving H3 frame header only
- BUG/MINOR: quic: Fix QUIC_EV_CONN_PRSAFRM event traces
- CLEANUP: quic: remove useless check on local UNI stream reception
- BUG/MINOR: qpack: do not consider empty enc/dec stream as error
- DOC: intro: adjust the numbering of paragrams to keep the output ordered
- MINOR: version: mention that it's LTS now.
Willy Tarreau [Tue, 31 May 2022 14:23:06 +0000 (16:23 +0200)]
DOC: intro: adjust the numbering of paragrams to keep the output ordered
The HTML version appeared with sections in a different order where
3.3.10..3.3.16 were placed between 3.3.1 and 3.3.2. This patch just slips
them into an intermediary section so that we now have "basic features",
"standard features", and "advanced features".
Amaury Denoyelle [Tue, 31 May 2022 13:21:27 +0000 (15:21 +0200)]
BUG/MINOR: qpack: do not consider empty enc/dec stream as error
When parsing QPACK encoder/decoder streams, h3_decode_qcs() displays an
error trace if they are empty. Change the return code used in QPACK code
to avoid this trace.
To uniformize with MUX/H3 code, 0 is now used to indicate success.
Beyond this spurious error trace, this bug has no impact.
Amaury Denoyelle [Tue, 31 May 2022 13:17:02 +0000 (15:17 +0200)]
CLEANUP: quic: remove useless check on local UNI stream reception
The MUX now provides a single API for both uni and bidirectional
streams. It is responsible to reject reception on a local unidirectional
stream with the error STREAM_STATE_ERROR. This is already implemented in
qcc_recv(). As such, remove this duplicated check from xprt_quic.c.
Amaury Denoyelle [Tue, 31 May 2022 12:18:33 +0000 (14:18 +0200)]
BUG/MEDIUM: h3: fix H3_EXCESSIVE_LOAD when receiving H3 frame header only
The H3 frame demuxing code is incorrect when receiving a STREAM frame
which contains only a new H3 frame header without its payload.
In this case, the check on frames bigger than the buffer size is
incorrect. This is because the buffer has been freed via
qcs_consume()/qc_free_ncbuf() as it was emptied after H3 frame header
parsing. This causes the connection to be incorrectly closed with
H3_EXCESSIVE_LOAD error.
This bug was reproduced with xquic client on the interop and with the
command-line invocation :
$ ./interop_client -l d -k $SSLKEYLOGFILE -a <addr> -p <port> -D /tmp \
-A h3 -U https://<addr>:<port>/hello_world.txt
Note also that h3_is_frame_valid() invocation has been moved before the
new buffer size check. This ensures that first we check the frame
validity before returning from the function. It's also better
positionned as this is only needed when a new H3 frame header has been
parsed.
Amaury Denoyelle [Tue, 31 May 2022 09:44:52 +0000 (11:44 +0200)]
BUG/MINOR: h3: fix frame demuxing
The H3 demuxing code was not fully correct. After parsing the H3 frame
header, the check between frame length and buffer data is wrong as we
compare a copy of the buffer made before the H3 header removal.
Fix this by improving the H3 demuxing code API. h3_decode_frm_header()
now uses a ncbuf instance, this prevents an unnecessary cast
ncbuf/buffer in h3_decode_qcs() which resolves this error.
This bug was not triggered at this moment. Its impact should be really
limited.
Amaury Denoyelle [Tue, 31 May 2022 09:44:25 +0000 (11:44 +0200)]
MINOR: ncbuf: adjust ncb_data with NCBUF_NULL
Replace ncb_blk_is_null() by ncb_is_null() as a prelude to ncb_data().
The result is the same : the function will return 0 if the buffer is
uninitialized. However, it is clearer to directly call ncb_is_null() to
reflect this.
Willy Tarreau [Tue, 31 May 2022 09:37:37 +0000 (11:37 +0200)]
DOC: install: update supported OpenSSL versions in the INSTALL doc
OpenSSL 3.0 is now supported but was not mentioned. Also, it was
found that OpenSSL 0.9.8 doesn't build anymore since 2.5 due to
some of the functions used in the JWT token processing, and since
nobody complained, it seems it's not worth fixing it so support for
it was removed.
Emeric Brun [Fri, 25 Mar 2022 13:13:23 +0000 (14:13 +0100)]
DOC: gpc/gpt: add commments of gpc/gpt array definitions on stick tables.
Some users misunderstood that the parameter of gpc() gpt()
store types on the table line presents the number of elements
of the array to store and not an index of gpt/gpc tag/counter.
Willy Tarreau [Tue, 31 May 2022 08:22:12 +0000 (10:22 +0200)]
DOC: peers: indicate that some server settings are not usable
Let's make it clear in the peers documentation that not all server
parameters may be used, as there is some confusion around this, and
the doc was even misleading by saying that all parameters were
supported.
Willy Tarreau [Tue, 31 May 2022 07:42:44 +0000 (09:42 +0200)]
BUG/MINOR: peers: detect and warn on init_addr/resolvers/check/agent-check
Some server keywords are currently silently ignored in the peers
section, which is not good because it wastes time on user-side, trying
to make something work while it cannot by design.
With this patch we at least report a few of them (the most common ones),
which are init_addr, resolvers, check, agent-check. Others might follow.
This may be backported to 2.5 to encourage some cleaning of bogus configs.
Willy Tarreau [Tue, 31 May 2022 07:25:34 +0000 (09:25 +0200)]
MINOR: server: indicate when no address was expected for a server
When parsing a peers section, it's particularly difficult to make the
difference between the local peer which doesn't have any address, and
other peers which need one, and the error messages do not help because
with just:
peers foo
bind :8001
server foo 127.0.0.1:8001
server bar 127.0.0.2:8001
One can get such a confusing message when the local peer is "bar":
It's not clear there why the other peer doesn't trigger an error.
With this commit we add a hint in the error message when no address
was expected. The error remains quite generic (since deep into the
server code) but at least the useer gets a hint about why the keyword
wasn't understood:
[peers.cfg:15] : 'server foo/bar' : unknown keyword '127.0.0.1:8001'.
Hint: no address was expected for this server.
Willy Tarreau [Tue, 31 May 2022 07:10:19 +0000 (09:10 +0200)]
BUG/MINOR: peers: set the proxy's name to the peers section name
For some poor historical reasons, the name of a peers proxy used to be
set to the name of the local peer itself. That causes some confusion when
multiple sections are present because the same proxy name appears at
multiple places in "show peers", but since 2.5 where parsing errors include
the proxy name, a config like this one :
peers foo
server foobar blah
Would report this when the local peer name isn't "foobar":
'server (null)/foobar' : invalid address: 'blah' in 'blah'
And this when it is foobar:
'server foobar/foobar' : invalid address: 'blah' in 'blah'
This is wrong, confusing and not very practical. This commit addresses
all this by using the peers section's name when it's created. This now
allows to report messages such as:
'server foo/foobar' : invalid address: 'blah' in 'blah'
Which make it clear that the section is called "foo" and the server
"foobar".
This may be backported to 2.5, though the patch may be simplified if
needed, by just adding the change at the output of init_peers_frontend().
Willy Tarreau [Tue, 31 May 2022 06:55:54 +0000 (08:55 +0200)]
CLEANUP: peers/cli: make peers_dump_peer() take an appctx instead of an stconn
By having the appctx in argument this function wouldn't have experienced
the previous bug. Better do that now to avoid proliferation of awkward
functions.
Willy Tarreau [Tue, 31 May 2022 06:53:25 +0000 (08:53 +0200)]
CLEANUP: peers/cli: stop misusing the appctx local variable
In the context of a CLI command, it's particularly not welcome to use
an "appctx" variable that is not the current one. In addition it was
created for use at exactly 6 places in 2 lines. Let's just remove it
and stick to peer->appctx which is used elsewhere in the function and
is unambiguous.
Willy Tarreau [Tue, 31 May 2022 06:49:29 +0000 (08:49 +0200)]
BUG/MEDIUM: peers/cli: fix "show peers" crash
Commit d0a06d52f ("CLEANUP: applet: use applet_put*() everywhere possible")
replaced most accesses to the conn_stream with simpler accesses to the
appctx. Unfortunately, in all the CLI functions using an appctx, one
makes an exception where the appctx is not the caller's but the one being
inspected! When no peers connection is active, the early exit immediately
crashes.
Willy Tarreau [Mon, 30 May 2022 17:24:27 +0000 (19:24 +0200)]
BUILD: makefile: reorder objects by build time
As usual, let's sort objects by inverse build time at -O2. It will
still vary based on the options but keeps them optimally sorted for
parallel builds.
This config probably last worked on 1.3, maybe 1.4, but it uses too
many obsolete statements and it silently errors because of the "quiet"
directive, which adds to the confusion. Let's remove it.