Olivier Houchard [Mon, 25 Mar 2019 13:10:42 +0000 (14:10 +0100)]
BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing.
In h2_unsubscribe(), if we unsubscribe on SUB_CALL_UNSUBSCRIBE, then remove
ourself from the sending_list, and remove the tasklet from the task list.
We're probably about to destroy the stream anyway, so we don't want the
tasklet to run, or to stay in the sending_list, or it could lead to a crash.
Olivier Houchard [Mon, 25 Mar 2019 13:08:01 +0000 (14:08 +0100)]
BUG/MEDIUM: h2: Follow the same logic in h2_deferred_shut than in h2_snd_buf.
In h2_deferred_shut(), don't just set h2s->send_wait to NULL, instead, use
the same logic as in h2_snd_buf() and only do so if we successfully sent data
(or if we don't want to send them anymore). Setting it to NULL can lead to
crashes.
Olivier Houchard [Mon, 25 Mar 2019 12:25:02 +0000 (13:25 +0100)]
BUG/MEDIUM: h2: only destroy the h2s if h2s->cs is NULL.
In h2_deferred_shut(), only attempt to destroy the h2s if h2s->cs is NULL.
h2s->cs being non-NULL means it's still referenced by the stream interface,
so it may try to use it later, and that could lead to a crash.
MEDIUM: proto_htx: Reintroduce the infinite forwarding on data
This commit was reverted because of bugs. Now it should be ok. Difference with
the commit f52170d2f ("MEDIUM: proto_htx: Switch to infinite forwarding if there
is no data filte") is that when the infinite forwarding is enabled, the message
is switched to the state HTTP_MSG_DONE if the flag CF_EOI is set.
BUG/MEDIUM: http/htx: Fix handling of the option abortonclose
Because the flag CF_SHUTR is no more set to mark the end of the message by the
H2 multiplexer, we can rely on it again to detect aborts. there is no more need
to make a check on the flag SI_FL_CLEAN_ABRT when the option abortonclose is
enabled. So, this option should work as before for h2 clients.
This patch must be backported to 1.9 with the previous EOI patches.
MINOR: channel: Report EOI on the input channel if it was reached in the mux
The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set
on the corresponding conn_stream. In addition, if a read activity is reported
when this flag is set, the stream is woken up.
MINOR: connection: and new flag to mark end of input (EOI)
Since the begining, in the H2 multiplexer, when the end of a message is reached,
the flag CS_FL_(R)EOS is set on the conn_stream to notify the upper layer that
all data were received and consumed and there is no longer any expected. The
stream-interface converts it into a shutdown read. But it leads to some
ambiguities with the real shutr. Once it was reported at the end of the message,
there is no way to report it when the read0 is received. For this reason, aborts
after the message was fully received cannot be reported. And on the channel
side, it is hard to make the difference between a shutr because the end of the
message was reached and a shutr because of an abort.
For these reasons, there is now a flag to mark the end of the message. It is
called CS_FL_EOI (end-of-input) because it is only used on the receipt path.
This flag is only declared and not used yet.
This patch will be used by future bug fixes and will have to be backported
to 1.9.
BUG/MINOR: proto-http: Don't forward request body anymore on error
In the commit 93e02d8b7 ("MINOR: proto-http/proto-htx: Make error handling
clearer during data forwarding"), a return clause was removed by error in the
function http_request_forward_body(). This bug seems not having any visible
impact.
Olivier Houchard [Fri, 22 Mar 2019 16:37:16 +0000 (17:37 +0100)]
BUG/MEDIUM: h2: Try to be fair when sending data.
On the send path, try to be fair, and make sure the first to attempt to
send data will actually be the first to send data when it's possible (ie
when the mux' buffer is not full anymore).
To do so, use a separate list element for the sending_list, and only remove
the h2s from the send_list/fctl_list if we successfully sent data. If we did
not, we'll keep our place in the list, and will be able to try again next time.
Radek Zajic [Fri, 22 Mar 2019 10:21:54 +0000 (10:21 +0000)]
BUG/MINOR: log: properly format IPv6 address when LOG_OPT_HEXA modifier is used.
In lf_ip(), when LOG_OPT_HEXA modifier is used, there is a code to format the
IP address as a hexadecimal string. This code does not properly handle cases
when the IP address is IPv6. In such case, the code only prints `00000000`.
This patch adds support for IPv6. For legacy IPv4, the format remains
unchanged. If IPv6 socket is used to accept IPv6 connection, the full IPv6
address is returned. For example, IPv6 localhost, ::1, is printed as 00000000000000000000000000000001.
If IPv6 socket accepts IPv4 connection, the IPv4 address is mapped by the
kernel into the IPv4-mapped-IPv6 address space (RFC4291, section 2.5.5.2)
and is formatted as such. For example, 127.0.0.1 becomes ::ffff:127.0.0.1,
which is printed as 00000000000000000000FFFF7F000001.
Pierre Cheynier [Thu, 21 Mar 2019 16:15:47 +0000 (16:15 +0000)]
BUG/MEDIUM: ssl: ability to set TLS 1.3 ciphers using ssl-default-server-ciphersuites
Any attempt to put TLS 1.3 ciphers on servers failed with output 'unable
to set TLS 1.3 cipher suites'.
This was due to usage of SSL_CTX_set_cipher_list instead of
SSL_CTX_set_ciphersuites in the TLS 1.3 block (protected by
OPENSSL_VERSION_NUMBER >= 0x10101000L & so).
Willy Tarreau [Thu, 21 Mar 2019 18:19:36 +0000 (19:19 +0100)]
CLEANUP: mux-h2: add some comments to help understand the code
Some functions' roles and usage are far from being obvious, and diving
into this part each time requires deep concentration before starting to
understand who does what. Let's add a few comments which help figure
some of the useful pieces.
Willy Tarreau [Thu, 21 Mar 2019 16:47:28 +0000 (17:47 +0100)]
MINOR: mux-h2: copy small data blocks more often and reduce the number of pauses
We tend to refrain from sending data a bit too much in the H2 mux :
whenever there are pending data in the buffer and we try to copy
something larger than 1/4 of the buffer we prefer to pause. This
is suboptimal for medium-sized objects which have to send their
headers and later their data.
This patch slightly changes this by allowing a copy of a large block
if it fits at once and if the realign cost is small, i.e. the pending
data are small or the block fits in the contiguous area. Depending on
the object size this measurably improves the download performance by
between 1 and 10%, and possibly lowers the transfer latency for medium
objects.
Olivier Houchard [Thu, 21 Mar 2019 14:50:58 +0000 (15:50 +0100)]
BUG/MEDIUM: mux-h2: Use the right list in h2_stop_senders().
In h2_stop_senders(), when we're about to move the h2s about to send back
to the send_list, because we know the mux is full, instead of putting them
all in the send_list, put them back either in the fctl_list or the send_list
depending on if they are waiting for the flow control or not. This also makes
sure they're inserted in their arrival order and not reversed.
Olivier Houchard [Thu, 21 Mar 2019 14:48:46 +0000 (15:48 +0100)]
BUG/MEDIUM: mux-h2: Don't bother keeping the h2s if detaching and nothing to send.
In h2_detach(), don't bother keeping the h2s even if it was waiting for
flow control if we no longer are subscribed for receiving or sending, as
nobody will do anything once we can write in the mux, anyway. Failing to do
so may lead to h2s being kept opened forever.
Olivier Houchard [Thu, 21 Mar 2019 14:47:13 +0000 (15:47 +0100)]
BUG/MEDIUM: mux-h2: Make sure we destroyed the h2s once shutr/shutw is done.
If we're waiting until we can send a shutr and/or a shutw, once we're done
and not considering sending anything, destroy the h2s, and eventually the
h2c if we're done with the whole connection, or it will never be done.
This commit was merged too early, some areas are not ready and
transfers from H1 to H2 often stall. Christopher suggested to wait
for the other parts to be ready before reintroducing it.
MINOR: http/applets: Handle all applets intercepting HTTP requests the same way
In addition to stats and cache applets, there are also HTTP applet services
declared in an http-request rule. All these applets are now handled the same
way. Among other things, the header Expect is handled at the same place for all
these applets.
MINOR: stats/cache: Handle the header Expect when applets are registered
First of all, it is a way to handle 100-Continue for the cache without
duplicating code. Then, for the stats, it is no longer necessary to wait for the
request body.
BUG/MEDIUM: lua: Fully consume large requests when an HTTP applet ends
In Lua, when an HTTP applet ends (in HTX and legacy HTTP), we must flush
remaining outgoing data on the request. But only outgoing data at time the
applet is called are consumed. If a request with a huge body is sent, an error
is triggerred because a SHUTW is catched for an unfinisehd request.
Now, we consume request data until the end. In fact, we don't try to shutdown
the request's channel for write anymore.
This patch must be backported to 1.9 after some observation period. It should
probably be backported in prior versions too. But honnestly, with refactoring
on the connection layer and the stream interface in 1.9, it is probably safer
to not do so.
BUG/MINOR: stats: Fully consume large requests in the stats applet
In the stats applet (in HTX and legacy HTTP), after a response is fully sent to
a client, the request is consumed. It is done at the end, after all the response
was copied into the channel's buffer. But only outgoing data at time the applet
is called are consumed. Then the applet is closed. If a request with a huge body
is sent, an error is triggerred because a SHUTW is catched for an unfinisehd
request.
Now, we consume request data until the end. In fact, we don't try to shutdown
the request's channel for write anymore.
This patch must be backported to 1.9 after some observation period. It should
probably be backported in prior versions too. But honnestly, with refactoring
on the connection layer and the stream interface in 1.9, it is probably safer
to not do so.
BUG/MINOR: cache: Fully consume large requests in the cache applet
In the cache applet (in HTX and legacy HTTP), when an cached object is sent to a
client, the request must be consumed. It is done at the end, after all the
response was copied into the channel's buffer. But only outgoing data at time
the applet is called are consumed. Then the applet is closed. If a request with
a huge body is sent, an error is triggerred because a SHUTW is catched on an
unfinished request.
Now, we consume request data as soon as possible and we do it until the end. In
fact, we don't try to shutdown the request's channel for write anymore.
This patch must be backported to 1.9 after some observation period.
MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter
Because in HTX the parsing is done by the multiplexers, there is no reason to
limit the amount of data fast-forwarded. Of course, it is only true when there
is no data filter registered on the corresponding channel. So now, we enable the
infinite forwarding when possible. However, the HTTP message state remains
HTTP_MSG_DATA. Then, when infinite forwarding is enabled, if the flag CF_SHUTR
is set, the state is switched to HTTP_MSG_DONE.
Willy Tarreau [Tue, 19 Mar 2019 07:08:10 +0000 (08:08 +0100)]
MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
Willy Tarreau [Mon, 18 Mar 2019 15:31:18 +0000 (16:31 +0100)]
BUILD: tools: fix a build warning on some 32-bit archs
Some recent versions of gcc apparently can detect that x >> 32 will not
work on a 32-bit architecture, but are failing to see that the code will
not be built since it's enclosed in "if (sizeof(LONG) > 4)" or equivalent.
Just shift right twice by 16 bits in this case, the compiler correctly
replaces it by a single 32-bit shift.
MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.
Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.
In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.
So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.
This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was never sent
According to the H2 spec (see #8.1.4), setting the REFUSED_STREAM error code
is a way to indicate that the stream is being closed prior to any processing
having occurred, such as when a server-side H1 keepalive connection is closed
without sending anything (which differs from the regular error case since
haproxy doesn't even generate an error message). Any request that was sent on
the reset stream can be safely retried. So, when a stream is closed, if no
data was ever sent back (ie. the flag H2_SF_HEADERS_SENT is not set), we can
set the REFUSED_STREAM error code on the RST_STREAM frame.
BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen streams
This only happens for server streams because their id is assigned when the first
message is sent. If these streams are not woken up, some events can be lost
leading to frozen streams. For instance, it happens when a server closes its
connection before sending its preface.
Willy Tarreau [Mon, 18 Mar 2019 10:02:57 +0000 (11:02 +0100)]
BUG/MINOR: http/counters: fix missing increment of fe->srv_aborts
When a server aborts a transfer, we used to increment the backend's
counter but not the frontend's during the forwarding phase. This fixes
it. It might be backported to all supported versions (possibly removing
the htx part) though it is of very low importance.
BUG/MAJOR: stats: Fix how huge POST data are read from the channel
When the body length is greater than a chunk size (so if length of POST data
exceeds the buffer size), the requests is rejected with the status code
STAT_STATUS_EXCD. Otherwise the stats applet will wait to have all the data to
copy and parse them. But there is a problem when the total request size
(including the headers) is just lower than the buffer size but greater the
buffer size less the reserve. In such case, the body length is considered as
enough small to be processed but not entierly received. So the stats applet
waits for more data. But because outgoing data are still there, the channel's
buffer is considered as full and nothing more can be read, leading to a freeze
of the session.
Note this bug is pretty easy to reproduce with the legacy HTTP. It is harder
with the HTX but still possible. To fix the bug, in the stats applet, when the
request is not fully received, we check if at least the reserve remains
available the channel's buffer.
This patch must be backported as far as 1.5. But because the HTX does not exist
in 1.8 and lower, it will have to be adapted for these versions.
BUG/MAJOR: spoe: Fix initialization of thread-dependent fields
A bug was introduced in the commit b0769b ("BUG/MEDIUM: spoe: initialization
depending on nbthread must be done last"). The code depending on global.nbthread
was moved from cfg_parse_spoe_agent() to spoe_check() but the pointer on the
agent configuration was not updated to use the filter's one. The variable
curagent is a global variable only valid during the configuration parsing. In
spoe_check(), conf->agent must be used instead.
Willy Tarreau [Fri, 15 Mar 2019 16:29:53 +0000 (17:29 +0100)]
BUILD: Makefile: resolve LEVEL before calling run-regtests
Calling "make reg-tests V=1" shows --LEVEL "$LEVEL" which is not quite
useful. Let's use "$(LEVEL)" instead of "$$LEVEL" so that make resolves
the variable before launching the command. This way the reported command
is usable from the shell.
Willy Tarreau [Fri, 15 Mar 2019 16:28:36 +0000 (17:28 +0100)]
BUILD: Makefile: allow the reg-tests target to be verbose
When debugging reg-tests, it's quite annoying not to be able to figure
the syntax to call the scripts. Let's replace the '@' with '$(Q)' as for
other commands so that launching them with "V=1" is enough to reveal the
command line.
Willy Tarreau [Fri, 15 Mar 2019 16:16:34 +0000 (17:16 +0100)]
BUILD: listener: shut up a build warning when threads are disabled
We get this with __decl_hathreads due to the lone semi-colon, let's move
it at the end of the innermost declaration :
src/listener.c: In function 'listener_accept':
src/listener.c:601:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
It's a temporary revert. This commit suggested to update to vtest
commit 4e43cc1 to fix handling of HEAD requests, but the compression
was broken two commits before, leaving us with no single version of
vtest being able to run all tests anymore.
Let's temporary disable HEAD again in the tests so that we can use
any version up to and including a2e82a8 for the time it takes vtest
to fix the compression.
BUG/MINOR: stats: Be more strict on what is a valid request to the stats applet
First of all, only GET, HEAD and POST methods are now allowed. Others will be
rejected with the status code STAT_STATUS_IVAL (invalid request). Then, for the
legacy HTTP, only POST requests with a content-length are allowed. Now, chunked
encoded requests are also considered as invalid because the chunk formatting
will interfere with the parsing of POST parameters. In HTX, It is not a problem
because data are unchunked.
This patch must be backported to 1.9. For prior versions too, but HTX part must
be removed. The patch introducing the status code STAT_STATUS_IVAL must also be
backported.
MINOR: stats: Move stuff about the stats status codes in stats files
The status codes definition (STAT_STATUS_*) and their string representation
stat_status_codes) have been moved in stats files. There is no reason to keep
them in proto_http files.
BUG/MINOR: lua/htx: Don't forget to call htx_to_buf() when appropriate
When htx_from_buf() is used to get an HTX message from a buffer, htx_to_buf()
must always be called when finish. Some calls to htx_to_buf() were missing.
BUG/MINOR: stats/htx: Call channel_add_input() when response headers are sent
This function will only increment the total amount of bytes read by a channel
because at this stage there is no fast forwarding. So the bug is pretty limited.
BUG/MINOR: mux-h1: Don't report an error on EOS if no message was received
An error is reported if the EOS is detected before the end of the message. But
we must be carefull to not report an error if there is no message at all.
Olivier Houchard [Thu, 14 Mar 2019 23:23:10 +0000 (00:23 +0100)]
BUG/MEDIUM: tasks: Make sure we wake sleeping threads if needed.
When waking a task on a remote thread, we currently check 1) if this
thread was sleeping, and 2) if it was already marked as active before
writing to its pipe. Unfortunately this doesn't always work as desired
because only one thread from the mask is woken up, while the
active_tasks_mask indicates all eligible threads for this task. As a
result, if one multi-thread task (e.g. a health check) wakes up to run
on any thread, then an accept() dispatches an incoming connection on
thread 2, this thread will already have its bit set in active_tasks_mask
because of the previous wakeup and will not be woken up.
This is easily noticeable on 2.0-dev by injecting on a multi-threaded
listener with a single connection at a time while health checks are
running quickly in the background : the injection runs slowly with
random response times (the poll timeouts). In 1.9 it affects the
dequeing of server connections, which occasionally experience pauses
if multiple threads share the same queue.
The correct solution consists in adjusting the sleeping_thread_mask
when waking another thread up. This mask reflects threads that are
sleeping, hence that need to be signaled to wake up. Threads with a
bit in active_tasks_mask already don't have their sleeping_thread_mask
bit set before polling so the principle remains consistent. And by
doing so we can remove the old_active_mask field.
Willy Tarreau [Thu, 14 Mar 2019 18:10:55 +0000 (19:10 +0100)]
BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes
Each thread uses one epoll_fd or kqueue_fd, and a pipe (thus two FDs).
These ones have to be accounted for in the maxsock calculation, otherwise
we can reach maxsock before maxconn. This is difficult to observe but it
in fact happens when a server connects back to the frontend and has checks
enabled : the check uses its FD and serves to fill the loop. In this case
all FDs planed for the datapath are used for this.
Olivier Houchard [Thu, 14 Mar 2019 15:14:04 +0000 (16:14 +0100)]
BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global rq.
In task_unlink_rq, to decide if we should logk the global runqueue lock,
use the TASK_GLOBAL flag instead of relying on t->thread_mask being tid_bit,
as it could be so while still being in the global runqueue if another thread
woke that task for us.
Willy Tarreau [Wed, 13 Mar 2019 14:03:53 +0000 (15:03 +0100)]
BUG/MEDIUM: listener: make sure we don't pick stopped threads
Dragan Dosen reported that after the multi-queue changes, appending
"process 1/even" on a bind line can make the process immediately crash
when delivering a first connection. This is due to the fact that I
believed that thread_mask(mask) applied the all_threads_mask value,
but it doesn't. And in case of even/odd the bits cover more than the
available threads, resulting in too high a thread number being selected
and a non-existing task to be woken up.
Willy Tarreau [Wed, 13 Mar 2019 13:03:28 +0000 (14:03 +0100)]
BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED()
Injecting on a saturated listener started to exhibit some deadlocks
again between LIST_POP_LOCKED() and LIST_DEL_LOCKED(). Olivier found
it was due to a leftover from a previous debugging session. This patch
fixes it.
This will have to be backported if the other LIST_*_LOCKED() patches
are backported.
Willy Tarreau [Wed, 13 Mar 2019 09:10:49 +0000 (10:10 +0100)]
MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
Willy Tarreau [Wed, 13 Mar 2019 09:03:07 +0000 (10:03 +0100)]
MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
MINOR: threads: Add macros to do atomic operation with no memory barrier.
Add variants of the HA_ATOMIC* macros, prefixed with a _, that do the
atomic operation with no barrier generated by the compiler. It is expected
the developer adds barriers manually if needed.
MEDIUM: threads: Use __ATOMIC_SEQ_CST when using the newer atomic API.
When using the new __atomic* API, ask the compiler to generate barriers.
A variant of those functions that don't generate barriers will be added later.
Before that, using HA_ATOMIC* would not generate any barrier, and some parts
of the code should be reviewed and missing barriers should be added.
This should probably be backported to 1.8 and 1.9.
Implement __ha_barrier functions to be used when trying to protect data
modified by atomic operations (except when using HA_ATOMIC_STORE).
On intel, atomic operations either use the LOCK prefix and xchg, and both
atc as full barrier, so there's no need to add an extra barrier.
Willy Tarreau [Thu, 7 Mar 2019 17:44:12 +0000 (18:44 +0100)]
OPTIM: task: limit the impact of memory barriers in taks_remove_from_task_list()
In this function we end up with successive locked operations then a
store barrier, and in addition the compiler has to emit less efficient
code due to a longer jump. There's no need for absolutely updating the
tasks_run_queue counter before clearing the task's leaf pointer, so
let's swap the two operations and benefit from a single barrier as much
as possible. This code is on the hot path and shows about half a percent
of improvement with 8 threads.