MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter
Because in HTX the parsing is done by the multiplexers, there is no reason to
limit the amount of data fast-forwarded. Of course, it is only true when there
is no data filter registered on the corresponding channel. So now, we enable the
infinite forwarding when possible. However, the HTTP message state remains
HTTP_MSG_DATA. Then, when infinite forwarding is enabled, if the flag CF_SHUTR
is set, the state is switched to HTTP_MSG_DONE.
Willy Tarreau [Tue, 19 Mar 2019 07:08:10 +0000 (08:08 +0100)]
MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
Willy Tarreau [Mon, 18 Mar 2019 15:31:18 +0000 (16:31 +0100)]
BUILD: tools: fix a build warning on some 32-bit archs
Some recent versions of gcc apparently can detect that x >> 32 will not
work on a 32-bit architecture, but are failing to see that the code will
not be built since it's enclosed in "if (sizeof(LONG) > 4)" or equivalent.
Just shift right twice by 16 bits in this case, the compiler correctly
replaces it by a single 32-bit shift.
MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.
Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.
In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.
So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.
This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was never sent
According to the H2 spec (see #8.1.4), setting the REFUSED_STREAM error code
is a way to indicate that the stream is being closed prior to any processing
having occurred, such as when a server-side H1 keepalive connection is closed
without sending anything (which differs from the regular error case since
haproxy doesn't even generate an error message). Any request that was sent on
the reset stream can be safely retried. So, when a stream is closed, if no
data was ever sent back (ie. the flag H2_SF_HEADERS_SENT is not set), we can
set the REFUSED_STREAM error code on the RST_STREAM frame.
BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen streams
This only happens for server streams because their id is assigned when the first
message is sent. If these streams are not woken up, some events can be lost
leading to frozen streams. For instance, it happens when a server closes its
connection before sending its preface.
Willy Tarreau [Mon, 18 Mar 2019 10:02:57 +0000 (11:02 +0100)]
BUG/MINOR: http/counters: fix missing increment of fe->srv_aborts
When a server aborts a transfer, we used to increment the backend's
counter but not the frontend's during the forwarding phase. This fixes
it. It might be backported to all supported versions (possibly removing
the htx part) though it is of very low importance.
BUG/MAJOR: stats: Fix how huge POST data are read from the channel
When the body length is greater than a chunk size (so if length of POST data
exceeds the buffer size), the requests is rejected with the status code
STAT_STATUS_EXCD. Otherwise the stats applet will wait to have all the data to
copy and parse them. But there is a problem when the total request size
(including the headers) is just lower than the buffer size but greater the
buffer size less the reserve. In such case, the body length is considered as
enough small to be processed but not entierly received. So the stats applet
waits for more data. But because outgoing data are still there, the channel's
buffer is considered as full and nothing more can be read, leading to a freeze
of the session.
Note this bug is pretty easy to reproduce with the legacy HTTP. It is harder
with the HTX but still possible. To fix the bug, in the stats applet, when the
request is not fully received, we check if at least the reserve remains
available the channel's buffer.
This patch must be backported as far as 1.5. But because the HTX does not exist
in 1.8 and lower, it will have to be adapted for these versions.
BUG/MAJOR: spoe: Fix initialization of thread-dependent fields
A bug was introduced in the commit b0769b ("BUG/MEDIUM: spoe: initialization
depending on nbthread must be done last"). The code depending on global.nbthread
was moved from cfg_parse_spoe_agent() to spoe_check() but the pointer on the
agent configuration was not updated to use the filter's one. The variable
curagent is a global variable only valid during the configuration parsing. In
spoe_check(), conf->agent must be used instead.
Willy Tarreau [Fri, 15 Mar 2019 16:29:53 +0000 (17:29 +0100)]
BUILD: Makefile: resolve LEVEL before calling run-regtests
Calling "make reg-tests V=1" shows --LEVEL "$LEVEL" which is not quite
useful. Let's use "$(LEVEL)" instead of "$$LEVEL" so that make resolves
the variable before launching the command. This way the reported command
is usable from the shell.
Willy Tarreau [Fri, 15 Mar 2019 16:28:36 +0000 (17:28 +0100)]
BUILD: Makefile: allow the reg-tests target to be verbose
When debugging reg-tests, it's quite annoying not to be able to figure
the syntax to call the scripts. Let's replace the '@' with '$(Q)' as for
other commands so that launching them with "V=1" is enough to reveal the
command line.
Willy Tarreau [Fri, 15 Mar 2019 16:16:34 +0000 (17:16 +0100)]
BUILD: listener: shut up a build warning when threads are disabled
We get this with __decl_hathreads due to the lone semi-colon, let's move
it at the end of the innermost declaration :
src/listener.c: In function 'listener_accept':
src/listener.c:601:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
It's a temporary revert. This commit suggested to update to vtest
commit 4e43cc1 to fix handling of HEAD requests, but the compression
was broken two commits before, leaving us with no single version of
vtest being able to run all tests anymore.
Let's temporary disable HEAD again in the tests so that we can use
any version up to and including a2e82a8 for the time it takes vtest
to fix the compression.
BUG/MINOR: stats: Be more strict on what is a valid request to the stats applet
First of all, only GET, HEAD and POST methods are now allowed. Others will be
rejected with the status code STAT_STATUS_IVAL (invalid request). Then, for the
legacy HTTP, only POST requests with a content-length are allowed. Now, chunked
encoded requests are also considered as invalid because the chunk formatting
will interfere with the parsing of POST parameters. In HTX, It is not a problem
because data are unchunked.
This patch must be backported to 1.9. For prior versions too, but HTX part must
be removed. The patch introducing the status code STAT_STATUS_IVAL must also be
backported.
MINOR: stats: Move stuff about the stats status codes in stats files
The status codes definition (STAT_STATUS_*) and their string representation
stat_status_codes) have been moved in stats files. There is no reason to keep
them in proto_http files.
BUG/MINOR: lua/htx: Don't forget to call htx_to_buf() when appropriate
When htx_from_buf() is used to get an HTX message from a buffer, htx_to_buf()
must always be called when finish. Some calls to htx_to_buf() were missing.
BUG/MINOR: stats/htx: Call channel_add_input() when response headers are sent
This function will only increment the total amount of bytes read by a channel
because at this stage there is no fast forwarding. So the bug is pretty limited.
BUG/MINOR: mux-h1: Don't report an error on EOS if no message was received
An error is reported if the EOS is detected before the end of the message. But
we must be carefull to not report an error if there is no message at all.
Olivier Houchard [Thu, 14 Mar 2019 23:23:10 +0000 (00:23 +0100)]
BUG/MEDIUM: tasks: Make sure we wake sleeping threads if needed.
When waking a task on a remote thread, we currently check 1) if this
thread was sleeping, and 2) if it was already marked as active before
writing to its pipe. Unfortunately this doesn't always work as desired
because only one thread from the mask is woken up, while the
active_tasks_mask indicates all eligible threads for this task. As a
result, if one multi-thread task (e.g. a health check) wakes up to run
on any thread, then an accept() dispatches an incoming connection on
thread 2, this thread will already have its bit set in active_tasks_mask
because of the previous wakeup and will not be woken up.
This is easily noticeable on 2.0-dev by injecting on a multi-threaded
listener with a single connection at a time while health checks are
running quickly in the background : the injection runs slowly with
random response times (the poll timeouts). In 1.9 it affects the
dequeing of server connections, which occasionally experience pauses
if multiple threads share the same queue.
The correct solution consists in adjusting the sleeping_thread_mask
when waking another thread up. This mask reflects threads that are
sleeping, hence that need to be signaled to wake up. Threads with a
bit in active_tasks_mask already don't have their sleeping_thread_mask
bit set before polling so the principle remains consistent. And by
doing so we can remove the old_active_mask field.
Willy Tarreau [Thu, 14 Mar 2019 18:10:55 +0000 (19:10 +0100)]
BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes
Each thread uses one epoll_fd or kqueue_fd, and a pipe (thus two FDs).
These ones have to be accounted for in the maxsock calculation, otherwise
we can reach maxsock before maxconn. This is difficult to observe but it
in fact happens when a server connects back to the frontend and has checks
enabled : the check uses its FD and serves to fill the loop. In this case
all FDs planed for the datapath are used for this.
Olivier Houchard [Thu, 14 Mar 2019 15:14:04 +0000 (16:14 +0100)]
BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global rq.
In task_unlink_rq, to decide if we should logk the global runqueue lock,
use the TASK_GLOBAL flag instead of relying on t->thread_mask being tid_bit,
as it could be so while still being in the global runqueue if another thread
woke that task for us.
Willy Tarreau [Wed, 13 Mar 2019 14:03:53 +0000 (15:03 +0100)]
BUG/MEDIUM: listener: make sure we don't pick stopped threads
Dragan Dosen reported that after the multi-queue changes, appending
"process 1/even" on a bind line can make the process immediately crash
when delivering a first connection. This is due to the fact that I
believed that thread_mask(mask) applied the all_threads_mask value,
but it doesn't. And in case of even/odd the bits cover more than the
available threads, resulting in too high a thread number being selected
and a non-existing task to be woken up.
Willy Tarreau [Wed, 13 Mar 2019 13:03:28 +0000 (14:03 +0100)]
BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED()
Injecting on a saturated listener started to exhibit some deadlocks
again between LIST_POP_LOCKED() and LIST_DEL_LOCKED(). Olivier found
it was due to a leftover from a previous debugging session. This patch
fixes it.
This will have to be backported if the other LIST_*_LOCKED() patches
are backported.
Willy Tarreau [Wed, 13 Mar 2019 09:10:49 +0000 (10:10 +0100)]
MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
Willy Tarreau [Wed, 13 Mar 2019 09:03:07 +0000 (10:03 +0100)]
MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
MINOR: threads: Add macros to do atomic operation with no memory barrier.
Add variants of the HA_ATOMIC* macros, prefixed with a _, that do the
atomic operation with no barrier generated by the compiler. It is expected
the developer adds barriers manually if needed.
MEDIUM: threads: Use __ATOMIC_SEQ_CST when using the newer atomic API.
When using the new __atomic* API, ask the compiler to generate barriers.
A variant of those functions that don't generate barriers will be added later.
Before that, using HA_ATOMIC* would not generate any barrier, and some parts
of the code should be reviewed and missing barriers should be added.
This should probably be backported to 1.8 and 1.9.
Implement __ha_barrier functions to be used when trying to protect data
modified by atomic operations (except when using HA_ATOMIC_STORE).
On intel, atomic operations either use the LOCK prefix and xchg, and both
atc as full barrier, so there's no need to add an extra barrier.
Willy Tarreau [Thu, 7 Mar 2019 17:44:12 +0000 (18:44 +0100)]
OPTIM: task: limit the impact of memory barriers in taks_remove_from_task_list()
In this function we end up with successive locked operations then a
store barrier, and in addition the compiler has to emit less efficient
code due to a longer jump. There's no need for absolutely updating the
tasks_run_queue counter before clearing the task's leaf pointer, so
let's swap the two operations and benefit from a single barrier as much
as possible. This code is on the hot path and shows about half a percent
of improvement with 8 threads.
Dragan Dosen [Thu, 7 Mar 2019 14:24:23 +0000 (15:24 +0100)]
BUG/MEDIUM: 51d: fix possible segfault on deinit_51degrees()
When haproxy is built with 51Degrees support, but not configured to use
51Degrees database, a segfault can occur when deinit_51degrees()
function is called, eg. during soft-stop on SIGUSR1 signal.
Only builds that use Pattern algorithm are affected.
This fix must be backported to all stable branches where 51Degrees
support is available. Additional adjustments are required for some
branches due to API and naming changes.
Before c8d5b95 the "maxconn" of the backend of dynamic "use_backend"
rules was not modified (this does not make sense and this is correct).
When implementing proxy_adjust_all_maxconn(), c8d5b95 commit missed this case.
With this patch we adjust the "maxconn" of the backend of such rules only if
they are not dynamic.
Without this patch reg-tests/http-rules/h00003.vtc could make haproxy crash.
Willy Tarreau [Wed, 6 Mar 2019 14:26:33 +0000 (15:26 +0100)]
MINOR: listener: move thr_idx from the bind_conf to the listener
Tests show that it's slightly faster to have this field in the listener.
The cache walk patterns are under heavy stress and having only this field
written to in the bind_conf was wasting a cache line that was heavily
read. Let's move this close to the other entries already written to in
the listener. Warning, the position does have an impact on peak performance.
Willy Tarreau [Tue, 5 Mar 2019 18:25:26 +0000 (19:25 +0100)]
CLEANUP: listener: remove old thread bit mapping
Now that the P2C algorithm for the accept queue is removed, we don't
need to map a number to a thread bit anymore, so let's remove all
these fields which are taking quite some space for no reason.
Willy Tarreau [Tue, 5 Mar 2019 07:46:28 +0000 (08:46 +0100)]
MEDIUM: listener: change the LB algorithm again to use two round robins instead
At this point, the random used in the hybrid queue distribution algorithm
provides little benefit over a periodic scan, can even have a slightly
worse worst case, and it requires to establish a mapping between a
discrete number and a thread ID among a mask.
This patch introduces a different approach using two indexes. One scans
the thread mask from the left, the other one from the right. The related
threads' loads are compared, and the least loaded one receives the new
connection. Then one index is adjusted depending on the load resulting
from this election, so that we start the next election from two known
lightly loaded threads.
This approach provides an extra 1% peak performance boost over the previous
one, which likely corresponds to the removal of the extra work on the
random and the previously required two mappings of index to thread.
A test was attempted with two indexes going in the same direction but it
was much less interesting because the same thread pairs were compared most
of the time with the load climbing in a ladder-like model. With the reverse
directions this cannot happen.
Willy Tarreau [Tue, 5 Mar 2019 11:04:55 +0000 (12:04 +0100)]
MINOR: tools: implement my_flsl()
We already have my_ffsl() to find the lowest bit set in a word, and
this patch implements the search for the highest bit set in a word.
On x86 it uses the bsr instruction and on other architectures it
uses an efficient implementation.
Willy Tarreau [Mon, 4 Mar 2019 18:57:34 +0000 (19:57 +0100)]
MINOR: listener: improve incoming traffic distribution
By picking two randoms following the P2C algorithm, we seldom observe
asymmetric loads on bursts of small session counts. This is typically
what makes h2load take a bit of time to complete the last 100% because
if a thread gets two connections while the other ones only have one,
it takes twice the time to complete its work.
This patch proposes a modification of the p2c algorithm which seems
more suitable to this case : it mixes a rotating index with a random.
This way, we're certain that all threads are consulted in turn and at
the same time we're not forced to use the ones we're giving a chance.
This significantly increases the traffic rate. Now h2load shows faster
completion and the average request rates on H2 and the TLS resume rate
increases by a bit more than 5% compared to pure p2c.
The index was placed into the struct bind_conf because 1) it's faster
there and it's the best place to optimally distribute traffic among a
group of listeners. It's the only runtime-modified element there and
it will be quite cache-hot.
Willy Tarreau [Wed, 6 Mar 2019 18:34:25 +0000 (19:34 +0100)]
MINOR: task: use LIST_DEL_INIT() to remove a task from the queue
By using LIST_DEL_INIT() instead of LIST_DEL()+LIST_INIT() we manage
to bump the peak connection rate by no less than 3% on 8 threads.
The perf top profile shows much less contention in this area which
suffered from the second reload.
Willy Tarreau [Wed, 6 Mar 2019 18:32:11 +0000 (19:32 +0100)]
MINOR: lists: add a LIST_DEL_INIT() macro
It turns out that we call LIST_DEL+LIST_INIT very frequently and that
the compiler doesn't know what pointers get modified in the e->n->p
and e->p->n dance, so when LIST_INIT() is called, it reloads these
pointers, which is quite a bit of a mess in terms of performance.
This patch adds LIST_DEL_INIT() to perform the two operations at once
using local temporary variables so that the compiler knows these
pointers are left unaffected.
REGTEST: Enable reg tests with HEAD HTTP method usage.
This patch enables the part of this reg test which could not work due to a vtest
(formerly varnishtest) bug.
NOTE: You must have a vtest version with 4e43cc1 commit for this bug fix to make this
script succeed (see https://github.com/vtest/VTest/commit/4e43cc1fec45213b64503812599847c02045c8fa
for more information).
MINOR: sample: Add a protocol buffers specific converter.
This patch adds "protobuf" protocol buffers specific converter wich
may used in combination with "ungrpc" as first converter to extract
a protocol buffers field value. It is simply implemented reusing
protobuf_field_lookup() which is the protocol buffers specific parser already
used by "ungrpc" converter which only parse a gRPC header in addition of
parsing protocol buffers message.
Update the documentation for this new "protobuf" converter.
MINOR: sample: Extract some protocol buffers specific code.
We move the code responsible of parsing protocol buffers messages
inside gRPC messages from sample.c to include/proto/protocol_buffers.h
so that to reuse it to cascade "ungrpc" converter.
Lukas Tribus [Tue, 5 Mar 2019 22:14:32 +0000 (23:14 +0100)]
BUG/MINOR: ssl: fix warning about ssl-min/max-ver support
In 84e417d8 ("MINOR: ssl: support Openssl 1.1.1 early callback for
switchctx") the code was extended to also support OpenSSL 1.1.1
(code already supported BoringSSL). A configuration check warning
was updated but with the wrong logic, the #ifdef needs a && instead
of an ||.
Willy Tarreau [Tue, 5 Mar 2019 17:14:03 +0000 (18:14 +0100)]
MINOR: config: relax the range checks on cpu-map
Emeric reports that when MAX_THREADS and/or MAX_PROCS are set to lower
values, referencing thread or process numbers higher than these limits
in cpu-map returns errors. This is annoying because these typically are
silent settings that are expected to be used only when set. Let's switch
back to LONGBITS for this limit.
Willy Tarreau [Tue, 5 Mar 2019 12:35:35 +0000 (13:35 +0100)]
CLEANUP: wurfl: remove dead, broken and unmaintained code
Since the "wurfl" device detection engine was merged slightly more than
two years ago (2016-11-04), it never received a single fix nor update.
For almost two years it didn't receive even the minimal review or changes
needed to be compatible with threads, and it's remained build-broken for
about the last 9 months, consecutive to the last buffer API changes,
without anyone ever noticing! When asked on the list, nobody confirmed
using it :
And obviously nobody even cared to verify that it did still build. So we
are left with this broken code with no user and no maintainer. It might
even suffer from remotely exploitable vulnerabilities without anyone
being able to check if it presents any risk. It's a pain to update each
time there is an API change because it doesn't build as it depends on
external libraries that are not publicly accessible, leading to careful
blind changes. It slows down the whole project. This situation is not
acceptable at all.
It's time to cure the problem where it is. This patch removes all this
dead, non-buildable, non-working code. If anyone ever decides to use it,
which I seriously doubt based on history, it could be reintegrated, but
this time the following guarantees will be required :
- someone has to step up as a maintainer and have his name listed in
the MAINTAINERS file (I should have been more careful last time).
This person will take the sole blame for all issues and will be
responsible for fixing the bugs and incompatibilities affecting
this code, and for making it evolve to follow regular internal API
updates.
- support building on a standard distro with automated tools (i.e. no
more "click on this site, register your e-mail and download an
archive then figure how to place this into your build system").
Dummy libs are OK though as long as they allow the mainline code to
build and start.
- multi-threaded support must be fixed. I mean seriously, not worked
around with a check saying "please disable threads, we've been busy
fishing for the last two years".
This may be backported to 1.9 given that the code has never worked there
either, thus at least we're certain nobody will miss it.
For now on, "ungrpc" may take a second optional argument to provide
the protocol buffers types used to encode the field value to be extracted.
When absent the field value is extracted as a binary sample which may then
followed by others converters like "hex" which takes binary as input sample.
When this second argument is a type which does not match the one found by "ungrpc",
this field is considered as not found even if present.
With this patch we also remove the useless "varint" and "svarint" converters.
Update the documentation about "ungrpc" converters.
Parsing protocol buffer fields always consists in skip the field
if the field is not found or store the field value if found.
So, with this patch we factorize a little bit the code for "ungrpc" converter.
Willy Tarreau [Tue, 5 Mar 2019 09:47:37 +0000 (10:47 +0100)]
BUG/MEDIUM: h2/htx: verify that :path doesn't contain invalid chars
While the legacy code converts h2 to h1 and provides some control over
what is passed, in htx mode there is no such control and it is possible
to pass control chars and linear white spaces in the path, which are
possibly reencoded differently once passed to the H1 side.
HTX supports parse error reporting using a special flag. Let's check
the correctness of the :path pseudo header and report any anomaly in
the HTX flag.
Thanks to Jérôme Magnin for reporting this bug with a working reproducer.
This fix must be backported to 1.9 along with the two previous patches
("MINOR: htx: unconditionally handle parsing errors in requests or
responses" and "MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR
between h2s and buf on RX").
Willy Tarreau [Tue, 5 Mar 2019 09:51:11 +0000 (10:51 +0100)]
MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR between h2s and buf on RX
In order to allow the H2 parser to report parsing errors, we must make
sure to always pass the HTX_FL_PARSING_ERROR flag from the h2s htx to
the conn_stream's htx.
Willy Tarreau [Tue, 5 Mar 2019 09:43:32 +0000 (10:43 +0100)]
MINOR: htx: unconditionally handle parsing errors in requests or responses
The htx request and response processing functions currently only check
for HTX_FL_PARSING_ERROR on incomplete messages because that's how mux_h1
delivers these. However with H2 we have to detect some parsing errors in
the format of certain pseudo-headers (e.g. :path), so we do have a complete
message but we want to report an error.
Let's move the parse error check earlier so that it always triggers when
the flag is present. It was also moved for htx_wait_for_request_body()
since we definitely want to be able to abort processing such an invalid
request even if it appears complete, but it was not changed in the forward
functions so as not to truncate contents before the position of the first
error.
Willy Tarreau [Mon, 4 Mar 2019 10:19:49 +0000 (11:19 +0100)]
BUG/MEDIUM: list: fix again LIST_ADDQ_LOCKED
Well, that's becoming embarrassing. Now this fixes commit 4ef6801c
("BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last
element") which itself tried to fix commit 285192564. This fix only
works under low contention and was tested with the listener's queue.
With the idle conns it's obvious that it's still wrong since adding
more than one element to the list leaves a LLIST_BUSY pointer into
the list's head. This was visible when accumulating idle connections
in a server's list.
This new version of the fix almost goes back to the original code,
except that since then we addressed issues with expectedly idempotent
operations that were not. Now the code has been verified on paper again
and has survived 300 million connections spread over 4 threads.
This will have to be backported if the commit above is backported.
MINOR: sample: Replace "req.ungrpc" smp fetch by a "ungrpc" converter.
This patch simply extracts the code of smp_fetch_req_ungrpc() for "req.ungrpc"
from http_fetch.c to move it to sample.c with very few modifications.
Furthermore smp_fetch_body_buf() used to fetch the body contents is no more needed.
Willy Tarreau [Mon, 4 Mar 2019 07:03:25 +0000 (08:03 +0100)]
BUG/MAJOR: mux-h2: fix race condition between close on both ends
A crash in H2 was reported in issue #52. It turns out that there is a
small but existing race by which a conn_stream could detach itself
using h2_detach(), not being able to destroy the h2s due to pending
output data blocked by flow control, then upon next h2s activity
(transfer_data or trailers parsing), an ES flag may need to be turned
into a CS_FL_REOS bit, causing a dereference of a NULL stream. This is
a side effect of the fact that we still have a few places which
incorrectly depend on the CS flags, while these flags should only be
set by h2_rcv_buf() and h2_snd_buf().
All candidate locations along this path have been secured against this
risk, but the code should really evolve to stop depending on CS anymore.
This fix must be backported to 1.9 and possibly partially to 1.8.
Willy Tarreau [Fri, 1 Mar 2019 16:38:08 +0000 (17:38 +0100)]
REGTEST: fix a spurious "nbthread 4" in the connection test
Commit 26f6ae12c ("MAJOR: config: disable support for nbproc and nbthread
in parallel") revealed that there was accidently nbproc+nbthread in this
test while nbproc is the one expected. This likely is a leftover from a
previous attempt at reproducing the issue.
Willy Tarreau [Fri, 1 Mar 2019 14:43:14 +0000 (15:43 +0100)]
MEDIUM: init: make the global maxconn default to what rlim_fd_cur permits
The global maxconn value is often a pain to configure :
- in development the user never has the permissions to increase the
rlim_cur value too high and gets warnings all the time ;
- in some production environments, users may have limited actions on
it or may only be able to act on rlim_fd_cur using ulimit -n. This
is sometimes particularly true in containers or whatever environment
where the user has no privilege to upgrade the limits.
- keeping config homogenous between machines is even less easy.
We already had the ability to automatically compute maxconn from the
memory limits when they were set. This patch goes a bit further by also
computing the limit permitted by the configured limit on the number of
FDs. For this it simply reverses the rlim_fd_cur calculation to determine
maxconn based on the number of reserved sockets for listeners & checks,
the number of SSL engines and the number of pipes (absolute or relative).
This way it becomes possible to make maxconn always be the highest possible
value resulting in maxsock matching what was set using "ulimit -n", without
ever setting it. Note that we adjust to the soft limit, not the hard one,
since it's what is configured with ulimit -n. This allows users to also
limit to low values if needed.
Just like before, the calculated value is reported in verbose mode.
Willy Tarreau [Fri, 1 Mar 2019 08:39:42 +0000 (09:39 +0100)]
MINOR: init: move some maxsock updates earlier
We'll need to know the global maxsock before the maxconn calculation.
Actually only two components were calculated too late, the peers FD
and the stats FD. Let's move them a few lines upward.
Willy Tarreau [Fri, 1 Mar 2019 13:19:31 +0000 (14:19 +0100)]
MINOR: init: make the maxpipe computation more accurate
The default number of pipes is adjusted based on the sum of frontends
and backends maxconn/fullconn settings. Now that it is possible to have
a null maxconn on a frontend to indicate "unlimited" with commit c8d5b95e6 ("MEDIUM: config: don't enforce a low frontend maxconn value
anymore"), the sum of maxconn may remain low and limited to the only
frontends/backends where this limit is set.
This patch considers this new unlimited case when doing the check, and
automatically switches to the default value which is maxconn/4 in this
case. All the calculation was moved to a distinct function for ease of
use. This function also supports returning unlimited (-1) when the
value depends on global.maxconn and this latter is not yet set.
Willy Tarreau [Fri, 1 Mar 2019 09:21:55 +0000 (10:21 +0100)]
BUG/MINOR: mworker: be careful to restore the original rlim_fd_cur/max on reload
When the master re-execs itself on reload, it doesn't restore the initial
rlim_fd_cur/rlim_fd_max values, which have been modified by the ulimit-n
or global maxconn directives. This is a problem, because if these values
were set really low it could prevent the process from restarting, and if
they were set very high, this could have some implications on the restart
time, or later on the computed maxconn.
Let's simply reset these values to the ones we had at boot to maintain
the system in a consistent state.
A backport could be performed to 1.9 and maybe 1.8. This patch depends on
the two previous ones.