BUG/MINOR: contrib/prometheus-exporter: Don't use channel_htx_recv_max()
The function htx_free_data_space() must be used intead. Otherwise, if there are
some output data not already forwarded, the maximum amount of data that may be
inserted into the buffer may be greater than what we can really insert.
BUG/MEDIUM: checks: Make sure the tasklet won't run if the connection is closed.
wake_srv_chk() can be called from conn_fd_handler(), and may decide to
destroy the conn_stream and the connection, by calling cs_close(). If that
happens, we have to make sure the tasklet isn't scheduled to run, or it will
probably crash trying to access the connection or the conn_stream.
This fixes a crash that can be seen when using tcp checks.
This should be backported to 1.9 and 2.0.
For 1.9, the call should be instead :
task_remove_from_tasklet_list((struct task *)check->wait_list.task);
That function was renamed in 2.0.
BUG/MEDIUM: connections: Always call shutdown, with no linger.
Revert commit fe4abe62c7c5206dff1802f42d17014e198b9141.
The goal was to make sure for health-checks, we would not get sockets in
TIME_WAIT. To do so, we would not call shutdown() if linger_risk is set.
However that is wrong, and that means shutw would never be forwarded to
the server, and thus we could get connection that are never properly closed.
Instead, to fix the original problem as described here :
https://www.mail-archive.com/haproxy@formilux.org/msg34080.html
Just make sure the checks code call cs_shutr() before calling cs_shutw().
If shutr has been called, conn_sock_shutw() will make no attempt to call
shutdown(), as it knows close() will be called.
We should really review and revamp the shutr/shutw code, as described in
github issue #142.
BUG/MINOR: mux-h1: Skip trailers for non-chunked outgoing messages
Unlike H1, H2 messages may contains trailers while the header "Content-Length"
is set. Indeed, because of the framed structure of HTTP/2, it is no longer
necessary to use the chunked transfer encoding. So Trailing HEADERS frames,
after all DATA frames, may be added on messages with an explicit content length.
But in H1, it is impossible to have trailers on non-chunked messages. So when
outgoing messages are formatted by the H1 multiplexer, if the message is not
chunked, all trailers must be dropped.
This patch must be backported to 2.0 and 1.9. However, the patch will have to be
adapted for the 1.9.
BUG/MEDIUM: checks: unblock signals in external checks
As discussed in issue #140, processes are forked with signals blocked
resulting in haproxy's kill being ignored. This happens when the command
takes more time to complete than the configured check timeout or interval.
Just calling "sleep 30" every second makes the problem obvious.
The fix simply consists in unblocking the signals in the child after the
fork. It needs to be backported to all stable branches containing external
checks and where signals are blocked on startup. It's unclear when it
started, but the following config exhibits the issue :
global
external-check
listen www
bind :8001
timeout client 5s
timeout server 5s
timeout connect 5s
option external-check
external-check command "$PWD/sleep10.sh"
server local 127.0.0.1:80 check inter 200
$ cat sleep10.sh
#!/bin/sh
exec /bin/sleep 10
The "sleep" processes keep accumulating for 10 seconds and stabilize
around 25 when the bug is present. Just issuing "killall sleep" has no
effect on them, and stopping haproxy leaves these processes behind.
BUG/MINOR: mworker/cli: don't output a \n before the response
When using a level lower than admin on the master CLI, a \n is output
before the response, this is caused by the response of the "operator" or
"user" that are sent before the actual command.
To fix this problem we introduce the flag APPCTX_CLI_ST1_NOLF which ask
a command response to not be followed by the final \n.
This patch made a special case with the command operator and user
followed by a - so they are not followed by \n.
BUG/MEDIUM: mux-h1: Always release H1C if a shutdown for writes was reported
We must take care of this when the stream is detached from the
connection. Otherwise, on the server side, the connexion is inserted in the list
of idle connections of the session. But when reused, because the shutdown for
writes was already catched, nothing is sent to the server and the session is
blocked with a freezed connection.
This patch must be backported to 2.0 and 1.9. It is related to the issue #136
reported on Github.
Olivier Houchard [Fri, 28 Jun 2019 12:10:33 +0000 (14:10 +0200)]
BUG/MEDIUM: ssl: Don't attempt to set alpn if we're not using SSL.
Checks use ssl_sock_set_alpn() to set the ALPN if check-alpn is used, however
check-alpn failed to check if the connection was indeed using SSL, and thus,
would crash if check-alpn was used on a non-SSL connection. Fix this by
making sure the connection uses SSL before attempting to set the ALPN.
BUG/MINOR: mux-h1: Make format errors during output formatting fatal
These errors are unexpected at this staged and there is not much more to do than
to close the connection and leave. So now, when it happens, the flag
H1C_F_CS_ERROR is set on the H1 connection and the flag HTX_FL_PARSING_ERROR is
set on the channel's HTX message.
BUG/MEDIUM: mux-h1: Use buf_room_for_htx_data() to detect too large messages
During headers parsing, an error is returned if the message is too large and
does not fit in the input buffer. The mux h1 used the function b_full() to do
so. But to allow zero copy transfers, in h1_recv(), the input buffer is
pre-aligned and thus few bytes remains always free.
To fix the bug, as during the trailers parsing, the function
buf_room_for_htx_data() should be used instead.
BUG/MEDIUM: proto_htx: Don't add EOM on 1xx informational messages
Since the commit b75b5eaf ("MEDIUM: htx: 1xx messages are now part of the final
reponses"), these messages are part of the response and should not contain
EOM. This block is skipped during responses parsing, but analyzers still add it
for "100-Continue" and "103-Eraly-Hints". It can also be added for error files
with 1xx status code.
Now, when HAProxy generate such transitional responses, it does not emit EOM
blocks. And informational messages are now forbidden in error files.
BUG/MINOR: memory: Set objects size for pools in the per-thread cache
When a memory pool is created, it may be allocated from a static array. This
happens for "most common" pools, allocated first. Objects of these pools may
also be cached in a pool cache. Of course, to not cache too much entries, we
track the number of cached objects and the total size of the cache.
But the objects size of each pool in the cache (ie, pool_cache[tid][idx].size,
where tid is the thread-id and idx is the index of the pool) was never set. So
the total size of the cache was never limited. Now when a pool is created, if
these objects may be cached, we set the corresponding objects size in the pool
cache.
BUG/MAJOR: mux-h1: Don't crush trash chunk area when outgoing message is formatted
When an outgoing HTX message is formatted before sending it, a trash chunk is
used to do the formatting. Its content is then copied into the output buffer of
the H1 connection. There are some tricks to avoid this last copy. First, if
possible we perform a zero-copy by swapping the area of the HTX buffer with the
one of the output buffer. If zero-copy is not possible, but if the output buffer
is empty, we don't use a trash chunk. To do so, we change the area of the trash
chunk to point on the one of the output buffer. But it is terribly wrong. Trash
chunks are global variables, allocated statically. If the area is changed, the
old one is lost. Worst, the area of the output buffer is dynamically allocated,
so it is released when emptied, leaving the trash chunk with a freed area (in
fact, it is a bit more complicated because buffers are allocated from a memory
pool).
So, honestly, I don't know why we never experienced any problem because this bug
till now. To fix it, we still use a temporary buffer, but we assign it to a
trash chunk only when other solutions were excluded. This way, we never
overwrite the area of a trash chunk.
BUG/MINOR: htx: Save hdrs_bytes when the HTX start-line is replaced
The HTX start-line contains the number of bytes held by all headers as seen by
the mux during the parsing. So it must not be updated during analysis. It was
done when the start-line is replaced, so this update was removed at this
place. But we still save it from the old start-line to not loose it. It should
not be used outside the mux, but there is no reason to skip it. It is a bug,
however it should have no impact.
Olivier Houchard [Mon, 24 Jun 2019 16:57:39 +0000 (18:57 +0200)]
BUG/MEDIUM: ssl: Don't do anything in ssl_subscribe if we have no ctx.
In ssl_subscribe(), make sure we have a ssl_sock_ctx before doing anything.
When ssl_sock_close() is called, it wakes any subscriber up, and that
subscriber may decide to subscribe again, for some reason. If we no longer
have a context, there's not much we can do.
Olivier Houchard [Mon, 24 Jun 2019 16:19:40 +0000 (18:19 +0200)]
BUG/MEDIUM: connections: Always add the xprt handshake if needed.
In connect_server(), we used to only call xprt_add_hs() if CO_FL_SEND_PROXY
was set during the function call, we would not do it if the flag was set
before connect_server() was called. The rational at the time was if the flag
was already set, then the XPRT was already present. But now the xprt_handshake
always removes itself, so we have to re-add it each time, or it wouldn't be
done if the first connection attempt failed.
While I'm there, check any non-ssl handshake flag, instead of just
CO_FL_SEND_PROXY, or we'd miss the SOCKS4 flags.
Olivier Houchard [Mon, 24 Jun 2019 14:08:08 +0000 (16:08 +0200)]
BUG/MEDIUM: stream_interface: Don't add SI_FL_ERR the state is < SI_ST_CON.
Only add SI_FL_ERR if the stream_interface is connected, or is attempting
a connection. We may get there because the stream_interface's tasklet
was woken up, but before it actually runs, process_stream() may be called,
detect that there were an error, and change the state of the stream_interface
to SI_ST_TAR. When the stream_interface's tasklet then run, the connection
may still have CO_FL_ERROR, but that error was already accounted for, so
just ignore it.
BUG/MEDIUM: mworker: don't call the thread and fdtab deinit
Before switching to wait mode, the per thread deinit should not be
called, because we didn't initiate threads and fdtab.
The problem is that the master could crash if we try to reload HAProxy
The commit 944e619 ("MEDIUM: mworker: wait mode use standard init code
path") removed the deinit code by accident, but its fix 7c756a8
("BUG/MEDIUM: mworker: fix FD leak upon reload") was incomplete and did
not took care of the WAIT_MODE.
Tim Duesterhus [Sun, 23 Jun 2019 20:10:12 +0000 (22:10 +0200)]
BUG/MINOR: mworker-prog: Fix segmentation fault during cfgparse
Consider this configuration:
frontend fe_http
mode http
bind *:8080
default_backend be_http
backend be_http
mode http
server example example.com:80
program foo bar
Running with valgrind results in:
==16252== Invalid read of size 8
==16252== at 0x52AE3F: cfg_parse_program (mworker-prog.c:233)
==16252== by 0x4823B3: readcfgfile (cfgparse.c:2180)
==16252== by 0x47BCED: init (haproxy.c:1649)
==16252== by 0x404E22: main (haproxy.c:2714)
==16252== Address 0x48 is not stack'd, malloc'd or (recently) free'd
Check whether `ext_child` is valid before attempting to free it and its
contents.
Willy Tarreau [Sat, 22 Jun 2019 06:24:16 +0000 (08:24 +0200)]
BUILD: makefile: do not rely on shell substitutions to determine git version
Solaris's default shell doesn't support substitutions at the beginning or
end of variables, which are still used to determine the version based on
git. Since we added --abbrev=0 we don't need the last one. And using cut
it's trivial to replace the first one, actually simplifying the whole
expression.
Willy Tarreau [Sat, 22 Jun 2019 06:13:24 +0000 (08:13 +0200)]
BUILD: makefile: adjust the sed expression of "make help" for solaris
Solaris's sed doesn't take the 'p' argument on the 's' command so
nothing is printed. Just passing ';p' fixes this without affecting
other implementations. Additionally, optional characters cannot be
matched using a question mark, which is always searched verbatim, so
the leading '#' wasn't stripped. Using \{0,1\} works fine everywhere
so let's use this instead.
Willy Tarreau [Sat, 22 Jun 2019 05:51:02 +0000 (07:51 +0200)]
BUILD: makefile: use :space: instead of digits to count commits
The 'tr' command on Solaris doesn't conform to POSIX and requires
brackets around ranges. So the sequence '0-9' is understood as the
3 characters '0', '-', and '9'. This causes tagged versions (those
with no commit after the last commit) to be numberred as an empty
string, resulting in an error being reported while computing the
version number.
All implementations support '[:space:]' to delete heading spaces,
so let's use this instead.
Willy Tarreau [Sat, 22 Jun 2019 05:41:38 +0000 (07:41 +0200)]
BUILD: mworker: silence two printf format warnings around getpid()
getpid() is documented as returning a pit pid_t result, not
necessarily an int. This causes a build warning on Solaris 10
because of '%d' or '%u' are used in the format passed to snprintf().
Let's just cast the result as an int (respectively unsigned int).
This can be backported to 2.0 and possibly older versions though
it really has no impact.
BUG/MAJOR: sample: Wrong stick-table name parsing in "if/unless" ACL condition.
This bug was introduced by 1b8e68e commit which supposed the stick-table was always
stored in struct arg at parsing time. This is never the case with the usage of
"if/unless" conditions in stick-table declared as backends. In this case, this is
the name of the proxy which must be considered as the stick-table name.
BUG/MEDIUM: lb_fwlc: Don't test the server's lb_tree from outside the lock
In the function fwlc_srv_reposition(), the server's lb_tree is tested from
outside the lock. So it is possible to remove it after the test and then call
eb32_insert() in fwlc_queue_srv() with a NULL root pointer, which is
invalid. Moving the test in the scope of the lock fixes the bug.
This issue was reported on Github, issue #126.
This patch must be backported to 2.0, 1.9 and 1.8.
BUG/MEDIUM: mux-h2: Remove the padding length when a DATA frame size is checked
When a DATA frame is processed for a message with a content-length, we first
take care to not have a frame size that exceeds the remaining to
read. Otherwise, an error is triggered. But we must remove the padding length
from the frame size because the padding is not included in the announced
content-length.
BUG/MEDIUM: mux-h2: Reset padlen when several frames are demux
In the function h2_process_demux(), if several frames are parsed, the padding
length must be reset between each frame. Otherwise we may wrongly think a frame
has a padding block because the previous one was padded.
BUG/MEDIUM: htx: Fully update HTX message when the block value is changed
Everywhere the value length of a block is changed, calling the function
htx_set_blk_value_len(), the HTX message must be updated. But at many places,
because of the recent changes in the HTX structure, this update was only
partially done. tail_addr and head_addr values were not systematically updated.
In fact, the function htx_set_blk_value_len() was designed as an internal
function to the HTX API. And we used it from outside by convenience. But it is
really painfull and error prone to let the caller update the HTX message. So
now, we use the function htx_change_blk_value_len() wherever is possible. It
changes the value length of a block and updates the HTX message accordingly.
MINOR: htx: Add the function htx_change_blk_value_len()
As its name suggest, this function change the value length of a block. But it
also update the HTX message accordingly. It simplifies the HTX API. The function
htx_set_blk_value_len() is still available and must be used with caution because
this one does not update the HTX message. It just updates the HTX block. It
should be considered as an internal function. When possible,
htx_change_blk_value_len() should be used instead.
This function is used to fix a bug affecting the 2.0. So, this patch must be
backported to 2.0.
Tim Duesterhus [Mon, 17 Jun 2019 14:10:07 +0000 (16:10 +0200)]
BUG/MEDIUM: compression: Set Vary: Accept-Encoding for compressed responses
Make HAProxy set the `Vary: Accept-Encoding` response header if it compressed
the server response.
Technically the `Vary` header SHOULD also be set for responses that would
normally be compressed based off the current configuration, but are not due
to a missing or invalid `Accept-Encoding` request header or due to the
maximum compression rate being exceeded.
Not setting the header in these cases does no real harm, though: An
uncompressed response might be returned by a Cache, even if a compressed
one could be retrieved from HAProxy. This increases the traffic to the end
user if the cache is unable to compress itself, but it saves another
roundtrip to HAProxy.
see the discussion on the mailing list: https://www.mail-archive.com/haproxy@formilux.org/msg34221.html
Message-ID: 20190617121708.GA2964@1wt.eu
A small issue remains: The User-Agent is not added to the `Vary` header,
despite being relevant to the response. Adding the User-Agent header would
make responses effectively uncacheable and it's unlikely to see a Mozilla/4
in the wild in 2019.
Add a reg-test to ensure the behaviour as described in this commit message.
see issue #121
Should be backported to all branches with compression (i.e. 1.6+).
BUG/MINOR: mux-h1: Add the header connection in lower case in outgoing messages
When necessary, this header is directly added in outgoing messages by the H1
multiplexer. Because there is no HTX conversion first, the header name is not
converserted to its lower case version. So, it must be added in lower case by
the multiplexer.
Baptiste Assmann [Thu, 13 Jun 2019 11:24:29 +0000 (13:24 +0200)]
MEDIUM: server: server-state global file stored in a tree
Server states can be recovered from either a "global" file (all backends)
or a "local" file (per backend).
The way the algorithm to parse the state file was first implemented was good
enough for a low number of backends and servers per backend.
Basically, for each backend the state file (global or local) is opened,
parsed entirely and for each line we check if it contains data related to
a server from the backend we're currently processing.
We must read the file entirely, just in case some lines for the current
backend are stored at the end of the file.
This does not scale at all!
This patch changes the behavior above for the "global" file only. Now,
the global file is read and parsed once and all lines it contains are
stored in a tree, for faster discovery.
This result in way much less fopen, fgets, and strcmp calls, which make
loading of very big state files very quick now.
BUG/MEDIUM: h2/htx: Update data length of the HTX when the cookie list is built
When an H2 request is converted into an HTX message, All cookie headers are
grouped into one, each value separated by a semicolon (;). To do so, we add the
header "cookie" with the first value and then we update the value by appending
other cookies. But during this operation, only the size of the HTX block is
updated. And not the data length of the whole HTX message.
It is an old bug and it seems to work by chance till now. But it may lead to
undefined behaviour by time to time.
Willy Tarreau [Sun, 16 Jun 2019 18:00:26 +0000 (20:00 +0200)]
[RELEASE] Released version 2.0.0
Released version 2.0.0 with the following main changes :
- MINOR: fd: Don't use atomic operations when it's not needed.
- DOC: mworker-prog: documentation for the program section
- MINOR: http: add a new "http-request replace-uri" action
- BUG/MINOR: 51d/htx: The _51d_fetch method, and the methods it calls are now HTX aware.
- MINOR: 51d: Added dummy libraries for the 51Degrees module for testing.
- MINOR: mworker: change formatting in uptime field of "show proc"
- MINOR: mworker: add the HAProxy version in "show proc"
- MINOR: doc: Remove -Ds option in man page
- MINOR: doc: add master-worker in the man page
- MINOR: doc: mention HAPROXY_LOCALPEER in the man
- BUILD: Silence gcc warning about unused return value
- CLEANUP: 51d: move the 51d dummy lib to contrib/51d/src to match the real lib
- BUILD: travis-ci: add 51Degree device detection, update openssl to 1.1.1c
- MINOR: doc: update the manpage and usage message about -S
- BUILD/MINOR: 51d: Updated build registration output to indicate thatif the library is a dummy one or not.
- BUG/MEDIUM: h1: Don't wait for handshake if we had an error.
- BUG/MEDIUM: h1: Wait for the connection if the handshake didn't complete.
- BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity
- BUG/MINOR: fl_trace/htx: Be sure to always forward trailers and EOM
- BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full()
- BUG/MINOR: http: Use the global value to limit the number of parsed headers
- BUG/MINOR: htx: Detect when tail_addr meet end_addr to maximize free rooms
- BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis
- CLEANUP: channel: Remove channel_htx_fwd_payload() and channel_htx_fwd_all()
- BUG/MEDIUM: proto_htx: Introduce the state ENDING during forwarding
- MINOR: htx: Add 3 flags on the start-line to deal with the request schemes
- MINOR: h2: Set flags about the request's scheme on the start-line
- MINOR: mux-h1: Set flags about the request's scheme on the start-line
- MINOR: mux-h2: Forward clients scheme to servers checking start-line flags
- MEDIUM: server: server-state only rely on server name
- CLEANUP: connection: rename the wait_event.task field to .tasklet
- CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
- BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger.
- DOC: add some environment variables in section 2.3
- BUILD: makefile: clarify the "help" output and list options
- BUG/MINOR: mux-h1: Wake busy mux for I/O when message is fully sent
- BUG: tasks: fix bug introduced by latest scheduler cleanup
- BUG/MEDIUM: mux-h2: fix early close with option abortonclose
- BUG/MEDIUM: connections: Don't use ALPN to pick mux when in mode TCP.
- BUG/MEDIUM: connections: Don't try to send early data if we have no mux.
- BUG/MEDIUM: mux-h2: properly account for the appended data in HTX
- BUILD: makefile: further clarify the "help" output and list targets
- BUILD: makefile: rename "linux2628" to "linux-glibc" and remove older targets
- BUILD: travis-ci: switch to linux-glibc instead of linux2628
- DOC: update few references to the linux* targets and change them to linux-glibc
- BUILD: makefile: detect and reject recently removed linux targets
- BUILD: makefile: enable linux namespaces by default on linux
- BUILD: makefile: enable TFO on linux platforms
- BUILD: makefile: enable getaddrinfo on the linux-glibc target
- DOC: small updates to the CONTRIBUTING file
- BUG/MEDIUM: ssl: Make sure we initiate the handshake after using early data.
- CLEANUP: removed obsolete examples an move a few to better places
- DOC: Fix typos in CONTRIBUTING
- DOC: update the outdated ROADMAP file
- DOC: create a BRANCHES file to explain the life cycle
- DOC: mention in INSTALL haproxy 2.0 is a long-term supported stable version
- BUILD: travis-ci: TFO and GETADDRINFO are now enabled by default
- BUILD: makefile: make the obsolete target detection compatible with make-3.80
- BUILD: tools: work around an internal compiler bug in gcc-3.4
- BUILD: pattern: work around an internal compiler bug in gcc-3.4
- BUILD: makefile: enable USE_RT on Solaris
- BUILD: makefile: do not use echo -n
- DOC: mention a few common build errors in the INSTALL file
Willy Tarreau [Sun, 16 Jun 2019 17:39:44 +0000 (19:39 +0200)]
DOC: mention a few common build errors in the INSTALL file
These are some errors met when trying to build with gcc 3.4 on an
old (13 years-old) Solaris 10 and on an even older Linux 2.4 with
glibc 2.2.5. A few options were enough to fix the build there.
Willy Tarreau [Sat, 15 Jun 2019 19:58:44 +0000 (21:58 +0200)]
DOC: update the outdated ROADMAP file
At least the load load balancing was done. Other points are being carried
since 1.5 or so, they should go into the issue tracker with no version
indication.
Olivier Houchard [Sat, 15 Jun 2019 18:59:30 +0000 (20:59 +0200)]
BUG/MEDIUM: ssl: Make sure we initiate the handshake after using early data.
When we're done sending/receiving early data, and we add the handshake
flags on the connection, make sure we wake the associated tasklet up, so that
the handshake will be initiated.
Willy Tarreau [Sat, 15 Jun 2019 15:15:12 +0000 (17:15 +0200)]
DOC: small updates to the CONTRIBUTING file
There's an abstract explaining what is discussed in the file, a small
explanation of how the project works, which justifies the measures
taken here, and instructions about what to do when a patch is ignored,
or how to annoy everyone.
Willy Tarreau [Fri, 14 Jun 2019 16:33:56 +0000 (18:33 +0200)]
BUILD: makefile: enable getaddrinfo on the linux-glibc target
getaddrinfo() has been available since glibc 2.3.3 or so and is generally
enabled by distro packagers. The main reason for not enabling it on Linux
in the past is that it was known broken on some libc alternatives. It's
the right moment to enable it by default with glibc.
Willy Tarreau [Fri, 14 Jun 2019 14:57:42 +0000 (16:57 +0200)]
BUILD: makefile: enable TFO on linux platforms
TCP Fast Open is supported on all supported Linux kernels and on all
kernels shipped in supported distros, except the older 2.6.32 that
comes with RHEL6. However the option is harmless, will not prevent
from building and smoothly falls back even if forcefully enabled, so
it makes sense to enable it by default. It's still possible to pass
"USE_TFO=" to force it disabled if really desired.
Willy Tarreau [Fri, 14 Jun 2019 14:54:51 +0000 (16:54 +0200)]
BUILD: makefile: enable linux namespaces by default on linux
Oldest kernel found on a supported Linux distro (2.6.32 + backports on
RHEL6) supports network namespaces, so we have no reason not to enable
them by default on the linux-glibc target.
Willy Tarreau [Fri, 14 Jun 2019 14:44:49 +0000 (16:44 +0200)]
BUILD: makefile: detect and reject recently removed linux targets
We've just removed old linux targets "linux22", "linux24", "linux24e",
"linux26" and "linux2628" and it's likely that many build scripts and
packages will still reference these. So let's have the makefile detect
these and reject with instructions instead of silently building with
incorrect options.
Willy Tarreau [Fri, 14 Jun 2019 16:40:48 +0000 (18:40 +0200)]
DOC: update few references to the linux* targets and change them to linux-glibc
The INSTALL guide, the Lua doc and the Prometheus exporter's README all
used to reference "linux2628", "linux26" or even "linux". These were all
updated to consistently reflect "linux-glibc" instead. The default options
were updated there as well so that it should build cleanly on most distros.
Willy Tarreau [Fri, 14 Jun 2019 14:32:09 +0000 (16:32 +0200)]
BUILD: makefile: rename "linux2628" to "linux-glibc" and remove older targets
The linux targets have become more than confusing over time. We used to
have "linux2628" to match the features available in kernels 2.6.28 and
above, without consideration for the libc, and due to many new features
appearing later in kernels, some other options were added that are not
enabled by default in linux2628, so this target doesn't make any sense
anymore. The older ones (linux 2.2, linux 2.4, ...) do not make sense
either since these versions are not supported anymore. Let's clean things
up by creating a new "linux-glibc" target that matches what is available
by default on Linux kernels and glibc present on supported distros at the
time of release. Other libc implementation may use a custom or generic
target or be added later if needed.
Willy Tarreau [Sat, 15 Jun 2019 09:34:41 +0000 (11:34 +0200)]
BUG/MEDIUM: mux-h2: properly account for the appended data in HTX
When commit 0350b90e3 ("MEDIUM: htx: make htx_add_data() never defragment
the buffer") was introduced, it made htx_add_data() actually be able to
add less data than it was asked for, and the callers must use the returned
value to know how much was added. The H2 code used to rely on the frame
length instead of the return value. A version of the code doing this was
written but is obviously not the one that got merged, resulting in breaking
large uploads or downloads when HTX would have instead defragmented the
buffer because the HTX side sees less contents than what the H2 side sees.
This patch fixes this again. No backport is needed.
Olivier Houchard [Fri, 14 Jun 2019 22:14:05 +0000 (00:14 +0200)]
BUG/MEDIUM: connections: Don't try to send early data if we have no mux.
In connect_server(), if we don't yet have a mux, because we're choosing
one depending on the ALPN, don't attempt to send early data. We can't do
it because those data would depend on the mux, that will only be determined
by the handshake.
Willy Tarreau [Sat, 15 Jun 2019 07:55:50 +0000 (09:55 +0200)]
BUG/MEDIUM: mux-h2: fix early close with option abortonclose
Olivier found that commit 99ad1b3e8 ("MINOR: mux-h2: stop relying on
CS_FL_REOS") managed to break abortonclose again with H2. What happens
is that while the CS_FL_REOS flag was set on some transitions to the
HREM state, it's not set on all and is in fact only set when the low
level connection is closed. So making the replacement condition match
the HREM and ERROR states is not correct and causes completely correct
requests to send advertise an early close of the connection layer while
only the stream's input is closed.
In order to avoid this, we now properly split the checks for the CLOSED
state and for the closed connection. This way there is no risk to set
the EOS flag too early on the connection.
Willy Tarreau [Fri, 14 Jun 2019 16:05:54 +0000 (18:05 +0200)]
BUG: tasks: fix bug introduced by latest scheduler cleanup
In commit 86eded6c6 ("CLEANUP: tasks: rename task_remove_from_tasklet_list()
to tasklet_remove_*") which consisted in removing the casts between tasks
and tasklet, I was a bit too fast to believe that we only saw tasklets in
this function since process_runnable_tasks() also uses it with tasks under
a cast. So removing the bookkeeping on task_list_size was not appropriate.
Bah, the joy of casts which hide the real thing...
This patch does two things at once to address this mess once for all:
- it restores the decrement of task_list_size when it's a real task,
but moves it to process_runnable_task() since it's the only place
where it's allowed to call it with a task
- it moves the increment there as well and renames
task_insert_into_tasklet_list() to tasklet_insert_into_tasklet_list()
of obvious consistency reasons.
This way the increment/decrement of task_list_size is made at the only
places where the cast is enforced, so it has less risks to be missed.
The comments on top of these functions were updated to reflect that they
are only supposed to be used with tasklets and that the caller is responsible
for keeping task_list_size up to date if it decides to enforce a task there.
Now we don't have to worry anymore about how these functions work outside
of the scheduler, which is better longterm-wise. Thanks to Christopher for
spotting this mistake.
BUG/MINOR: mux-h1: Wake busy mux for I/O when message is fully sent
If a mux is in busy mode when the outgoing EOM is consummed, it is important to
wake it up for I/O. Because in busy mode, the mux is not subscribed for
receive. Otherwise, it depends on the applicative layer to shutdown the H1
stream. Wake it up allows the mux to catch the read0 as soon as possible.
Willy Tarreau [Fri, 14 Jun 2019 13:52:01 +0000 (15:52 +0200)]
BUILD: makefile: clarify the "help" output and list options
The list of enable and disabled build options now appears separately
at the end of "make help". This is convenient to know what is enabled
by default on a given target. For example :
$ make help TARGET=linux2628
Enabled features for TARGET 'linux2628' (disable with 'USE_xxx=') :
EPOLL NETFILTER POLL THREAD TPROXY LINUX_TPROXY LINUX_SPLICE LIBCRYPT
CRYPT_H FUTEX ACCEPT4 CPU_AFFINITY DL RT PRCTL THREAD_DUMP
Olivier Houchard [Fri, 14 Jun 2019 13:26:06 +0000 (15:26 +0200)]
BUG/MEDIUM: connections: Don't call shutdown() if we want to disable linger.
In conn_sock_shutw(), avoid calling shutdown() if linger_risk is set. Not
doing so will result in getting sockets in TIME_WAIT for some time.
This is particularly observable with health checks.
Willy Tarreau [Fri, 14 Jun 2019 12:47:49 +0000 (14:47 +0200)]
CLEANUP: tasks: rename task_remove_from_tasklet_list() to tasklet_remove_*
The function really only operates on tasklets, its arguments are always
tasklets cast as tasks to match the function's type, to be cast back to
a struct tasklet. Let's rename it to tasklet_remove_from_tasklet_list(),
take a struct tasklet, and get rid of the undesired task casts.
Willy Tarreau [Fri, 14 Jun 2019 12:42:29 +0000 (14:42 +0200)]
CLEANUP: connection: rename the wait_event.task field to .tasklet
It's really confusing to call it a task because it's a tasklet and used
in places where tasks and tasklets are used together. Let's rename it
to tasklet to remove this confusion.
Baptiste Assmann [Tue, 11 Jun 2019 12:51:49 +0000 (14:51 +0200)]
MEDIUM: server: server-state only rely on server name
Since h7da71293e431b5ebb3d6289a55b0102331788ee6as has been added, the
server name (srv->id in the code) is now unique per backend, which
means it can reliabely be used to identify a server recovered from the
server-state file.
This patch cleans up the parsing of server-state file and ensure we use
only the server name as a reliable key.
MINOR: mux-h2: Forward clients scheme to servers checking start-line flags
By default, the scheme "https" is always used. But when an explicit scheme was
defined and when this scheme is "http", we use it in the request sent to the
server. This is done by checking flags of the start-line. If the flag
HTX_SL_F_HAS_SCHM is set, it means an explicit scheme was defined on the client
side. And if the flag HTX_SL_F_SCHM_HTTP is set, it means the scheme "http" was
used.
MINOR: mux-h1: Set flags about the request's scheme on the start-line
We first try to figure out if the URI of the start-line is absolute or not. So,
if it does not start by a slash ("/"), it means the URI is an absolute one and
the flag HTX_SL_F_HAS_SCHM is set. Then checks are performed to know if the
scheme is "http" or "https" and the corresponding flag is set,
HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS. Other schemes, for instance ftp, are
ignored.
MINOR: h2: Set flags about the request's scheme on the start-line
The flag HTX_SL_F_HAS_SCHM is always set because H2 requests have always an
explicit scheme. Then, the pseudo-header ":scheme" is tested. If it is set to
"http", the flag HTX_SL_F_SCHM_HTTP is set. Otherwise, for all other cases, the
flag HTX_SL_F_SCHM_HTTPS is set. For now, it seems reasonable to have a fallback
on the scheme "https".
MINOR: htx: Add 3 flags on the start-line to deal with the request schemes
The first one, HTX_SL_F_HAS_SCHM, will be used to know the request has an
explicit scheme. So, in H2, it is always true because the pseudo-header
":scheme" is mandatory. In H1, it is only true when an absolute URI is found on
the start-line. The other flags, HTX_SL_F_SCHM_HTTP and HTX_SL_F_SCHM_HTTPS,
will be used to know which scheme the request have. For now, other protocols are
not handled.
The aim of these flags is to pass this information to the backend side in
general, and to the H2 mux in particular. So the multiplexer will have a chance
to use this information to send the right scheme to the server.
BUG/MEDIUM: proto_htx: Introduce the state ENDING during forwarding
This state is used in the legacy HTTP when everything was received from an
endpoint but a filter doesn't forward all the data. It is used to not report a
client or a server abort, depending on channels flags.
The same must be done on HTX streams. Otherwise, the message may be
truncated. For instance, it may happen with the filter trace with the random
forwarding enabled on the response channel.
BUG/MEDIUM: htx: Don't change position of the first block during HTX analysis
In the HTX structure, the field <first> is used to know where to (re)start the
analysis. It may differ from the message's head. It is especially important to
update it to handle 1xx messages, to be sure to restart the analysis on the next
message (another 1xx message or the final one). It is also updated when some
data are forwarded (the headers or part of the body). But this update is an
error and must never be done at the analysis level. It is a bug, because some
sample fetches may be used after the data forwarding (but before the first send
of course). At this stage, if the first block position does not point on the
start-line, most of HTTP sample fetches fail.
So now, when something is forwarding by HTX analyzers, the first block position
is not update anymore.
This issue was reported on Github. See #119. No backport needed.
BUG/MINOR: htx: Detect when tail_addr meet end_addr to maximize free rooms
When a block's payload is moved during an expansion or when the whole block is
removed, the addresses of free spaces are updated accordingly. We must be
careful to reset them when <tail_addr> becomes equal to <end_addr>. In this
situation, we can maximize the free space between the blocks and their payload
and set the other one to 0. It is also important to be sure to never have
<end_addr> greater than <tail_addr>.
BUG/MINOR: http: Use the global value to limit the number of parsed headers
Instead of using the macro MAX_HTTP_HDR to limit the number of headers parsed
before throwing an error, we now use the custom global variable
global.tune.max_http_hdr.
BUG/MINOR: channel/htx: Call channel_htx_full() from channel_full()
When channel_full() is called for an HTX stream, we fall back on the HTX
version. This function is called, among other, from tcp_inspect_request(). With
this patch, the inspect delay is respected again.
BUG/MINOR: fl_trace/htx: Be sure to always forward trailers and EOM
Previous fix about the random forwarding on the message body was not enough to
fix the bug in all cases. Among others, when there is no data but only the EOM,
we must forward everything.
This patch must be backported to 1.9 if the patch 0bdeeaacb ("BUG/MINOR:
flt_trace/htx: Only apply the random forwarding on the message body.") is also
backported.
Willy Tarreau [Fri, 14 Jun 2019 06:30:10 +0000 (08:30 +0200)]
BUG/MINOR: task: prevent schedulable tasks from starving under high I/O activity
With both I/O and tasks in the same tasklet list, we now have a very
smooth and responsive scheduler, providing a good fairness between I/O
activities. With the lower layers relying on tasklet a lot (I/O wakeup,
subscribe, etc), there may often be a large number of totally autonomous
tasklets doing their business such as forwarding data between two muxes.
But the task scheduler historically refrained from picking tasks from the
priority-ordered run queue to put them into the tasklet list until this
later had less than max_runqueue_depth entries. This was to make sure that
low-latency, high-priority tasks would have an opportunity to be dequeued
before others even if they arrive late. But the counter used for this is
still the tasklet list size, which contains countless I/O events. This
causes an unfairness between unbounded I/Os and bounded tasks, resulting
for example in the CLI responding slower when forwarding 40 Gbps of HTTP
traffic spread over a thousand of connections.
A good solution consists in sticking to the initial intent of
max_runqueue_depth which is to limit the number of tasks in the list
(to maintain fairness between them) and not to limit the number of these
tasks among tasklets. It just turns out that the task_list_size initially
was this task counter and changed over time to be a tasklet list size.
Let's simply refrain from updating it for pure tasklets so that it takes
back its original role of counting real tasks as its name implies. With
this change the CLI becomes instantly responsive under load again.
This patch may possibly be backported to 1.9 though it requires some
careful checks.
Olivier Houchard [Thu, 13 Jun 2019 15:54:33 +0000 (17:54 +0200)]
BUG/MEDIUM: h1: Wait for the connection if the handshake didn't complete.
In h1_init(), also add the H1C_F_CS_WAIT_CONN flag if the handshake didn't
complete, otherwise we may end up letting the upper layer sending data too
soon.
Ben51Degrees [Thu, 13 Jun 2019 15:51:59 +0000 (16:51 +0100)]
BUILD/MINOR: 51d: Updated build registration output to indicate thatif the library is a dummy one or not.
When built with the dummy 51Degrees library for testing, the output will
include "(dummy library)" to ensure it is clear that this is this is not
the API.
Willy Tarreau [Thu, 13 Jun 2019 13:56:10 +0000 (15:56 +0200)]
CLEANUP: 51d: move the 51d dummy lib to contrib/51d/src to match the real lib
This way the directory structure remains the same as with the real lib and
one can apply the same build options regardless of where the lib is stored,
removing any possible confusion.
Tim Duesterhus [Wed, 12 Jun 2019 18:47:30 +0000 (20:47 +0200)]
BUILD: Silence gcc warning about unused return value
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
complains:
> src/debug.c: In function "ha_panic":
> src/debug.c:162:2: warning: ignoring return value of "write", declared with attribute warn_unused_result [-Wunused-result]
> (void) write(2, trash.area, trash.data);
> ^