git.ipfire.org Git - thirdparty/haproxy.git/log

MINOR: stream: measure and report a stream's call rate in "show sess"

Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.

This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.

A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.

MINOR: tasks/activity: report the context switch and task wakeup rates

It's particularly useful to spot runaway tasks to see this. The context
switch rate covers all tasklet calls (tasks and I/O handlers) while the
task wakeups only covers tasks picked from the run queue to be executed.
High values there will indicate either an intense traffic or a bug that
mades a task go wild.

CLEANUP: task: report calls as unsigned in show sess

The "show sess" output used signed ints to report the number of calls,
which is confusing for runaway tasks where the call count can turn
negative.

BUG/MINOR: htx: Exclude TCP proxies when the HTX mode is handled during startup

When tests are performed on the HTX mode during HAProxy startup, only HTTP
proxies are considered. It is important because, since the commit 1d2b586cd
("MAJOR: htx: Enable the HTX mode by default for all proxies"), the HTX is
enabled on all proxies by default. But for TCP proxies, it is "deactivated".

This patch must be backported to 1.9.

BUG/MAJOR: muxes: Use the HTX mode to find the best mux for HTTP proxies only

Since the commit 1d2b586cd ("MAJOR: htx: Enable the HTX mode by default for all
proxies"), the HTX is enabled by default for all proxies, HTTP and TCP, but also
CLI and HEALTH proxies. But when the best mux is retrieved, only HTTP and TCP
modes are checked. If the TCP mode is not explicitly set, it is considered as an
HTTP proxy. It is an hidden bug introduced when the option "http-use-htx" was
added. It has no effect until the commit 1d2b586cd. But now, when a stats socket
is created for the master process, the mux h1 is installed on all incoming
connections to the CLI proxy, leading to segfaults because HTX operations are
performed on raw buffers.

So to fix the buf, when a mux is installed, all proxies are considered as TCP
proxies, except HTTP ones. This way, CLI and HEALTH proxies will be handled as
TCP proxies.

This patch must be backported to 1.9 although it has no effect. It is safer to
not keep hidden bugs.

BUG/MAJOR: lb/threads: fix AB/BA locking issue in round-robin LB

An occasional divide by zero in the round-robin scheduler was addressed
in commit 9df86f997 ("BUG/MAJOR: lb/threads: fix insufficient locking on
round-robin LB") by grabing the server's lock in fwrr_get_server_from_group().

But it happens that this is not the correct approach as it introduces a
case of AB/BA deadlock reported by Maksim Kupriianov. This happens when
a server weight changes from/to zero while another thread extracts this
server from the tree. The reason is that the functions used to manipulate
the state work under the server's lock and grab the LB lock while the ones
used in LB work under the LB lock and grab the server's lock when needed.

This commit mostly reverts the changes above and instead further completes
the locking analysis performed on this code to identify areas that really
need to be protected by the server's lock, since this is the only algorithm
which happens to have this requirement. This audit showed that in fact all
locations which require the server's lock are already protected by the LB
lock. This was not noticed the first time due to the server's lock being
taken instead and due to some functions misleadingly using atomic ops to
modify server fields which are under the LB lock protection (these ones
were now removed).

The change consists in not taking the server's lock anymore here, and
instead making sure that the aforementioned function which used to
suffer from the server's weight becoming zero only uses a copy of the
weight which was preliminary verified to be non-null (when the weight
is null, the server will be removed from the tree anyway so there is
no need to recalculate its position).

With this change, the code survived an injection at 200k req/s split
on two servers with weights changing 50 times a second.

This commit must be backported to 1.9 only.

BUG/MEDIUM: ssl: Return -1 on recv/send if we got EAGAIN.

In ha_ssl_read()/ha_ssl_write(), if we couldn't send/receive data because
we got EAGAIN, return -1 and not 0, as older SSL versions expect that.
This should fix the problems with OpenSSL < 1.1.0.

BUG/MINOR: spoe: Don't systematically wakeup SPOE stream in the applet handler

This can lead to wakeups in loop between the SPOE stream and the SPOE applets
waiting to receive agent messages (mainly AGENT-HELLO and AGENT-DISCONNECT).

This patch must be backported to 1.9 and 1.8.

BUG/MEDIUM: stream: Fix the way early aborts on the client side are handled

A regression was introduced with the commit c9aecc8ff ("BUG/MEDIUM: stream:
Don't request a server connection if a shutw was scheduled"). Among other this,
it breaks the CLI when the shutr on the client side is handled with the client
data. To depend on the flag CF_SHUTW_NOW to not establish the server connection
when an error on the client side is detected is the right way to fix the bug,
because this flag may be set without any error on the client side.

So instead, we abort the request where the error is handled and only when the
backend stream-interface is in the state SI_ST_INI. This way, there is no
ambiguity on the reason why the abort accurred. The stream-interface is also
switched to the state SI_ST_CLO.

This patch must be backported to 1.9. If the commit c9aecc8ff is backported to
previous versions, this one MUST also be backported. Otherwise, it MAY be
backported to older versions that 1.9 with caution.

BUG/MAJOR: stream: Missing DNS context initializations.

Fix some missing initializations wich came with 333939c commit (MINOR: action:
new '(http-request|tcp-request content) do-resolve' action). The DNS contexts of
streams which were allocated were not initialized by stream_new(). This leaded to
accesses to non-allocated memory when freeing these contexts with stream_free().

REGTEST: make the "run-regtests" script search for tests in reg-tests by default

It happens almost daily to me that make regtests fails because the script
found a temporary, old, or broken VTC file that was lying in my work dir,
leaving me no place to hide it. This is a real pain as some tests take ages
to fail, so let's make this script only look up for tests where they are
expected to be stored, under reg-tests only. It remains possible to force
the location on the command line though.

REGTEST: adapt some reg tests after renaming.

Some reg tests and their dependencies have been renamed. They may be
referenced by the .vtc files. So, this patch modifies also the references
to these dependencies.

REGTEST: rename the reg test files.

We rename all the VTC files to avoid name collisions when importing/backporting.

REGTEST: replace LEVEL option by a more human readable one.

This patch replaces LEVEL variable by REGTESTS_TYPES variable which is more
mnemonic and human readable. It is uses as a filter to run the reg tests scripts
where a commented #REGTEST_TYPE may be defined to designate their types.
Running the following command:

$ REGTESTS_TYPES=slow,default

will start all the reg tests where REGTEST_TYPE is defines as 'slow' or 'default'.
Note that 'default' is also the default value of REGTEST_TYPE when not specified
dedicated to run all the current h*.vtc files. When REGTESTS_TYPES is not specified
there is no filter at all. All the tests are run.

This patches also defines REGTEST_TYPE with 'slow' value for all the s*.vtc files,
'bug' value for al the b*.vtc files, 'broken' value for all the k*.vtc files.

MINOR: log: Extract some code to send syslog messages.

This patch extracts the code of __send_log() responsible of sending a syslog
message to a syslog destination represented as a logsrv struct to define
__do_send_log() function. __send_log() calls __do_send_log() for each syslog
destination of a proxy after having prepared some of its parameters.

MINOR: action: new '(http-request|tcp-request content) do-resolve' action

The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).

Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.

This patch creates a global counter to track various errors reported by
the action 'do-resolve'.

MINOR: obj_type: new object type for struct stream

This patch creates a new obj_type for the struct stream in HAProxy.

MINOR: dns: move callback affection in dns_link_resolution()

In dns.c, dns_link_resolution(), each type of dns requester is managed
separately, that said, the callback function is affected globaly (and
points to server type callbacks only).
This design prevents the addition of new dns requester type and this
patch aims at fixing this limitation: now, the callback setting is done
directly into the portion of code dedicated to each requester type.

MINOR: dns: dns_requester structures are now in a memory pool

dns_requester structure can be allocated at run time when servers get
associated to DNS resolution (this happens when SRV records are used in
conjunction with service discovery).
Well, this memory allocation is safer if managed in an HAProxy pool,
furthermore with upcoming HTTP action which can perform DNS resolution
at runtime.

This patch moves the memory management of the dns_requester structure
into its own pool.

MINOR: contrib: dummy wurfl library

This is dummy version of the Scientiamobile WURFL C API that can be used
to successfully build/run haproxy compiled with USE_WURFL=1.
It is marked as version 1.11.2.100 to distinguish it from any real version
of the lib. It has no external dependencies so it should work out of the
box by building it like this :

$ make -C contrib/wurfl

In order to use it, simply reference this directory as the WURFL include
and library paths :

$ make TARGET=<target> USE_WURFL=1 WURFL_INC=$PWD/contrib/wurfl WURFL_LIB=$PWD/contrib/wurfl

MINOR: wurfl: enabled multithreading mode

Initially excluded multithreaded mode is completely supported (libwurfl is fully MT safe).
Internal tests now are run also with multithreading enabled.

DOC: wurfl: added point of contact in MAINTAINERS file

CLEANUP: wurfl: removed deprecated methods

last 2 major releases of libwurfl included a complete review of engine options with
the result of deprecating many features. The patch removes unecessary code and fixes
the documentation.
Can be backported on any version of haproxy.

[wt: must not be backported since it removes config keywords and would
thus break existing configurations]
Signed-off-by: Willy Tarreau <w@1wt.eu>

BUILD: wurfl: build fix for 1.9/2.0 code base

This applies the required changes for the new buffer API that came in 1.9.
This patch must be backported to 1.9.

MINOR: wurfl: indicate in haproxy -vv the wurfl version in use

It also explicitly mentions that the library is the dummy one when it
is detected.

We have this output now :

$ ./haproxy -vv |grep -i wurfl
Built with WURFL support (dummy library version 1.11.2.100)

BUILD: add USE_WURFL to the list of known build options

Since the removal of WURFL we've improved the build system to report
known build options, let's reference this option there as well.

Revert "CLEANUP: wurfl: remove dead, broken and unmaintained code"

This reverts commit 8e5e1e7bf000f2603a085c76be118214d22c55b4.

The following patches will fix this code and may be backported.

MINOR: ssl/cli: async fd io-handlers printable on show fd

This patch exports the async fd iohandlers and make them printable
doing a 'show fd' on cli.

MINOR: gcc: Fix a silly gcc warning in connect_server()

Don't know why it happens now, but gcc seems to think srv_conn may be NULL when
a reused connection is removed from the orphan list. It happens when HAProxy is
compiled with -O2 with my gcc (8.3.1) on fedora 29... Changing a little how
reuse parameter is tested removes the warnings. So...

This patch may be backported to 1.9.

BUG/MINOR: da: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()

Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.

BUG/MINOR: 51d: Get the request channel to call CHECK_HTTP_MESSAGE_FIRST()

Since the commit 89dc49935 ("BUG/MAJOR: http_fetch: Get the channel depending on
the keyword used"), the right channel must be passed as argument when the macro
CHECK_HTTP_MESSAGE_FIRST is called.

This patch must be backported to 1.9.

BUG/MEDIUM: stream: Don't request a server connection if a shutw was scheduled

If a shutdown for writes was performed on the client side (CF_SHUTW is set on
the request channel) while the server connection is still unestablished (the
stream-int is in the state SI_ST_INI), then it is aborted. It must also be
aborted when the shudown for write is pending (only CF_SHUTW_NOW is
set). Otherwise, some errors on the request channel can be ignored, leaving the
stream in an undefined state.

This patch must be backported to 1.9. It may probably be backported to all
suported versions, but it is unclear if the bug is visbile for older versions
than 1.9. So it is probably safer to wait bug reports on these versions to
backport this patch.

BUG/MEDIUM: thread/http: Add missing locks in set-map and add-acl HTTP rules

Locks are missing in the rules "http-request set-map" and "http-response
add-acl" when an acl or map update is performed. Pattern elements must be
locked.

This patch must be backported to 1.9 and 1.8. For the 1.8, the HTX part must be
ignored.

BUG/MEDIUM: h1: Don't parse chunks CRLF if not enough data are available

As specified in the function comment, the function h1_skip_chunk_crlf() must not
change anything and return zero if not enough data are available. This must
include the case where there is no data at all. On this point, it must do the
same that other h1 parsing functions. This bug is made visible since the commit
91f77d599 ("BUG/MINOR: mux-h1: Process input even if the input buffer is
empty").

This patch must be backported to 1.9.

MINOR: proto_tcp: tcp-request content: enable set-dst and set-dst-var

The set-dst and set dst-var are available at both 'tcp-request
connection' and 'http-request' but not at the layer in the middle.
This patch fixes this miss and enables both set-dst and set-dst-var at
'tcp-request content' layer.

REGTEST: Missing REQUIRE_VERSION declarations.

checks/s00001.vtc needs support for "srvrecord" which came with 1.8 version.
peers/s_basic_sync.vtc and s_tls_basic_sync.vtc need support for "server"
keyword usage in "peers" section which came with 2.0 version.

BUG/MINOR: acl: properly detect pattern type SMP_T_ADDR

Since 1.6-dev4 with commit b2f8f087f ("MINOR: map: The map can return
IPv4 and IPv6"), maps can return both IPv4 and IPv6 addresses, which
is represented as SMP_T_ADDR at the output of the map converter. But
the ACL parser only checks for either SMP_T_IPV4 or SMP_T_IPV6 and
requires to see an explicit matching method specified. Given that it
uses the same pattern parser for both address families, it implicitly
is also compatible with SMP_T_ADDR, which ought to have been added
there.

This fix should be backported as far as 1.6.

BUG/MEDIUM: maps: only try to parse the default value when it's present

Maps returning an IP address (e.g. map_str_ip) support an optional
default value which must be parsed. Unfortunately the parsing code does
not check for this argument's existence and uncondtionally tries to
resolve the argument whenever the output is of type address, resulting
in segfaults at parsing time when no such argument is provided. This
patch adds the appropriate check.

This fix may be backported as far as 1.6.

MEDIUM: connections: Add a way to control the number of idling connections.

As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.

MINOR: fd: Add a counter of used fds.

Add a new counter, ha_used_fds, that let us know how many file descriptors
we're currently using.

MEDIUM: enable travis-ci builds

currently only xenial/clang build is enabled. osx and xenial/gcc
will be enabled later.

travis-ci is cloud based continuous integration, builds will be
started automatically if they are enabled for certain repo or fork.

Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>

MINOR: peers: adds counters on show peers about tasks calls.

This patch adds a counter of calls on the orchestator peers task
and a counter on the tasks linked to applet i/o handler for
each peer.

Those two counters are useful to detect if a peer sync is active
or frozen.

This patch is related to the commit:
"MINOR: peers: Add a new command to the CLI for peers."
and should be backported with it.

BUILD/medium: ssl: Fix build with OpenSSL < 1.1.0

Make sure it builds with OpenSSL < 1.1.0, a lot of the BIO_get/set methods
were introduced with OpenSSL 1.1.0, so fallback with the old way of doing
things if needed.

MEDIUM: ssl: provide our own BIO.

Instead of letting the OpenSSL code handle the file descriptor directly,
provide a custom BIO, that will use the underlying XPRT to send/recv data.
This will let us implement QUIC later, and probably clean the upper layer,
if/when the SSL code provide its own subscribe code, so that the upper layers
won't have to care if we're still waiting for the handshake to complete or not.

MEDIUM: connections: Provide a xprt_ctx for each xprt method.

For most of the xprt methods, provide a xprt_ctx. This will be useful later
when we'll want to be able to stack xprts.
The init() method now has to create and provide the said xprt_ctx if needed.

MEDIUM: ssl: provide its own subscribe/unsubscribe function.

In order to prepare for the possibility of using different kinds of xprt
with ssl, make the ssl code provide its own subscribe and unsubscribe
functions, right now it just calls conn_subscribe and conn_unsubsribe.

MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx.

Move xprt_st, tmp_early_data and sent_early_data from struct connection to
struct ssl_sock_ctx, as they are only used in the SSL code.

MEDIUM: ssl: Give ssl_sock its own context.

Instead of using directly a SSL * as xprt_ctx, give ssl_sock its own context.
It's useless for now, but will be useful later when we'll want to be able to
stack xprts.

MEDIUM: tasks: Use __ha_barrier_store after modifying global_tasks_mask.

Now that we no longer use atomic operations to update global_tasks_mask,
as it's always modified while holding the TASK_RQ_LOCK, we have to use
__ha_barrier_store() instead of __ha_barrier_atomic_store() to ensure
any modification of global_tasks_mask is seen before modifying
active_tasks_mask.

This should be backported to 1.9.

BUG/MINOR: mworker: disable busy polling in the master process

When enabling busy polling, we don't want the master to use it, or it
wastes a dedicated processor to this!

Must be backported to 1.9.

MINOR: contrib/prometheus-exporter: Follow best practices about metrics type

In short, _total metrics are now counters and others are gauges.

No backport needed. See issue #81 on github.

MINOR: contrib/prometheus-exporter: Rename some metrics to be more usable

Some metrics have been renamed and their type adapted to be more usable in
Prometheus:

  * haproxy_process_uptime_seconds -> haproxy_process_start_time_seconds
  * haproxy_process_max_memory -> haproxy_process_max_memory_bytes
  * haproxy_process_pool_allocated_total -> haproxy_process_pool_allocated_bytes
  * haproxy_process_pool_used_total -> haproxy_process_pool_used_bytes
  * haproxy_process_ssl_cache_lookups -> haproxy_process_ssl_cache_lookups_total
  * haproxy_process_ssl_cache_misses -> haproxy_process_ssl_cache_misses_total

No backport needed. See issue #81 on github.

MINOR: contrib/prometheus-exporter: Remove usless rate metrics

Following metrics have been removed:

  * haproxy_frontend_connections_rate_current (ST_F_CONN_RATE)
  * haproxy_frontend_http_requests_rate_current (ST_F_REQ_RATE)
  * haproxy_*_current_session_rate (ST_F_RATE)

These rates can be deduced using the total value with this kind of formula:

  rate(haproxy_frontend_connections_total[1m])

No backport needed. See issue #81 on github.

BUG/MINOR: contrib/prometheus-exporter: Fix a typo in the run-queue metric type

No backport needed.

MEDIUM: tasks: Don't account a destroyed task as a runned task.

In process_runnable_tasks(), if the task we're about to run has been
destroyed, and should be free, don't account for it in the number of task
we ran. We're only allowed a maximum number of tasks to run per call to
process_runnable_tasks(), and freeing one shouldn't take the slot of a
valid task.

MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().

task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.

CLEANUP: task: remain consistent when using the task's handler

A pointer "process" is assigned the task's handler in
process_runnable_tasks(), we have no reason to use t->process
right after it is assigned.

MINOR: task/thread: factor out a wake-up condition

The wakeup condition in task_wakeup() is redundant as it is already
validated by the CAS. Better move the __task_wakeup() call there, it
also has the merit of being easier to audit this way. This also reduces
the code size by around 1.8 kB :

  $ size haproxy-?
     text    data     bss     dec     hex filename
  2153806  100208 1307676 3561690  3658da haproxy-1
  2152094  100208 1307676 3559978  36522a haproxy-2

BUG/MAJOR: task: make sure never to delete a queued task

Commit 0c7a4b6 ("MINOR: tasks: Don't set the TASK_RUNNING flag when
adding in the tasklet list.") revealed a hole in the way tasks may
be freed : they could be removed while in the run queue when the
TASK_QUEUED flag was present but not the TASK_RUNNING one. But it
seems the issue was emphasized by commit cde7902 ("MEDIUM: tasks:
improve fairness between the local and global queues") though the
code it replaces was already affected given how late the TASK_RUNNING
flag was set after removal from the global queue.

At the moment the task is picked from the global run queue, if it
is the last one, the global run queue lock is dropped, and then
the TASK_RUNNING flag was added. In the mean time another thread
might have performed a task_free(), and immediately after, the
TASK_RUNNING flag was re-added to the task, which was then added
to the tasklet list. The unprotected window was extremely faint
but does definitely exist and inconsistent task lists have been
observed a few times during very intensive tests over the last few
days. From this point various options are possible, the task might
have been re-allocated while running, and assigned state 0 and/or
state QUEUED while it was still running, resulting in the tast not
being put back into the tree.

This commit simply makes sure that tests on TASK_RUNNING before removing
the task also cover TASK_QUEUED.

It must be backported to 1.9 along with the previous ones touching
that area.

BUG/MEDIUM: applets: Don't use task_in_rq().

When deciding if we want to wake the task of an applet up, don't give up
if task_in_rq returns 1, as there's a race condition and another thread
may run it. Instead, always attempt to task_wakeup(), at worst the task
is already in the run queue, and nothing will happen.

MINOR: tasks: Don't set the TASK_RUNNING flag when adding in the tasklet list.

Now that TASK_QUEUED is enforced, there's no need to set TASK_RUNNING when
removing the task from the runqueue to add it to the tasklet list. The flag
will only be set right before we run the task.

MEDIUM: tasks: No longer use rq.node.leaf_p as a lock.

Now that we have the warranty that a task won't be added in the runqueue
while the TASK_QUEUED or the TASK_RUNNING flag is set, don't bother trying
to lock the task by setting leaf_p to 0x1 while inserting it in the runqueue
or having it in the tasklet_list, as nobody else will attempt to add it.

MINOR: tasks: Don't consider we can wake task with tasklet_wakeup().

In tasklet_wakeup(), don't bother checking if the tasklet is really a task,
calling tasklet_wakeup() with a task is invalid.

BUG/MEDIUM: tasks: Make sure we modify global_tasks_mask with the rq_lock.

When modifying global_tasks_mask, make sure we hold the rq_lock, or we might
remove the bit while it has been re-set by somebody else, and we make not
be waked when needed.

BUG/MEDIUM: tasks: Make sure we set TASK_QUEUED before adding a task to the rq.

Make sure we set TASK_QUEUED in every case before adding the task to the
run queue. task_wakeup() now checks if either TASK_QUEUED or TASK_RUNNING
is set, and if neither is set, add TASK_QUEUED and effectively add the task
to the runqueue.
No longer use __task_wakeup() anywhere except in task_wakeup(), always use
task_wakeup() instead.
With the old code, process_runnable_task() may re-add a task in the runqueue
without setting the TASK_QUEUED flag, and there were race conditions that could
lead to a task having the TASK_QUEUED flag but not in the runqueue, thus
being unschedulable.

This should be backported to 1.9.

BUG/MINOR: http_fetch/htx: Use HTX versions if the proxy enables the HTX mode

Because the HTX is now the default mode for all proxies (HTTP and TCP), it is
better to match on the proxy options to know if the HTX is enabled or not. This
way, if a TCP proxy explicitly disables the HTX mode, the legacy version of HTTP
fetches will be used.

No backport needed except if the patch activating the HTX by default for all
proxies is backported.

BUG/MINOR: http_fetch/htx: Allow permissive sample prefetch for the HTX

As for smp_prefetch_http(), there is now a way to successfully perform a
prefetch in HTX, even if the message forwarding already begun. It is used for
the sample fetches "req.proto_http" and "method".

This patch must be backported to 1.9.

BUG/MAJOR: http_fetch: Get the channel depending on the keyword used

All HTTP samples are buggy because the channel tested in the prefetch functions
(HTX and legacy HTTP) is chosen depending on the sample direction and not the
keyword really used. It means the request channel is used if the sample is
called during the request analysis and the response channel is used if it is
called during the response analysis, regardless the sample really called. For
instance, if you use the sample "req.ver" in an http-response rule, the response
channel will be prefeched because it is called during the response analysis,
while the request channel should have been used instead. So some assumptions on
the validity of the sample may be made on the wrong channel. It is the first
bug.

Then the same error is done in some samples themselves. So fetches are performed
on the wrong channel. For instance, the header extraction (req.fhdr, res.fhdr,
req.hdr, res.hdr...). If the sample "req.hdr" is used in an http-response rule,
then the matching is done on the response headers and not the request ones. It
is the second bug.

Finally, the last one but not the least, in some samples, the right channel is
used. But because the prefetch was done on the wrong one, this channel may be in
a undefined state. For instance, using the sample "req.ver" in an http-response
rule leads to a matching on a posibility released buffer.

To fix all these bugs, the right channel is now chosen in sample fetches, before
the prefetch. If the same function is used to fetch requests and responses
elements, then the keyword is used to choose the right one. This channel is then
used by the functions smp_prefetch_htx() and smp_prefetch_http(). Of course, it
is also used by the samples themselves to extract information.

This patch must be backported to all supported versions. For version 1.8 and
priors, it must be totally refactored. First because there is no HTX into these
versions. Then the buffers API has changed in HAProxy 1.9. The files
http_fetch.{ch} doesn't exist on old versions.

BUG/MEDIUM: htx: Don't return the start-line if the HTX message is empty

In the function htx_get_stline(), NULL must be returned if the HTX message
doesn't contain any element.

This patch must be backported to 1.9.

MINOR: mux-h1: Handle read0 during TCP splicing

It avoids a roundtrip with underlying I/O callbacks to do so. If a read0 is
handled at the end of h1_rcv_pipe(), the flag CS_FL_REOS is set on the
conn_stream. And if there is no data in the pipe, the flag CS_FL_EOS is also
set.

This path may be backported to 1.9.

BUG/MEDIUM: mux-h1: Enable TCP splicing to exchange data only

Use the TCP splicing only when the input parser is in the state H1_MSG_DATA or
H1_MSG_TUNNEL and don't transfer more than then known expected length for these
data (unlimited for the tunnel mode). In other states or when all data are
transferred, the TCP splicing is disabled.

This patch must be backported to 1.9.

BUG/MEDIUM: mux-h1: Notify the stream waiting for TCP splicing if ibuf is empty

When a stream-interface want to use the TCP splicing to forward its data, it
notifies the mux h1. We will then flush the input buffer and don't read more
data. So the stream-interface will not be notified for read anymore, except if
an error or a read0 is detected. It is a problem everytime the receive I/O
callback is called again. It happens when the pipe is full or when no data are
received on the pipe. It also happens when the input buffer is freshly
flushed. Because the TCP splicing is enabled, nothing is done in h1_recv() and
the stream-interface is never woken up. So, now, in h1_recv(), if the TCP
splicing is used and the input buffer is empty, the stream-interface is notified
for read.

This patch must be backported to 1.9.

BUG/MINOR: mux-h1: Don't switch the parser in busy mode if other side has done

There is no reaon to switch the input parser in busy mode if all the output has
been processed.

This patch must be backported to 1.9.

BUG/MINOR: mux-h1: Process input even if the input buffer is empty

It is required, at least, to add the EOM block and finish the message when the
TCP splicing was used to send all data. Otherwise, there is no way to finish the
parsing.

This patch must be backported to 1.9.

REGTESTS: exclude tests that require ssl, pcre if no such feature is enabled

Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>

BUG/MINOR: mworker: ensure that we still quits with SIGINT

Since the fix "BUG/MINOR: mworker: don't exit with an ambiguous value"
we are leaving with a EXIT_SUCCESS upon a SIGINT.

We still need to quit with a SIGINT when a worker leaves with a SIGINT.

This is done this way because vtest expect a 130 during the process
stop, haproxy without mworker returns a 130, so it should be the same in
mworker mode.

This should be backported in 1.9, with the previous patch ("BUG/MINOR:
mworker: don't exit with an ambiguous value").

Code has moved, mworker_catch_sigchld() is in haproxy.c.

BUG/MINOR: mworker: don't exit with an ambiguous value

When the sigchld handler is called and waitpid() returns -1,
the behavior of waitpid() with the status variable is undefined.
It is not a good idea to exit with the value contained in it.

Since this exit path does not use the exitcode variable, it means that
this is an expected and successful exit.

This should be backported in 1.9, code has moved,
mworker_catch_sigchld() is in haproxy.c.

BUG/MINOR: mworker: mworker_kill should apply on every children

Commit 3f12887 ("MINOR: mworker: don't use children variable anymore")
introduced a regression.

The previous behavior was to send a signal to every children, whether or
not they are former children. Instead of this, we only send a signal to
the current children, so we don't try to kill -INT or -TERM all
processes during a reload.

No backport needed.

BUG/MINOR: listener/mq: correctly scan all bound threads under low load

When iterating on the CLI using "show activity" and no other load, it
was visible that the last thread was always skipped. This was caused by
the way the thread bits were walking : t1 was updated after t2 to make
sure it never equals t2 (thus it skips t2), and in case of a tie we
choose t1. This results in the chosen thread never to equal t2 unless
the other ones already have one connection. In addition to this, t2 was
recalulated upon each pass due to the fact that only the 31th bit was
looked at instead of looking at the t2'th bit.

This patch fixes this by updating t2 after t1 so that t1 is free to
walk over all positions under equal load. No measurable performance
gains are expected from this though, but it at least removes one
strange indicator which could lead to some suspicion.

No backport is needed.

MINOR: init: add a "set-dumpable" global directive to enable core dumps

It's always a pain to get a core dump when enabling user/group setting
(which disables the dumpable flag on Linux), when using a chroot and/or
when haproxy is started by a service management tool which requires
complex operations to just raise the core dump limit.

This patch introduces a new "set-dumpable" global directive to work
around these troubles by doing the following :

  - remove file size limits     (equivalent of ulimit -f unlimited)
  - remove core size limits     (equivalent of ulimit -c unlimited)
  - mark the process dumpable again (equivalent of suid_dumpable=1)

Some of these will depend on the operating system. This way it becomes
much easier to retrieve a core file. Temporarily moving the chroot to
a user-writable place generally enough.

MINOR: mworker: export HAPROXY_MWORKER=1 when running in mworker mode

Export HAPROXY_MWORKER=1 in an environment variable when running in
mworker mode.

MINOR: cli: don't add a semicolon at the end of HAPROXY_CLI

Only add the semicolon when there is several CLI in HAPROXY_CLI and
HAPROXY_MASTER_CLI.

MEDIUM: mworker/cli: export the HAPROXY_MASTER_CLI variable

It works the same way as the HAPROXY_CLI variable, it exports the
listeners addresses separated by semicolons.

CLEANUP: mworker: remove the type field in mworker_proc

Since the introduction of the options field, we can use it to store the
type of process.

type = 'm' is replaced by PROC_O_TYPE_MASTER
type = 'w' is replaced by PROC_O_TYPE_WORKER
type = 'e' is replaced by PROC_O_TYPE_PROG

The old values are still used in the HAPROXY_PROCESSES environment
variable to pass the information during a reload.

MEDIUM: mworker-prog: implements 'option start-on-reload'

This option is already the default, but its opposite 'no option
start-on-reload' allows the master to keep a previous instance of a
program and don't start a new one upon a reload.

The old program will then appear as a current one in "show proc" and
could also trigger an exit-on-failure upon a segfault.

MEDIUM: mworker: store the leaving state of a process

Previously we were assuming than a process was in a leaving state when
its number of reload was greater than 0. With mworker programs it's not
the case anymore so we need to store a leaving state.

BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB

Maksim Kupriianov reported very strange crashes in fwrr_update_position()
which didn't make sense because of an apparent divide overflow except that
the value was not null in the core.

It happens that while the locking is correct in all the functions' call
graph, the uppermost one (fwrr_get_next_server()) incorrectly expected
that its target server was already locked when called. This stupid
assumption causd the server lock not to be held when calling the other
ones, explaining how it was possible to change the server's eweight by
calling srv_lb_commit_status() under the server lock yet collide with
its unprotected usage.

This commit makes sure that fwrr_get_server_from_group() retrieves a
locked server and that fwrr_get_next_server() is responsible for
unlocking the server before returning it. There is one subtlety in
this function which is that it builds a list of avoided servers that
were full while scanning the tree, and all of them are queued in a
full state so they must be unlocked upon return.

Many thanks to Maksim for providing detailed info allowing to narrow
down this bug.

This fix must be backported to 1.9. In 1.8 the lock seems much wider
and changes to the server's state are performed under the rendez-vous
point so this it doesn't seem possible that it happens there.

DOC: update for "show peers" CLI command.

Add the documentation for the new "show peers" CLI command which comes with
this commit "MINOR: peers: Add a new command to the CLI for peers.".

MINOR: peers: Add a new command to the CLI for peers.

Implements "show peers [peers section]" new CLI command to dump information
about the peers and their stick-tables to be synchronized and others internal.

May be backported as far as 1.5.

BUILD: htx: fix a used uninitialized warning on is_cookie2

gcc-3.4 reports this which actually looks like a valid warning when
looking at the code, it's unsure why others didn't notice it :

src/proto_htx.c: In function `htx_manage_server_side_cookies':
src/proto_htx.c:4266: warning: 'is_cookie2' might be used uninitialized in this function

BUILD: do not specify "const" on functions returning structs or scalars

Older compilers (like gcc-3.4) warn about the use of "const" on functions
returning a struct, which makes sense since the return may only be copied :

include/common/htx.h:233: warning: type qualifiers ignored on function return type

Let's simply drop "const" here.

BUILD: address a few cases of "static <type> inline foo()"

Older compilers don't like to see "inline" placed after the type in a
function declaration, it must be "static inline <type>" only. This
patch touches various areas. The warnings were seen with gcc-3.4.

BUG/MEDIUM: Threads: Only use the gcc >= 4.7 builtins when using gcc >= 4.7.

Move the definition of the various _HA_ATOMIC_* macros that use
__atomic_* in the #if GCC_VERSION >= 4.7, not just after it, so that we
can build with older versions of gcc again.

MINOR: connections: Remove the SUB_CALL_UNSUBSCRIBE flag.

Garbage collect SUB_CALL_UNSUBSCIRBE, as it's now unused.

BUG/MEDIUM: h2: Revamp the way send subscriptions works.

Instead of abusing the SUB_CALL_UNSUBSCRIBE flag, revamp the H2 code a bit
so that it just checks if h2s->sending_list is empty to know if the tasklet
of the stream_interface has been waken up or not.
send_wait is now set to NULL in h2_snd_buf() (ideally we'd set it to NULL
as soon as we're waking the tasklet, but it can't be done, because we still
need it in case we have to remove the tasklet from the task list).

BUG/MEDIUM: h2: Make sure we're not already in the send_list in h2_subscribe().

In h2_subscribe(), don't add ourself to the send_list if we're already in it.
That may happen if we try to send and fail twice, as we're only removed
from the send_list if we managed to send data, to promote fairness.
Failing to do so can lead to either an infinite loop, or some random crashes,
as we'd get the same h2s in the send_list twice.

This should be backported to 1.9.

BUG/MEDIUM: muxes: Make sure we unsubcribed when destroying mux ctx.

In the h1 and h2 muxes, make sure we unsubscribed before destroying the
mux context.
Failing to do so will lead in a segfault later, as the connection will
attempt to dereference its conn->send_wait or conn->recv_wait, which pointed
to the now-free'd mux context.

This was introduced by commit 39a96ee16eeec51774f9f52a783cf624a0de4ccb, so
should only be backported if that commit gets backported.

BUILD: cli/threads: fix build in single-threaded mode

Commit a8f57d51a ("MINOR: cli/activity: report the accept queue sizes
in "show activity"") broke the single-threaded build because the
accept-rings are not implemented there. Let's ifdef this out. Ideally
we should start to think about always having such elements initialized
even without threads to improve the test coverage.

BUILD: task/thread: fix single-threaded build of task.c

As expected, commit cde7902ac ("MEDIUM: tasks: improve fairness between
the local and global queues") broke the build with threads disabled,
and I forgot to rerun this test before committing. No backport is
needed.

BUG/MINOR: ssl: Fix 48 byte TLS ticket key rotation

Whenever HAProxy was reloaded with rotated keys, the resumption would be
broken for previous encryption key. The bug was introduced with the addition
of 80 byte keys in 9e7547 (MINOR: ssl: add support of aes256 bits ticket keys
on file and cli.).

This fix needs to be backported to 1.9.