git.ipfire.org Git - thirdparty/haproxy.git/log

MINOR: buffer: remove bi_getblk() and bi_getblk_nc()

These ones were relying on bi_ptr() and are not used. They may be
reimplemented later in the channel if needed.

MINOR: buffer: replace calls to buffer_space_wraps() with b_space_wraps()

And remove the unused function.

MINOR: channel/buffer: replace b_{adv,rew} with c_{adv,rew}

These ones manipulate the output data count which will be specific to
the channel soon, so prepare the call points to use the channel only.
The b_* functions are now unused and were removed.

MINOR: buffer: remove buffer_slow_realign() and the swap_buffer allocation code

Since all call places can use the trash now, this is not needed anymore.

MINOR: h2: use b_slow_realign() with the trash as a swap buffer

H2 doesn't use the trash so it can make use of it as a swap area when
calling b_slow_realign(). This way we don't need buffer_slow_realign()
anymore.

MEDIUM: channel: make channel_slow_realign() take a swap buffer

The few call places where it's used can use the trash as a swap buffer,
which is made for this exact purpose. This way we can rely on the
generic b_slow_realign() call.

MINOR: channel/buffer: replace buffer_slow_realign() with channel_slow_realign() and b_slow_realign()

Where relevant, the channel version is used instead. The buffer version
was ported to be more generic and now takes a swap buffer and the output
byte count to know where to set the alignment point. The H2 mux still
uses buffer_slow_realign() with buf->o but it will change later.

MINOR: channel/buffer: use c_realign_if_empty() instead of buffer_realign()

This patch removes buffer_realign() and replaces it with c_realign_if_empty()
instead.

MINOR: channel: add a few basic functions for the new buffer API

This adds :
  - c_orig()  : channel buffer's origin
  - c_size()  : channel buffer's size
  - c_wrap()  : channel buffer's wrapping location
  - c_data()  : channel buffer's total data count
  - c_room()  : room left in channel buffer's
  - c_empty() : true if channel buffer is empty
  - c_full()  : true if channel buffer is full

  - c_ptr()   : pointer to an offset relative to input data in the buffer
  - c_adv()   : advances the channel's buffer (bytes become part of output)
  - c_rew()   : rewinds the channel's buffer (output bytes not output anymore)
  - c_realign_if_empty() : realigns the buffer if it's empty

  - co_data() : # of output data
  - co_head() : beginning of output data
  - co_tail() : end of output data
  - ci_data() : # of input data
  - ci_head() : beginning of input data
  - ci_tail() : end of input data
  - ci_stop() : location after ci_tail()
  - ci_next() : pointer to next input byte

And for the ci_* / co_* functions above, the "__*" variants which disable
wrapping checks, and the "_ofs" variants which return an offset relative to
the buffer's origin instead.

MINOR: compression: pass the channel to http_compression_buffer_end()

This will be needed to access the output data count from the channel
after the buffer/channel changes.

MINOR: buffer: introduce b_realign_if_empty()

Many places deal with buffer realignment after data removal. The method
is always the same : if the buffer is empty, set its pointer to the origin.
Let's have a function for this so that we have less code to change with the
new API.

MINOR: buffer: Add b_set_data().

Add a new function that lets you set the amount of input in a buffer.
For now it extends/truncates b->i except if the total length is
below b->o in which case it clears i and adjusts o.

MINOR: buffer: Introduce b_sub(), b_add(), and bo_add()

Instead of doing b->i -= directly, introduce b_sub(), that does the job, to
make it easier to switch to the future API.

Also add b_add(), that increases b->i, instead of using it directly, and
bo_add(), that does increase b->o.

MINOR: buffer: add a few basic functions for the new API

Here's the list of newly introduced functions :

- b_data(), returning the total amount of data in the buffer (currently i+o)

- b_orig(), returning the origin of the storage area, that is, the place of
  position 0.

- b_wrap(), pointer to wrapping point (currently data+size)

- b_size(), returning the size of the buffer

- b_room(), returning the amount of bytes left available

- b_full(), returning true if the buffer is full, otherwise false

- b_stop(), pointer to end of data mark (currently p+i), used to compute
  distances or a stop pointer for a loop.

- b_peek(), this one will help make the transition to the new buffer model.
  It returns a pointer to a position in the buffer known from an offest
  relative to the beginning of the data in the buffer. Thus, we can replace
  the following occurrences :

     bo_ptr(b)     => b_peek(b, 0);
     bo_end(b)     => b_peek(b, b->o);
     bi_ptr(b)     => b_peek(b, b->o);
     bi_end(b)     => b_peek(b, b->i + b->o);
     b_ptr(b, ofs) => b_peek(b, b->o + ofs);

- b_head(), pointer to the beginning of data (currently bo_ptr())

- b_tail(), pointer to first free place (currently bi_ptr())

- b_next() / b_next_ofs(), pointer to the next byte, taking wrapping
  into account.

- b_dist(), returning the distance between two pointers belonging to a buffer

- b_reset(), which resets the buffer

- b_space_wraps(), indicating if the free space wraps around the buffer

- b_almost_full(), indicating if 3/4 or more of the buffer are used

Some of these are provided with the unchecked variants using the "__"
prefix, or with the "_ofs" suffix indicating they return a relative
position to the buffer's origin instead of a pointer.

Cc: Olivier Houchard <ohouchard@haproxy.com>

MINOR: buffer: switch buffer sizes and offsets to size_t

Passing unsigned ints everywhere is painful, and will cause some headache
later when we'll want to integrate better with struct ist which already
uses size_t. Let's switch buffers to use size_t instead.

MINOR: buffer: implement a new file for low-level buffer manipulation functions

The buffer code currently depends on pools and other stuff and is not
really autonomous anymore. The rewrite of the new API is an opportunity
to clean this up. This patch creates a new file (buf.h) which does not
depend on other elements and which will only contain what is needed to
perform the most basic buffer operations. The new API will be introduced
in this file and the conversion will be finished once buffer.h is empty.

The definition of struct buffer was moved to this new file, using more
explicity stdint types for the sizes and offsets.

Most new functions will be implemented in two variants :

__b_something() : unchecked variant, no wrapping is expected
b_something() : wrapping-checked variant

This way callers will be able to select which one to use depending on
the use cases.

MINOR: tasklet: Set process to NULL.

Some consumers expect the process to be NULL when a tasklet it created, so
do so.

BUG/MEDIUM: h2: make sure the last stream closes the connection after a timeout

If a timeout strikes on the connection side with some active streams,
there is a corner case which can sometimes cause the following sequence
to happen :

  - There are active streams but there are data in the mux buffer
    (eg: a client suddenly disconnected during a download with pending
    requests). The timeout is active.

  - The timeout strikes, h2_timeout_task() is called, kills the task and
    doesn't close the connection since there are streams left ; The
    connection is marked in H2_CS_ERROR ;

  - the streams are woken up and closed ;

  - when the last stream closes, calling h2_detach(), it sees the
    tree list is empty, but there is no condition allowing the
    connection to be closed (mbuf->o > 0), thus it does nothing ;

  - since the task is dead, there's no more hope to clear this
    situation later

For now we can take care of this by adding a test for the presence of
H2_CS_ERROR and !task, implying the timeout task triggered already
and will not be able to handle this again.

Over the long term it seems like a more reliable test on should be
made, so that it is possible to know whether or not someone is still
able to close this connection.

A big thanks to Janusz Dziemidowicz and Milan Petruzelka for providing
many details helping in figuring this bug.

BUG/MEDIUM: h2: never leave pending data in the output buffer on close

We currently don't process trailers on H2, but this has an impact : on
chunked HTTP/1 responses, we decide to emit the ES bit once we see the
0CRLF. From this point the stream switches to the CLOSED state, which
aborts processing of the remaining bytes. Thus the extra CRLF which ends
trailers is not processed and remains in the buffer. This prevents the
stream from being notified about end of transmission, which in turn keeps
the mux busy and prevents the connection from quitting.

The case of the trailers is not the root cause of this issue, though it
is what triggers it. The root cause is that upon error and/or close, once
we know we're not going to process any more data, we must absolutely flush
any remaining bytes from the output buffer, otherwise there is no way the
stream can quit. This is what this patch does.

It looks very likely related to the issues reported and debugged by
Janusz Dziemidowicz and Milan Petruzelka.

One way to reproduce it is to chain two proxies with the last one emitting
chunked data (typically using the stats page) :

    global
        stats socket /tmp/sock1 mode 666 level admin
        stats timeout 1h
        tune.ssl.default-dh-param 1024
        tune.bufsize 16384

    defaults
        mode http
        timeout connect 4s
        timeout client 10s
        timeout server 20s

    listen px1
        bind :4443 ssl crt rsa+dh2048.pem npn h2 alpn h2
        server s1 127.0.0.1:4445

    listen px2
        bind :4444 ssl crt rsa+dh2048.pem npn h2 alpn h2
        bind :4445
        stats uri /

Then use curl to fetch the stats through px1 :

    curl --http2 -k "https://127.0.0.1:4443/"

When curl is sent to the first one, "show sess" issued to the CLI will
show a remaining session during the client timeout. When curl is aimed at
port 4444 (px2), there is no such remaining session.

This fix needs to be backported to 1.8.

MINOR: h2: add the mux and demux buffer lengths on "show fd"

It is convenient during debugging sessions to know if the mux and demux
buffers are empty/full/other. Let's report this on "show fd" output.

BUG/MEDIUM: h2: don't accept new streams if conn_streams are still in excess

The streams bookkeeping made in H2 is used for protocol compliance only
but it doesn't consider the number of conn_streams still attached to the
mux. It causes an issue when http-request set-nice rules are applied on
H2 requests processed on a saturated machine. Indeed, in this case, the
requests are accepted and assigned a default nice value of zero. When
they are processed, their nice value changes to a higher one (say 1024).
The response is sent through the H2 mux, which detects the end of stream
and decrements the protocol-level stream count (h2c->nb_streams). The
client may then send a new request. But the conn_stream is still attached
and will require a new call to process_stream() to finish, which is made
through the scheduler. Given that the machine is saturated, it is assumed
that many tasks are present in the scheduler. Thus the closing tasks holding
a higher nice value will pass after the new stream creations. If the client
is fast enough with a low latency link, it may add a lot of new stream
creations before the stream terminations have a chance to disappear due
to their high nice value, resulting in a huge amount of memory being used.

The solution consists in letting a mux always monitor its conn_streams and
refrain from creating new ones when it is full. Here the H2 mux checks the
nb_cs counter and sets a new blocked flag (H2_CF_DEM_TOOMANY) if the limit
was reached, so that the frame parser requests a pause in the new stream
creation, leaving some time for the pending conn_streams to vanish.

Several experiments were made using varying thresholds to see if
overbooking would provide any benefit here but it turned out not to be
the case, so the conn_stream limit remains set to the exact streams
limit. Interestingly various performance measurements showed that the
code tends to be slightly faster now than without the limit, probably
due to the smoother memory usage.

This commit requires previous patch ("MINOR: h2: keep a count of the number
of conn_streams attached to the mux"). It needs to be backported to 1.8.

MINOR: h2: keep a count of the number of conn_streams attached to the mux

The h2 mux only knows about the number of H2 streams which are not in a
CLOSED state. This is used for protocol compliance. But it doesn't hold
the number of really attached streams. It is a problem because depending
on scheduling, it is possible that more streams are attached to the mux
than the ones seen at the protocol level, due to some streams taking some
time to be detached. Let's add this count based on the conn_streams.

Note: this patch is part of a series of fixes which will have to be
backported to 1.8.

BUG/MINOR: ssl: properly ref-count the tls_keys entries

Commit 200b0fa ("MEDIUM: Add support for updating TLS ticket keys via
socket") introduced support for updating TLS ticket keys from the CLI,
but missed a small corner case : if multiple bind lines reference the
same tls_keys file, the same reference is used (as expected), but during
the clean shutdown, it will lead to a double free when destroying the
bind_conf contexts since none of the lines knows if others still use
it. The impact is very low however, mostly a core and/or a message in
the system's log upon old process termination.

Let's introduce some basic refcounting to prevent this from happening,
so that only the last bind_conf frees it.

Thanks to Janusz Dziemidowicz and Thierry Fournier for both reporting
the same issue with an easy reproducer.

This fix needs to be backported from 1.6 to 1.8.

REGTEST/MINOR: Unexpected curl URL globling.

With certain curl versions URLs which contain brackets may be interpreted
by the "URL globbing parser". This patch ensures that such brackets
are escaped.

Thank you to Ilya Shipitsin for having reported this issue.

MINOR: dns: new DNS options to allow/prevent IP address duplication

By default, HAProxy's DNS resolution at runtime ensure that there is no
IP address duplication in a backend (for servers being resolved by the
same hostname).
There are a few cases where people want, on purpose, to disable this
feature.

This patch introduces a couple of new server side options for this purpose:
"resolve-opts allow-dup-ip" or "resolve-opts prevent-dup-ip".

MINOR: dns: fix wrong score computation in dns_get_ip_from_response

dns_get_ip_from_response() is used to compare the caller current IP to
the IP available in the records returned by the DNS server.
A scoring system is in place to get the best IP address available.
That said, in the current implementation, there are a couple of issues:
1. a comment does not match what the code does
2. the code does not match what the commet says (score value is not
incremented with '2')

This patch fixes both issues.

Backport status: 1.8

CLEANUP: dns: inacurate comment about prefered IP score

The comment was about "prefered network ip version" while it's actually
"prefered ip version" in the code.
Fixed

Backport status: 1.7 and 1.8
Be careful, this patch may not apply on 1.7, since the score was '4'
for this item at that time.

CLEANUP: dns: remove obsolete macro DNS_MAX_IP_REC

Since a8c6db8d2d97629b2734c1d2be0860b6b11e5709, this macro is not used
anymore and can be safely removed.

Backport status: 1.8

REGTEST/MINOR: Wrong URI syntax.

Ilya Shipitsin reported that with some curl versions this reg test
may fail due to a wrong URI syntax with ::1 ipv6 local address in
this varnishtest script. This patch fixes this syntax issue and
replaces the iteration of "procees" commands by a "shell" command
to start curl processes (must be faster).

Thanks to Ilya Shipitsin for having reported this VTC file bug.

MINOR: systemd: consider exit status 143 as successful

The master process will exit with the status of the last worker. When
the worker is killed with SIGTERM, it is expected to get 143 as an
exit status. Therefore, we consider this exit status as normal from a
systemd point of view. If it happens when not stopping, the systemd
unit is configured to always restart, so it has no adverse effect.

This has mostly a cosmetic effect. Without the patch, stopping HAProxy
leads to the following status:

    â\97\8f haproxy.service - HAProxy Load Balancer
       Loaded: loaded (/lib/systemd/system/haproxy.service; disabled; vendor preset: enabled)
       Active: failed (Result: exit-code) since Fri 2018-06-22 20:35:42 CEST; 8min ago
         Docs: man:haproxy(1)
               file:/usr/share/doc/haproxy/configuration.txt.gz
      Process: 32715 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE $EXTRAOPTS (code=exited, status=143)
      Process: 32714 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS)
     Main PID: 32715 (code=exited, status=143)

After the patch:

    â\97\8f haproxy.service - HAProxy Load Balancer
       Loaded: loaded (/lib/systemd/system/haproxy.service; disabled; vendor preset: enabled)
       Active: inactive (dead)
         Docs: man:haproxy(1)
               file:/usr/share/doc/haproxy/configuration.txt.gz

MINOR: startup: change session/process group settings

Change the way the process groups are set. Indeed setsid() was called
for every processes which caused the worker to have a different process
group than the master.

This patch behave in a better way:

- In daemon mode only, each child do a setsid()
- In master worker + daemon mode, the setsid() is done in the master before
forking the children
- In any foreground mode, we don't do a setsid()

Could be backported in 1.8 but the master-worker mode is mostly used
with systemd which rely on cgroups so that won't affect much people.

BUG/MEDIUM: lua: possible CLOSE-WAIT state with '\n' headers

The Lua parser doesn't takes in account end-of-headers containing
only '\n'. It expects always '\r\n'. If a '\n' is processes the Lua
parser considers it miss 1 byte, and wait indefinitely for new data.

When the client reaches their timeout, it closes the connection.
This close is not detected and the connection keep in CLOSE-WAIT
state.

I guess that this patch fix only a visible part of the problem.
If the Lua HTTP parser wait for data, the timeout server or the
connectio closed by the client may stop the applet.

How reproduce the problem:

HAProxy conf:

   global
      lua-load bug38.lua
   frontend frt
      timeout client 2s
      timeout server 2s
      mode http
      bind *:8080
      http-request use-service lua.donothing

Lua conf

   core.register_service("donothing", "http", function(applet) end)

Client request:

   echo -ne 'GET / HTTP/1.1\n\n' | nc 127.0.0.1 8080

Look for CLOSE-WAIT in the connection with "netstat" or "ss". I
use this script:

   while sleep 1; do ss | grep CLOSE-WAIT; done

This patch must be backported in 1.6, 1.7 and 1.8

Workaround: enable the "hard-stop-after" directive, and perform
periodic reload.

MINOR: stick-tables: make stktable_release() do nothing on NULL

stktable_release() has been involved in two recent crashes by being
used without enough care. Just like any free() function this one is
often called on an exit path with a possibly unsafe argument. Given
that there is another case (smp_fetch_sc_trackers()) which theorically
could call it with an unchecked NULL, though it cannot happen since
the function doesn't support being called with src_* hence cannot make
use of tmpstkctr, let's rather move the check into the function itself
to make it safer for the long term.

This patch could be backported to 1.8 as a strengthening measure.

BUG/MAJOR: stick_table: Complete incomplete SEGV fix

This commit completes the incomplete segmentation fault fix
in commit ac1f3ed64b58bd178865c6f2cc8f6f306d9e1e15.

Likewise it must be backported to haproxy 1.8.

BUG/BUILD: threads: unbreak build without threads

The build without threads was once again broken.

This issue was introduced in commit ba86c6c ("MINOR: threads: Be sure to
remove threads from all_threads_mask on exit").

This is exactly the same problem as last time it happened, because of
all_threads_mask not being defined with USE_THREAD=

This must be backported in 1.8

BUG/MAJOR: Stick-tables crash with segfault when the key is not in the stick-table

When a lookup is done on a key not present in the stick-table the "st"
pointer is NULL and it is used to return the converter result, but it
is used untested with stktable_release().

This regression was introduced in 1.8.10 here:

   BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters
   commit d7bd88009d88dd413e01bc0baa90d6662a3d7718
   Author: Daniel Corbett <dcorbett@haproxy.com>
   Date:   Sun May 27 09:47:12 2018 -0400

Minimal conf for reproducong the problem:

   frontend test
      mode http
      stick-table type ip size 1m expire 1h store gpc0
      bind *:8080
      http-request redirect location /a if { src,in_table(test) }

The segfault is triggered using:

   curl -i http://127.0.0.1:8080/

This patch must be backported in 1.8

REGTEST/MINOR: Add levels to reg-tests target.

With this patch we can provide LEVEL environment variable when
running reg-tests Makefile targe (reg testing) to set the execution
level of the reg-tests make target to run.

LEVEL default value is 1.

LEVEL=1 is to run all h*.vtc files which are the most important
reg testing files (to test haproxy core, HTTP compliance etc).

LEVEL=2 is to run all s*.vtc files which are a bit slow tests,
for instance tests requiring external programs (curl, socat etc).

LEVEL=3 is to run all l*.vtc files which are test files with again
more slow or with little interest.

REGTEST/MINOR: Set HAPROXY_PROGRAM default value.

With this patch, we set HAPROXY_PROGRAM environment variable
default value to the haproxy executable of the current working directory.
So, if the current directory is the haproxy sources directory,
the reg-tests Makefile target may be run with this shorter command:

$ VARNISTEST_PROGRAM=<...> make reg-tests

in place of

$ VARNISTEST_PROGRAM=<...> HAPROXY_PROGRAM=<...> make reg-tests

REGTEST/MINOR: Wrong URI in a reg test for SSL/TLS.

Fix typos where http:// URIs were used in place of https://.

MINOR: threads: Be sure to remove threads from all_threads_mask on exit

When HAProxy is started with several threads, Each running thread holds a bit in
the bitfiled all_threads_mask. This bitfield is used here and there to check
which threads are registered to take part in a specific processing. So when a
thread exits, it seems normal to remove it from all_threads_mask.

No direct impact could be identified with this right now but it would
be better to backport it to 1.8 as a preventive measure to avoid complex
situations like the one in previous bug.

BUG/MEDIUM: threads: Use the sync point to check active jobs and exit

When HAProxy is shutting down, it exits the polling loop when there is no jobs
anymore (jobs == 0). When there is no thread, it works pretty well, but when
HAProxy is started with several threads, a thread can decide to exit because
jobs variable reached 0 while another one is processing a task (e.g. a
health-check). At this stage, the running thread could decide to request a
synchronization. But because at least one of them has already gone, the others
will wait infinitly in the sync point and the process will never die.

To fix the bug, when the first thread (and only this one) detects there is no
active jobs anymore, it requests a synchronization. And in the sync point, all
threads will check if jobs variable reached 0 to exit the polling loop.

This patch must be backported in 1.8.

MINOR: Some spelling cleanup in the comments.

Signed-off-by: Dave Chiluk <chiluk+haproxy@indeed.com>

BUG/MEDIUM: fd: Don't modify the update_mask in fd_dodelete().

Only the pollers should remove bits in the update_mask. Removing it will
mean if the fd is currently in the global update list, it will never be
removed, and while it's mostly harmless in 1.9, in 1.8, only update_mask
is checked to know if the fd is already in the list or not, so we can end
up trying to add a fd that is already in the list, and corrupt it, which
means some fd may not be added to the poller.

This should be backported to 1.8.

DOC: Add new REGTEST tag info about reg testing.

MINOR: reg-tests: Add a few regression testing files.

MINOR: reg-tests: Add reg-tests/README file.

Add reg-tests/README file about how to compile and use varnishtest, and
how to produce patches to add regression testing files to HAProxy sources.

Also update CONTRIBUTING file to encourage the contributors to write
regression testing files.

MINOR: tests: First regression testing file.

Add a makefile target 'reg-tests' to run all regression testing file
found in 'reg-tests' directory.
Add reg-tests/lua/h00000.vtc first regression testing file for a LUA
fixed by f874a83 commit.

BUG/MEDIUM: ssl: do not store pkinfo with SSL_set_ex_data

Bug from 96b7834e: pkinfo is stored on SSL_CTX ex_data and should
not be also stored on SSL ex_data without reservation.
Simply extract pkinfo from SSL_CTX in ssl_sock_get_pkey_algo.

No backport needed.

BUG/MAJOR: ssl: OpenSSL context is stored in non-reserved memory slot

We never saw unexplicated crash with SSL, so I suppose that we are
luck, or the slot 0 is always reserved. Anyway the usage of the macro
SSL_get_app_data() and SSL_set_app_data() seem wrong. This patch change
the deprecated functions SSL_get_app_data() and SSL_set_app_data()
by the new functions SSL_get_ex_data() and SSL_set_ex_data(), and
it reserves the slot in the SSL memory space.

For information, this is the two declaration which seems wrong or
incomplete in the OpenSSL ssl.h file. We can see the usage of the
slot 0 whoch is hardcoded, but never reserved.

#define SSL_set_app_data(s,arg) (SSL_set_ex_data(s,0,(char *)arg))
#define SSL_get_app_data(s) (SSL_get_ex_data(s,0))

This patch must be backported at least in 1.8, maybe in other versions.

BUG/MAJOR: ssl: Random crash with cipherlist capture

The cipher list capture struct is stored in the SSL memory space,
but the slot is reserved in the SSL_CTX memory space. This causes
ramdom crashes.

This patch should be backported to 1.8

BUG/MINOR: lua: Segfaults with wrong usage of types.

Patrick reported that this simple configuration made haproxy segfaults:

    global
        lua-load /tmp/haproxy.lua

    frontend f1
        mode http
        bind :8000
        default_backend b1

        http-request lua.foo

    backend b1
        mode http
        server s1 127.0.0.1:8080

with this '/tmp/haproxy.lua' script:

    core.register_action("foo", { "http-req" }, function(txn)
        txn.sc:ipmask(txn.f:src(), 24, 112)
    end)

This is due to missing initialization of the array of arguments
passed to hlua_lua2arg_check() which makes it enter code with
corrupted arguments.

Thanks a lot to Patrick Hemmer for having reported this issue.

Must be backported to 1.8, 1.7 and 1.6.

BUG/MINOR: tasklets: Just make sure we don't pass a tasklet to the handler.

We can't just set t to NULL if it's a tasklet, or we'd have a hard time
accessing to t->process, so just make sure we pass NULL as the first parameter
of t->process if it's a tasklet.
This should be a non-issue at this point, as tasklets aren't used yet.

MINOR: tasks: Make sure we correctly init and deinit a tasklet.

Up until now, a tasklet couldn't be free'd while it was in the list, it is
no longer the case, so make sure we remove it from the list before freeing it.
To do so, we have to make sure we correctly initialize it, so use LIST_INIT,
instead of setting the pointers to NULL.

DOC: regression testing: Add a short starting guide.

This documentation describes how to write varnish test case (VTC)
files to reg test haproxy.

BUG/MAJOR: map: fix a segfault when using http-request set-map

The bug happens with an existing entry, when you try to overwrite the
value with wrong data, for example, a string when the type is INT.

The code path was not secure and tried to set *err and *merr while
err = merr = NULL when performing an http action.

Must be backported in 1.6, 1.7, 1.8.

BUG/MINOR: signals: ha_sigmask macro for multithreading

The behavior of sigprocmask in an multithreaded environment is
undefined.

The new macro ha_sigmask() calls either pthreads_sigmask() or
sigprocmask() if haproxy was built with thread support or not.

This should be backported to 1.8.

BUG/MINOR: don't ignore SIG{BUS,FPE,ILL,SEGV} during signal processing

We don't have any reason of blocking those signals.

If SIGBUS, SIGFPE, SIGILL, or SIGSEGV are generated while they are blocked, the
result is undefined, unless the signal was generated by kill(2), sigqueue(3), or
raise(3).

This should be backported to 1.8.

BUG/MEDIUM: threads: handle signal queue only in thread 0

Signals were handled in all threads which caused some signals to be lost
from time to time. To avoid complicated lock system (threads+signals),
we prefer handling the signals in one thread avoiding concurrent access.

The side effect of this bug was that some process were not leaving from
time to time during a reload.

This patch must be backported in 1.8.

MINOR: lua: Increase debug information

When an unrecoverable error raises, the user receive poor information
for the trouble shooting. For example:

[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big.

Unfortunately, the memory allocation error can be throwed by many
function, and we have no informatio to reach the original cause.
This patch add the list of function called from the entry point to
the function in error, like this:

[ALERT] 157/143755 (21212) : Lua function 'hello-world': runtime error: memory allocation error: block too big from [C] method 'req_get_headers', bug35.lua:2 global 'ee', bug35.lua:6 global 'ff', bug35.lua:10 C function line 9.

BUG/MINOR: unix: Make sure we can transfer abns sockets on seamless reload.

When checking if a socket we got from the parent is suitable for a listener,
we just checked that the path matched sockname.tmp, however this is
unsuitable for abns sockets, where we don't have to create a temporary
file and rename it later.
To detect that, check that the first character of the sun_path is 0 for
both, and if so, that &sun_path[1] is the same too.

This should be backported to 1.8.

MINOR: tasks: Don't define rqueue if we're building without threads.

To make sure we don't inadvertently insert task in the global runqueue,
while only the local runqueue is used without threads, make its definition
and usage conditional on USE_THREAD.

BUG/MEDIUM: tasks: Use the local runqueue when building without threads.

When building without threads enabled, instead of just using the global
runqueue, just use the local runqueue associated with the only thread, as
that's what is now expected for a single thread in prcoess_runnable_tasks().
This should fix haproxy when built without threads.

MINOR: task: Fix compiler warning.

Waking up task, when checking if it is a valid entry.
Similarly to commit caa8a37ffe5922efda7fd7b882e96964b40d7135,
casting explicitally to void pointer as HA_ATOMIC_CAS needs.

MINOR: applet: assign the same nice value to a new appctx as its owner task

When an applet is created, let's assign it the same nice value as the task
of the stream which owns it. It ensures that fairness is properly propagated
to applets, and that the CLI can regain a low latency behaviour again. Huge
differences have been seen under extreme loads, with the CLI being called
every 200 microseconds instead of 11 milliseconds.

MINOR: stats: also report the nice and number of calls for applets

Since applets are now part of the main scheduler, it's useful to report
their nice value and the number of calls to the applet handler, to see
where the CPU is spent.

MINOR: task: Fix a compiler warning by adding a cast.

When calling HA_ATOMIC_CAS with a pointer as the target, the compiler
expects a pointer as the new value, so give it one by casting 0x1 to
(void *).

BUG/MINOR: contrib/modsecurity: update pointer on the end of the frame

Similar to commit 94bb4c6 ("BUG/MINOR: spoa: Update pointer on the end of
the frame when a reply is encoded").

This patch should be backported to 1.8.

BUG/MINOR: contrib/mod_defender: update pointer on the end of the frame

Similar to commit 94bb4c6 ("BUG/MINOR: spoa: Update pointer on the end of
the frame when a reply is encoded").

This patch should be backported to 1.8.

BUG/MINOR: contrib/modsecurity: Don't reset the status code during disconnect

When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.

This patch may be backported in 1.8.

BUG/MINOR: contrib/mod_defender: Don't reset the status code during disconnect

When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.

This patch may be backported in 1.8.

BUG/MINOR: contrib/spoa_example: Don't reset the status code during disconnect

When the connection is closed by HAProxy, the status code provided in the
DISCONNECT frame is lost. By retransmitting it in the agent's reply, we are sure
to have it in the SPOE logs.

This patch may be backported in 1.8.

MAJOR: spoe: upgrade the SPOP version to 2.0 and remove the support for 1.0

The commit c4dcaff3 ("BUG/MEDIUM: spoe: Flags are not encoded in network order")
introduced an incompatibility with older agents. So the major version of the
SPOP is increased to make the situation unambiguous. And because before the fix,
the protocol is buggy, the support of the version 1.0 is removed to be sure to
not continue to support buggy agents.

The agents in the contrib folder (spoa_example, modsecurity and mod_defender)
are also updated to announce the SPOP version 2.0.

So, to be clear, from the patch, connections to agents announcing the SPOP
version 1.0 will be rejected.

This patch must be backported in 1.8.

DOC: SPOE.txt: fix a typo

DOC: contrib/modsecurity: few typo fixes

Few typo fixes.

BUG/MEDIUM: lua/socket: Buffer error, may segfault

The buffer pointer is already updated. It is again updated
when it is given to the function ci_putblk().

This patch must be backported in 1.6, 1.7 and 1.8

BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock

When we write data, we risk to encounter a dead-loack. The
function "stream_int_notify()" cannot be called the the
cosocket because the caller acquire a lock and when the socket
is closed, the cleanup function try to acquire the same lock.,
so a dead-lock raises.

In other way, the function stream_int_update_applet() can't
be called because it schedumes the applet only if some activity
in the buffers were detected. It is not always the case. We
replace this function by appctx_wakeup() which wake up the
applet inconditionnaly.

The last part of the fix is setting right signals. the applet
call the stream_int_update() function if the output buffer si
not empty, and ask for put data if some rite signals are
registered.

This patch must be backported in 1.6, 1.7 and 1.8. Note that it requires
patch "MINOR: task/notification: Is notifications registered" to be
applied.

BUG/MEDIUM: lua/socket: Notification error

Each time the send function yields, a notification must be registered.
Without this notification, the task is never wakeup when data arrives.

Today, the notification is registered only if the buffer is not available.
Other cases like the buffer is too small for all data are not processed.

This patch must be backported in 1.6, 1.7 and 1.8

BUG/MAJOR: lua: Dead lock with sockets

In some cases, when we are waiting for data and the socket
timeout expires, we have a dead lock. The Lua socket locks
the applet socket, and call for a notify. The notify
immediately executes code and try to acquire the same lock,
so ... dead lock.

stream_int_notify() cant be used because it wakeup the applet
task only if the stream have changes. The changes are forces
by Lua, but not repported on the stream.

stream_int_update_applet() cant be used because the deadlock.

So, I inconditionnaly wakeup the applet. This wake is performed
asynchronously, and will call a stream_int_notify().

This patch must be backported in 1.6, 1.7 and 1.8

BUG/MEDIUM: lua/socket: wrong scheduling for sockets

The appctx pointer is given from any variable which are wrong.
This implies the wakeup of wrong applet, and the socket are no
longer responsive.

This behavior is hidden by another inherited error which is
fixed in the next patch.

This patch remove all wrong appctx affectations.

This patch must be backported in 1.6, 1.7 and 1.8

MINOR: task/notification: Is notifications registered ?

This function returns true is some notifications are registered.

This function is usefull for the following patch

BUG/MEDIUM: lua/socket: Sheduling error on write: may dead-lock

It should be backported in 1.6, 1.7 and 1.8

BUG/MEDIUM: spoe: Return an error when the wrong ACK is received in sync mode

This is required to let a message processing timed out. Because, when it
happens, there is no more context attached to the SPOE applet that sent the
NOTIFY frame. So when the ACK is received, it is too late. This is the same
situation when we receive the wrong ACK. It is invalid in sync mode. Otherwise,
the SPOE applet remains in the state "WAITING_SYNC_ACK" until the idle timeout
is reached. In such case, the applet is seen as busy and it is unusable. If this
happens too often, more and more applets will be created because some others are
blocked. If there is a maxconn on the SPOE backend, all processings will be
drastically slowdown.

Returning an error in such cases, in sync mode, allow us to terminate the SPOE
applet. Because it means the agent is unresponsive or too slow.

Note this bug exists only if the sync mode is used.

This patch must be backported in 1.8.

MINOR: dns: Implement `parse-resolv-conf` directive

This introduces a new directive for the `resolvers` section:
`parse-resolv-conf`. When present, it will attempt to add any
nameservers in `/etc/resolv.conf` to the list of nameservers
for the current `resolvers` section.

[Mailing list thread][1].

[1]: https://www.mail-archive.com/haproxy@formilux.org/msg29600.html

MINOR: task: Also consider the task list size when getting global tasks.

We're taking tasks from the global runqueue based on the number of tasks
the thread already have in its local runqueue, but now that we have a task
list, we also have to take that into account.

BUG/MEDIUM: task: Don't forget to decrement max_processed after each task.

When the task list was introduced, we bogusly lost max_processed--, that means
we would execute as much tasks as present in the list, and we would never
set active_tasks_mask, so the thread would go to sleep even if more tasks were
to be executed.

1.9-dev only, no backport is needed.

BUG/MEDIUM: tasks: Don't forget to increase/decrease tasks_run_queue.

Don't forget to increase tasks_run_queue when we're adding a task to the
tasklet list, and to decrease it when we remove a task from a runqueue,
or its value won't be accurate, and could lead to tasks not being executed
when put in the global run queue.

1.9-dev only, no backport is needed.

MINOR: stats: also report the failed header rewrites warnings on the stats page

These ones concern the warnings detected during header addition/insertion.
They are visible in the tooltip reporting the per-status codes stats. The
frontend and backend contain a total of request+response warnings, while
server only has the response warnings.

DOC: management: add the new wrew stats column

This is the number of failed rewrite warnings, per front/listener/back/server.

MINOR: http: Log warning if (add|set)-header fails

This patch adds a warning if an http-(request|reponse) (add|set)-header
rewrite fails to change the respective header in a request or response.

This usually happens when tune.maxrewrite is not sufficient to hold all
the headers that should be added.

BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters

When using table_* converters ref_cnt was incremented
and never decremented causing entries to not expire.

The root cause appears to be that stktable_lookup_key()
was called within all sample_conv_table_* functions which was
incrementing ref_cnt and not decrementing after completion.

Added stktable_release() to the end of each sample_conv_table_*
function and reworked the end logic to ensure that ref_cnt is
always decremented after use.

This should be backported to 1.8

MAJOR: applets: Use tasks, instead of rolling our own scheduler.

There's no real reason to have a specific scheduler for applets anymore, so
nuke it and just use tasks. This comes with some benefits, the first one
being that applets cannot induce high latencies anymore since they share
nice values with other tasks. Later it will be possible to configure the
applets' nice value. The second benefit is that the applet scheduler was
not very thread-friendly, having a big lock around it in prevision of this
change. Thus applet-intensive workloads should now scale much better with
threads.

Some more improvement is possible now : some applets also use a task to
handle timers and timeouts. These ones could now be simplified to use only
one task.

MINOR: tasks: Make the number of tasks to run at once configurable.

Instead of hardcoding 200, make the number of tasks to be run configurable
using tune.runqueue-depth. 200 is still the default.

MAJOR: tasks: Introduce tasklets.

Introduce tasklets, lightweight tasks. They have no notion of priority,
they are just run as soon as possible, and will probably be used for I/O
later.

For the moment they're used to replace the temporary thread-local list
that was used in the scheduler. The first part of the struct is common
with tasks so that tasks can be cast to tasklets and queued in this list.
Once a task is in the tasklet list, it has its leaf_p set to 0x1 so that
it cannot accidently be confused as not in the queue.

Pure tasklets are identifiable by their nice value of -32768 (which is
normally not possible).

MAJOR: tasks: Create a per-thread runqueue.

A lot of tasks are run on one thread only, so instead of having them all
in the global runqueue, create a per-thread runqueue which doesn't require
any locking, and add all tasks belonging to only one thread to the
corresponding runqueue.

The global runqueue is still used for non-local tasks, and is visited
by each thread when checking its own runqueue. The nice parameter is
thus used both in the global runqueue and in the local ones. The rare
tasks that are bound to multiple threads will have their nice value
used twice (once for the global queue, once for the thread-local one).

MINOR: tasks: Change the task API so that the callback takes 3 arguments.

In preparation for thread-specific runqueues, change the task API so that
the callback takes 3 arguments, the task itself, the context, and the state,
those were retrieved from the task before. This will allow these elements to
change atomically in the scheduler while the application uses the copied
value, and even to have NULL tasks later.

BUG/MEDIUM: lua/socket: Length required read doesn't work

The limit of data read works only if all the data is in the
input buffer. Otherwise (if the data arrive in chunks), the
total amount of data is not taken in acount.

Only the current read data are compared to the expected amout
of data.

This patch must be backported from 1.9 to 1.6

BUG/MEDIUM: servers: Add srv_addr default placeholder to the state file

When creating a state file using "show servers state" an empty field is
created in the srv_addr column if the server is from the socket family
AF_UNIX. This leads to a warning on start up when using
"load-server-state-from-file". This patch defaults srv_addr to "-" if
the socket family is not covered.

This patch should be backported to 1.8.

BUG/BUILD: threads: unbreak build without threads

A few users reported that building without threads was accidently broken
after commit 6b96f72 ("BUG/MEDIUM: pollers: Use a global list for fd
shared between threads.") due to all_threads_mask not being defined.
It's OK to set it to zero as other code parts do when threads are
enabled but only one thread is used.

This needs to be backported to 1.8.

BUG/MEDIUM: dns: Delay the attempt to run a DNS resolution on check failure.

When checks fail, the code tries to run a dns resolution, in case the IP
changed.
The old way of doing that was to check, in case the last dns resolution
hadn't expired yet, if there were an applicable IP, which should be useless,
because it has already be done when the resolution was first done, or to
run a new resolution.
Both are a locking nightmare, and lead to deadlocks, so instead, just wake the
resolvers task, that should do the trick.

This should be backported to 1.8.

MINOR: ssl: set SSL_OP_PRIORITIZE_CHACHA

Sets OpenSSL 1.1.1's SSL_OP_PRIORITIZE_CHACHA unconditionally, as per [1]:

When SSL_OP_CIPHER_SERVER_PREFERENCE is set, temporarily reprioritize
ChaCha20-Poly1305 ciphers to the top of the server cipher list if a
ChaCha20-Poly1305 cipher is at the top of the client cipher list. This
helps those clients (e.g. mobile) use ChaCha20-Poly1305 if that cipher
is anywhere in the server cipher list; but still allows other clients to
use AES and other ciphers. Requires SSL_OP_CIPHER_SERVER_PREFERENCE.

[1] https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_clear_options.html

BUG/MEDIUM: cache: don't cache when an Authorization header is present

RFC 7234 says:

A cache MUST NOT store a response to any request, unless:
[...] the Authorization header field (see Section 4.2 of [RFC7235]) does
not appear in the request, if the cache is shared, unless the
response explicitly allows it (see Section 3.2), [...]

In this patch we completely disable the cache upon the receipt of an
Authorization header in the request. In this case it's not possible to
either use the cache or store into the cache anymore.

Thanks to Adam Eijdenberg of Digital Transformation Agency for raising
this issue.

This patch must be backported to 1.8.