git.ipfire.org Git - thirdparty/haproxy.git/log

MINOR: pools: move the fault injector to __pool_alloc()

Till now it was limited to objects allocated from the OS which means
it had little use as soon as pools were enabled. Let's move it upper
in the layers so that any code can benefit from fault injection. In
addition this allows to pass a new flag POOL_F_NO_FAIL to disable it
if some callers prefer a no-failure approach.

MINOR: pools: use cheaper randoms for fault injections

ha_random() is quite heavy and uses atomic ops or even a lock on some
architectures. Here we don't seek good randoms, just statistical ones,
so let's use the statistical prng instead.

MINOR: tools: add statistical_prng_range() to get a random number over a range

This is simply a multiply and shift from statistical_prng() but it's made
easily accessible.

CLEANUP: pools: rename __pool_free() to pool_put_to_shared_cache()

Now the multi-level cache becomes more visible:

    pool_get_from_local_cache()
    pool_put_to_local_cache()
    pool_get_from_shared_cache()
    pool_put_to_shared_cache()

CLEANUP: pools: rename pool_*_{from,to}_cache() to *_local_cache()

The functions were rightfully called from/to_cache when the thread-local
cache was considered as the only cache, but this is getting terribly
confusing. Let's call them from/to local_cache to make it clear that
it is not related with the shared cache.

As a side note, since pool_evict_from_cache() used not to work for a
particular pool but for all of them at once, it was renamed to
pool_evict_from_local_caches() (plural form).

CLEANUP: pools: rename __pool_get_first() to pool_get_from_shared_cache()

This is exactly what it is, the entry is retrieved from the shared
cache when it is defined. The implementation that is enabled with
CONFIG_HAP_NO_GLOBAL_POOLS continues to return NULL.

CLEANUP: pools: move the lock to the only __pool_get_first() that needs it

Now that __pool_alloc() only surrounds __pool_get_first() with the lock,
let's move it to the only variant that requires it and remove the ugly
ifdefs from the function. This is safe because nobody else calls this
function.

MINOR: pools: call pool_alloc_nocache() out of the pool's lock

In __pool_alloc(), historically we used to use factor out the
pool's lock between __pool_get_first() and __pool_refill_alloc(),
resulting in real malloc() or mmap() calls being performed under
the pool lock (for platforms using the locked shared pools).

As this is not needed anymore, let's move the call out of the
lock, it may improve allocation patterns on some platforms. This
also makes __pool_alloc() cleaner as we see a first attempt to
allocate from the local cache, then a second from the shared
cache then a reall allocation.

CLEANUP: pools: re-merge pool_refill_alloc() and __pool_refill_alloc()

They were strictly equivalent, let's remerge them and rename them to
pool_alloc_nocache() as it's the call which performs a real allocation
which does not check nor update the cache. The only difference in the
past was the former taking the lock and not the second but now the lock
is not needed anymore at this stage since the pool's list is not touched.

In addition, given that the "avail" argument is no longer used by the
function nor by its callers, let's drop it.

MEDIUM: pools: unify pool_refill_alloc() across all models

Now we don't loop anymore trying to refill multiple items at once, and
an allocated object is directly returned to the requester instead of
being stored into the shared pool. This has multiple benefits. The
first one is that no locking is needed anymore on the allocation path
and the second one is that the loop will no longer cause latency spikes.

MINOR: pools: make the basic pool_refill_alloc()/pool_free() update needed_avg

This is a first step towards unifying all the fallback code. Right now
these two functions are the only ones which do not update the needed_avg
rate counter since there's currently no shared pool kept when using them.
But their code is similar to what could be used everywhere except for
this one, so let's make them capable of maintaining usage statistics.

As a side effect the needed field in "show pools" will now be populated.

MINOR: pools: enable the fault injector in all allocation modes

The mem_should_fail() call enabled by DEBUG_FAIL_ALLOC used to be placed
only in the no-cache version of the allocator. Now we can generalize it
to all modes and remove the exclusive test on CONFIG_HAP_NO_GLOBAL_POOLS.

MINOR: pools: rename CONFIG_HAP_LOCAL_POOLS to CONFIG_HAP_POOLS

We're going to make the local pool always present unless pools are
completely disabled. This means that pools are always enabled by
default, regardless of the use of threads. Let's drop this notion
of "local" pools and make it just "pool". The equivalent debug
option becomes DEBUG_NO_POOLS instead of DEBUG_NO_LOCAL_POOLS.

For now this changes nothing except the option and dropping the
dependency on USE_THREAD.

MINOR: pool: remove the size field from pool_cache_head

Everywhere we have access to the pool so we don't need to cache a copy
of the pool's size into the pool_cache_head. Let's remove it.

MEDIUM: pools: move the cache into the pool header

Initially per-thread pool caches were stored into a fixed-size array.
But this was a bit ugly because the last allocated pools were not able
to benefit from the cache at all. As a work around to preserve
performance, a size of 64 cacheable pools was set by default (there
are 51 pools at the moment, excluding any addon and debugging code),
so all in-tree pools were covered, at the expense of higher memory
usage.

In addition an index had to be calculated for each pool, and was used
to acces the pool cache head into that array. The pool index was not
even stored into the pools so it was required to determine it to access
the cache when the pool was already known.

This patch changes this by moving the pool cache head into the pool
head itself. This way it is certain that each pool will have its own
cache. This removes the need for index calculation.

The pool cache head is 32 bytes long so it was aligned to 64B to avoid
false sharing between threads. The extra cost is not huge (~2kB more
per pool than before), and we'll make better use of that space soon.
The pool cache head contains the size, which should probably be removed
since it's already in the pool's head.

CLEANUP: pools: remove unused arguments to pool_evict_from_cache()

In commit fb117e6a8 ("MEDIUM: memory: don't let pool_put_to_cache() free
the objects itself") pool_evict_from_cache() was introduced with no
argument, yet the only call place passes it the pool, the pointer and
the index number!

Let's remove these as they even let the reader think that the function
does something specific to the current pool while it's not the case.

MINOR: pools: drop the unused static history of artificially failed allocs

When building with DEBUG_FAIL_ALLOC we call a random generator to decide
whether the pool alloc should succeed or fail, and there was a preliminary
debugging mechanism to keep sort of a history of the previous decisions. But
it was never used, enforces a lock during the allocation, and forces to use
static variables, all of which are limiting the ability to pursue the pools
cleanups with no real benefit. Let's get rid of them now.

BUG/MINOR: pools/buffers: make sure to always reserve the required buffers

Since recent commit ae07592 ("MEDIUM: pools: add CONFIG_HAP_NO_GLOBAL_POOLS
and CONFIG_HAP_GLOBAL_POOLS") the pre-allocation of all desired reserved
buffers was not done anymore on systems not using the shared cache. This
basically has no practical impact since these ones will quickly be refilled
by all the ones used at run time, but it may confuse someone checking if
they're allocated in "show pools".

That's only 2.4-dev, no backport is needed.

BUG/MINOR: pools: maintain consistent ->allocated count on alloc failures

When running with CONFIG_HAP_NO_GLOBAL_POOLS, it's theoritically possible
to keep an incorrect count of allocated entries in a pool because the
allocated counter was used as a cumulated counter of alloc calls instead
of a number of currently allocated items (it's possible the meaning has
changed over time). The only impact in this mode essentially is that
"show pools" will report incorrect values. But this would only happen on
limited pools, which is not even certain still exist.

This was added by recent commit 0bae07592 ("MEDIUM: pools: add
CONFIG_HAP_NO_GLOBAL_POOLS and CONFIG_HAP_GLOBAL_POOLS") so no backport
is needed.

DOC: Note that URI normalization is experimental

Add a paragraph to the URI normalization documentation that URI normalization
is currently considered to be experimental.

DOC: Add introduction to http-request normalize-uri

This patch adds an introduction to the http-request normalize-uri section,
explaining what to expect from the normalizers and possible issues that might
arise when not being careful.

MEDIUM: http_act: Rename uri-normalizers

This patch renames all existing uri-normalizers into a more consistent naming
scheme:

1. The part of the URI that is being touched.
2. The modification being performed as an explicit verb.

MINOR: uri_normalizer: Add a `percent-upper` normalizer

This normalizer uppercases the hexadecimal characters used in percent-encoding.

See GitHub Issue #714.

MINOR: uri_normalizer: Add a `sort-query` normalizer

This normalizer sorts the `&` delimited query parameters by parameter name.

See GitHub Issue #714.

MINOR: uri_normalizer: Add support for supressing leading `../` for dotdot normalizer

This adds an option to supress `../` at the start of the resulting path.

MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request normalize-uri

This normalizer merges `../` path segments with the predecing segment, removing
both the preceding segment and the `../`.

Empty segments do not receive special treatment. The `merge-slashes` normalizer
should be executed first.

See GitHub Issue #714.

MINOR: uri_normalizer: Add a `merge-slashes` normalizer to http-request normalize-uri

This normalizer merges adjacent slashes into a single slash, thus removing
empty path segments.

See GitHub Issue #714.

MINOR: uri_normalizer: Add `http-request normalize-uri`

This patch adds the `http-request normalize-uri` action that was requested in
GitHub issue #714.

Normalizers will be added in the next patches.

MINOR: uri_normalizer: Add `enum uri_normalizer_err`

This enum will serve as the return type for each normalizer.

MINOR: uri_normalizer: Add uri_normalizer module

This is in preparation for future patches.

BUILD: makefile: Redirect stderr to /dev/null when probing options

It is a workaround to avoid a clang 11 bug that exits with SIGABRT when
stderr is redirected to stdin. This bug was already reported few weeks ago:

https://bugs.llvm.org/show_bug.cgi?id=49463

But because it is pretty annoying, the standard error is now redirected to
/dev/null.

BUG/MINOR: logs: Report the true number of retries if there was no connection

When the session is aborted before any connection attempt to any server, the
number of connection retries reported in the logs is wrong. It happens
because when the retries counter is not strictly positive, we consider the
max number of retries was reached and the backend retries value is used. It
is obviously wrong when no connectioh was performed.

In fact, at this stage, the retries counter is initialized to 0. But the
backend stream-interface is in the INI state. Once it is set to SI_ST_REQ,
the counter is set to the backend value. And it is the only possible state
transition from INI state. Thus it is safe to rely on it to fix the bug.

This patch must be backported to all stable versions.

BUG/MINOR: http_htx: Remove BUG_ON() from http_get_stline() function

The http_get_stline() was designed to be called from HTTP analyzers. Thus
before any data forwarding. To prevent any invalid usage, two BUG_ON()
statements were added. However, it is not a good idea because it is pretty
hard to be sure no HTTP sample fetch will never be called outside the
analyzers context. Especially because there is at least one possible area
where it may happens. An HTTP sample fetch may be used inside the unique-id
format string. On the normal case, it is generated in AN_REQ_HTTP_INNER
analyzer. But if an error is reported too early, the id is generated when
the log is emitted.

So, it is safer to remove the BUG_ON() statements and consider the normal
behavior is to return NULL if the first block is not a start-line. Of
course, this means all calling functions must test the return value or be
sure the start-line is really there.

This patch must be backported as far as 2.0.

MINOR: tcp_samples: Be able to call bc_src/bc_dst from the health-checks

The new L4 sample fetches used to get source and destination info of the
backend connection may now be called from an health-check.

MINOR: tcp_samples: Add samples to get src/dst info of the backend connection

This patch adds 4 new sample fetches to get the source and the destination
info (ip address and port) of the backend connection :

* bc_dst : Returns the destination address of the backend connection
* bc_dst_port : Returns the destination port of the backend connection
* bc_src : Returns the source address of the backend connection
* bc_src_port : Returns the source port of the backend connection

The configuration manual was updated accordingly.

BUG/MINOR: http-fetch: Make method smp safe if headers were already forwarded

When method sample fetch is called, if an exotic method is found
(HTTP_METH_OTHER), when smp_prefetch_htx() is called, we must be sure the
start-line is still there. Otherwise, HAproxy may crash because of a NULL
pointer dereference, for instance if the method sample fetch is used inside
a unique-id format string. Indeed, the unique id may be generated when the
log message is emitted. At this stage, the request channel is empty.

This patch must be backported as far as 2.0. But the bug exists in all
stable versions for the legacy HTTP mode too. Thus it must be adapted to the
legacy HTTP mode and backported to all other stable versions.

BUG/MINOR: ssl-samples: Fix ssl_bc_* samples when called from a health-check

For all ssl_bc_* sample fetches, the test on the keyword when called from a
health-check is inverted. We must be sure the 5th charater is a 'b' to
retrieve a connection.

This patch must be backported as far as 2.2.

MINOR: connection: Make bc_http_major compatible with tcp-checks

bc_http_major sample fetch now works when it is called from a
tcp-check. When it happens, the session origin is a check. The backend
connection is retrieved from the conn-stream attached to the check.

If required, this path may easily be backported as far as 2.2.

BUG/MINOR: connection: Fix fc_http_major and bc_http_major for TCP connections

fc_http_major and bc_http_major sample fetches return the major digit of the
HTTP version used, respectively, by the frontend and the backend
connections, based on the mux. However, in reality, "2" is returned if the
H2 mux is detected, otherwise "1" is inconditionally returned, regardless
the mux used. Thus, if called for a raw TCP connection, "1" is returned.

To fix this bug, we now get the multiplexer flags, if there is one, to be
sure MX_FL_HTX is set.

I guess it was made this way on purpose when the H2 multiplexer was
introduced in the 1.8 and with the legacy HTTP mode there is no other
solution at the connection level. Thus this patch should be backported as
far as 2.2. For the 2.0, it must be evaluated first because of the legacy
HTTP mode.

MINOR: logs: Add support of checks as session origin to format lf strings

When a log-format string is built from an health-check, the session origin
is the health-check itself and not a connection. In addition, there is no
stream. It means for now some formats are not supported: %s, %sc, %b, %bi,
%bp, %si and %sp.

Thanks to this patch, the session origin is converted to a check. So it is
possible to retrieve the backend and the backend connection. Note this
session have no listener, thus %ft format must be guarded.

This patch is light and standalone, thus it may be backported as far as 2.2
if required. However, because the error is human, it is probably better to
wait a bit to be sure everything is properly protected.

BUG/MINOR: checks: Set missing id to the dummy checks frontend

The dummy frontend used to create the session of the tcp-checks is
initialized without identifier. However, it is required because this id may
be used without any guard, for instance in log-format string via "%f" or
when fe_name sample fetch is called. Thus, an unset id may lead to crashes.

This patch must be backported as far as 2.2.

MINOR: threads: Only consider running threads to end a thread harmeless period

When a thread ends its harmeless period, we must only consider running
threads when testing threads_want_rdv_mask mask. To do so, we reintroduce
all_threads_mask mask in the bitwise operation (It was removed to fix a
deadlock).

Note that for now it is useless because there is no way to stop threads or
to have threads reserved for another task. But it is safer this way to avoid
bugs in the future.

BUG/MEDIUM: threads: Ignore current thread to end its harmless period

A previous patch was pushed to fix a deadlock when an isolated thread ends
its harmless period (a9a9e9aac ["BUG/MEDIUM: thread: Fix a deadlock if an
isolated thread is marked as harmless"]). But, unfortunately, the fix is
incomplete. The same must be done in the outer loop, in
thread_harmless_end() function. The current thread must be ignored when
threads_want_rdv_mask mask is tested.

This patch must also be backported as far as 2.0.

DOC: ssl: Certificate hot update works on server certificates

The CLI's "set ssl cert" command also works on backend certificates
(see GitHub issue #427).

It does not need to be backported.

DOC: ssl: Certificate hot update only works on fronted certificates

The CLI's "set ssl cert" command only works on frontend certificates but
the documentation did not specify this limitations yet.

This patch can be backported to all stable branches.

CI: travis-ci: enable weekly graviton2 builds

as documented in https://blog.travis-ci.com/2020-09-11-arm-on-aws
we can build on graviton2, let us expand travis-ci matrix to that
machine type as well

MINOR: sample: converter: Add json_query converter

With the json_query can a JSON value be extacted from a header
or body of the request and saved to a variable.

This converter makes it possible to handle some JSON workload
to route requests to different backends.

MINOR: sample: converter: Add mjson library.

This library is required for the subsequent patch which adds
the JSON query possibility.

It is necessary to change the include statement in "src/mjson.c"
because the imported includes in haproxy are in "include/import"

orig: #include "mjson.h"
new: #include <import/mjson.h>

MINOR: opentracing: transfer of context names without prefix

In order to enable the assignment of a context name, and yet exclude the
use of that name (prefix in this case) when extracting the context from
the HTTP header, a special character '-' has been added, which can be
specified at the beginning of the prefix.

So let's say if we look at examples of the fe-be configuration, we can
transfer the context via an HTTP header without a prefix like this:

  fe/ot.cfg:
        ..
        span "HAProxy session"
            inject "" use-headers
        event on-backend-http-request

Such a context can be read in another process using a name that has a
special '-' sign at the beginning:

  be/ot.cfg:
    ot-scope frontend_http_request
        extract "-ot-ctx" use-headers
        span "HAProxy session" child-of "-ot-ctx" root
        ..

This means that the context name will be '-ot-ctx' but it will not be
used when extracting data from HTTP headers.

Of course, if the context does not have a prefix set, all HTTP headers
will be inserted into the OpenTracing library as context.  All of the
above will only work correctly if that library can figure out what is
relevant to the context and what is not.

MINOR: opentracing: correct calculation of the number of arguments in the args[]

It is possible that some arguments within the configuration line are not
specified; that is, they are set to a blank string.

For example:
  keyword '' arg_2

In that case the content of the args field will be like this:
  args[0]:                  'keyword'
  args[1]:                  NULL pointer
  args[2]:                  'arg_2'
  args[3 .. MAX_LINE_ARGS): NULL pointers

The previous way of calculating the number of arguments (as soon as a
null pointer is encountered) could not place an argument on an empty
string.

All of the above is essential for passing the OpenTracing context via
the HTTP headers (keyword 'inject'), where one of the arguments is the
context name prefix.  This way we can set an empty prefix, which is very
useful if we get context from some other process that can't add a prefix
to that data; or we want to pass the context to some process that cannot
handle the prefix of that data.

CI: cirrus: install "pcre" package

it turned out that our cirrus-ci freebsd builds got broken because
of missing "pcre". Most probably it was installed earlier as a dependency.
let us install it directly.

MINOR: ist: Add `istclear(struct ist*)`

istclear allows one to easily reset an ist to zero-size, while preserving the
previous size, indicating the length of the underlying buffer.

CLEANUP: sample: align samples list in sample.c

MINOR: sample: add ub64dec and ub64enc converters

ub64dec and ub64enc are the base64url equivalent of b64dec and base64
converters. base64url encoding is the "URL and Filename Safe Alphabet"
variant of base64 encoding. It is also used in in JWT (JSON Web Token)
standard.
RFC1421 mention in base64.c file is deprecated so it was replaced with
RFC4648 to which existing converters, base64/b64dec, still apply.

Example:
  HAProxy:
    http-request return content-type text/plain lf-string %[req.hdr(Authorization),word(2,.),ub64dec]
  Client:
    Token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiZm9vIiwia2V5IjoiY2hhZTZBaFhhaTZlIn0.5VsVj7mdxVvo1wP5c0dVHnr-S_khnIdFkThqvwukmdg
    $ curl -H "Authorization: Bearer ${TOKEN}" http://haproxy.local
    {"user":"foo","key":"chae6AhXai6e"}

BUG/MEDIUM: sample: Fix adjusting size in field converter

Adjust the size of the sample buffer before we change the "area"
pointer. The change in size is calculated as the difference between the
original pointer and the new start pointer. But since the
`smp->data.u.str.area` assignment results in `smp->data.u.str.area` and
`start` being the same pointer, we always ended up substracting zero.
This changes it to change the size by the actual amount it changed.

I'm not entirely sure what the impact of this is, but the previous code
seemed wrong.

[wt: from what I can see the only harmful case is when the output is
converted to a stick-table key, it could result in zeroing past the
end of the buffer; other cases do not touch beyond ->data]

DOC: internals: update the SSL architecture schema

This commit adds the new fields added to the ckch_inst structure in
order to manage the backend certificate hot update (GitHub #427) and the
bug of the default certificate update (GitHub #1143).

MINOR: cfgparse/proxy: Group alloc error handling during proxy section parsing

All allocation errors in cfg_parse_listen() are now handled in a unique
place under the "alloc_error" label. This simplify a bit error handling in
this function.

BUG/MINOR: cfgparse/proxy: Hande allocation errors during proxy section parsing

At several places during the proxy section parsing, memory allocation was
performed with no check. Result is now tested and an error is returned if
the allocation fails.

This patch may be backported to all stable version but it only fixes
allocation errors during configuration parsing. Thus, it is not mandatory.

BUG/MINOR: listener: Handle allocation error when allocating a new bind_conf

Allocation error are now handled in bind_conf_alloc() functions. Thus
callers, when not already done, are also updated to catch NULL return value.

This patch may be backported (at least partially) to all stable
versions. However, it only fix errors durung configuration parsing. Thus it
is not mandatory.

BUG/MINOR: cfgparse/proxy: Fix some leaks during proxy section parsing

Allocated variables are now released when an error occurred during
use_backend, use-server, force/ignore-parsing, stick-table, stick and stats
directives parsing. For some of these directives, allocation errors have
been added.

This patch may be backported to all stable version but it only fixes leaks
or allocation errors during configuration parsing. Thus, it is not
mandatory. It should fix issue #1119.

BUG/MINOR: hlua: Fix memory leaks on error path when registering a cli keyword

When an error occurred in hlua_register_cli(), the allocated lua function
and keyword must be released to avoid memory leaks.

This patch depends on "MINOR: hlua: Add function to release a lua
function". It may be backported in all stable versions.

BUG/MINOR: hlua: Fix memory leaks on error path when registering a service

When an error occurred in hlua_register_service(), the allocated lua
function and keyword must be released to avoid memory leaks.

This patch depends on "MINOR: hlua: Add function to release a lua
function". It may be backported in all stable versions.

BUG/MINOR: hlua: Fix memory leaks on error path when registering an action

When an error occurred in hlua_register_action(), the allocated lua function
and keyword must be released to avoid memory leaks.

This patch depends on "MINOR: hlua: Add function to release a lua
function". It may be backported in all stable versions.

BUG/MINOR: hlua: Fix memory leaks on error path when parsing a lua action

hen an error occurred in action_register_lua(), the allocated hlua rule and
arguments must be released to avoid memory leaks.

This patch may be backported in all stable versions.

BUG/MINOR: hlua: Fix memory leaks on error path when registering a fetch

When an error occurred in hlua_register_fetches(), the allocated lua
function and keyword must be released to avoid memory leaks.

This patch depends on "MINOR: hlua: Add function to release a lua
function". It may be backported in all stable versions. It should fix #1112.

BUG/MINOR: hlua: Fix memory leaks on error path when registering a converter

When an error occurred in hlua_register_converters(), the allocated lua
function and keyword must be released to avoid memory leaks.

This patch depends on "MINOR: hlua: Add function to release a lua
function". It may be backported in all stable versions.

BUG/MINOR: hlua: Fix memory leaks on error path when registering a task

When an error occurred in hlua_register_task(), the allocated lua context
and task must be released to avoid memory leaks.

This patch may be backported in all stable versions.

MINOR: hlua: Add function to release a lua function

release_hlua_function() must be used to release a lua function. Some fixes
depends on this function.

MINOIR: checks/trace: Register a new trace source with its events

Add the trace support for the checks. Only tcp-check based health-checks are
supported, including the agent-check.

In traces, the first argument is always a check object. So it is easy to get
all info related to the check. The tcp-check ruleset, the conn-stream and
the connection, the server state...

MINOR: trace: Add the checks as a possible trace source

To be able to add the trace support for the checks, a new kind of source
must be added for this purpose.

MINOR: atomic: reimplement the relaxed version of x86 BTS/BTR

Olivier spotted that I messed up during a rebase of commit 92c059c2a
("MINOR: atomic: implement native BTS/BTR for x86"), losing the x86
version of the BTS/BTR and leaving the generic version for it instead
of having this block in the #else. Since this variant is not used for
now it was easy to overlook it. Let's re-implement it here.

MEDIUM: time: make the clock offset global and no per-thread

Since 1.8 for simplicity the time offset used to compensate for time
drift and jumps had been stored per thread. But with a global time,
the complexit has significantly increased.

What this patch does in order to address this is to get back to the
origins of the pre-thread time drift correction, and keep a single
offset between the system's date and the current global date.

The thread first verifies from the before_poll date if the time jumped
backwards or forward, then either fixes it by computing the new most
likely date, or applies the current offset to this latest system date.
In the first case, if the date is out of range, the old one is reused
with the max_wait offset or not depending on the interrupted flag.
Then it compares its date to the global date and updates both so that
both remain monotonic and that the local date always reflects the
latest known global date.

In order to support atomic updates to the offset, it's saved as a
ullong which contains both the tv_sec and tv_usec parts in its high
and low words. Note that a part of the patch comes from the inlining
of the equivalent of tv_add applied to the offset to make sure that
signed ints are permitted (otherwise it depends on how timeval is
defined).

This is significantly more reliable than the previous model as the
global time should move in a much smoother way, and not according
to what thread last updated it, and the thread-local time should
always be very close to the global one.

Note that (at least for debugging) a cheap way to measure processing
lag would consist in measuring the difference between global_now_ms
and now_ms, as long as other threads keep it up-to-date.

MINOR: time: change the global timeval and the the global tick at once

Instead of using two CAS loops, better compute the two units
simultaneously and update them at once. There is no guarantee that
the update will be synchronous, but we don't care, what matters is
that both are monotonically updated and that global_now_ms always
follows the last known value of global_now.

MINOR: time: remove useless variable copies in tv_update_date()

In the global_now loop, we used to set tmp_adj from adjusted, then
set update it from tmp_now, then set adjusted back to tmp_adj, and
finally set now from adjusted. This is a long and unneeded set of
moves resulting from years of code changes. Let's just set now
directly in the loop, stop using adjusted and remove tmp_adj.

MINOR: time: move the time initialization out of tv_update_date()

The time initialization was made a bit complex because we rely on a
dummy negative argument to reset all fields, leaving no distinction
between process-level initialization and thread-level initialization.
This patch changes this by introducing two functions, one for the
process and the second one for the threads. This removes ambigous
test and makes sure that the relevant fields are always initialized
exactly once. This also offers a better solution to the bug fixed in
commit b48e7c001 ("BUG/MEDIUM: time: make sure to always initialize
the global tick") as there is no more special values for global_now_ms.

It's simple enough to be backported if any other time-related issues
are encountered in stable versions in the future.

CLEANUP: time: remove the now unused ms_left_scaled

It was only used by freq_ctr and is not used anymore. In addition the
local curr_sec_ms was removed, as well as the equivalent extern
definitions which did not exist anymore either.

MINOR: freq_ctr: simplify and improve the update function

update_freq_ctr_period() was still not very clean and didn't wait for
the rotation lock to be dropped before trying again, thus maintaining
the contention at a high level. In addition, the rotation update was
made in three steps, which are not very efficient in terms of bus
cycles.

Here the wait loop was reworked so that the fast path remains short
and that the contended path waits for the lock to be dropped before
attempting another write, but it only waits a relax cycle before
attempting a read. The rotation block was simplified to remove a
test that was already validated by the first loop, and so that the
retrieval of the current period, its reset and its increment are all
performed in a single atomic op and the store to the previous period
is performed immediately after.

All this results in significantly smaller code for the inline function
(~1kB total) and a shorter critical path.

MINOR: freq_ctr: add cpu_relax in the rotation loop of update_freq_ctr_period()

When counters are rotated, there is contention between the threads which
can slow down the operation of the thread performing the rotation. Let's
apply a cpu_relax there to let the first thread finish faster.

MEDIUM: freq_ctr: replace the per-second counters with the generic ones

It remains cumbersome to preserve two versions of the freq counters and
two different internal clocks just for this. In addition, the savings
from using two different mechanisms are not that important as the only
saving is a divide that is replaced by a multiply, but now thanks to
the freq_ctr_total() unificaiton the code could also be simplified to
optimize it in case of constants.

This patch turns all non-period freq_ctr functions to static inlines
which call the period-based ones with a period of 1 second. A direct
benefit is that a single internal clock is now needed for any counter
and that they now all rely on ticks.

These 1-second counters are essentially used to report request rates
and to enforce a connection rate limitation in listeners. It was
verified that these continue to work like before.

MINOR: freq_ctr: unify freq_ctr and freq_ctr_period into freq_ctr

Both structures are identical except the name of the field starting
the period and its description. Let's call them all freq_ctr and the
period's start "curr_tick" which is generic.

This is only a temporary change and fields are expected to remain
the same with no code change (verified).

MINOR: freq_ctr: add the missing next_event_delay_period()

There was still no function to compute a wait time for periods, let's
implement it on top of freq_ctr_total() as we'll soon need it for the
per-second one. The divide here is applied on the frequency so that it
will be replaced with a reciprocal multiply when constant.

MEDIUM: freq_ctr: reimplement freq_ctr_remain_period() from freq_ctr_total()

Now the function becomes an inline one and only contains a divide and
a max. The divide will automatically go away with constant periods.

MEDIUM: freq_ctr: make read_freq_ctr_period() use freq_ctr_total()

This one is the easiest to implement, it just requires a call and a
divide of the result. Anti-flapping correction for low-rates was
preserved.

Now calls using a constant period will be able to use a reciprocal
multiply for the period instead of a divide.

MINOR: freq_ctr: add a generic function to report the total value

Most of the functions designed to read a counter over a period go through
the same complex loop and only differ in the way they use the returned
values, so it was worth implementing all this into freq_ctr_total() which
returns the total number of events over a period so that the caller can
finish its operation using a divide or a remaining time calculation. As
a special case, read_freq_ctr_period() doesn't take pending events but
requires to enable an anti-flapping correction at very low frequencies.
Thus the function implements it when pend<0.

Thanks to this function it will be possible to reimplement the other ones
as inline and merge the per-second ones with the arbitrary period ones
without always adding the cost of a 64 bit divide.

MINOR: trace: make trace sources read_mostly

The trace sources are checked at plenty of places in the code and their
contents only change when trace status changes, let's mark them read_mostly.

MINOR: pattern: make the pat_lru_seed read_mostly

This seed is created once at boot and is used in every LRU hash when
caching results. Let's mark it read_mostly.

MINOR: protocol: move __protocol_by_family to read_mostly

This one is used for each outgoing connection and never changes after
boot, move it to read_mostly.

MINOR: server: move idle_conn_task to read_mostly

This pointer is used when adding connections to the idle list and is
never changed, let's move it to the read_mostly section.

MINOR: threads: mark all_threads_mask as read_mostly

This variable almost never changes and is read a lot in time-critical
sections. threads_want_rdv_mask is read very often as well in
thread_harmless_end() and is almost never changed (only when someone
uses thread_isolate()). Let's move both to read_mostly.

MINOR: pool: move pool declarations to read_mostly

All pool heads are accessed via a pointer and should not be shared with
highly written variables. Move them to the read_mostly section.

MINOR: kqueue: move kqueue_fd to read_mostly

This one only contains the list of per-thread kqueue FDs, and is used
a lot during updates. Let's mark it read_mostly to avoid false sharing
of FDs placed at the extremities.

MINOR: epoll: move epoll_fd to read_mostly

This one only contains the list of per-thread epoll FDs, and is used
a lot during updates. Let's mark it read_mostly to avoid false sharing
of FDs placed at the extremities.

MINOR: fd: move a few read-mostly variables to their own section

Some pointer to arrays such as fdtab, fdinfo, polled_mask etc are never
written to at run time but are used a lot. fdtab accesses appear a lot in
perf top because ha_used_fds is in the same cache line and is modified
all the time. This patch moves all these read-mostly variables to the
read_mostly section when defined. This way their cache lines will be
able to remain in shared state in all CPU caches.

MINOR: global: declare a read_mostly section

Some variables are mostly read (mostly pointers) but they tend to be
merged with other ones in the same cache line, slowing their access down
in multi-thread setups. This patch declares an empty, aligned variable
in a section called "read_mostly". This will force a cache-line alignment
on this section so that any variable declared in it will be certain to
avoid false sharing with other ones. The section will be eliminated at
link time if not used.

A __read_mostly attribute was added to compiler.h to ease use of this
section.

CLEANUP: initcall: rely on HA_SECTION_* instead of defining its own

Now initcalls are defined using the regular section definitions from
compiler.h in order to ease maintenance.

MINOR: compiler: add macros to declare section names

HA_SECTION() is used as an attribute to force a section name. This is
required because OSX prepends "__DATA, " in front of the declaration.
HA_SECTION_START() and HA_SECTION_STOP() are used as post-attribute on
variable declaration to designate the section start/end (needed only on
OSX, empty on others).

For platforms with an obsolete linker, all macros are left empty. It would
possibly still work on some of them but this will not be needed anyway.

CLEANUP: initcall: rename HA_SECTION to HA_INIT_SECTION

The HA_SECTION name is too generic and will be reused globally. Let's
rename this one.

MINOR: initcall: uniformize the section names between MacOS and other unixes

Due to length restrictions on OSX the initcall sections are called "i_"
there while they're called "init_" on other OSes. However the start and
end of sections are still called "__start_init_" and "__stop_init_",
which forces to have distinct code between the OSes. Let's switch everyone
to "i_" and rename the symbols accordingly.

MINOR: trace: replace the trace() inline function with an equivalent macro

The trace() function is convenient to avoid calling trace() when traces
are not enabled, but there starts to be some callers which place complex
expressions in their trace calls, which results in all of them to be
evaluated before being passed as arguments to the trace() function. This
needlessly wastes precious CPU cycles.

Let's change the function for a macro, so that the arguments are now only
evaluated when the surce has traces enabled. However having a generic
macro being called "trace()" can easily cause conflicts with innocent
code so we rename it "_trace".

Just doing this has resulted in a 2.5% increase of the HTTP/1 request rate.

CLEANUP: pattern: make all pattern tables read-only

Interestingly, all arrays used to declare patterns were read-write while
only hard-coded. Let's mark them const so that they move from data to
rodata and don't risk to experience false sharing.