Emeric Brun [Wed, 30 Jun 2021 17:04:16 +0000 (19:04 +0200)]
MEDIUM: stick-table: add the new arrays of gpc and gpc_rate
This patch adds the definition of two new array data_types:
'gpc': This is an array of 32bits General Purpose Counters.
'gpc_rate': This is an array on increment rates of General Purpose Counters.
Like for all arrays, they are limited to 100 elements.
This patch also adds actions and fetches to handle
elements of those arrays.
Note: As documented, those new actions and fetches won't
apply to the legacy 'gpc0', 'gpc1', 'gpc0_rate' nor 'gpc1_rate'.
Emeric Brun [Tue, 22 Jun 2021 14:09:55 +0000 (16:09 +0200)]
MEDIUM: peers: handle arrays of std types in peers protocol
This patch adds support of array data_types on the peer protocol.
The table definition message will provide an additionnal parameter
for array data-types: the number of elements of the array.
In case of array of frqp it also provides a second parameter:
the period used to compute freq counter.
The array elements are std_type values linearly encoded in
the update message.
Note: if a remote peer announces an array data_type without
parameters into the table definition message, all updates
on this table will be ignored because we can not
parse update messages consistently.
Emeric Brun [Wed, 30 Jun 2021 16:01:02 +0000 (18:01 +0200)]
MEDIUM: stick-table: handle arrays of standard types into stick-tables
This patch provides the code to handle arrays of some
standard types (SINT, UINT, ULL and FRQP) in stick table.
This way we could define new "array" data types.
Note: the number of elements of an array was limited
to 100 to put a limit and to ensure that an encoded
update message will continue to fit into a buffer
when the peer protocol will handle such data types.
BUG/MINOR: peers: fix data_type bit computation more than 32 data_types
This patch fixes the computation of the bit of the current data_type
in some part of code of peer protocol where the computation is limited
to 32bits whereas the bitfield of data_types can support 64bits.
Without this patch it could result in bugs when we will define more
than 32 data_types.
Backport is useless because there is currently less than 32 data_types
Daniel Black [Thu, 1 Jul 2021 02:09:32 +0000 (12:09 +1000)]
DOC: config: use CREATE USER for mysql-check
CREATE USER has been the standard way of creating users since
MySQL-5.0 (2005).
The current syntax of INSERT INTO mysql.user won't actually work
on MariaDB-10.4+.
Because haproxy doesn't use any resources the MySQL executable comment
syntax provides resource contraints to make it more palatable
to risk adverse users.
/*!50701 is a syntax recognised by MySQL and MariaDB 5.7.1+ when
resource contraints where added.
/*M!100201 is a MariaDB executable comment syntax recognised for MariaDB
for the 10.2.1 where the MAX_STATEMENT_TIME was added.
Willy Tarreau [Wed, 30 Jun 2021 14:16:14 +0000 (16:16 +0200)]
[RELEASE] Released version 2.5-dev1
Released version 2.5-dev1 with the following main changes :
- CLEANUP: ssl: Move ssl_store related code to ssl_ckch.c
- MINOR: ssl: Allow duplicated entries in the cafile_tree
- MEDIUM: ssl: Chain ckch instances in ca-file entries
- MINOR: ssl: Add reference to default ckch instance in bind_conf
- MINOR: ssl: Add helper functions to create/delete cafile entries
- MEDIUM: ssl: Add a way to load a ca-file content from memory
- MINOR: ssl: Add helper function to add cafile entries
- MINOR: ssl: Ckch instance rebuild and cleanup factorization in CLI handler
- MEDIUM: ssl: Add "set+commit ssl ca-file" CLI commands
- REGTESTS: ssl: Add new ca-file update tests
- MINOR: ssl: Add "abort ssl ca-file" CLI command
- MINOR: ssl: Add a cafile_entry type field
- MINOR: ssl: Refactorize the "show certificate details" code
- MEDIUM: ssl: Add "show ssl ca-file" CLI command
- MEDIUM: ssl: Add "new ssl ca-file" CLI command
- MINOR: ssl: Add "del ssl ca-file" CLI command
- REGTESTS: ssl: Add "new/del ssl ca-file" tests
- DOC: ssl: Add documentation about CA file hot update commands
- DOC: internals: update the SSL architecture schema
- MINOR: ssl: Chain instances in ca-file entries
- MEDIUM: ssl: Add "set+commit ssl crl-file" CLI commands
- MEDIUM: ssl: Add "new+del crl-file" CLI commands
- MINOR: ssl: Add "abort ssl crl-file" CLI command
- MEDIUM: ssl: Add "show ssl crl-file" CLI command
- REGTESTS: ssl: Add "new/del ssl crl-file" tests
- REGTESTS: ssl: Add "set/commit ssl crl-file" test
- DOC: ssl: Add documentation about CRL file hot update commands
- BUILD/MINOR: ssl: Fix compilation with SSL enabled
- BUILD/MINOR: ssl: Fix compilation with OpenSSL 1.0.2
- CI: introduce scripts/build-vtest.sh for installing VTest
- CLEANUP: ssl: Fix coverity issues found in CA file hot update code
- CI: github actions: add OpenTracing builds
- BUG/MEDIUM: ebtree: Invalid read when looking for dup entry
- BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'
- BUILD/MINOR: opentracing: fixed build when using clang
- BUG/MEDIUM: filters: Exec pre/post analysers only one time per filter
- BUG/MINOR: http-comp: Preserve HTTP_MSGF_COMPRESSIONG flag on the response
- MINOR: map/acl: print the count of all the map/acl entries in "show map/acl"
- CLEANUP: pattern: remove export of non-existent function pattern_delete()
- MINOR: h1-htx: Update h1 parsing functions to return result as a size_t
- MEDIUM: h1-htx: Adapt H1 data parsing to copy wrapping data in one call
- MINOR: mux-h1/mux-fcgi: Don't needlessly loop on data parsing
- MINOR: h1-htx: Move HTTP chunks parsing into a dedicated function
- MEDIUM: h1-htx: Split function to parse a chunk and the loop on the buffer
- MEDIUM: h1-htx: Add a function to parse contiguous small chunks
- MINOR: h1-htx: Use a correlation table to speed-up small chunks parsing
- MINOR: buf: Add function to realign a buffer with a specific head position
- MINOR: muxes/h1-htx: Realign input buffer using b_slow_realign_ofs()
- CLEANUP: mux-h1: Rename functions parsing input buf and filling output buf
- Revert "MEDIUM: http-ana: Deal with L7 retries in HTTP analysers"
- BUG/MINOR: http-ana: Send the right error if max retries is reached on L7 retry
- BUG/MINOR: http-ana: Handle L7 retries on refused early data before K/A aborts
- MINOR: http-ana: Perform L7 retries because of status codes in response analyser
- MINOR: cfgparse: Fail when encountering extra arguments in macro
- DOC: intro: Fix typo in starter guide
- BUG/MINOR: server: Missing calloc return value check in srv_parse_source
- BUG/MINOR: peers: Missing calloc return value check in peers_register_table
- BUG/MINOR: ssl: Missing calloc return value check in ssl_init_single_engine
- BUG/MINOR: http: Missing calloc return value check in parse_http_req_capture
- BUG/MINOR: proxy: Missing calloc return value check in proxy_parse_declare
- BUG/MINOR: proxy: Missing calloc return value check in proxy_defproxy_cpy
- BUG/MINOR: http: Missing calloc return value check while parsing tcp-request/tcp-response
- BUG/MINOR: http: Missing calloc return value check while parsing tcp-request rule
- BUG/MINOR: compression: Missing calloc return value check in comp_append_type/algo
- BUG/MINOR: worker: Missing calloc return value check in mworker_env_to_proc_list
- BUG/MINOR: http: Missing calloc return value check while parsing redirect rule
- BUG/MINOR: http: Missing calloc return value check in make_arg_list
- BUG/MINOR: proxy: Missing calloc return value check in chash_init_server_tree
- CLEANUP: http-ana: Remove useless if statement about L7 retries
- BUG/MAJOR: stream-int: Release SI endpoint on server side ASAP on retry
- MINOR: backend: Don't release SI endpoint anymore in connect_server()
- BUG/MINOR: vars: Be sure to have a session to get checks variables
- DOC/MINOR: move uuid in the configuration to the right alphabetical order
- CLEANUP: mux-fcgi: Don't needlessly store result of data/trailers parsing
- BUILD: fix compilation for OpenSSL-3.0.0-alpha17
- MINOR: http-ana: Use -1 status for client aborts during queuing and connect
- REGTESTS: Fix http_abortonclose.vtc to support -1 status for some client aborts
- CLEANUP: backend: fix incorrect comments on locking conditions for lb functions
- CLEANUP: reg-tests: Remove obsolete no-htx parameter for reg-tests
- CI: github actions: add OpenSSL-3.0.0 builds
- CI: github actions: -Wno-deprecated-declarations with OpenSSL 3.0.0
- MINOR: errors: allow empty va_args for diag variadic macro
- REORG: errors: split errors reporting function from log.c
- CLEANUP: server: fix cosmetic of error message on sni parsing
- MEDIUM: errors: implement user messages buffer
- MINOR: log: do not discard stderr when starting is over
- MEDIUM: errors: implement parsing context type
- MINOR: errors: use user messages context in print_message
- MINOR: log: display exec path on first warning
- MINOR: errors: specify prefix "config" for parsing output
- MINOR: log: define server user message format
- REORG: server: use parsing ctx for server parsing
- REORG: config: use parsing ctx for server config check
- MINOR: server: use parsing ctx for server init addr
- MINOR: server: use ha_alert in server parsing functions
- DOC: use the req.ssl_sni in examples
- CLEANUP: cfgparse: Remove duplication of `MAX_LINE_ARGS + 1`
- CLEANUP: tools: Make errptr const in `parse_line()`
- MINOR: haproxy: Add `-cc` argument
- BUG: errors: remove printf positional args for user messages context
- CI: Make matrix.py executable and add shebang
- BUILD: make tune.ssl.keylog available again
- BUG/MINOR: ssl: OCSP stapling does not work if expire too far in the future
- Revert "BUG/MINOR: opentracing: initialization after establishing daemon mode"
- BUG/MEDIUM: opentracing: initialization before establishing daemon and/or chroot mode
- SCRIPTS: opentracing: enable parallel builds in build-ot.sh
- BUG/MEDIUM: compression: Fix loop skipping unused blocks to get the next block
- BUG/MEDIUM: compression: Properly get the next block to iterate on payload
- BUG/MEDIUM: compression: Add a flag to know the filter is still processing data
- MINOR: ssl: Keep the actual key length in the certificate_ocsp structure
- MINOR: ssl: Add new "show ssl ocsp-response" CLI command
- MINOR: ssl: Add the OCSP entry key when displaying the details of a certificate
- MINOR: ssl: Add the "show ssl cert foo.pem.ocsp" CLI command
- REGTESTS: ssl: Add "show ssl ocsp-response" test
- BUG/MINOR: server: explicitly set "none" init-addr for dynamic servers
- BUG/MINOR: pools: fix a possible memory leak in the lockless pool_flush()
- BUG/MINOR: pools: make DEBUG_UAF always write to the to-be-freed location
- MINOR: pools: do not maintain the lock during pool_flush()
- MINOR: pools: call malloc_trim() under thread isolation
- MEDIUM: pools: use a single pool_gc() function for locked and lockless
- BUG/MAJOR: pools: fix possible race with free() in the lockless variant
- CLEANUP: pools: remove now unused seq and pool_free_list
- MEDIUM: pools: remove the locked pools implementation
- BUILD: ssl: Fix compilation with BoringSSL
- BUG/MEDIUM: errors: include missing obj_type file
- REGTESTS: ssl: show_ssl_ocspresponce.vtc is broken with BoringSSL
- BUG/MAJOR: htx: Fix htx_defrag() when an HTX block is expanded
- BUG/MINOR: mux-fcgi: Expose SERVER_SOFTWARE parameter by default
- BUG/MINOR: h1-htx: Fix a signess bug with char data type when parsing chunk size
- CLEANUP: l7-retries: do not test the buffer before calling b_alloc()
- BUG/MINOR: resolvers: answser item list was randomly purged or errors
- MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item
- MEDIUM: resolvers: add a ref between servers and srv request or used SRV record
- BUG/MINOR: server-state: load SRV resolution only if params match the config
- MINOR: config: remove support for deprecated option "tune.chksize"
- MINOR: config: completely remove support for "no option http-use-htx"
- MINOR: log: remove the long-deprecated early log-format tags
- MINOR: http: remove the long deprecated "set-cookie()" sample fetch function
- MINOR: config: reject long-deprecated "option forceclose"
- MINOR: config: remove deprecated option "http-tunnel"
- MEDIUM: proxy: remove the deprecated "grace" keyword
- MAJOR: config: remove parsing of the global "nbproc" directive
- BUILD: init: remove initialization of multi-process thread mappings
- BUILD: log: remove unused fmt_directive()
- REGTESTS: Remove REQUIRE_VERSION=1.6 from all tests
- REGTESTS: Remove REQUIRE_VERSION=1.7 from all tests
- CI: github actions: enable alpine/musl builds
- BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs
- DOC: lua: Add a warning about buffers modification in HTTP
- MINOR: ssl: Use OpenSSL's ASN1_TIME convertor when available
- BUG/MINOR: stick-table: insert srv in used_name tree even with fixed id
- BUG/MEDIUM: server: extend thread-isolate over much of CLI 'add server'
- BUG/MEDIUM: server: clear dynamic srv on delete from proxy id/name trees
- BUG/MEDIUM: server: do not forget to generate the dynamic servers ids
- BUG/MINOR: server: do not keep an invalid dynamic server in px ids tree
- BUG/MEDIUM: server: do not auto insert a dynamic server in px addr_node
- BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE
- BUG/MINOR: ssl: use atomic ops to update global shctx stats
- BUG/MINOR: mworker: fix typo in chroot error message
- CLEANUP: global: remove unused definition of stopping_task[]
- MEDIUM: init: remove the loop over processes during init
- MINOR: mworker: remove the initialization loop over processes
- CLEANUP: global: remove the nbproc field from the global structure
- CLEANUP: global: remove pid_bit and all_proc_mask
- MEDIUM: global: remove dead code from nbproc/bind_proc removal
- MEDIUM: config: simplify cpu-map handling
- MEDIUM: cpu-set: make the proc a single bit field and not an array
- CLEANUP: global: remove unused definition of MAX_PROCS
- MEDIUM: global: remove the relative_pid from global and mworker
- DOC: update references to process numbers in cpu-map and bind-process
- MEDIUM: config: warn about "bind-process" deprecation
- CLEANUP: shctx: remove the different inter-process locking techniques
- BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue
- MINOR: backend: only skip LB when there are actual connections
- BUG/MINOR: mux-h1: do not skip the error response on bad requests
- MINOR: connection: add helper conn_append_debug_info()
- MINOR: mux-h2/trace: report a few connection-level info during h2_init()
- CLEANUP: mux-h2/traces: better align user messages
- BUG/MINOR: stats: make "show stat typed desc" work again
- MINOR: mux-h2: obey http-ignore-probes during the preface
- BUG/MINOR: mux-h2/traces: bring back the lost "rcvd H2 REQ" trace
- BUG/MINOR: mux-h2/traces: bring back the lost "sent H2 REQ/RES" traces
- CLEANUP: assorted typo fixes in the code and comments
- CI: Replace the requirement for 'sudo' with a call to 'ulimit -n'
- REGTESTS: Replace REQUIRE_VERSION=2.5 with 'haproxy -cc'
- REGTESTS: Replace REQUIRE_OPTIONS with 'haproxy -cc' for 2.5+ tests
- REGTESTS: Replace REQUIRE_BINARIES with 'command -v'
- REGTESTS: Remove support for REQUIRE_BINARIES
- CI: ssl: enable parallel builds for OpenSSL on Linux
- CI: ssl: do not needlessly build the OpenSSL docs
- CI: ssl: keep the old method for ancient OpenSSL versions
- CLEANUP: server: a separate function for initializing the per_thr field
- BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled
- BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI
- MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
- MINOR: resolvers: Remove server from named_servers tree when removing a SRV item
- BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status
- BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose
- BUG/MINOR: backend: do not set sni on connection reuse
- BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()
- BUG/MINOR: server/cli: Fix locking in function processing "set server" command
- BUG/MINOR: cache: Correctly handle existing-but-empty 'accept-encoding' header
- MINOR: ssl: fix typo in usage for 'new ssl ca-file'
- MINOR: ssl: always initialize random generator
- MINOR: ssl: check allocation in ssl_sock_init_srv
- MINOR: ssl: check allocation in parse ciphers/ciphersuites/verifyhost
- MINOR: ssl: check allocation in parse npn/sni
- MINOR: server: disable CLI 'set server ssl' for dynamic servers
- MINOR: ssl: render file-access optional on server crt loading
- MINOR: ssl: split parse functions for alpn/check-alpn
- MINOR: ssl: support ca-file arg for dynamic servers
- MINOR: ssl: support crt arg for dynamic servers
- MINOR: ssl: support crl arg for dynamic servers
- MINOR: ssl: enable a series of ssl keywords for dynamic servers
- MINOR: ssl: support ssl keyword for dynamic servers
- REGTESTS: server: test ssl support for dynamic servers
- MINOR: queue: update the stream's pend_pos before queuing it
- CLEANUP: Prevent channel-t.h from being detected as C++ by GitHub
- BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check
- REGTESTS: fix maxconn update with agent-check
- MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn
- MINOR: queue: update proxy->served once out of the loop
- MEDIUM: queue: refine the locking in process_srv_queue()
- MINOR: lb/api: remove the locked argument from take_conn/drop_conn
- MINOR: queue: create a new structure type "queue"
- MINOR: proxy: replace the pendconns-related stuff with a struct queue
- MINOR: server: replace the pendconns-related stuff with a struct queue
- MEDIUM: queue: use a dedicated lock for the queues
- MEDIUM: queue: simplify again the process_srv_queue() API
- MINOR: queue: factor out the proxy/server queuing code
- MINOR: queue: use atomic-ops to update the queue's index
- MEDIUM: queue: determine in process_srv_queue() if the proxy is usable
- MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()
- MEDIUM: queue: unlock as soon as possible
- MINOR: queue: make pendconn_first() take the lock by itself
- CLEANUP: backend: remove impossible case of round-robin + consistent hash
- MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
- DOC: config: Add missing actions in "tcp-request session" documentation
- CLEANUP: dns: Remove a forgotten debug message
- DOC: Replace issue templates by issue forms
- Revert "MINOR: queue: make pendconn_first() take the lock by itself"
- Revert "MEDIUM: queue: unlock as soon as possible"
- Revert "MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()"
- Revert "MEDIUM: queue: determine in process_srv_queue() if the proxy is usable"
- Revert "MINOR: queue: use atomic-ops to update the queue's index"
- Revert "MINOR: queue: factor out the proxy/server queuing code"
- Revert "MEDIUM: queue: simplify again the process_srv_queue() API"
- Revert "MEDIUM: queue: use a dedicated lock for the queues"
- Revert "MEDIUM: queue: refine the locking in process_srv_queue()"
- Revert "MINOR: queue: update proxy->served once out of the loop"
- Revert "MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn"
- MEDIUM: queue: update px->served and lb's take_conn once per loop
- MEDIUM: queue: use a dedicated lock for the queues (v2)
- MEDIUM: queue: simplify again the process_srv_queue() API (v2)
- MEDIUM: queue: determine in process_srv_queue() if the proxy is usable (v2)
- MINOR: queue: factor out the proxy/server queuing code (v2)
- MINOR: queue: use atomic-ops to update the queue's index (v2)
- MEDIUM: queue: take the proxy lock only during the px queue accesses
- MEDIUM: queue: use a trylock on the server's queue
- MINOR: queue: add queue_init() to initialize a queue
- MINOR: queue: add a pointer to the server and the proxy in the queue
- MINOR: queue: store a pointer to the queue into the pendconn
- MINOR: queue: remove the px/srv fields from pendconn
- MINOR: queue: simplify pendconn_unlink() regarding srv vs px
- BUG: backend: stop looking for queued connections once there's no more
- BUG/MINOR: queue/debug: use the correct lock labels on the queue lock
- BUG/MINOR: resolvers: Always attach server on matching record on resolution
- BUG/MINOR: resolvers: Reset server IP when no ip is found in the response
- MINOR: resolvers: Reset server IP on error in resolv_get_ip_from_response()
- BUG/MINOR: checks: return correct error code for srv_parse_agent_check
- BUILD: Makefile: fix linkage for Haiku.
- BUG/MINOR: tcpcheck: Fix numbering of implicit HTTP send/expect rules
- MINOR: http-act/tcp-act: Add "set-log-level" for tcp content rules
- MINOR: http-act/tcp-act: Add "set-nice" for tcp content rules
- MINOR: http-act/tcp-act: Add "set-mark" and "set-tos" for tcp content rules
- CLEANUP: tcp-act: Sort action lists
- BUILD/MEDIUM: tcp: set-mark setting support for FreeBSD.
- BUILD: tcp-act: avoid warning when set-mark / set-tos are not supported
- BUG/MINOR: mqtt: Fix parser for string with more than 127 characters
- BUG/MINOR: mqtt: Support empty client ID in CONNECT message
- BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV resolution
- CLEANUP: peers: re-write intdecode function comment.
BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV resolution
The commit 3406766d5 ("MEDIUM: resolvers: add a ref between servers and srv
request or used SRV record") introduced a regression. The first server of a
template based on SRV record is no longer resolved. The same bug exists for
a normal server based on a SRV record.
In fact, the server used during parsing (used as reference when a
server-template line is parsed) is never attached to the corresponding srvrq
object. Thus with following lines, no resolution is performed because
"srvrq->attached_servers" is empty:
server-template test 1 _http.domain.tld resolvers dns ...
server test1 _http.domain.tld resolvers dns ...
This patch should fix the issue #1295 (but not confirmed yet it is the same
bug). It must be backported everywhere the above commit is.
BUG/MINOR: mqtt: Support empty client ID in CONNECT message
As specified by the MQTT specification (MQTT-3.1.3-6), the client ID may be
empty. That means the length of the client ID string may be 0. However, The
MQTT parser does not support empty strings.
So, to fix the bug, the mqtt_read_string() function may now parse empty
string. 2 bytes must be found to decode the string length, but the length
may be 0 now. It is the caller responsibility to test the string emptiness
if necessary. In addition, in mqtt_parse_connect(), the client ID may be
empty now.
This patch should partely fix the issue #1310. It must be backported to 2.4.
BUG/MINOR: mqtt: Fix parser for string with more than 127 characters
Parsing of too long strings (> 127 characters) was buggy because of a wrong
cast on the length bytes. To fix the bug, we rely on mqtt_read_2byte_int()
function. This way, the string length is properly decoded.
This patch should partely fix the issue #1310. It must be backported to 2.4.
Willy Tarreau [Mon, 28 Jun 2021 05:12:22 +0000 (07:12 +0200)]
BUILD: tcp-act: avoid warning when set-mark / set-tos are not supported
Since recent commit 469c06c30 ("MINOR: http-act/tcp-act: Add "set-mark"
and "set-tos" for tcp content rules") there's a build warning (or error)
on Windows due to static function tcp_action_set_mark() not being used
because the set-mark functionality is not supported there. It's caused
by the fact that only the parsing function uses it so if the code is
ifdefed out the function remains unused.
Let's surround it with ifdefs as well, and do the same for
tcp_action_set_tos() which could suffer the same fate on operating systems
not defining IP_TOS.
This may need to be backported if the patch above is backported. Also
be careful, the condition was adjusted to cover FreeBSD after commit f7f53afcf ("BUILD/MEDIUM: tcp: set-mark setting support for FreeBSD.").
MINOR: http-act/tcp-act: Add "set-mark" and "set-tos" for tcp content rules
It is now possible to set the Netfilter MARK and the TOS field value in all
packets sent to the client from any tcp-request rulesets or the "tcp-response
content" one. To do so, the parsing of "set-mark" and "set-tos" actions are
moved in tcp_act.c and the actions evaluation is handled in dedicated functions.
This patch may be backported as far as 2.2 if necessary.
MINOR: http-act/tcp-act: Add "set-nice" for tcp content rules
It is now possible to set the "nice" factor of the current stream from a
"tcp-request content" or "tcp-response content" ruleset. To do so, the
action parsing is moved in stream.c and the action evaluation is handled in
a dedicated function.
This patch may be backported as far as 2.2 if necessary.
MINOR: http-act/tcp-act: Add "set-log-level" for tcp content rules
It is now possible to set the stream log level from a "tcp-request content"
or "tcp-response content" ruleset. To do so, the action parsing is moved in
stream.c and the action evaluation is handled in a dedicated function.
This patch should fix issue #1306. It may be backported as far as 2.2 if
necessary.
BUG/MINOR: tcpcheck: Fix numbering of implicit HTTP send/expect rules
The index of the failing rule is reported in the health-check log message. The
rules index is also used in the check traces. But for implicit HTTP send/expect
rules, the index is wrong. It must be incremented by one compared to the
preceding rule.
Dirkjan Bussink [Fri, 18 Jun 2021 19:57:49 +0000 (19:57 +0000)]
BUG/MINOR: checks: return correct error code for srv_parse_agent_check
In srv_parse_agent_check the error code is not returned in case
something goes wrong. The value 0 is always return.
Additionally, there's a small cleanup of unreachable returns that in
most checks are not present either and removed in two places they were
present. This makes the code consistent across the different checks.
MINOR: resolvers: Reset server IP on error in resolv_get_ip_from_response()
If resolv_get_ip_from_response() returns an error (or an unexpected return
value), the server is set to RMAINT status. However, its address must also
be reset. Otherwise, it is still reported by the cli on "show servers state"
commands. This may be confusing. Note that it is a theorical patch because
this code path does not exist. Thus it is not tagged as a BUG.
BUG/MINOR: resolvers: Reset server IP when no ip is found in the response
For A/AAAA resolution, if no ip is found for a server in the response, the
server is set to RMAINT status. However, its address must also be
reset. Otherwise, it is still reported by the cli on "show servers state"
commands. This may be confusing.
BUG/MINOR: resolvers: Always attach server on matching record on resolution
On A/AAAA resolution, for a given server, if a record is matching, we must
always attach the server to this record. Before it was only done if the
server IP was not the same than the record one. However, it is a problem if
the server IP was not set for a previous resolution. From the libc during
startup for instance. In this case, the server IP is not updated and the
server is not attached to any record. It remains in this state while a
matching record is found in the DNS response. It is especially a problem
when the resolution is used for server-templates.
This bug was introduced by the commit bd78c912f ("MEDIUM: resolvers: add a
ref on server to the used A/AAAA answer item").
This patch should solve the issue #1305. It must be backported to all
versions containing the above commit.
Willy Tarreau [Thu, 24 Jun 2021 14:00:18 +0000 (16:00 +0200)]
BUG/MINOR: queue/debug: use the correct lock labels on the queue lock
A dedicated queue lock was added by commit 16fbdda3c ("MEDIUM: queue:
use a dedicated lock for the queues (v2)") but during its rebase, some
labels were lost and left to SERVER_LOCK / PROXY_LOCK instead of
QUEUE_LOCK. It's harmless but can confuse the lock debugger, so better
fix it.
Willy Tarreau [Thu, 24 Jun 2021 13:51:12 +0000 (15:51 +0200)]
BUG: backend: stop looking for queued connections once there's no more
Commit ae0b12ee0 ("MEDIUM: queue: use a trylock on the server's queue")
introduced a hard to trigger bug that's more visible with a single thread:
if a server dequeues a connection and finds another free slot with no
connection to place there, process_srv_queue() will never break out of
the loop. In multi-thread it almost does not happen because other threads
bring new connections.
Willy Tarreau [Wed, 23 Jun 2021 14:54:16 +0000 (16:54 +0200)]
MINOR: queue: simplify pendconn_unlink() regarding srv vs px
Since the code paths became exactly the same except for what log field
to update, let's simplify the code and move further code out of the
lock. The queue position update and the test for server vs proxy do not
need to be inside the lock.
Willy Tarreau [Wed, 23 Jun 2021 14:43:45 +0000 (16:43 +0200)]
MINOR: queue: remove the px/srv fields from pendconn
Now we directly use p->queue to get to the queue, which is much more
straightforward. The performance on 100 servers and 16 threads
increased from 560k to 574k RPS, or 2.5%.
A lot more simplifications are possible, but the minimum was done at
this point.
Willy Tarreau [Wed, 23 Jun 2021 14:11:02 +0000 (16:11 +0200)]
MINOR: queue: add a pointer to the server and the proxy in the queue
A queue is specific to a server or a proxy, so we don't need to place
this distinction inside all pendconns, it can be in the queue itself.
This commit adds the relevant fields "px" and "sv" into the struct
queue, and initializes them accordingly.
Willy Tarreau [Thu, 24 Jun 2021 06:30:07 +0000 (08:30 +0200)]
MEDIUM: queue: use a trylock on the server's queue
Doing so makes sure that threads attempting to wake up new connections
for a server will give up early if another thread is already in charge
of this. The goal is to avoid unneeded contention on low server counts.
Now with a single server with 16 threads in roundrobin we get the same
performance as with multiple servers, i.e. ~575kreq/s instead of ~496k
before. Leastconn is seeing a similar jump, from ~460 to ~560k (the
difference being the calls to fwlc_srv_reposition).
The overhead of process_srv_queue() is now around 2% instead of ~20%
previously.
Willy Tarreau [Thu, 24 Jun 2021 06:04:24 +0000 (08:04 +0200)]
MEDIUM: queue: take the proxy lock only during the px queue accesses
There's no point keeping the proxy lock held for a long time, it's
only needed when checking the proxy's queue, and keeping it prevents
multiple servers from dequeuing in parallel. Let's move it into
pendconn_process_next_strm() and release it ASAP. The pendconn
remains under the server queue lock's protection, guaranteeing that
no stream will release it while it's being touched.
For roundrobin, the performance increases by 76% (327k to 575k) on
16 threads. Even with a single server and maxconn=100, the performance
increases from 398 to 496 kreq/s. For leastconn, almost no change is
visible (less than one percent) but this is expected since most of the
time there is spent in fwlc_reposition() and fwlc_get_next_server().
Willy Tarreau [Fri, 18 Jun 2021 08:51:58 +0000 (10:51 +0200)]
MINOR: queue: use atomic-ops to update the queue's index (v2)
Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.
Willy Tarreau [Fri, 18 Jun 2021 08:21:20 +0000 (10:21 +0200)]
MINOR: queue: factor out the proxy/server queuing code (v2)
The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.
The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.
The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.
Willy Tarreau [Tue, 22 Jun 2021 16:47:51 +0000 (18:47 +0200)]
MEDIUM: queue: simplify again the process_srv_queue() API (v2)
This basically undoes the API changes that were performed by commit 0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
Willy Tarreau [Fri, 18 Jun 2021 07:45:27 +0000 (09:45 +0200)]
MEDIUM: queue: use a dedicated lock for the queues (v2)
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.
This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.
The lower contention on the locks increases the performance from 362k
to 374k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 9%.
This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
Willy Tarreau [Thu, 24 Jun 2021 05:47:08 +0000 (07:47 +0200)]
MEDIUM: queue: update px->served and lb's take_conn once per loop
There's no point doing atomic incs over px->served/px->totpend under the
locks from the inner loop, as this value is used by the LB algorithms but
not during the dequeuing step. In addition, the LB algo's take_conn()
doesn't need to be refreshed for each and every connection taken
under the lock, it can be performed once at the end and out of the
lock.
While the gain on roundrobin is not noticeable (only the atomic inc),
on leastconn which uses take_conn(), the performance increases from
355k to 362k req/s on 16 threads.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
Tim Düsterhus [Wed, 23 Jun 2021 19:38:13 +0000 (21:38 +0200)]
DOC: Replace issue templates by issue forms
GitHub's issue forms are the next evolution of issue templates and allow for
showing an actual form with separate inputs when creating an issue. They ensure
that all the required fields are filled in and automatically format code parts
(e.g. haproxy -vv or the configuration) as actual code blocks, possibly with
syntax highlighting.
Co-authored-by: Maximilian Mader <max@bastelstu.be>
MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
If it possible to set source IP/Port from "tcp-request connection",
"tcp-request session" and "http-request" rules but not from "tcp-request
content" rules. There is no reason for this limitation and it may be a
problem for anyone wanting to call a lua fetch to dynamically set source
IP/Port from a TCP proxy. Indeed, to call a lua fetch, we must have a
stream. And there is no stream when "tcp-request connection/session" rules
are evaluated.
Thanks to this patch, "set-src" and "set-src-port" action are now supported
by "tcp_request content" rules.
This patch is related to the issue #1303. It may be backported to all stable
versions.
Willy Tarreau [Tue, 22 Jun 2021 15:31:51 +0000 (17:31 +0200)]
CLEANUP: backend: remove impossible case of round-robin + consistent hash
In 1.4, consistent hashing was brought by commit 6b2e11be1 ("[MEDIUM]
backend: implement consistent hashing variation") which took care of
replacing all direct calls to map_get_server_rr() with an alternate
call to chash_get_next_server() if consistent hash was being used.
One of them, however, cannot happen because a preliminary test for
static round-robin is being done prior to the call, so we're certain
that if it matches it cannot use a consistent hash tree.
Willy Tarreau [Fri, 18 Jun 2021 18:32:50 +0000 (20:32 +0200)]
MINOR: queue: make pendconn_first() take the lock by itself
Dealing with the queue lock in the caller remains complicated. Let's
change pendconn_first() to take the queue instead of the tree head,
and handle the lock itself. It now returns an element with a locked
queue or no element with an unlocked queue. It can avoid locking if
the queue is already empty.
Willy Tarreau [Fri, 18 Jun 2021 18:12:11 +0000 (20:12 +0200)]
MEDIUM: queue: unlock as soon as possible
There's no point keeping the server's queue lock after seeing that the
server's queue is empty, just like there's no need to keep the proxy's
lock when its queue is empty. This patch checks for emptiness and
releases these locks as soon as possible.
With this the performance increased from 524k to 530k on 16 threads
with round-robin.
Willy Tarreau [Fri, 18 Jun 2021 17:57:34 +0000 (19:57 +0200)]
MEDIUM: queue: move the queue lock manipulation to pendconn_process_next_strm()
By placing the lock there, it becomes possible to lock the proxy
later and to unlock it earlier. The server unlocking also happens slightly
earlier.
The performance on roundrobin increases from 481k to 524k req/s on 16
threads. Leastconn shows about 513k req/s (the difference being the
take_conn() call).
The performance profile changes from this:
9.32% hap-pxok [.] process_srv_queue
7.56% hap-pxok [.] pendconn_dequeue
6.90% hap-pxok [.] pendconn_add
Willy Tarreau [Fri, 18 Jun 2021 17:45:17 +0000 (19:45 +0200)]
MEDIUM: queue: determine in process_srv_queue() if the proxy is usable
By doing so we can move some evaluations outside of the lock and the
loop. In the round robin case, the performance increases from 497k to
505k rps on 16 threads with 100 servers.
Willy Tarreau [Fri, 18 Jun 2021 08:51:58 +0000 (10:51 +0200)]
MINOR: queue: use atomic-ops to update the queue's index
Doing so allows to retrieve and update the pendconn's queue index outside
of the queue's lock and to save one more percent CPU on a highly-contented
backend.
Willy Tarreau [Fri, 18 Jun 2021 08:21:20 +0000 (10:21 +0200)]
MINOR: queue: factor out the proxy/server queuing code
The code only differed by the nbpend_max counter. Let's have a pointer
to it and merge the two variants to always use a generic queue. It was
initially considered to put the max inside the queue structure itself,
but the stats support clearing values and maxes and this would have been
the only counter having to be handled separately there. Given that we
don't need this max anywhere outside stats, let's keep it where it is
and have a pointer to it instead.
The CAS loop to update the max remains. It was naively thought that it
would have been faster without atomic ops inside the lock, but this is
not the case for the simple reason that it is a max, it converges very
quickly and never has to perform the check anymore. Thus this code is
better out of the lock.
The queue_idx is still updated inside the lock since that's where the
idx is updated, though it could be performed using atomic ops given
that it's only used to roughly count places for logging.
Willy Tarreau [Tue, 22 Jun 2021 16:47:51 +0000 (18:47 +0200)]
MEDIUM: queue: simplify again the process_srv_queue() API
This basically undoes the API changes that were performed by commit 0274286dd ("BUG/MAJOR: server: fix deadlock when changing maxconn via
agent-check") to address the deadlock issue: since process_srv_queue()
doesn't use the server lock anymore, it doesn't need the "server_locked"
argument, so let's get rid of it before it gets used again.
Willy Tarreau [Fri, 18 Jun 2021 07:45:27 +0000 (09:45 +0200)]
MEDIUM: queue: use a dedicated lock for the queues
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.
This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.
The lower contention on the locks increases the performance from 491k
to 507k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 6%.
At this point the differences are mostly measurement noise.
This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
Willy Tarreau [Fri, 18 Jun 2021 07:21:22 +0000 (09:21 +0200)]
MINOR: queue: create a new structure type "queue"
This structure will be common to proxies and servers and will contain
everything needed to handle their respective queues. For now it's only
a tree head, a length and an index.
Willy Tarreau [Fri, 18 Jun 2021 16:29:25 +0000 (18:29 +0200)]
MINOR: lb/api: remove the locked argument from take_conn/drop_conn
This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers
of take_conn/drop_conn tell if they have the lock") that was merged
during 2.4 before the various locks could be eliminated at the lower
layers. Passing that information complicates the cleanup of the queuing
code and it's become useless.
Willy Tarreau [Fri, 18 Jun 2021 17:08:23 +0000 (19:08 +0200)]
MEDIUM: queue: refine the locking in process_srv_queue()
The lock in process_srv_queue() was placed around the whole loop to
avoid the cost of taking/releasing it multiple times. But in practice
almost all calls to this function only dequeue a single connection, so
that argument doesn't really stand. However by placing the lock inside
the loop, we'd make it possible to release it before manipulating the
pendconn and waking the task up. That's what this patch does.
This increases the performance from 431k to 491k req/s on 16 threads
with 20 servers under leastconn.
Willy Tarreau [Fri, 18 Jun 2021 16:48:06 +0000 (18:48 +0200)]
MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn
It used to do far too much under the lock, including waking up tasks,
updating counters and repositionning entries in the load balancing algo.
This patch first moves all that stuff out of the function into the only
caller (process_srv_queue()). The decision to update the LB algo is now
taken out of the lock. The wakeups could be performed outside of the
loop by using a local list.
This increases the performance from 377k to 431k req/s on 16 threads
with 20 servers under leastconn.
Amaury Denoyelle [Tue, 22 Jun 2021 14:23:11 +0000 (16:23 +0200)]
REGTESTS: fix maxconn update with agent-check
Correct the typo in the parameter used to update the 'maxconn' via
agent-check. The test is also completed to detect the update of maxconn
using CLI 'show stats'.
Amaury Denoyelle [Fri, 18 Jun 2021 09:11:36 +0000 (11:11 +0200)]
BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check
The server_parse_maxconn_change_request locks the server lock. However,
this function can be called via agent-checks or lua code which already
lock it. This bug has been introduced by the following commit :
This commit tried to fix another deadlock with can occur because
previoulsy server_parse_maxconn_change_request requires the server lock
to be held. However, it may call internally process_srv_queue which also
locks the server lock. The locking policy has thus been updated. The fix
is functional for the CLI 'set maxconn' but fails to address the
agent-check / lua counterparts.
This new issue is fixed in two steps :
- changes from the above commit have been reverted. This means that
server_parse_maxconn_change_request must again be called with the
server lock.
- to counter the deadlock fixed by the above commit, process_srv_queue
now takes an argument to render the server locking optional if the
caller already held it. This is only used by
server_parse_maxconn_change_request.
The above commit was subject to backport up to 1.8. Thus this commit
must be backported in every release where it is already present.
Tim Duesterhus [Sat, 19 Jun 2021 14:56:30 +0000 (16:56 +0200)]
CLEANUP: Prevent channel-t.h from being detected as C++ by GitHub
GitHub uses github/linguist to determine the programming language used for each
source file to show statistics and to power the search. In cases of unique file
extensions this is easy, but for `.h` files the situation is less clear as they
are used for C, C++, Objective C and more. In these cases linguist makes use of
heuristics to determine the language.
One of these heuristics for C++ is that the file contains a line beginning with
`try`, only preceded by whitespace indentation. This heuristic matches the long
comment at the bottom of `channel-t.h`, as one sentence includes the word `try`
after a linebreak.
Fix this misdetection by changing the comment to follow the convention that all
lines start with an asterisk.
Willy Tarreau [Fri, 18 Jun 2021 08:33:47 +0000 (10:33 +0200)]
MINOR: queue: update the stream's pend_pos before queuing it
Since commit c7eedf7a5 ("MINOR: queue: reduce the locked area in
pendconn_add()") the stream's pend_pos is set out of the lock, after
the pendconn is queued. While this entry is only manipulated by the
stream itself and there is no bug caused by this right now, it's a
bit dangerous because another thread could decide to look at this
field during dequeuing and could randomly see something else. Also
in case of crashes, memory inspection wouldn't be as trustable.
Let's assign the pendconn before it can be found in the queue.
Amaury Denoyelle [Fri, 18 Jun 2021 14:30:36 +0000 (16:30 +0200)]
REGTESTS: server: test ssl support for dynamic servers
Create a new regtest to test SSL support for dynamic servers.
The first step of the test is to create the ca-file via the CLI. Then a
dynamic server is created with the ssl option using the ca-file. A
client request is made through it to achieve the test.
Amaury Denoyelle [Wed, 19 May 2021 07:49:41 +0000 (09:49 +0200)]
MINOR: ssl: support ssl keyword for dynamic servers
Activate the 'ssl' keyword for dynamic servers. This is the final step
to have ssl dynamic servers feature implemented. If activated,
ssl_sock_prepare_srv_ctx will be called at the end of the 'add server'
CLI handler.
At the same time, update the management doc to list all ssl keywords
implemented for dynamic servers.
Amaury Denoyelle [Mon, 14 Jun 2021 08:10:32 +0000 (10:10 +0200)]
MINOR: ssl: support crl arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crl is used at runtime for a dynamic server. The crl must
have already been loaded either in the config or through the 'ssl crl'
CLI commands.
Amaury Denoyelle [Fri, 21 May 2021 14:22:53 +0000 (16:22 +0200)]
MINOR: ssl: support crt arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_crt is used at runtime for a dynamic server. The cert must
have already been loaded either in the config or through the 'ssl cert'
CLI commands.
Amaury Denoyelle [Wed, 19 May 2021 07:46:59 +0000 (09:46 +0200)]
MINOR: ssl: support ca-file arg for dynamic servers
File-access through ssl_store_load_locations_file is deactivated if
srv_parse_ca_file is used at runtime for a dynamic server. The ca-file
must have already been loaded either in the config or through the 'ssl
ca-file' CLI commands.
Amaury Denoyelle [Fri, 21 May 2021 14:22:11 +0000 (16:22 +0200)]
MINOR: ssl: render file-access optional on server crt loading
The function ssl_sock_load_srv_cert will be used at runtime for dynamic
servers. If the cert is not loaded on ckch tree, we try to access it
from the file-system.
Now this access operation is rendered optional by a new function
argument. It is only allowed at parsing time, but will be disabled for
dynamic servers at runtime.
Amaury Denoyelle [Wed, 19 May 2021 13:00:54 +0000 (15:00 +0200)]
MINOR: server: disable CLI 'set server ssl' for dynamic servers
'set server ssl' uses ssl parameters from default-server. As dynamic
servers does not reuse any default-server parameters, this command has
no sense for them.
Amaury Denoyelle [Wed, 19 May 2021 13:35:29 +0000 (15:35 +0200)]
MINOR: ssl: always initialize random generator
Explicitly call ssl_initialize_random to initialize the random generator
in init() global function. If the initialization fails, the startup is
interrupted.
This commit is in preparation for support of ssl on dynamic servers. To
be able to activate ssl on dynamic servers, it is necessary to ensure
that the random generator is initialized on startup regardless of the
config. It cannot be called at runtime as access to /dev/urandom is
required.
This also has the effect to fix the previous non-consistent behavior.
Indeed, if bind or server in the config are using ssl, the
initialization function was called, and if it failed, the startup was
interrupted. Otherwise, the ssl initialization code could have been
called through the ssl server for lua, but this times without blocking
the startup on error. Or not called at all if lua was deactivated.
RFC 7231#5.3.4 makes a difference between a completely missing
'accept-encoding' header and an 'accept-encoding' header without any values.
This case was already correctly handled by accident, because an empty accept
encoding does not match any known encoding. However this resulted in the
'other' encoding being added to the bitmap. Usually this also succeeds in
serving cached responses, because the cached response likely has no
'content-encoding', thus matching the identity case instead of not serving the
response, due to the 'other' encoding. But it's technically not 100% correct.
Fix this by special-casing 'accept-encoding' values with a length of zero and
extend the test to check that an empty accept-encoding is correctly handled.
Due to the reasons given above the test also passes without the change in
cache.c.
Vary support was added in HAProxy 2.4. This fix should be backported to 2.4+.
BUG/MINOR: server/cli: Fix locking in function processing "set server" command
The commit c7b391aed ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn
is set from the CLI") introduced 2 bugs. The first one is a typo on the
server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about
the server's lock itself. It must be acquired to execute the "agent-send"
subcommand.
The patch above is marked to be backported as far as 1.8. Thus, this one
must also backported as far 1.8.
BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand
BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()
The commit dcac41806 ("BUG/MEDIUM: resolvers: Add a task on servers to check
SRV resolution status") introduced a type. In resolv_srvrq_expire_task()
function, the resolver's lock must be used instead of the resolver itself.
This patch must be backported with the patch above (at least as far as 2.2).
BUG/MINOR: backend: do not set sni on connection reuse
When reusing a backend connection, do not reapply the SNI on the
connection. It should already be defined when the connection was
instantiated on a previous connect_server invocation. As the SNI is a
parameter used to select a connection, only connection with same value
can be reused.
The impact of this bug is unknown and may be null. No memory leak has
been reported by valgrind. So this is more a cleaning fix.
This commit relies on the SF_SRV_REUSED flag and thus depends on the
following fix :
BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose
Amaury Denoyelle [Thu, 17 Jun 2021 13:14:49 +0000 (15:14 +0200)]
BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose
The SF_SRV_REUSED flag was set if a stream reused a backend connection.
One of its purpose is to count the total reuse on the backend in
opposition to newly instantiated connection.
However, the flag was diverted from its original purpose since the
following commit :
With this change, the flag is not set anymore if the mux is not ready
when a connection is picked for reuse. This can happen for multiplexed
connections which are inserted in the available list as soon as created
in http-reuse always mode. The goal of this change is to not retry
immediately this request in case on an error on the same server if the
reused connection is not fully ready.
This change is justified for the retry timeout handling but it breaks
other places which still uses the flag for its original purpose. Mainly,
in this case the wrong 'connect' backend counter is incremented instead
of the 'reuse' one. The flag is also used in http_return_srv_error and
may have an impact if a http server error is replied for this stream.
To fix this problem, the original purpose of the flag is restored by
setting it unconditionaly when a connection is reused. Additionally, a
new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the
connection is reused but the mux is not ready yet. For the timeout
handling on error, the request is retried immediately only if the stream
reused a connection without this newly anticipated flag.
BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status
When a server relies on a SRV resolution, a task is created to clean it up
(fqdn/port and address) when the SRV resolution is considered as outdated
(based on the resolvers 'timeout' value). It is only possible if the server
inherits outdated info from a state file and is no longer selected to be
attached to a SRV item. Note that most of time, a server is attached to a
SRV item. Thus when the item becomes obsolete, the server is cleaned
up.
It is important to have such task to be sure the server will be free again
to have a chance to be resolved again with fresh information. Of course,
this patch is a workaround to solve a design issue. But there is no other
obvious way to fix it without rewritting all the resolvers part. And it must
be backportable.
This patch relies on following commits:
* MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
* MINOR: resolvers: Remove server from named_servers tree when removing a SRV item
All the series must be backported as far as 2.2 after some observation
period. Backports to 2.0 and 1.8 must be evaluated.
MINOR: resolvers: Remove server from named_servers tree when removing a SRV item
When a server is cleaned up because the corresponding SRV item is removed,
we always remove the server from the srvrq's name_servers tree. For now, it
is useless because, if a server was attached to a SRV item, it means it was
already removed from the tree. But it will be mandatory to fix a bug.
MINOR: resolvers: Clean server in a dedicated function when removing a SRV item
A dedicated function is now used to clean up servers when a SRV item becomes
obsolete or when a requester is removed from a resolution. This patch is
mandatory to fix a bug.