BUILD: debug: Add braces to if statement calling only CHECK_IF()
In src/ev_epoll.c, a CHECK_IF() is guarded by an if statement. So, when the
macro is empty, GCC (at least 11.3.1) is not happy because there is an if
statement with an empty body without braces... It is handled by
"-Wempty-body" option.
BUG/MINOR: quic: do not send CONNECTION_CLOSE_APP in initial/handshake
As specified by RFC 9000, it is forbidden to send a CONNECTION_CLOSE of
type 0x1d (CONNECTION_CLOSE_APP) in an Initial or Handshake packet. It
must be converted to type 0x1c (CONNECTION_CLOSE) with APPLICATION_ERROR
code.
CONNECTION_CLOSE_APP are generated by QUIC MUX interaction. Thus,
special care must be taken when dealing with a 0-RTT packet, as this is
the only case where the MUX can be instantiated and quic-conn still on
the Initial or Handshake encryption level.
To enforce RFC 9000, xprt build packet function is now responsible to
translate a CONNECTION_CLOSE_APP if still on Initial/Handshake
encryption. This process is done in a dedicated function named
qc_build_cc_frm().
Without this patch, BUG_ON() statement in qc_build_frm() will be
triggered when building a CONNECTION_CLOSE_APP frame on Initial or
Handshake level. This is because QUIC_FT_CONNECTION_CLOSE_APP frame
builder mask does not allow these encryption levels, as opposed to
QUIC_FT_CONNECTION_CLOSE builder. This crash was reproduced by modifying
the H3 layer to force emission of a CONNECTION_CLOSE_APP on first frame
of a 0-RTT session.
Note however that CONNECTION_CLOSE emission during Handshake is a
complicated process for the server. For the moment, this is still
incomplete on haproxy side. RFC 9000 requires to emit it multiple times
in several packets under different encryption levels, depending on what
we know about the client encryption context.
BUG/MINOR: tools: fix statistical_prng_range()'s output range
This function was added by commit 84ebfabf7 ("MINOR: tools: add
statistical_prng_range() to get a random number over a range") but it
contains a bug on the range, since mul32hi() covers the whole input
range, we must pass it range-1. For now it didn't have any impact, but
if used to find an array's index it will cause trouble.
BUG/MINOR: resolvers: shut off the warning for the default resolvers
When the resolv.conf file is empty or there is no resolv.conf file, an
empty resolvers will be created, which emits a warning during the
postparsing step.
This patch fixes the problem by freeing the resolvers section if the
parsing failed or if the nameserver list is empty.
Must be backported in 2.6, the previous patch which introduces
resolvers_destroy() is also required.
BUG/MEDIUM: tools: avoid calling dlsym() in static builds (try 2)
The first approach in commit 288dc1d8e ("BUG/MEDIUM: tools: avoid calling
dlsym() in static builds") relied on dlopen() but on certain configs (at
least gcc-4.8+ld-2.27+glibc-2.17) it used to catch situations where it
ought not fail.
Let's have a second try on this using dladdr() instead. The variable was
renamed "build_is_static" as it's exactly what's being detected there.
We could even take it for reporting in -vv though that doesn't seem very
useful. At least the variable was made global to ease inspection via the
debugger, or in case it's useful later.
Now it properly detects a static build even with gcc-4.4+glibc-2.11.1 and
doesn't crash anymore.
Brad Smith [Sat, 16 Jul 2022 09:18:50 +0000 (05:18 -0400)]
BUILD: makefile: Fix install(1) handling for OpenBSD/NetBSD/Solaris/AIX
Add a new INSTALL variable to allow overridiing the flags passed to
install(1). install(1) on OpenBSD/NetBSD/Solaris/AIX does not support
the -v flag. With the new INSTALL variable and handling only use the
-v flag with the Linux targets.
Released version 2.7-dev2 with the following main changes :
- BUG/MINOR: qpack: fix build with QPACK_DEBUG
- MINOR: h3: handle errors on HEADERS parsing/QPACK decoding
- BUG/MINOR: qpack: abort on dynamic index field line decoding
- MINOR: qpack: properly handle invalid dynamic table references
- MINOR: task: Add tasklet_wakeup_after()
- BUG/MINOR: quic: Dropped packets not counted (with RX buffers full)
- MINOR: quic: Add new stats counter to diagnose RX buffer overrun
- MINOR: quic: Duplicated QUIC_RX_BUFSZ definition
- MINOR: quic: Improvements for the datagrams receipt
- CLEANUP: h2: Typo fix in h2_unsubcribe() traces
- MINOR: quic: Increase the QUIC connections RX buffer size (upto 64Kb)
- CLEANUP: mux-quic: adjust comment on qcs_consume()
- MINOR: ncbuf: implement ncb_is_fragmented()
- BUG/MINOR: mux-quic: do not signal FIN if gap in buffer
- MINOR: fd: add a new FD_DISOWN flag to prevent from closing a deleted FD
- BUG/MEDIUM: ssl/fd: unexpected fd close using async engine
- MINOR: tinfo: make tid temporarily still reflect global ID
- CLEANUP: config: remove unused proc_mask()
- MINOR: debug: remove mask support from "debug dev sched"
- MEDIUM: task: add and preset a thread ID in the task struct
- MEDIUM: task/debug: move the ->thread_mask integrity checks to ->tid
- MAJOR: task: use t->tid instead of ffsl(t->thread_mask) to take the thread ID
- MAJOR: task: replace t->thread_mask with 1<<t->tid when thread mask is needed
- CLEANUP: task: remove thread_mask from the struct task
- MEDIUM: applet: only keep appctx_new_*() and drop appctx_new()
- MEDIUM: task: only keep task_new_*() and drop task_new()
- MINOR: applet: always use task_new_on() on applet creation
- MEDIUM: task: remove TASK_SHARED_WQ and only use t->tid
- MINOR: task: replace task_set_affinity() with task_set_thread()
- CLEANUP: task: remove the unused task_unlink_rq()
- CLEANUP: task: remove the now unused TASK_GLOBAL flag
- MINOR: task: make rqueue_ticks atomic
- MEDIUM: task: move the shared runqueue to one per thread
- MEDIUM: task: replace the global rq_lock with a per-rq one
- MINOR: task: remove grq_total and use rq_total instead
- MINOR: task: replace global_tasks_mask with a check for tree's emptiness
- MEDIUM: task: use regular eb32 trees for the run queues
- MEDIUM: queue: revert to regular inter-task wakeups
- MINOR: thread: make wake_thread() take care of the sleeping threads mask
- MINOR: thread: move the flags to the shared cache line
- MINOR: thread: only use atomic ops to touch the flags
- MINOR: poller: centralize poll return handling
- MEDIUM: polling: make update_fd_polling() not care about sleeping threads
- MINOR: poller: update_fd_polling: wake a random other thread
- MEDIUM: thread: add a new per-thread flag TH_FL_NOTIFIED to remember wakeups
- MEDIUM: tasks/fd: replace sleeping_thread_mask with a TH_FL_SLEEPING flag
- MINOR: tinfo: add the tgid to the thread_info struct
- MINOR: tinfo: replace the tgid with tgid_bit in tgroup_info
- MINOR: tinfo: add the mask of enabled threads in each group
- MINOR: debug: use ltid_bit in ha_thread_dump()
- MINOR: wdt: use ltid_bit in wdt_handler()
- MINOR: clock: use ltid_bit in clock_report_idle()
- MINOR: thread: use ltid_bit in ha_tkillall()
- MINOR: thread: add a new all_tgroups_mask variable to know about active tgroups
- CLEANUP: thread: remove thread_sync_release() and thread_sync_mask
- MEDIUM: tinfo: add a dynamic thread-group context
- MEDIUM: thread: make stopping_threads per-group and add stopping_tgroups
- MAJOR: threads: change thread_isolate to support inter-group synchronization
- MINOR: thread: add is_thread_harmless() to know if a thread already is harmless
- MINOR: debug: mark oneself harmless while waiting for threads to finish
- MINOR: wdt: do not rely on threads_to_dump anymore
- MEDIUM: debug: make the thread dumper not rely on a thread mask anymore
- BUILD: debug: fix build issue on clang with previous commit
- BUILD: debug: re-export thread_dump_state
- BUG/MEDIUM: threads: fix incorrect thread group being used on soft-stop
- BUG/MEDIUM: thread: check stopping thread against local bit and not global one
- MINOR: proxy: use tg->threads_enabled in hard_stop() to detect stopped threads
- BUILD: Makefile: Add Lua 5.4 autodetect
- CI: re-enable gcc asan builds
- MEDIUM: mworker: set the iocb of the socketpair without using fd_insert()
- MINOR: fd: Add BUG_ON checks on fd_insert()
- CLEANUP: mworker: rename mworker_pipe to mworker_sockpair
- CLEANUP: mux-quic: do not export qc_get_ncbuf
- REORG: mux-quic: reorganize flow-control fields
- MINOR: mux-quic: implement accessor for sedesc
- MEDIUM: mux-quic: refactor streams opening
- MINOR: mux-quic: rename qcs flag FIN_RECV to SIZE_KNOWN
- MINOR: mux-quic: emit FINAL_SIZE_ERROR on invalid STREAM size
- BUG/MINOR: peers/config: always fill the bind_conf's argument
- BUG/MEDIUM: peers/config: properly set the thread mask
- CLEANUP: bwlim: Set pointers to NULL when memory is released
- BUG/MINOR: http-check: Preserve headers if not redefined by an implicit rule
- BUG/MINOR: http-act: Properly generate 103 responses when several rules are used
- BUG/MEDIUM: thread: mask stopping_threads with threads_enabled when checking it
- CLEANUP: thread: also remove a thread's bit from stopping_threads on stop
- BUG/MINOR: peers: fix possible NULL dereferences at config parsing
- BUG/MINOR: http-htx: Fix scheme based normalization for URIs wih userinfo
- MINOR: http: Add function to get port part of a host
- MINOR: http: Add function to detect default port
- BUG/MEDIUM: h1: Improve authority validation for CONNCET request
- MINOR: http-htx: Use new HTTP functions for the scheme based normalization
- BUG/MEDIUM: http-fetch: Don't fetch the method if there is no stream
- REGTEESTS: filters: Fix CONNECT request in random-forwarding script
- MEDIUM: mworker/systemd: send STATUS over sd_notify
- BUG/MINOR: mux-h1: Be sure to commit htx changes in the demux buffer
- BUG/MEDIUM: http-ana: Don't wait to have an empty buf to switch in TUNNEL state
- BUG/MEDIUM: mux-h1: Handle connection error after a synchronous send
- MEDIUM: epoll: don't synchronously delete migrated FDs
- BUILD: debug: silence warning on gcc-5
- BUILD: http: silence an uninitialized warning affecting gcc-5
- BUG/MEDIUM: mux-quic: fix server chunked encoding response
- REORG: mux-quic: rename stream initialization function
- MINOR: mux-quic: rename stream purge function
- MINOR: mux-quic: add traces on frame parsing functions
- MINOR: mux-quic: implement qcs_alert()
- MINOR: mux-quic: filter send/receive-only streams on frame parsing
- MINOR: mux-quic: do not ack STREAM frames on unrecoverable error
- MINOR: mux-quic: support stream opening via MAX_STREAM_DATA
- MINOR: mux-quic: define basic stream states
- MINOR: mux-quic: use stream states to mark as detached
- MEDIUM: mux-quic: implement RESET_STREAM emission
- MEDIUM: mux-quic: implement STOP_SENDING handling
- BUG/MEDIUM: debug: fix possible hang when multiple threads dump at once
- BUG/MINOR: quic: fix closing state on NO_ERROR code sent
- CLEANUP: quic: clean up include on quic_frame-t.h
- MINOR: quic: define a generic QUIC error type
- MINOR: mux-quic: support app graceful shutdown
- MINOR: mux-quic/h3: prepare CONNECTION_CLOSE on release
- MEDIUM: quic: send CONNECTION_CLOSE on released MUX
- CLEANUP: mux-quic: move qc_release()
- MINOR: mux-quic: send one last time before release
- MINOR: h3: store control stream in h3c
- MINOR: h3: implement graceful shutdown with GOAWAY
- BUG/MINOR: threads: produce correct global mask for tgroup > 1
- BUG/MEDIUM: cli/threads: make "show threads" more robust on applets
- BUG/MINOR: thread: use the correct thread's group in ha_tkillall()
- BUG/MINOR: debug: enter ha_panic() only once
- BUG/MEDIUM: debug: fix parallel thread dumps again
- MINOR: cli/streams: show a stream's tgid next to its thread ID
- DEBUG: cli: add a new "debug dev deadlock" expert command
- MINOR: cli/activity: add a thread number argument to "show activity"
- CLEANUP: applet: remove the obsolete command context from the appctx
- MEDIUM: config: remove deprecated "bind-process" directives from frontends
- MEDIUM: config: remove the "process" keyword on "bind" lines
- MINOR: listener/config: make "thread" always support up to LONGBITS
- CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros
- MEDIUM: debug/threads: make the lock debugging take tgroups into account
- MEDIUM: proto: stop protocols under thread isolation during soft stop
- MEDIUM: poller: program the update in fd_update_events() for a migrated FD
- MEDIUM: poller: disable thread-groups for poll() and select()
- MINOR: thread: remove MAX_THREADS limitation
- MEDIUM: cpu-map: replace the process number with the thread group number
- MINOR: mworker/threads: limit the mworker sockets to group 1
- MINOR: cli/threads: always bind CLI to thread group 1
- MINOR: fd/thread: get rid of thread_mask()
- MEDIUM: task/thread: move the task shared wait queues per thread group
- MINOR: task: move the niced_tasks counter to the thread group context
- DOC: design: add some thoughts about how to handle the update_list
- MEDIUM: conn: make conn_backend_get always scan the same group
- MAJOR: fd: remove pending updates upon real close
- MEDIUM: fd/poller: make the update-list per-group
- MINOR: fd: delete unused updates on close()
- MINOR: fd: make fd_insert() apply the thread mask itself
- MEDIUM: fd: add the tgid to the fd and pass it to fd_insert()
- MINOR: cli/fd: show fd's tgid and refcount in "show fd"
- MINOR: fd: add functions to manipulate the FD's tgid
- MINOR: fd: add fd_get_running() to atomically return the running mask
- MAJOR: fd: grab the tgid before manipulating running
- MEDIUM: fd/poller: turn polled_mask to group-local IDs
- MEDIUM: fd/poller: turn update_mask to group-local IDs
- MEDIUM: fd/poller: turn running_mask to group-local IDs
- MINOR: fd: make fd_clr_running() return the previous value instead
- MEDIUM: fd: make thread_mask now represent group-local IDs
- MEDIUM: fd: make fd_insert() take local thread masks
- MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid
- MEDIUM: fd: quit fd_update_events() when FD is closed
- MEDIUM: thread: change thread_resolve_group_mask() to return group-local values
- MEDIUM: listener: switch bind_thread from global to group-local
- MINOR: fd: add fd_reregister_all() to deal with boot-time FDs
- MEDIUM: fd: support stopping FDs during starting
- MAJOR: pollers: rely on fd_reregister_all() at boot time
- MAJOR: poller: only touch/inspect the update_mask under tgid protection
- MEDIUM: fd: support broadcasting updates for foreign groups in updt_fd_polling
- CLEANUP: threads: remove the now unused all_threads_mask and tid_bit
- MINOR: config: change default MAX_TGROUPS to 16
- BUG/MEDIUM: tools: avoid calling dlsym() in static builds
BUG/MEDIUM: tools: avoid calling dlsym() in static builds
Since 2.4 with commit 64192392c ("MINOR: tools: add functions to retrieve
the address of a symbol"), we can resolve symbols. However some old glibc
crash in dlsym() when the program is statically built.
Fortunately even on these old libs we can detect lack of support by
calling dlopen(NULL). Normally it returns a handle to the current
program, but on a static build it returns NULL. This is sufficient to
refrain from calling dlsym() (which will be of very limited use anyway),
so we check this once at boot and use the result when needed.
This may be backported to 2.4. On stable versions, be careful to place
the init code inside an if/endif guard that checks for DL support.
This will allows nbtgroups > 1 to be declared in the config without
recompiling. The theoretical limit is 64, though we'd rather not push
it too far for now as some structures might be enlarged to be indexed
per group. Let's start with 16 groups max, allowing to experiment with
dual-socket machines suffering from up to 8 loosely coupled L3 caches.
It's a good start and doesn't engage us too far.
CLEANUP: threads: remove the now unused all_threads_mask and tid_bit
Since these are not used anymore, let's now remove them. Given the
number of places where we're using ti->ldit_bit, maybe an equivalent
might be useful though.
MEDIUM: fd: support broadcasting updates for foreign groups in updt_fd_polling
We're still facing the situation where it's impossible to update an FD
for a foreign group. That's of particular concern when disabling/enabling
listeners (e.g. pause/resume on signals) since we don't decide which thread
gets the signal and it needs to process all listeners at once.
Fortunately, not that much is unprotected in FDs. This patch adds a test for
tgid's equality in updt_fd_polling() so that if a change is applied for a
foreing group, then it's detected and taken care of separately. The method
consists in forcing the update on all bound threads in this group, adding it
to the group's update_list, and sending a wake-up as would be done for a
remote thread in the local group, except that this is done by grabbing a
reference to the FD's tgid.
Thanks to this, SIGTTOU/SIGTTIN now work for nbtgroups > 1 (after that was
temporarily broken by "MEDIUM: fd/poller: make the update-list per-group").
MAJOR: poller: only touch/inspect the update_mask under tgid protection
With thread groups and group-local masks, the update_mask cannot be
touched nor even checked if it may change below us. In order to avoid
this, we have to grab a reference to the FD's tgid before checking the
update mask. The operations are cheap enough so that we don't notice it
in performance tests. This is expected because the risk of meeting a
reassigned FD during an update remains very low.
It's worth noting that the tgid cannot be trusted during startup nor
during soft-stop since that may come from anywhere at the moment. Since
soft-stop runs under thread isolation we use that hint to decide whether
or not to check that the FD's tgid matches the current one.
The modification is applied to the 3 thread-aware pollers, i.e. epoll,
kqueue, and evports. Also one poll_drop counter was missing for shared
updates, though it might be hard to trigger it.
With this change applied, thread groups are usable in benchmarks.
MAJOR: pollers: rely on fd_reregister_all() at boot time
The poller-specific thread init code now uses that new function to
safely register boot events. This ensures that we don't register an
event for another group and that we properly deal with parallel
thread startup.
It's only done for thread-aware pollers, there's no point in using
that in poll/select though that should work as well.
There's a nasty case during boot, which is the master process. It stops
all listeners from the main thread, and as such we're seeing calls to
fd_delete() from a thread that doesn't match the FD's mask, but more
importantly from a group that doesn't match either. Fortunately this
happens in a process that doesn't see the threads creation, so the FDs
are left intact in the table and we can overwrite the tgid there.
The approach is ugly, it probably shows that we should use a dummy
value for the tgid during boot, that would be replaced once the FDs
migrate to their target, but we also need a way to make sure not to
miss them. Also that doesn't solve the possibility of closing a
listener at run time from the wrong thread group.
MINOR: fd: add fd_reregister_all() to deal with boot-time FDs
At boot the pollers are allocated for each thread and they need to
reprogram updates for all FDs they will manage. This code is not
trivial, especially when trying to respect thread groups, so we'd
rather avoid duplicating it.
Let's centralize this into fd.c with this function. It avoids closed
FDs, those whose thread mask doesn't match the requested one or whose
thread group doesn't match the requested one, and performs the update
if required under thread-group protection.
Willy Tarreau [Tue, 28 Jun 2022 06:30:43 +0000 (08:30 +0200)]
MEDIUM: listener: switch bind_thread from global to group-local
It requires to both adapt the parser and change the algorithm to
redispatch incoming traffic so that local threads IDs may always
be used.
The internal structures now only reference thread group IDs and
group-local masks which are compatible with those now used by the
FD layer and the rest of the code.
Willy Tarreau [Tue, 28 Jun 2022 06:27:43 +0000 (08:27 +0200)]
MEDIUM: thread: change thread_resolve_group_mask() to return group-local values
It used to turn group+local to global but now we're doing the exact
opposite as we want to stick to group-local masks. This means that
"thread 3-4" might very well emit what "thread 2/1-2" used to emit
till now for 2 groups and 4 threads. This is needed because we'll
have to support group-local thread masks in receivers.
However the rest of the code (receivers) is not ready yet for this,
so using this code with more than one thread group will definitely
break some bindings.
MEDIUM: fd: quit fd_update_events() when FD is closed
The IOCB might have closed the FD itself, so it's not an error to
have fd.tgid==0 or anything else, nor to have a null running_mask.
In fact there are different conditions under which we can leave the
IOCB, all of them have been enumerated in the code's comments (namely
FD still valid and used, hence has running bit, FD closed but not yet
reassigned thus running==0, FD closed and reassigned, hence different
tgid and running becomes irrelevant, just like all other masks). For
this reason we have no other solution but to try to grab the tgid on
return before checking the other bits. In practice it doesn't represent
a big cost, because if the FD was closed and reassigned, it's instantly
detected and the bit is immediately released without blocking other
threads, and if the FD wasn't closed this doesn't prevent it from
being migrated to another thread. In the worst case a close by another
thread after a migration will be postponed till the moment the running
bit is cleared, which is the same as before.
MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid
These functions need to set/reset the FD's tgid but when they're called
there may still be wakeups on other threads that discover late updates
and have to touch the tgid at the same time. As such, it is not possible
to just read/write the tgid there. It must only be done using operations
that are compatible with what other threads may be doing.
As we're using inc/dec on the refcount, it's safe to AND the area to zero
the lower part when resetting the value. However, in order to set the
value, there's no other choice but fd_claim_tgid() which will assign it
only if possible (via a CAS). This is convenient in the end because it
protects the FD's masks from being modified by late threads, so while
we hold this refcount we can safely reset the thread_mask and a few other
elements. A debug test for non-null masks was added to fd_insert() as it
must not be possible to face this situation thanks to the protection
offered by the tgid.
MEDIUM: fd: make fd_insert() take local thread masks
fd_insert() was already given a thread group ID and a global thread mask.
Now we're changing the few callers to take the group-local thread mask
instead. It's passed directly into the FD's thread mask. Just like for
previous commit, it must not change anything when a single group is
configured.
MEDIUM: fd: make thread_mask now represent group-local IDs
With the change that was started on other masks, the thread mask was
still not fully converted, sometimes being used as a global mask and
sometimes as a local one. This finishes the code modifications so that
the mask is always considered as a group-local mask. This doesn't
change anything as long as there's a single group, but is necessary
for groups 2 and above since it's used against running_mask and so on.
MINOR: fd: make fd_clr_running() return the previous value instead
It's an AND so it destroys information and due to this there's a call
place where we have to perform two reads to know the previous value
then to change it. With a fetch-and-and instead, in a single operation
we can know if the bit was previously present, which is more efficient.
MEDIUM: fd/poller: turn running_mask to group-local IDs
From now on, the FD's running_mask only refers to local thread IDs. However,
there remains a limitation, in updt_fd_polling(), we temporarily have to
check and set shared FDs against .thread_mask, which still contains global
ones. As such, nbtgroups > 1 may break (but this is not yet supported without
special build options).
MEDIUM: fd/poller: turn update_mask to group-local IDs
From now on, the FD's update_mask only refers to local thread IDs. However,
there remains a limitation, in updt_fd_polling(), we temporarily have to
check and set shared FDs against .thread_mask, which still contains global
ones. As such, nbtgroups > 1 may break (but this is not yet supported without
special build options).
MEDIUM: fd/poller: turn polled_mask to group-local IDs
This changes the signification of each bit in the polled_mask so that
now each bit represents a local thread ID for the current group instead
of a global thread ID. As such, all tests now apply to ltid_bit instead
of tid_bit.
No particular check was made to verify that the FD's tgid matches the
current one because there should be no case where this is not true. A
check was added in epoll's __fd_clo() to confirm it never differs unless
expected (soft stop under thread isolation, or master in starting mode
going to exec mode), but that doesn't prevent from doing the job: it
only consists in checking in the group's threads those that are still
polling this FD and to remove them.
Some atomic loads were added at the various locations, and most repetitive
references to polled_mask[fd].xx were turned to a local copy instead making
the code much more clear.
MAJOR: fd: grab the tgid before manipulating running
We now grab a reference to the FD's tgid before manipulating the
running_mask so that we're certain it corresponds to our own group
(hence bits), and we drop it once we've set the bit. For now there's
no measurable performance impact in doing this, which is great. The
lock can be observed by perf top as taking a small share of the time
spent in fd_update_events(), itself taking no more than 0.28% of CPU
under 8 threads.
However due to the fact that the thread groups are not yet properly
spread across the pollers and the thread masks are still wrong, this
will trigger some BUG_ON() in fd_insert() after a few tens of thousands
of connections when threads other than those of group 1 are reached,
and this is expected.
MINOR: fd: add fd_get_running() to atomically return the running mask
The running mask is only valid if the tgid is the expected one. This
function takes a reference on the tgid before reading the running mask,
so that both are checked at once. It returns either the mask or zero if
the tgid differs, thus providing a simple way for a caller to check if
it still holds the FD.
MINOR: fd: add functions to manipulate the FD's tgid
The FD's tgid is refcounted and must be atomically manipulated. Function
fd_grab_tgid() will increase the refcount but only if the tgid matches the
one in argument (likely the current one). fd_claim_tgid() will be used to
self-assign the tgid after waiting for its refcount to reach zero.
fd_drop_tgid() will be used to drop a temporarily held tgid. All of these
are needed to prevent an FD from being reassigned to another group, either
when inspecting/modifying the running_mask, or when checking for updates,
in order to be certain that the mask being seen corresponds to the desired
group. Note that once at least one bit is set in the running mask of an
active FD, it cannot be closed, thus not migrated, thus the reference does
not need to be held long.
MEDIUM: fd: add the tgid to the fd and pass it to fd_insert()
The file descriptors will need to know the thread group ID in addition
to the mask. This extends fd_insert() to take the tgid, and will store
it into the FD.
In the FD, the tgid is stored as a combination of tgid on the lower 16
bits and a refcount on the higher 16 bits. This allows to know when it's
really possible to trust the tgid and the running mask. If a refcount is
higher than 1 it indeed indicates another thread else might be in the
process of updating these values.
Since a closed FD must necessarily have a zero refcount, a test was
added to fd_insert() to make sure that it is the case.
MINOR: fd: make fd_insert() apply the thread mask itself
It's a bit ugly to see that half of the callers of fd_insert() have to
apply all_threads_mask themselves to the bit field they're passing,
because usually it comes from a listener that may have other bits set.
Let's make the function apply the mask itself.
After a poller's ->clo() was called to completely terminate operations
on an FD, there's no reason for keeping updates on this FD, so if any
updates were already programmed it would be nice if we could delete them.
Tests show that __fd_clo() is called roughly half of the time with the
last FD from the local update list, which possibly makes sense if a close
has to appear after a polling change resulting from an incomplete read or
the end of a send().
We can detect this and remove the last entry, which gives less work to
do during the update() call, and eliminates most of the poll_drop_fd
event reports.
Note that while tempting, this must not be backported because it's only
safe to be done now that fd_delete_orphan() clears the update mask as
we need to be certain not to miss it:
- if the update mask is kept up with no entry, we can miss future
updates ;
- if the update mask is cleared too fast, it may result in failure
to add a shared event.
The update-list needs to be per-group because its inspection is based
on a mask and we need to be certain when scanning it if a mask is for
the same thread or another one. Once per-group there's no doubt about
it, even if the FD's polling changes, the entry remains valid. It will
be needed to check the tgid though.
Note that a soft-stop or pause/resume might not necessarily work here
with tgroups>1, because the operation might be delivered to a thread
that doesn't belong to the group and whoe update mask will not reflect
one that is interesting here. We can't do better at this stage.
Dealing with long-lasting updates that outlive a close() is always
going to be quite a problem, not because of the thread that will
discover such updates late, but mostly due to the shared update_list
that will have an entry on hold making it difficult to reuse it, and
requiring that the fd's tgid is changed and the update_mask reset
from a safe location.
After careful inspection, it turns out that all our pollers that support
automatic event removal upon close() do not need any extra bookkeeping,
and that poll and select that use an internal representation already
provide a poller->clo() callback that is already used to update the
local event. As such, it is already safe to reset the update mask and
to remove the event from the shared list just before the final close,
because nothing remains to be done with this FD by the poller.
Doing so considerably simplifies the handling of updates, which will
only have to be inspected by the pollers, while the writers can continue
to consider that the entries are always valid. Another benefit is that
it will be possible to reduce contention on the update_list by just
having one update_list per group (left to be done later if needed).
MEDIUM: conn: make conn_backend_get always scan the same group
We don't want to pick idle connections from another thread group,
this would be very slow by forcing to share undesirable data.
This patch makes sure that we start seeking from the current thread
group's threads only and loops over that range exclusively.
It's worth noting that the next_takeover pointer remains per-server
and will bounce when multiple groups use it at the same time. But we
preserve the perturbation by applying a modulo when retrieving it,
so that when groups are of the same size (most common case), the
index will not even change. At this time it doesn't seem worth
storing one index per group in servers, but that might be an option
if any contention is detected later.
MINOR: task: move the niced_tasks counter to the thread group context
This one is only used as a hint to improve scheduling latency, so there
is no more point in keeping it global since each thread group handles
its own run q
MEDIUM: task/thread: move the task shared wait queues per thread group
Their migration was postponed for convenience only but now's time for
having the shared wait queues per thread group and not just per process,
otherwise the WQ lock uses a huge amount of CPU alone.
Since commit d2494e048 ("BUG/MEDIUM: peers/config: properly set the
thread mask") there must not remain any single case of a receiver that
is bound nowhere, so there's no need anymore for thread_mask().
We're adding a test in fd_insert() to make sure this doesn't happen by
accident though, but the function was removed and its rare uses were
replaced with the original value of the bind_thread msak.
MINOR: cli/threads: always bind CLI to thread group 1
When using multiple groups, the stats socket starts to emit errors and
it's not natural to have to touch the global section just to specify
"thread 1/all". Let's pre-attach these sockets to thread group 1. This
will cause errors when trying to change the group but this really is not
a problem for now as thread groups are not enabled by default. This will
make sure configs remain portable and may possibly be relaxed later.
MINOR: mworker/threads: limit the mworker sockets to group 1
As a side effect of commit 34aae2fd1 ("MEDIUM: mworker: set the iocb of
the socketpair without using fd_insert()"), a config may now refuse to
start if there are multiple groups configured because the default bind
mask may span over multiple groups, and it is not possible to force it
to work differently.
Let's just assign thread group 1 to the master<->worker sockets so that
the thread bindings automatically resolve to a single group. The same was
done for the master side of the socket even if it's not used. It will
avoid being forgotten in the future.
MEDIUM: cpu-map: replace the process number with the thread group number
The principle remains the same, but instead of having a single process
and ignoring extra ones, now we set the affinity masks for the respective
threads of all groups.
MEDIUM: poller: disable thread-groups for poll() and select()
These old legacy pollers are not designed for this. They're still
using a shared list of events for all threads, this will not scale at
all, so there's no point in enabling thread-groups there. Modern
systems have epoll, kqueue or event ports and do not need these ones.
We arrange for failing at boot time, only when thread-groups > 1 so
that existing setups will remain unaffected.
If there's a compelling reason for supporting thread groups with these
pollers in the future, the rework should not be too hard, it would just
consume a lot of memory to have an fd_evts[] array per thread, but that
is doable.
MEDIUM: poller: program the update in fd_update_events() for a migrated FD
When an FD is migrated, all pollers program an update. That's useless
code duplication, and when thread groups will be supported, this will
require an extra round of locking just to verify the update_mask on
return. Let's just program the update direction from fd_update_events()
as it already does for closed FDs, this becomes more logical.
MEDIUM: proto: stop protocols under thread isolation during soft stop
protocol_stop_now() is called from do_soft_stop_now() running on any
thread that received the signal. The problem is that it will call some
listener handlers to close the FD, resulting in an fd_delete() being
called from the wrong group. That's not clean and we cannot even rely
on the thread mask to show up.
One interesting long-term approach could be to have kill queues for
FDs, and maybe we'll need them in the long run. However that doesn't
work well for listeners in this situation.
Let's simply isolate ourselves during this instant. We know we'll be
alone dealing with the close and that the FD will be instantly deleted
since not in use by any other thread. It's not the cleanest solution
but it should last long enough without causing trouble.
MEDIUM: debug/threads: make the lock debugging take tgroups into account
Since we have to use masks to verify owners/waiters, we have no other
option but to have them per group. This definitely inflates the size
of the locks, but this is only used for extreme debugging anyway so
that's not dramatic.
Thus as of now, all masks in the lock stats are local bit masks, derived
from ti->ltid_bit. Since at boot ltid_bit might not be set, we just take
care of this situation (since some structs are initialized under look
during boot), and use bit 0 from group 0 only.
CLEANUP: fd: get rid of the __GET_{NEXT,PREV} macros
They were initially made to deal with both the cache and the update list
but there's no cache anymore and keeping them for the update list adds a
lot of obfuscation that is really not desired. Let's get rid of them now.
Their purpose was simply to get a pointer to fdtab[fd].update.{,next,prev}
in order to perform atomic tests and modifications. The offset passed in
argument to the functions (fd_add_to_fd_list() and fd_rm_from_fd_list())
was the offset of the ->update field in fdtab, and as it's not used anymore
it was removed. This also removes a number of casts, though those used by
the atomic ops have to remain since only scalars are supported.
MEDIUM: config: remove the "process" keyword on "bind" lines
It was deprecated, marked for removal in 2.7 and was already emitting a
warning, let's get rid of it. Note that we've kept the keyword detection
to suggest to use "thread" instead.
MEDIUM: config: remove deprecated "bind-process" directives from frontends
This was already causing a deprecation warning and was marked for removal
in 2.7, now it happens. An error message indicates this doesn't exist
anymore.
CLEANUP: applet: remove the obsolete command context from the appctx
The "ctx" and "st2" parts in the appctx were marked for removal in 2.7
and were emulated using memcpy/memset etc for possible external code.
Let's remove this now.
MINOR: cli/activity: add a thread number argument to "show activity"
The output of "show activity" can be so large that the output is visually
unreadable on a screen. Let's add an option to filter on the desired
column (actually the thread number), use "0" to report only the first
column (aggregated/sum/avg), and use "-1", the default, for the normal
detailed dump.
DEBUG: cli: add a new "debug dev deadlock" expert command
This command will create the requested number of tasks competing on a
lock, resulting in triggering the watchdog and crashing the process.
This will help stress the watchdog and inspect the lock debugging parts.
BUG/MEDIUM: debug: fix parallel thread dumps again
The previous attempt to fix thread dumps in commit 672972604 ("BUG/MEDIUM:
debug: fix possible hang when multiple threads dump at once") still had
some shortcomings. Sometimes parallel dumps are jerky essentially due to
the way that threads synchronize on startup and end. In addition the risk
of waiting forever for a stopped thread exists, and panics happening in
parallel to thread dumps are not more reliable either.
This commit revisits the state transitions so that all threads may request
a dump in parallel, that all of them wait for each other in the handler,
and that one thread is responsible for counting every other and checking
that the total matches the number of active threads.
Then for stopping there's a finishing phase that all threads wait for so
that none quits this area too early. Given that we now know the number of
participants to the dump, we can let them each decrement the counter when
leaving so that another dump may only start after the last participant
has completely left.
Now many thread dumps in parallel are running fine, so do panics. No
backport is needed as this was the result of the changes for thread
groups.
Some panic dumps are mangled or truncated due to the watchdog firing at
the same time on multiple threads and calling ha_panic() simultaneously.
What may happen in this case is that the second one waits for the first
one to finish but as soon as it's done the second one resets the buffer
and dumps again, sometimes resetting the first one's dump. Also the first
one's abort() may trigger while the second one is currently dumping,
resulting in a full dump followed by a truncated one, leading to
confusion. Sometimes some lines appear in the middle of a dump as well.
It doesn't happen often and is easier to trigger by causing massive
deadlocks.
There's no reason for the process to resist to a panic, so we can safely
add a counter and no nothing on subsequent calls. Ideally we'd wait there
forever but as this may happen inside a signal handler (e.g. watchdog),
it doesn't always work, so the easiest thing to do is to return so that
the thread is interrupted as soon as possible and brought to the debug
handler to be dumped.
This should be backported, at least to 2.6 and possibly to older versions
as well.
BUG/MINOR: thread: use the correct thread's group in ha_tkillall()
In ha_tkillall(), the current thread's group was used to check for the
thread being running instead of using the target thread's group mask.
Most of the time it would not have any effect unless some groups are
uneven where it can lead to incomplete thread dumps for example.
BUG/MEDIUM: cli/threads: make "show threads" more robust on applets
Running several concurrent "show threads" in loops might occasionally
cause a segfault when trying to retrieve the stream from appctx_sc()
which may be null while the applet is finishing. It's not easy to
reproduce, it requires 3-5 sessions in parallel for about a minute
or so. The appctx_sc must be checked before passing it to sc_strm().
This must be backported to 2.6 which also has the bug.
BUG/MINOR: threads: produce correct global mask for tgroup > 1
In thread_resolve_group_mask(), if a global thread number is passed
and it belongs to a group greater than 1, an incorrect shift resulted
in shifting that ID again which made it appear nowhere or in a wrong
group possibly. The bug was introduced in 2.5 with commit 627def9e5
("MINOR: threads: add a new function to resolve config groups and
masks") though the groups only starts to be usable in 2.7, so there
is no impact for this bug, hence no backport is needed.
Amaury Denoyelle [Mon, 28 Mar 2022 12:53:45 +0000 (14:53 +0200)]
MINOR: h3: implement graceful shutdown with GOAWAY
Implement graceful shutdown as specified in RFC 9114. A GOAWAY frame is
generated with stream ID to indicate range of processed requests.
This process is done via the release app protocol operation. The MUX
is responsible to emit the generated GOAWAY frame after app release. A
CONNECTION_CLOSE will be emitted once there is no unacknowledged STREAM
frames.
Amaury Denoyelle [Mon, 13 Jun 2022 15:09:01 +0000 (17:09 +0200)]
MINOR: mux-quic: send one last time before release
Call qc_send() on qc_release(). This is mostly useful for application
protocol with a connection closing procedure. Most notably, this will be
useful to implement HTTP/3 GOAWAY emission.
This change is purely cosmetic. qc_release() function is moved just
before qc_io_cb(). It's cleaner as it brings it closer where it is used.
More importantly, this will be required to be able to use it in
qc_send() function.
MEDIUM: quic: send CONNECTION_CLOSE on released MUX
Send a CONNECTION_CLOSE if the MUX has been released and all STREAM data
are acknowledged. This is useful to prevent a client from trying to use
a connection which have the upper layer closed.
To implement this a new function qc_check_close_on_released_mux() has
been added. It is called on QUIC MUX release notification and each time
a qc_stream_desc has been released.
This commit is associated with the previous one :
MINOR: mux-quic/h3: schedule CONNECTION_CLOSE on app release
Both patches are required to prevent the risk of browsers stuck on
webpage loading if MUX has been released. On CONNECTION_CLOSE reception,
the client will reopen a new QUIC connection.
MINOR: mux-quic/h3: prepare CONNECTION_CLOSE on release
When MUX is released, a CONNECTION_CLOSE frame should be emitted. This
will ensure that the client does not use anymore a half-dead connection.
App protocol layer is responsible to provide the error code via release
callback. For HTTP/3 NO_ERROR is used as specified in RFC 9114. If no
release callback is provided, generic QUIC NO_ERROR code is used. Note
that a graceful shutdown is used : quic_conn must emit CONNECTION_CLOSE
frame when possible. This will be provided in another patch.
This change should limit the risk of browsers stuck on webpage loading
if MUX has been released. On CONNECTION_CLOSE reception, the client will
reopen a new QUIC connection.
Adjust qcc_emit_cc_app() to allow the delay of emission of a
CONNECTION_CLOSE. This will only set the error code but the quic-conn
layer is not flagged for immediate close. The quic-conn will be
responsible to shut the connection when deemed suitable.
This change will allow to implement application graceful shutdown, such
as HTTP/3 with GOAWAY emission. This will allow to emit closing frames
on MUX release. Once all work is done at the lower layer, the quic-conn
should emit a CONNECTION_CLOSE with the registered error code.
Define a new structure quic_err to abstract a QUIC error type. This
allows to easily differentiate a transport and an application error
code. This simplifies error transmission from QUIC MUX and H3 layers.
This new type is defined in quic_frame module. It is used to replace
<err_code> field in <quic_conn>. QUIC_FL_CONN_APP_ALERT flag is removed
as it is now useless.
Utility functions are defined to be able to quickly instantiate
transport, tls and application errors.
quic_frame-t.h and xprt_quic-t.h include themselves mutually. This may
cause some troubles later.
In fact, xprt_quic does not need to include quic_frame so remove this.
And as quic_frame is a generic source file which is included in multiple
places, it is useful to also remove the xprt_quic include in it. Use
forward declaration for this.
BUG/MEDIUM: debug: fix possible hang when multiple threads dump at once
A bug in the thread dumper was introduced by commit 00c27b50c ("MEDIUM:
debug: make the thread dumper not rely on a thread mask anymore"). If
two or more threads try to trigger a thread dump exactly at the same
time, the second one may loop indefinitely trying to set the value to 1
while the other ones will wait for it to finish dumping before leaving.
This is a consequence of a logic change using thread numbers instead of
a thread mask, as threads do not need to see all other ones there anymore.
Implement support for STOP_SENDING frame parsing. The stream is resetted
as specified by RFC 9000. This will automatically interrupt all future
send operation in qc_send(). A RESET_STREAM will be sent with the code
extracted from the original STOP_SENDING frame.
Implement functions to be able to reset a stream via RESET_STREAM.
If needed, a qcs instance is flagged with QC_SF_TO_RESET to schedule a
stream reset. This will interrupt all future send operations.
On stream emission, if a stream is flagged with QC_SF_TO_RESET, a
RESET_STREAM frame is generated and emitted to the transport layer. If
this operation succeeds, the stream is locally closed. If upper layer is
instantiated, error flag is set on it.
MINOR: mux-quic: use stream states to mark as detached
Adjust condition to detach a qcs instance : if the stream is not locally
close it is not directly free. This should improve stream closing by
ensuring that either FIN or a RESET_STREAM is sent before destroying it.
Implement a basic state machine to represent stream lifecycle. By
default a stream is idle. It is marked as open when sending or receiving
the first data on a stream.
Bidirectional streams has two states to represent the closing on both
receive and send channels. This distinction does not exists for
unidirectional streams which passed automatically from open to close
state.
This patch is mostly internal and has a limited visible impact. Some
behaviors are slightly updated :
* closed streams are garbage collected at the start of io handler
* send operation is interrupted if a stream is close locally
Outside of this, there is no functional change. However, some additional
BUG_ON guards are implemented to ensure that we do not conduct invalid
operation on a stream. This should strengthen the code safety. Also,
stream states are displayed on trace which should help debugging.
MINOR: mux-quic: support stream opening via MAX_STREAM_DATA
MAX_STREAM_DATA can be used as the first frame of a stream. In this
case, the stream should be opened, if it respects flow-control limit. To
implement this, simply replace plain lookup in stream tree by
qcc_get_qcs() at the start of the parsing function. This automatically
takes care of opening the stream if not already done.
As specified by RFC 9000, if MAX_STREAM_DATA is receive for a
receive-only stream, a STREAM_STATE_ERROR connection error is emitted.
MINOR: mux-quic: do not ack STREAM frames on unrecoverable error
Improve return path for qcc_recv() on STREAM parsing. It returns 0 on
success. On error, a non-zero value is returned which indicates to the
caller that the packet containing the frame should not be acknowledged.
When qcc_recv() generates a CONNECTION_CLOSE or RESET_STREAM, either
directly or via qcc_get_qcs(), an error is returned which ensure that no
acknowledgement is generated. This required an adjustment on
qcc_get_qcs() API which now returns a success/error code. The stream
instance is returned via a new out argument.
MINOR: mux-quic: filter send/receive-only streams on frame parsing
Extend the function qcc_get_qcs() to be able to filter send/receive-only
unidirectional streams. A connection error STREAM_STATE_ERROR is emitted
if this new filter does not match.
This will be useful when various frames handlers are converted with
qcc_get_qcs(). Depending on the frame type, it will be easy to filter on
the forbidden stream types as specified in RFC 9000.
Implement a simple function to notify a possible subscriber or wake up
the upper layer if a special condition happens on a stream. For the
moment, this is only used to replace identical code in
qc_wake_some_streams().
Rename qc_release_detached_streams() to qc_purge_streams(). The aim is
to have a more generic name. It's expected to complete this function to
add other criteria to purge dead streams.
Also the function documentation has been corrected. It does not return a
number of streams. Instead it is a boolean value, to true if at least
one stream was released.
REORG: mux-quic: rename stream initialization function
Rename both qcc_open_stream_local/remote() functions to
qcc_init_stream_local/remote(). This change is purely cosmetic. It will
reduces the ambiguity with the soon to be implemented OPEN states for
QCS instances.
BUG/MEDIUM: mux-quic: fix server chunked encoding response
QUIC MUX was not able to correctly deal with server response using
chunked transfer-encoding. All data will be transfered correctly to the
client but the FIN bit is missing. The transfer will never stop as the
client will wait indefinitely for the FIN bit.
This bug happened because the HTX message representing a chunked encoded
payload contains a final empty block with the EOM flag. However,
emission is skipped by QUIC MUX if there is no data to transfer. To fix
this, the condition was completed to ensure that there is no need to
send the FIN signal. If this is false, data emission will proceed even
if there is no data : this will generate an empty QUIC STREAM frame with
FIN set which will mark the end of the transfer.
To ensure that a FIN STREAM frame is sent only one time,
QC_SF_FIN_STREAM is resetted on send confirmation from the transport in
qcc_streams_sent_done().
This bug was reproduced when dealing with chunked transfer-encoding
response for the HTTP server.
BUILD: http: silence an uninitialized warning affecting gcc-5
When building with gcc-5, one can see this warning:
src/http_fetch.c: In function 'smp_fetch_meth':
src/http_fetch.c:356:6: warning: 'htx' may be used uninitialized in this function [-Wmaybe-uninitialized]
sl = http_get_stline(htx);
^
It's wrong since the only way to reach this code is to have met the
same condition a few lines before and initialized the htx variable.
The reason in fact is that the same test happens on different variables
of distinct types, so the compiler possibly doesn't know that the
condition is the same. Newer gcc versions do not have this problem.
Let's just move the assignment earlier and have the exact same test,
as it's sufficient to shut this up. This may have to be backported
to 2.6 since the code is the same there.
In 2.6, 8a0fd3a36 ("BUILD: debug: work around gcc-12 excessive
-Warray-bounds warnings") disabled some warnings that were reported
around the the BUG() statement. But the -Wnull-dereference warning
isn't known from gcc-5, it only arrived in gcc-6, hence makes gcc-5
complain loudly that it doesn't know this directive. Let's just
condition this one to gcc-6.
Between 1.8 and 1.9 commit d9e7e36c6 ("BUG/MEDIUM: epoll/threads: use one
epoll_fd per thread") split the epoll poller to use one poller per thread
(and this was backported to 1.8). This patch added a call to epoll_ctl(DEL)
on return from the I/O handler as a safe way to deal with a detected thread
migration when that code was still quite fragile. One aspect of this choice
was that by then we wanted to maintain support for the rare old bogus epoll
implementations that failed to remove events on close(), so risking to lose
the event was not an option.
Later in 2.5, commit 200bd50b7 ("MEDIUM: fd: rely more on fd_update_events()
to detect changes") changed the code to perform most of the operations
inside fd_update_events(), but it maintained that oddity, to the point that
strictly all pollers except epoll now just add an update to be dealt with at
the next round.
This approach is much more efficient, because under load and server-side
connection reuse, it's perfectly possible for a thread to see the same FD
several times in a poll loop, the first time to relinquish it after a
migration, then the other thread makes a request, gets its response, and
still during the same loop for the first one, grabbing an idle connection
to send a request and wait for a response will program a new update on
this FD. By using a synchronous epoll_ctl(DEL), we effectively lose the
opportunity to aggregate certain changes in the same update.
Some tests performed locally with 8 threads and one server show that on
average, by using an update instead of a synchronous call, we reduce the
number of epoll_ctl() calls by 25-30% (under low loads it will probably
not change anything).
So this patch implements the same method for all pollers and replaces the
synchronous epoll_ctl() with an update.
BUG/MEDIUM: mux-h1: Handle connection error after a synchronous send
Since commit d1480cc8 ("BUG/MEDIUM: stream-int: do not rely on the
connection error once established"), connection errors are not handled
anymore by the stream-connector once established. But it is a problem for
the H1 mux when an error occurred during a synchronous send in
h1_snd_buf(). Because in this case, the connction error is just missed. It
leads to a session leak until a timeout is reached (client or server).
To fix the bug, the connection flags are now checked in h1_snd_buf(). If
there is an error, it is reported to the stconn level by setting SF_FL_ERROR
flags. But only if there is no pending data in the input buffer.
This patch should solve the issue #1774. It must be backported as far as
2.2.
BUG/MEDIUM: http-ana: Don't wait to have an empty buf to switch in TUNNEL state
When we want to establish a tunnel on a side, we wait to have flush all data
from the buffer. At this stage the other side is at least in DONE state. But
there is no real reason to wait. We are already in DONE state on its
side. So all the HTTP message was already forwarded or planned to be
forwarded. Depending on the scheduling if the mux already started to
transfer tunneled data, these data may block the switch in TUNNEL state and
thus block these data infinitly.
This bug exists since the early days of HTX. May it was mandatory but today
it seems useless. But I honestly don't remember why this prerequisite was
added. So be careful during the backports.
This patch should be backported with caution. At least as far as 2.4. For
2.2 and 2.0, it seems to be mandatory too. But a review must be performed.
BUG/MINOR: mux-h1: Be sure to commit htx changes in the demux buffer
When a buffer area is casted to an htx message, depending on the method
used, the underlying buffer may be updated or not. The htxbuf() function
does not change the buffer state. We assume the buffer was already prepared
to store an htx message. htx_from_buf() on its side, updates the
buffer. With the first function, we only need to commit changes to the
underlying buffer if the htx message is changed. With last one, we must
always commit the changes. The idea is to be sure to keep non-empty HTX
messages while an empty message must be lead to an empty buffer after
commit.
All that said because in h1_process_demux(), the changes is not always
committed as expected. When the demux is blocked, we just leave the
function. So it is possible to have an empty htx message stored in a buffer
that appears non-empty. It is not critical, but the buffer cannot be
released in this state. And we should always release an empty buffer.
MEDIUM: mworker/systemd: send STATUS over sd_notify
The sd_notify API is not able to change the "Active:" line in "systemcl
status". However a message can still be displayed on a "Status: " line,
even if the service is still green and "active (running)".
When startup succeed the Status will be set to "Ready.", upon a reload
it will be set to "Reloading Configuration." If the configuration
succeed "Ready." again. However if the reload failed, it will be set to
"Reload failed!".
Keep in mind that the "Active:" line won't change upon a reload failure,
and will still be green.
BUG/MEDIUM: http-fetch: Don't fetch the method if there is no stream
The "method" sample fetch does not perform any check on the stream existence
before using it. However, for errors triggered at the mux level, there is no
stream. When the log message is formatted, this sample fetch must fail. It
must also fail when it is called from a health-check.
BUG/MEDIUM: h1: Improve authority validation for CONNCET request
From time to time, users complain to get 400-Bad-request responses for
totally valid CONNECT requests. After analysis, it is due to the H1 parser
performs an exact match between the authority and the host header value. For
non-CONNECT requests, it is valid. But for CONNECT requests the authority
must contain a port while it is often omitted from the host header value
(for default ports).
So, to be sure to not reject valid CONNECT requests, a basic authority
validation is now performed during the message parsing. In addition, the
host header value is normalized. It means the default port is removed if
possible.
This patch should solve the issue #1761. It must be backported to 2.6 and
probably as far as 2.4.
http_is_default_port() can be used to test if a port is a default HTTP/HTTPS
port. A scheme may be specified. In this case, it is used to detect defaults
ports, 80 for "http://" and 443 for "https://". Otherwise, with no scheme, both
are considered as default ports.
MINOR: http: Add function to get port part of a host
http_get_host_port() function can be used to get the port part of a host. It
will be used to get the port of an uri authority or a host header
value. This function only look for a port starting from the end of the
host. It is the caller responsibility to call it with a valid host value. An
indirect string is returned.
BUG/MINOR: http-htx: Fix scheme based normalization for URIs wih userinfo
The scheme based normalization is not properly handled the URI's userinfo,
if any. First, the authority parser is not called with "no_userinfo"
parameter set. Then it is skipped from the URI normalization.
BUG/MINOR: peers: fix possible NULL dereferences at config parsing
Patch 49f6f4b ("BUG/MEDIUM: peers: fix segfault using multiple bind on
peers sections") introduced possible NULL dereferences when parsing the
peers configuration.
Fix the issue by checking the return value of bind_conf_uniq_alloc().