Break long store_client call chains with async calls (#1056)
The store_client class design created very long call chains spanning
Squid-client and Squid-server processing and multiple transactions.
These call chains also create ideal conditions for dangerous recursive
relationships between communicating classes (a.k.a. "reentrancy" among
Squid developers). For example, storeClientCopy() enters store_client
and triggers disk I/O that triggers invokeHandlers() that re-enters the
same store_client object and starts competing with the original
storeClientCopy() processing state.
The official code prevented the worst recursion cases with three(!)
boolean flags and time-based events abused to break some of the call
chains, but that approach did not solve all of the problems while also
losing transaction context information across time-based events.
This change effectively makes STCB storeClientCopy() callbacks
asynchronous, eliminating the need for time-based events and one of the
flags. It shortens many call chains and preserves transaction context.
The remaining problems can and should be eliminated by converting
store_client into AsyncJob, but those changes deserve a dedicated PR.
store_client orchestrates cooperation of multiple asynchronous players:
* Sink: A Store client requests a STCB callback via a
storeClientCopy()/copy() call. A set _callback.callback_handler
implies that the client is waiting for this callback.
* Source1: A Store disk reading subsystem activated by the storeRead()
call "spontaneously" delivers response bytes via storeClientRead*()
callbacks. The disk_io_pending flag implies waiting for them.
* Source2: Store memory subsystem activated by storeClientListAdd()
"spontaneously" delivers response bytes via invokeHandlers().
* Source3: Store disk subsystem activated by storeSwapInStart()
"spontaneously" notifies of EOF/error by calling noteSwapInDone().
* Source4: A store_client object owner may delete the object by
"spontaneously" calling storeUnregister(). The official code was
converting this event into an error-notifying callback.
We continue to answer each storeClientCopy() request with the first
available information even though several SourceN calls are possible
while we are waiting to complete the STCB callback. The StoreIOBuffer
API and STCB recipients do not support data+error/eof combinations, and
future code will move this wait to the main event loop anyway. This
first-available approach means that the creation of the notifier call
effectively ends answer processing -- store_client just waits for that
call to fire so that it can relay the answer to still-synchronous STCB.
When STCB itself becomes asynchronous, this logic will continue to work.
Also stopped calling STCB from storeUnregister(). Conceptually, the
storeUnregister() and storeClientCopy() callers ought to represent the
same get-content-from-Store task; there should be no need to notify that
task about what it is doing. Technically, analysis of STCB callbacks
showed that many such notifications would be dangerous (if they are or
become reachable). At the time of the storeUnregister() call, the STCB
callbacks are usually unset (e.g., when storeUnregister() is called from
the destructor, after that object has finished copying -- a very common
case) or do not do anything (useful).
Also removed callback_data from the Callback::pending() condition. It is
conceptually wrong to require non-nil callback parameter, and it is
never cleared separately from the callback_handler data member anyway.
Also hid copyInto into the private store_client section to make sure it
is not modified while we are waiting to complete the STCB callback. This
move required adding a couple of read-only wrapper methods like
bytesWanted() and noteSwapInDone().
Also simplified error/EOF/bytes handling on copy()-STCB path using
dedicated methods (e.g., store_client::callback() API is no longer
mixing EOF and error signals).