Fewer "busy shutdown" crashes; simpler shutdown cleanup (#1128)
Preparing for shutdown after 186724 requests
Waiting 1 seconds for active connections to finish
Closing HTTP(S) port [::]:3128
FATAL: assertion failed: Transients.cc:192:
"oldE->mem_obj && oldE->mem_obj->xitTable.index == index"
This change removes two steps at the end of the manual shutdown sequence
(FreeAllFs() and Store::FreeMemory()) and associated deadly Store hacks,
adjusting a few Store checks as needed. Also, StoreFileSystem hierarchy
declarations were slightly updated when half of its virtual methods were
removed. Key changes are detailed below.
### Do not disable Store [by dropping Store::Root()] during shutdown
Disabling Store (i.e. calling Store::FreeMemory()) resulted in "do not
use Root() during shutdown" assertion-avoiding hacks that bypassed code
maintaining key invariants. Those hacks were probably accommodating
Store-using destruction sequences triggered by SquidShutdown() or later
code, _but_ the hacks also ran during the graceful shutdown period!
Violating those invariants in MemObject destructor led to the quoted
assertions when CollapsedForwarding::HandleNotification() ran during
graceful shutdown.
We could add more state to narrow down hacks scope, or we could disable
IPC notifications that triggered the known deadly shutdown sequence, but
* We should not disable assertions during shutdown because many are
about essential (e.g., preventing data corruption) invariants and
because it is not practical to disable all assertions that should be
disabled (and keep disabling the new ones correctly). The idea leads
to unsafe code that is also more difficult to write and maintain.
* We should not disable kid-to-kid notifications because some are
required for finishing transactions (that can still be successfully
finished) during graceful shutdown. Increasing successful completion
chances is the whole idea behind that graceful shutdown feature!
Instead, Store should provide (essential) services during shutdown.
TODO: Update those unit test cases that use Store::FreeMemory() but
should not. That requires careful analysis because the next test cases
may rely on the previous case removing its Store object/statistics/etc.
The same dedicated project might (help) address commit
27685ef TODO
about unwanted Store unit test order dependencies.
### Do not disable StoreFileSystem service during shutdown
Similar to the Store service, manually killing FS service is a bad idea.
The underlying code was also essentially unused and buggy:
* Nobody called RegisterAllFsWithCacheManager(). A caller would assert.
* Nobody called Fs::Clean(). A call could crash Squid because deleted
objects could still be in use by others (they are not refcounted).
* SetupAllFs() was called but did essentially nothing.
* FreeAllFs() was called but had only harmful effects: It leaked memory
and removed usable FS registrations, increasing shutdown dangers.
See also: Commit
3a85184 documents why we should avoid explicit "delete
essentialService" calls (and why some crashes may still occur if we do).