git.ipfire.org Git - thirdparty/squid.git/log

Fixed StoreMap.cc "anchorAt(anchorId).reading()" assertions (#1117)

Squid asserted because the code could switch a transient entry from
writing to reading while the corresponding store entry was still being
(or could still be) written to the shared memory cache. For example:

1. We start as a collapsed writer.
2. We get a response and start writing to disk and shared memory caches.
3. Disk swapout fails (for any reason, including out-of-slots errors).
4. storeSwapOutFileClosed() calls transientsCompleteWriting().
5. transientsCompleteWriting() switches the entry into reading mode
... but we are still writing to the memory cache!

There was a somewhat related XXX in transientsCompleteWriting(), but
what that comment did not say is that if we are writing to two stores in
parallel, then the first transientsCompleteWriting() call (made when one
of the two swapouts ends) makes us a reader prematurely.

An incorrect reader status allows a Controller::syncCollapsed()
notification (to, say, a shared memory cache writer) slip past the
transients->isWriter() guard, eventually reaching the reading()
assertion.

Properly detecting the end of all store writing is difficult because the
two mostly independent writing "threads" may start/finish at seemingly
random times, in many code places, and even in different workers. To
simplify this task, Squid now limits cache writing to the worker
transaction that made the StoreEntry object public. That transaction
creates a writing Transients entry. Other transactions start as
Transients readers. The writer switches to reader when neither memory
nor disk caching can start or continue.

Also simplified Transients entry state. Squid relayed swapout errors via
the Transient entries themselves, but that is not necessary (and it
prevented the cache entry from recovering by writing to another store).
Each store entry can take care of its own swapout status/results. The
readers just need to know whether somebody may start (or is still)
writing, and we relay that info by keeping the Transients entry locked
for writing (appending, to be precise) while that condition is true.

Also fixed shared caches to recognize that no more data will be coming
in because the remote entry writer is gone. Readers still try to deliver
what they have, even if they know that the response will be truncated.

Also tried to follow the "broadcast after change, in the same context as
the change" principle in the modified code instead of relying on the
caller to broadcast after all changes. This approach may increase the
number of broadcasts, but it reduces the probability that we will miss
an important Broadcast() call. We can (and should) optimize repeated,
useless broadcasts, but that work is outside this project scope.

Also improved StoreEntry::swapOut() shutdown safety and
mayStartSwapOut() checks descriptions/order.

Also added an out-of-scope XXX.

Honor httpd_suppress_version_string in more contexts (#1121)

* HTML "signature" part of generated HTTP responses to URN requests
* Server header in generated HTTP redirect responses
* boundary strings in generated HTTP 206 multipart/byteranges responses

Also document known contexts outside the configuration directive reach.
We may assert that all of them are outside the current "httpd" scope
(implied by the directive name), but admins may not know how we define
that "httpd" scope; being more explicit may reduce surprises.

Match ./configure --help parameter names with their defaults (#1120)

It is customary[^1] for "./configure --help" to use "--enable-foo"
spelling for default-disabled features and "--disable-foo" spelling for
default-enabled features. This custom allows admins to quickly discover
defaults without developers painstakingly documenting that default for
every parameter.

These changes are limited to "./configure --help" output. They do not
alter any defaults and do not alter non-help ./configure run outcomes.

[^1]: Example 1.1 at https://autotools.info/autoconf/arguments.html

Make COMM_ORPHANED and COMM_REUSEPORT values distinct (#1116)

Broken since inception (commit 1c2b446). The exact runtime effects (if
any) of this bug are difficult to ascertain, but it is possible that
some COMM_REUSEPORT (i.e. worker-queues port) listening connections were
not reported as incorrectly orphaned when they should have been.

Fix typo in manager ACL (#1113)

AsyncJob-protect and reduce ICAP DNS lookup call chain (#1112)

Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>

Do not tell admins that we are replacing their .* with our .* (#1119)

    acl hasHeaderFoo req_header Foo .*

    2022/08/12 11:43:27| WARNING: regular expression '.*' has only
    wildcards and matches all strings. Using '.*' instead.

Broken since RE optimization inception (commit 6564dae), but the
excessive WARNING was "invisible" to many admins until 61be1d8 started
saving early level-0/1 messages to cache.log.

Bug 3193 pt2: NTLM decoder truncating strings (#1114)

The initial bug fix overlooked large 'offset' causing integer
wrap to extract a too-short length string.

Improve debugs and checks sequence to clarify cases and ensure
that all are handled correctly.

Remove send-announce (#1110)

This is another feature which is very much outdated and unused.

Do not "open" Comm::Connection on oldAccept() errors (#1103)

When Comm::TcpAcceptor::oldAccept() discovers an error, it should close
the socket it just accepted. We should use RAII to avoid leaking the
open socket, especially in the presence of multiple error detection
areas and C++ exceptions. Using the available Connection object for
controlling socket lifetime does not work well because that Connection
object often lingers well past oldAccept() -- the object delivers
various low-level error details (e.g., the remote HTTP client address of
the failed attempt) to TcpAcceptor users.

Instead of "opening" the Connection object ASAP to avoid FD leaks and
then struggling to find the right time to close it, we now delay that
opening until oldAccept() succeeds. Meanwhile, the socket lifetime and
Comm registration are controlled by a simple RAII Descriptor object.

Eventually, we will reuse Comm::Descriptor in Connection and other
descriptor-owning code. One known prerequisite for those improvements is
Optional<> support for non-trivial types (a work in progress).

Unify r.n.g. seed generation (#1109)

Using time-based expressions for seeding pseudo random number generators
often results in multiple SMP workers getting the same seed. That, in
turn, may result in unwanted correlation of supposed-to-be-independent
worker activities: making artificially similar decisions (that are
supposed to be "random") and/or concentrating rather than
spreading/sharing load.

The recently added RandomUuid code did not use time-based seeds in
anticipation of this unification.

This change converts all std::mt19937 users in src/ except the random
ACL code (that probably deserves a dedicated change).

C++ standard does not guarantee that std::random_device does not force
users to wait while more entropy is accumulated. Based on public
comments and source code analysis of popular STL implementations, we
should be getting the desired no-wait behavior from popular STLs, in
popular environments. If Squid encounters a special environment where
the default std::random_device waits for entropy, more code and/or
configuration options will be needed to accommodate that special case.
The biggest drawback with this approach is that it may be difficult for
the admins to discover that std::random_device is waiting in their
environment. This is mitigated by the low probability of that happening.

Maintenance: Improve stability of generated debug-sections.txt (#1101)

Pure "sort" may not produce stable results across different locales.
This may explain why the official order changed in commit 502bcc7 and
why current runs in some unofficial environments reorder sections.

Also fixed two minor problems:
* "sort -n" (i.e. numeric sort) was applied to non-numeric fields
* "sort -u" (i.e. unique sort) was pointlessly applied multiple times

Maintenance: Improve stability of gperf-generated sources (#1099)

Control gperf from source-maintenance.sh: Existing Makefile rule did not
use (and should not use) primary "make" functionality -- we do not want
to automatically trigger RegisteredHeadersHash.cci regeneration based on
file timestamps alone because most users will not have (the right
version of) gperf and (the right version of) astyle available. The rule
did use Makefile variables, but that resulted in unstable gperf output
because variables like $src change depending on the developer build
environment. This change finishes the move started by commit 3d50bea.

Simplify formater.pl, making its interface similar to other formatting
tools used by source-maintenance.sh: The script now sends its results to
stdout. Renamed the script to reduce surprises, fix the old naming typo,
and to improve consistency with format-makefile-am.pl. The script needs
more polishing; we avoided otherwise unmodified lines except removing an
unused buggy $pfx line that triggered warnings.

Move "astyle did not corrupt code" checks from source-maintenance.sh to
formater.pl, where they belong and are easier to implement. This move
simplifies reuse of the Perl script inside source-maintenance.sh. It
also removes dependency on md5sum/etc. $CppFormatter now contains the
exact command to (re)use.

Fix "Generated by.." marker. Its old location in .cci was hard to
notice. The marker was also present in hand-edited .gperf file!

Report RegisteredHeadersHash.cci changes.

Also stopped stripping "register" from gperf output. Gperf v3.1 stopped
adding "register" to C++ output. We (should) use that gperf version.
Removing "register" complicated gperf error handling because a pipe to
sed ate gperf exit code.

Fix build on arm32 (#1108)

On arm32, size_t and long are 32 bits in size, exposing these bugs.

Support non-trivial Optional values (#1106)

Commit 7224761 skipped this std::optional feature when adding initial
Optional implementation because that feature is difficult to support.
However, several in-progress projects now need to make non-trivial
objects Optional. This implementation maintains key Optional invariants,
matching those of std::optional:

* A valueless Optional object does not construct or destruct the value.
* A valued Optional object supports move/copy ops supported by value.

Maintaining the first invariant is tricky:

* Union prevents value construction/destruction in a valueless Optional.
* Explicit destructor call destructs the value in a _valued_ Optional.
* A dummy union member works around a C++ requirement for constexpr
  unions to have at least one active (i.e. initialized) member. Since a
  valueless Optional cannot initialize the value, we need another union
  member that we can initialize.
* We use an _anonymous_ union because C++ places more requirements on
  named unions, triggering a more complex implementation with
  placement-new and to-be-deprecated std::aligned_storage_t tricks!

XXX: This simplified implementation violates the following std::optional
invariant. We believe that this violation does not hurt the current and
foreseeable Squid code. In our tests, optimizing compilers still
optimize Optional<Trivial> destructors away.

* std::optional<Value> is trivially destructible if Value is.

Use ThisCache2 buffer size when filling ThisCache2 buffer (#1102)

Maintenance: Trigger -Wswitch in more build environments (#1104)

When a developer forgets to update RawSwapMetaTypeTop() after adding a
new SwapMetaType item, the function should produce a -Wswitch compiler
warning (at least). However, we discovered that many compilers remain
silent, apparently confused by the constant switch expression:

* GCC warns only starting with v9.1;
* clang does not warn at all (up to v14, inclusive).

Adding a non-constant variable triggers the expected -Wswitch warnings
in clang and earlier GCC versions. Optimizing compilers (-O1) remove the
otherwise unused variable so adding it has no runtime cost.

Declaring the variable outside the switch statement is required to avoid
an unwanted -Wunused-variable warning with GCC v4.8 and clang v3.0.0.

A post-switch return statement is required to avoid an unwanted
-Wreturn-type warning with GCC v4.9.

Tests: https://gcc.godbolt.org/z/d1czs4c8e

Remove support for Gopher protocol (#1092)

Gopher code quality remains too low for production use in most
environments. The code is a persistent source of vulnerabilities and
fixing it requires significant effort. We should not be spending scarce
Project resources on improving that code, especially given the lack of
strong demand for Gopher support.

With this change, Gopher requests will be handled like any other request
with an unknown (to Squid) protocol. For example, HTTP requests with
Gopher URI scheme result in ERR_UNSUP_REQ.

Default Squid configuration still considers TCP port 70 "safe". The
corresponding Safe_ports ACL rule has not been removed for consistency
sake: We consider WAIS port safe even though Squid refuses to forward
WAIS requests:

acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais

Fix handling of bad SNMP variable type when parsing IP_* vars (#1097)

AFAICT, we never associate snmp_netIpFn() with the wrong variable type,
so we may be dealing with a bug in an effectively unreachable code here.

1999 commit 736eb6a correctly removed snmp_var_free() call from most
refactored SNMP functions but missed snmp_netIpFn() and
snmp_meshCtblFn() cases. The latter was fixed in 2000 commit d20b1cd.

Detected by static analyzer Svace v3.2.1.

RFC 9111: Stop treating Warning specially (#1072)

RFC 9111 obsoletes the Warning header, removing all specification
requirements about it. Likewise, this update changes Squid behaviour in
regards to that header:

1) Squid no longer adds Warning headers to generated or forwarded
   messages. Miss responses from servers/peers and hits cached by an
   older version of Squid may still have Warning headers.

2) On 304 revalidation, Warning header are treated the same as any
   other/generic header. They are added or replaced according to their
   presence in the 304 reply. Absent any Warning update by a 304, Squid
   may still deliver cached content with old Warning headers.

3) Squid no longer validates received Warning headers. RFC 7234 placed
   syntax requirements and limits on how old some Warning values could
   be (when dated). Those checks are no longer being performed. The
   header value is now treated as an opaque string.

4) Warning header usage and types are no longer tracked in message
   statistics available through cache manager.

Updated documentation references to reflect RFC 7234 becoming obsolete.

Refactored Ip::Intercept::Lookup() (#1062)

The TPROXY lookup part was misleading -- the method only checked that
TPROXY was enabled by the configuration and did not change the accepted
connection settings.

Also do not assume that a listening port can have both COMM_TRANSPARENT
and COMM_INTERCEPTION flags set. Squid treats TPROXY and NAT modes as
mutually exclusive and rejects such configurations.

Also forbid invalid port configurations with tproxy mode where TPROXY is
prohibited by ./configure.

Also removed unused Ip::Intercept::StopInterception().

Silence GnuRegex warnings on gcc 12.1 (#1096)

GnuRegex does pointer magic that gcc 12.1 doesn't like.

    compat/GnuRegex.c: error: array subscript -1 is outside array bounds
    of 'const char[]' [-Werror=array-bounds]

    compat/GnuRegex.c: error: pointer may be used after 'realloc'
    [-Werror=use-after-free]

A long term fix is to unship GnuRegex which is an old
snapshot and is not maintained.

Fix --disable-strict-error-checking and -W... options handling (#1091)

Commit 8b082ed (Activate extra compiler checks):

* started adding -Werror in --disable-strict-error-checking builds
* stopped adding -Wall to CXXFLAGS
* forgot that -Werror is spelled $squid_cv_cc_option_werror
* forgot that "=" is converted to "_" in $squid_cv_cc_arg... name
* forgot that "W" is present in $squid_cv_cc_arg... name
* checked squid_cv_cc_... name but set squid_cv_cc_arg... name
* used warning-generating SQUID_CC_CHECK_ARGUMENT code for testing
  compiler support for new -W options (while also using -Werror)

Correctly detecting support for a compiler warning option is difficult:

* Some compilers (e.g., clang) exit successfully when given an
  unsupported command line option unless we also give -Werror. With
  those compilers, we must use "-Werror -Wx" to test for -Wx support.
  However, that does not mean we should add -Werror to all Squid builds.

* Adding -Werror does not work for detecting supported GCC warnings: GCC
  needs -Werror=no-x to fail on unsupported -Wno-x warnings.

* We do not even know how to detect (via compiler exit code) supported
  warnings for compilers other than clang and GCC. We still add -Werror
  to all compilers other than GCC when testing -X, but, based on quick
  godbolt.org tests, -Werror does not make icc fail on unsupported -Wx.

Also delayed adding warnings until we are done adding such C++ compiler
options as -std=c++11 (and, for Irix, optimization level) because those
options may affect the set of warnings supported by the compiler. We
also (now) need SQUID_CC_GUESS_VARIANT and SQUID_CC_GUESS_OPTIONS. We
might need AC_PROG_CPP. All those things are settled after the official
code attempted to add warnings. The best code location probably needs
more work, but at least all warnings are handled in the same code now.

These changes also fix Bug 5167 (GCC -Wno-unused-private-field
miscategorized as supported).

Also fixed clang (-Wall => -Wmost) warnings exposed by the above fixes.

Avoid spurious translation warnings in default "make install" (#1095)

Squid defaults to --disable-translation, but the corresponding "make
install" produces dozens of alias-link.sh warnings like these:

    WARNING: az translations do not exist. Nothing to do for: az-az
    WARNING: bg translations do not exist. Nothing to do for: bg-bg
    ...
    WARNING: vi translations do not exist. Nothing to do for: vi-vn

With this change, --disable-translation builds do not attempt to
"locate" and install/link translation-related files. The following
warning is still shown, even in explicit --disable-translation builds:

    WARNING: Translation is disabled.

This change also stops exposing --disable-translation builds to
alias-link.sh bugs that generate warnings for boilerplate statements.

Discarded connections do not contribute to forward_max_tries (#1093)

The forward_max_tries directive is said to limit the number of
"high-level request forwarding attempts", but its documentation does not
specify whether each Happy Eyeballs connection opening attempt
constitutes such a forwarding attempt (because commit 3eebd26 that
clarified forward_max_tries meaning came before Happy Eyeballs code
started opening multiple concurrent connections in commit 5562295).

Official Squid counts each Happy Eyeballs connection attempt as a
request forwarding attempt. For example, if forward_max_tries is 2, then
Squid, to a likely admin surprise, may refuse to re-forward the request
after an HTTP failure because the second request forwarding attempt has
been used up by the (canceled!) spare TCP connection opening attempt.

This change stops counting discarded connections as request forwarding
attempts. For example, if forward_max_tries is 2, Squid will re-forward
the request (if needed and possible) even if Squid must open up to 3
connections to do that (discarding the second one in the process). In
this context, discarding the race-losing connection and going with the
winning one may be viewed as a low-level retry activity that
forward_max_tries is already documented to ignore.

Eventually, Squid may stop discarding connections that lost the Happy
Eyeballs race. When/if that complicated improvement is implemented,
those canceled connection opening attempts should be counted, but the
then-saved connections should be used for request re-forwarding,
preserving the same overall outcome: With forward_max_tries set to 2,
Squid will still re-forward the request (if needed and possible).

Before this change, setting forward_max_tries to 1 prevented all spare
connection attempts: The first DNS response received (A or AAAA) would
be used for the primary connection attempt, increasing n_tries, making
ranOutOfTimeOrAttempts() true, and blocking any spare attempt (both
concurrent and sequential). That low-level side effect is as wrong as
the blocked retries described in the "likely admin surprise" example
above. Now, admins that really want to disable spare connection attempts
may set forward_max_tries to 1 _and_ happy_eyeballs_connect_limit to 0.

Source Format Enforcement (#1089)

Bug 5133: OpenSSL 3.0 support  (#694)

This TLS update includes:

* Fix build with OpenSSL v3.
* Refactor RSA key generation to avoid deprecated RSA_*() APIs.
* Refactor DH parameter and key config to avoid deprecated DH_*() APIs.
* Refactor ECDH key creation to avoid deprecated EC_*() APIs.
* Deprecate ssl_engine support in builds with OpenSSL v1-.
* Disable ssl_engine support in builds OpenSSL v3+.

We deprecated/removed ssl_engine support (as summarized in the last two
bullets above) without providing an OpenSSL Providers-based alternative
because of the following factors:

1. We do not have the resources to update ssl_engine code to build
   (without deprecation warnings) with OpenSSL v3 when the feature is
   unused.

2. We do not have the resources to create an OpenSSL v3 Provider-based
   replacement for ssl_engine code that uses deprecated Engine APIs.

3. OpenSSL v3 deprecated Engine support (triggering deprecation warnings
   in applications that use Engine APIs with OpenSSL v3). Since Squid
   default builds use -Werror, doing nothing would break such builds.

4. Squid ssl_engine does not appear to be a popular feature.

Preserve caller context in TcpAcceptor (#1087)

Before this fix, acceptOne() was losing FTP DATA TcpAcceptor context
by unconditionally resetting the context to a nil listenPort_.

Also, TcpAcceptor methods should not (be expected to) explicitly set
their job code context. Instead, the job should be created in the
right code context, allowing job-created async calls to auto-manage code
context. The updated port-iterating clientHttpConnectionsOpen() and
Ftp::StartListening() loops now set the right code context when creating
async calls that result in TcpAcceptor job creation.

Streamline Store swap metadata handling (#1067)

Store swap metadata consists of a handful of Squid-specific TLV fields.
In 69c01cd, fields handling was converted to use lists of TLV objects.
In 528b2c6, the fields were promoted to a hierarchy of C++ classes, each
representing a specific T in TLV, but the transition was not complete:
While trying to add another class to the hierarchy, we discovered that
it would require violating hierarchy design again, adding another XXX.

While trying to address those hierarchy problems, we concluded that the
hierarchy and, in fact, the TLV lists themselves should be removed:

* A class hierarchy with polymorphic methods works well when the same
  method can be used regardless of the specific object type. In swap
  metadata case, the user code had to know the type (i.e. T in TLV) to
  handle each object correctly because only certain types could be
  processed at each metadata loading stage. Moreover, some processing
  can benefit from using type-specific APIs (e.g., extracting a URL).
  Several `switch (type)` statements (and XXXs about doing something a
  method should not be doing) illustrate that design conflict.

* A class hierarchy without polymorphic methods can still be useful if
  leaf objects can reuse a lot of parent code. However, StoreMeta class
  had virtually no code reusable by its derived classes -- it was a
  collection of static methods and their equivalents. Metadata leaf
  classes were essentially used as global functions.

* Creating a list is useful when the listed objects are traversed/used
  multiple times. In swap metadata case, each TLV object is effectively
  used once: Official code allocates each object, copies data into it,
  and then uses the object either to interpret the just-read value or to
  write just-copied bytes into another buffer. The only (minor)
  exception to that pattern was when the list was also scanned to
  calculate the total buffer size, but that extra scan is unnecessary
  and undesirable, especially given modern serialization techniques.

This change converts the problematic StoreMeta class hierarchy into a
few stand-alone functions. The SwapMetaView class implements a view of
the serialized metadata, significantly reducing allocations and copying.
Metadata parser creates a cheap view for the being-parsed field, with
view lifetime confined to that single parsing loop iteration.

Also removed unused STORE_META_* type IDs, addressing several related
XXXs and TODOs. This change was necessary to let the compiler tell us
when we forgot a swap metadata type in a `switch (type)` statement. No
changes to type IDs and their deficient assignment algorithm.

This change attempts to preserve arbitrary metadata validation decisions
that do not violate the overall design, except the following two:

* More problematic Store entries may now be rejected due to conversion
  of serious validation errors (e.g., framing problems) to exceptions.
  This change decreases the risks of infinite metadata parsing loops and
  invalid entry acceptance.

* Some error messages are now logged at level 1 instead of 0.

The Rock-specific ZeroedSlot() check was moved to Rock. Unlike Rock
fixed-size one-file storage, UFS-based cache_dir files are not zeroed.

Maintenance: Eliminate a (wrong) solution in a TODO comment (#1086)

Fix mingw-cross build (#1084)

compat/os/mswindows.h:640: ::inet_pton has not been declared

Maintenance: Removed most NULLs using modernize-use-nullptr (#1075)

Applying clang-tidy modernize-use-nullptr also converted many
"anonymous" 0s into nullptr.

Clang-tidy uses clang AST, so we can only modernize code what clang can
compile. Fortunately, it is fairly easy to compile most of Squid code,
eliminating ~90% of NULLs in actively maintained code (excluding
comments and Windows-specific code) and ~73% of all NULLs.

TODO: Consider eliminating all remaining C++ NULLs using a simple script
that does not understand C++ syntax, assuming that the remaining cases
are simple enough to avoid problems that a compiler cannot detect.

Maintenance: Remove unused data members (#1082)

These handlerSubscription fields blocked quite useful static assertions.
They are in the way of callback improvements; that work has discovered
that they are unused (since inception in merge commit 1c8f25b).

Maintenance: Improve stability of generated automake Makefiles (#1081)

Our source-maintenance.sh script generates five automake Makefiles using
"git ls-files <pattern>" output as the source of filename entries. These
entries were piped through "sort -u". That problematic step is now gone:

* "sort" may not produce stable results across different locales.

* "sort" does not use the order enforced by our format-makefile-am.pl.
  That Perl script itself did not fix the order because it only ordered
  entries in *_SOURCES groups that these generated files do not use.

* "sort -u" could silently drop duplicate Makefile entries (that should
  not exist) instead of letting that Perl script warn us about them.

The first two bullets made automake Makefile generation unstable (e.g.,
502bcc7 reversed the ASCII order of en.lang and en_AU.lang entries, but
current runs in unofficial environments could reorder those entries).

format-makefile-am.pl now recognizes the generated group name patterns,
properly formatting entries in those five generated automake Makefiles.

That fix alone created persistent misleading "File ... changed: by ...
format-makefile-am.pl" NOTICEs because the Perl script was (always)
changing the _generated_ file. We now pipeline the generation and
formatting steps, applying NOTICE-generation check to the final result.

(Positive) side effect: The generated automake Makefiles now include a
notice that they are generated.

Fix libntlmauth build on current Fedora Rawhide (#1077)

ntlmauth.cc:187: error: 'time' was not declared in this scope

Maintenance: Fix formater.pl application after a failure (#1078)

Our scripts/source-maintenance.sh relies on formater.pl to create
foo.astylebak -- a fresh copy of the being formatted file foo. The Perl
script is creating that copy, but since formater.pl could be executed
outside the "all source files are staged" protections offered by
source-maintenance.sh, formater.pl never overwrites old copies. Instead,
the Perl script numbers copies: foo.astylebak.1, foo.astylebak.2, ...

Unfortunately, the numbering code was copying into the highest available
index instead of shifting existing copies by one to always name the
fresh copy foo.astylebak, as expected by source-maintenance.sh! Thus, if
formatting failed (e.g., was interrupted by the user) leaving a backup
file on disk, then all subsequent attempts would start to fail after
a branch switch or local edits would make the backup copy stale:

    ERROR: File src/store.cc not formatting well

... with the following files on disk:

    src/store.cc <--- original file that should be backed up
    src/store.cc.astylebak.2 <--- fresh copy just made by formater.pl
    src/store.cc.astylebak.1
    src/store.cc.astylebak.0
    src/store.cc.astylebak <--- stale copy used by source-maintenance.sh

The FILO numbering order was also unusual/surprising to users; logrotate
and Squid itself use FIFO ordering, shifting numbered backup copies.

Implement FIFO ordering required by source-maintenance.sh and expected
by most users. TODO: Adjust source-maintenance.sh to stop relying on
formater.pl-created backup (and to stop guessing its name).

Maintenance: Fix cf_gen build warnings with clang v12 (#1076)

cf_gen.cc:... warning: comparison of array 'buff' not equal to a
null pointer is always true [-Wtautological-pointer-compare]

These conditions are not just always true but a bit odd or misleading
because they check the pointer after it was dereferenced. They were
added in c1f8bbd which misinterpreted the "else" conditions. The two
"regular DOC/CONFIG line" conditions are self-documented well enough.

Bug 5160: Test suite fails with -flto=auto (#897)

    FAIL: tests/testStore
    Test name: testStoreHashIndex::testSearch
        - check failed: setHow_ != optUnspecified
        exception location: ../src/base/YesNoNone.h(48) operator bool

Segmentation fault in tests/testStore is another symptom.

testStoreHashIndex calls Store::Root().init() which eventually gets to
MemStore::Requested() which correctly requires Config.memShared to be
configured. Hack the missing configuration replacement, following the
inconsistent pattern in three other test suites.

Similarly, Store::Disks::init() needs Config.store_dir_select_algorithm.

TODO: Reduce configuration code duplication and such bugs by providing
initialization method(s) for all Store::Root()-using tests to (re)use.

What does LTO have to do with any of it? CPPUnit [allegedly] randomizes
execution of test suites using raw C++ pointers, and a test suite in the
testStore collection probably contains the right initialization code.
Unfortunately, Config initialization is evidently shared across test
suites (XXX). LTO evidently changes those raw pointer values enough to
change the order in an unfortunate way.

[allegedly]: https://sourceforge.net/p/cppunit/bugs/182/

ext_session_acl: fix TDB key lookup (#1064)

When built with Samba TrivialDB, ext_session_acl would never
successfully look a session up due to an uninitialized key argument. As
a result, the helper would be unable to validate that an user had logged
in. Broken since TrivialDB support was added in acd207a.

Detected by Coverity. CID 1441979: Uninitialized scalar variable
(UNINIT).

RFC 9113 update documentation references (#1071)

RFC 7540 is now obsolete

Avoid reading tls-cert=bundle twice when loading certificates (#1073)

After opening and reading the certificate bundle to load the required
traffic-signing certificate, Squid was opening and reading the same file
again to load the optional intermediate certificates. Loading the
traffic-signing certificate twice triggered bugs and further
complicating naturally tricky code.

No functionality changes expected except minor cache.log message changes
and certificate reporting improvements in "squid -k parse" mode.

Maintenance: Improve CONTRIBUTORS and its updates (#980)

The Squid Project used a script to collect CONTRIBUTORS entries, but it
was not called from source-maintenance.sh, it did not really understand
the structure of those entries, and its results required significant
manual polishing efforts. CONTRIBUTORS file kept deteriorating.

This change consists of three major parts detailed further below:

* a major (semi-manual) CONTRIBUTORS update and cleanup;
* scripts/update-contributors.pl: Merges new entries into CONTRIBUTORS;
* collectAuthors() in source-maintenance.sh: Finds new entries.

Part 1: CONTRIBUTORS update: We collected (and then pruned/polished) all
contributors from the following (master and v3+ branches) sources:

* all commit authors;
* all commit co-authors (from Co-authored-by trailer and note entries);
* all CONTRIBUTORS file revisions (the latest one was missing entries).

Part 2: update-contributors.pl understands and enforces a more formal
CONTRIBUTORS structure. After a non-indented preamble text ending with
an empty line, indented CONTRIBUTORS entries now use these formats:

    name <email>
    name
    <email>

The entries are case-insensitively sorted by the two fields (name,
email), with several conflict-resolution rules aimed at achieving
natural and stable order. Non-ASCII entries are still banned (for now)
because properly supporting them requires a serious dedicated effort.

The program merges current CONTRIBUTORS and all well-formed contributor
entries (read from the standard input) except these:

* entries with an already represented email
* entries with an already represented name
* entries containing "squidadm" in name or email

The matching is case-insensitive. These filtering rules appear to work
better in Squid CONTRIBUTORS context than more accurate/strict rules do.

Part 3: collectAuthors() feeds update-contributors.pl with new
contributors collected from "git log" output. The function only looks at
the current git branch commits made after a "vetted" point. That point
is updated by certain CONTRIBUTORS commits, as detailed in the
collectAuthors() description inside source-maintenance.sh. It can also
be specified manually via the new --update-contributors-since option.

It is not critical (and is probably impossible) for CONTRIBUTORS
automation to properly detect and merge every single new contributor.
When the scripts get it wrong, a human can always update CONTRIBUTORS
manually. Rare mistakes are not a big deal in this context. For example,
if a past contributor now needs to be listed with another email (e.g.
representing a new employer), we manually add a second entry.

This change is a reference point for automated CONTRIBUTORS updates.

Do not send more than one self-signed certificate (#1058)

Configured chain L-M-R-R was sent as L-M-R-R instead of just L-M-R.

Configured chain R was sent as R-R instead of just R!

... where R is a self-signed (i.e. "Root CA") certificate that signed
(possibly indirectly, via an intermediate certificate M) the traffic
signing (i.e. "Leaf" or "end-entity") certificate L (if any).

Security::KeyData::loadX509ChainFromFile() has other problems, but
properly solving those problems needs significant code refactoring. It
is best done in a dedicated no-functionality-changes PR. The same
refactoring is likely to simplify the existing code logic, handling all
root certificates in one place.

Also removed disabled code so that we do not have to keep maintaining
it. Incorrectly disabling this code at the very end of commit 540e296
work created the bugs we are fixing in this branch. For some, it is
easier to comprehend active code without disabled code misdirection.
This disabled code removal does _not_ mean that Squid should keep
sending Root CA certificates (when they are not the signing/end-entity
certificates). However, stopping that behavior deserves a dedicated PR.

RFC 9218 Priority header registration (#1070)

This allows the Priority header to more efficiently be added and
removed by reverse-proxy administrators using the
reply_header_add, reply_header_access, reply_header_replace
directives.

Break long store_client call chains with async calls (#1056)

The store_client class design created very long call chains spanning
Squid-client and Squid-server processing and multiple transactions.
These call chains also create ideal conditions for dangerous recursive
relationships between communicating classes (a.k.a. "reentrancy" among
Squid developers). For example, storeClientCopy() enters store_client
and triggers disk I/O that triggers invokeHandlers() that re-enters the
same store_client object and starts competing with the original
storeClientCopy() processing state.

The official code prevented the worst recursion cases with three(!)
boolean flags and time-based events abused to break some of the call
chains, but that approach did not solve all of the problems while also
losing transaction context information across time-based events.

This change effectively makes STCB storeClientCopy() callbacks
asynchronous, eliminating the need for time-based events and one of the
flags. It shortens many call chains and preserves transaction context.
The remaining problems can and should be eliminated by converting
store_client into AsyncJob, but those changes deserve a dedicated PR.

store_client orchestrates cooperation of multiple asynchronous players:

* Sink: A Store client requests a STCB callback via a
  storeClientCopy()/copy() call. A set _callback.callback_handler
  implies that the client is waiting for this callback.

* Source1: A Store disk reading subsystem activated by the storeRead()
  call "spontaneously" delivers response bytes via storeClientRead*()
  callbacks. The disk_io_pending flag implies waiting for them.

* Source2: Store memory subsystem activated by storeClientListAdd()
  "spontaneously" delivers response bytes via invokeHandlers().

* Source3: Store disk subsystem activated by storeSwapInStart()
  "spontaneously" notifies of EOF/error by calling noteSwapInDone().

* Source4: A store_client object owner may delete the object by
  "spontaneously" calling storeUnregister(). The official code was
  converting this event into an error-notifying callback.

We continue to answer each storeClientCopy() request with the first
available information even though several SourceN calls are possible
while we are waiting to complete the STCB callback. The StoreIOBuffer
API and STCB recipients do not support data+error/eof combinations, and
future code will move this wait to the main event loop anyway. This
first-available approach means that the creation of the notifier call
effectively ends answer processing -- store_client just waits for that
call to fire so that it can relay the answer to still-synchronous STCB.
When STCB itself becomes asynchronous, this logic will continue to work.

Also stopped calling STCB from storeUnregister(). Conceptually, the
storeUnregister() and storeClientCopy() callers ought to represent the
same get-content-from-Store task; there should be no need to notify that
task about what it is doing. Technically, analysis of STCB callbacks
showed that many such notifications would be dangerous (if they are or
become reachable). At the time of the storeUnregister() call, the STCB
callbacks are usually unset (e.g., when storeUnregister() is called from
the destructor, after that object has finished copying -- a very common
case) or do not do anything (useful).

Also removed callback_data from the Callback::pending() condition. It is
conceptually wrong to require non-nil callback parameter, and it is
never cleared separately from the callback_handler data member anyway.

Also hid copyInto into the private store_client section to make sure it
is not modified while we are waiting to complete the STCB callback. This
move required adding a couple of read-only wrapper methods like
bytesWanted() and noteSwapInDone().

Also simplified error/EOF/bytes handling on copy()-STCB path using
dedicated methods (e.g., store_client::callback() API is no longer
mixing EOF and error signals).

Use SBuf for Fs::Ufs::RebuildState file paths (#1063)

In theory, a combination of a non-terminating-on-overflows snprintf()
and an incorrectly guessed MAXPATHLEN (or a cache_dir root path being at
the edge of a correctly guessed MAXPATHLEN) may result in non-terminated
fullpath. Use SBuf to avoid these particular headaches.

Detected by Coverity. CID 1461166: Copy of overlapping memory
(OVERLAPPING_COPY).

Remove use of deprecated std::iterator (#1069)

c++17 deprecates std::iterator.
Explicitly declare traits in our iterator classes instead of
using std::iterator

Remove reference to deprecated std::binary_function (#1068)

C++11 deprecates std::binary_function. C++17 removes it.
It only provides typedefs that are unused since C++11.

Prep for 5.6 (#1059)

Maintenance: Reduce HTTP/FTP code duplication (#1060)

Removed code duplication probably caused bug fixed in commit 049eeeb.

Fix uninitalised var in eDirectory digest helper (#1043)

In certain conditions, some string pointers in the eDirecotry digest
helper might be read before they are initialized. Initialize them.

Detected by Coverity, CID 1494358 (Uninitialized scalar variable).

Honor ftp_port worker-queues option (#1061)

Fix --enable-zph-qos build [-Wunused-parameter] (#1042)

Add runner to initialize NTLM auth (#1051)

Source Format Enforcement (#1046)

TLS library-agnostic X509 certificate interrogation functions (#1057)

Use added X509_check_issued() replacements. The only case left is in
src/ssl/gadgets.cc which is used by certificate helpers that cannot be
linked with libsecurity yet.

Use added X509_NAME_oneline() replacements, where feasible. This change
speeds up ssl_verify_cb() and other functions that used to extract and
copy certificate name into a buffer even when that name was unused
because debugging levels were not elevated enough, including by default.

Also fixes memory leak when debugging section 83 at level 3+ of an
OpenSSL-using Squid (missing name cleanup in clientNegotiateSSL()).

Also fixes a (usually symptom-free) sslcrtd bug: C strings allocated by
OpenSSL were freed by xfree() instead of OPENSSL_free().

Co-authored-by: Amos Jeffries <squid3@treenet.co.nz>

Fix PeerConnector::handleNegotiationResult() error debugging (#1055)

ERROR: failure while establishing TLS connection on FD: 70x123456*1

* FD value was malformed. Now report all Connection details instead.
* Error details were not reported (a pointer value was reported).

Also, for error message prefix, use a single/constant phrase instead of
error-dependent phrases. This principle is for level-0/1 messages, but
it is OK to apply it here as well, especially given the related TODO.

Also improved RawPointer reporting so that we can use it for this fix.
Some of these changes are not necessary (for the final code state), but
all of them except the last one were necessary at some point during the
work on this fix, and they all may become handy in the future:

* Print nothing if ptr is nil. Both existing callers benefit from this.
* Allow the caller to set the label/object delimiter. Also faster.
* Support reporting object values on a dedicated Debug::Extra line.
* Support (and do not attempt to print) nil RawPointer() labels.

Maintenance: Fix compiler warnings with --enable-ecap (#1054)

src/adaptation/ecap/XactionRep.cc:

`ACAMLI` locally defined but not used [-Wunused-local-typedefs]
unused parameter `bp` [-Wunused-parameter]

Bug 5211: support.cc:355: "!filledCheck->sslErrors" assertion (#1044)

One master transaction may encounter several certificate
validation-related errors because Squid may need to validate several
server certificates for one connection and may even need to open several
server connections. Since 2012 commit 4fb72cb, ssl_verify_cb()
incorrectly assumed that no past errors were possible. The callback
started to assert when recent code changes (commit e227da8?) improved
delivery of past errors to ACL checks.

While fixing this bug, we discovered that ServerBump::serverSession
ignored new TLS connection when Squid connected to another destination
after a failed TLS handshake. That stale value was then used to extract
stale information about past TLS errors. In some cases, that would
prevent Squid from hitting the now-fixed assertion. In other cases, it
could result in wrong sslproxy_cert_sign and sslproxy_cert_adapt
decisions. Now also fixed; see Ssl::ServerBump::attachServerSession().

A nil PeerConnector::check->sslErrors value may get stale because
PeerConnector::check is filled just once: sslError will remain nil
during the second validation pass (after PeerConnector fetches missing
intermediate certificates and revalidates), despite validation errors
discovered during the first pass. Today, ssl_verify_cb() does not really
use FilledChecklist::sslErrors preset by its PeerConnector, so a stale
value currently has no effect. Fixing this waiting-to-happen problem is
not trivial (we tried). A proper fix deserves a dedicated PR. Added XXX.

While working on these changes, we discovered that Squid implements two
mutually exclusive ssl_error (FilledChecklist::sslErrors) definitions:

* In sslproxy_cert_error context, ssl_error matches the last or current
  error being examined by the sslproxy_cert_error code.

* In sslproxy_cert_sign and sslproxy_cert_adapt context, ssl_error
  matches any previously observed validation error.

This change updates Squid documentation to reflect the above findings.
Unfortunately, it is not feasible to remove this unwanted directive
"sensitivity" without deprecating/replacing sslproxy_cert_error and
ssl_error: Selecting any one of the two ssl_error definitions above may
seriously (and silently) break existing configurations:

    # If all-errors semantics is adopted for sslproxy_cert_error, then
    # this configuration will start allowing certificates with _any_
    # error as long as that certificate also has a minorError:
    acl minorError ssl_error ...
    sslproxy_cert_error allow minorError

    # If last-error semantics is adopted for sslproxy_cert_sign, then
    # this configuration will stop properly signing certificates that
    # mimic self-signed real certificates if the very last error during
    # that real certificate validation was _not_ certSelfSigned:
    sslproxy_cert_sign signSelf ssl::certSelfSigned

Fixed (and renamed) Ssl::ReadX509Certificate() API (#1048)

This pointer-returning non-throwing function was used to load an unknown
number of certificates from a file. A nil result normally indicated the
end of the certificate sequence but also a certificate loading failure.
The caller could not distinguish the two nil-result outcomes without
OpenSSL error stack analysis, which is difficult to do correctly, and
which none of the callers bothered to do, resulting in some certificate
configuration errors silently ignored by Squid.

The adjusted API uses exceptions to report errors. A nil result is still
used to report the end of the certificate sequence. To help callers that
must have at least one certificate, one function is now dedicated to
loading required certificate(s) while a companion function is added to
load optional ones.

Avoid exponential growth in the number of changed functions by wrapping
callers inside the not-ready-to-throw code chains (that cannot be easily
converted) with try/catch-return-false blocks.

Also removed the need for Ssl::readCertFromMemory() by adding
Ssl::ReadOnlyBioTiedTo(). Functions reading things (from various input
sources) should be kept separately from functions creating input sources
(for various reading functions). Otherwise, we will end up creating a
lot more functions (at least readers*creators) with virtually no gain in
performance or convenience. OpenSSL BIO API exists specifically to
separate I/O mechanisms from I/O users. Use it to our advantage.

We now also clear old/stale OpenSSL error stack before performing the
certificate loading operation that may trigger one or more errors.

Also forget stale errno when forgetting OpenSSL errors. This reset
avoids misleading error reports based on stale errno values.

Also detail certificate loading errors, now that they are not ignored.

ReportSysError was added to avoid SysErrorDetail::NewIfAny() memory
allocations when reporting basic system call errors. Perhaps there is a
way to reuse SysErrorDetail here via (enhanced) Optional, but the
resulting complexity and the current SBuf-based reporting overheads do
not justify the reuse of "if errno then strerror(errno)" logic. We may
revisit this when Optional supports non-trivial classes.

Also moved most SysErrorDetail implementation details to the newly added
.cc file (to reduce user exposure), replacing strerror() with xstrerr().

ForgetErrors guts were moved from Security to Ssl because we cannot link
helpers with src/security (yet). We may have to refactor Security to
make such reuse possible in the future.

Add Universally Unique IDentifier (UUID) support (#1015)

These 128-bit UUIDs have a very high chance of being unique across
SMP Squid kids and even across independent Squid instances.

Our UUID creation algorithm uses random number generation as described
in RFC 4122 Section 4.4 (UUID version 4 variant 1).

Preserve caller context across (and improve) deferred reads (#1025)

The transaction context was not saved/restored when dealing with
deferred reads initiated by events like the DelayPools::Update() event.
To fix this, we refactored MemObject::delayRead() and its descendants to
use an AsyncCall, which automatically stores/restores code context.

Using explicit async callbacks highlighted the danger of passing
Connection object via CommRead that does not maintain a closure
callback. There was also a related "stuck transaction" suspicion
documented in DeferredReadManager::kickARead(). Fortunately, all these
problems could now be solved by removing DeferredRead and CommRead
classes! The delayed readers already store the Connection object,
maintain closure callbacks, and have to check stored Connection validity
before reading anyway. The general/centralized delayed reading logic is
not really about reading and Connections (those parts are handled by
transaction-specific code) but about triggering reading attempts.
Asynchronous calls are perfect (and sufficient) for doing that.

Also fixed Delay Pools for Gopher: delayAwareRead() was initiated only
once from gopherSendComplete() and the subsequent read calls were
delay-unaware (i.e. immediate) reads.

Also fixed a Delay Pools problem with active transactions: A transaction
started with Delay Pools on becomes stuck if a reconfiguration turns
Delay Pools off.

Also refactored the existing AsyncCall FIFO intrusive storage, making
its reuse possible (and marked one candidate with a TODO).

Ensure initClient MasterXactions have listening ports (#993)

MasterXaction creation code that forgets to supply a port for an
initClient transaction will now be rejected by the compiler.

This safety improvement hit two problems described below.

The TcpAcceptor class was incorrectly creating a MasterXaction object
for FTP DATA connections. We moved MasterXaction creation to TcpAcceptor
callbacks because only they know whether accepting a TCP connection
implies a new master transaction start (it does not for FTP DATA cases).

The Downloader class was implemented to spawn its own MasterXaction
using XactionInitiator info supplied by the download requestor. That
design is incompatible with the new static assertion because each
MasterXaction creator must now hard-code XactionInitiator value for the
assertion to work at _compile_ time. All other contexts naturally
hard-code that value.

We saw two ways to resolve this conflict:

a) Let the download requestor create the MasterXaction object for the
   Downloader. The primary advantage of this (implemented) approach is
   that it allows (future) client request processing code to satisfy
   client requests using Downloader. Such request satisfaction requires
   sharing the master transaction between the client transaction and the
   downloader transaction, and this change enables such sharing. This
   change moves the "Which master transaction does this download belong
   to?" question from the Downloader into download requestors.

b) Move initClient-has-port enforcement (and squidPort) from
   MasterXaction into XactionInitiator. The primary advantage of this
   (not implemented) approach is that it places that enforcement inside
   the type it is meant to police (essentially) -- XactionInitiator.

The two changes are complementary. We did not implement (b) because it
requires significant XactionInitiator redesign, moving _all_ originating
client information from MasterXaction to XactionInitiator (currently
squidPort and tcpClient).

Update status code 413 compliance (#1040)

The latest HTTP specification requires the data
transmitted in HTTP messages be called 'content'.

The official reason text for status code 413 has
been altered to comply.

Added a debugging tool: scripts/trace-context.pl (#1039)

This debugging script finds cache.log lines that belong to the given
CodeContext, identified by its gist (e.g., "conn123" or "master42").

Like other debugging scripts added in commit e83fd25, this one requires
maintenance as the cache.log format changes.

Cleanup ClientHttpRequest-related code (#1045)

No logic changes.

Ensure null-termination of string in negotiate_wrapper (#1031)

Coverity identified a theoretical chance that a buffer may not be
null-terminated in negotiate_wrapper. The code flow is clean, adding a
forced null termination to apply defensive programming practices.

Simplify gopherToHTML() (#1026)

No logic change.

Initialise default_keytab in negotiate_kerberos_auth (#1032)

Address a Coverity-identified issue, where default_keytab might be read
when uninitialised in negotiate_kerberos_auth.
Ensure it is initialised at declaration.

Detected by Coverity, CID 1503291 (Uninitialized scalar variable)

Defensive coding for ESI stackmember (#1033)

Add initializer for the value union in struct stackmember in ESI

Coverity CID 1494364: Uninitialized scalar variable (UNINIT).

Delete trailing whitespace (#1041)

Remove outdated replacement for EV_SET (#1038)

This removes a workaround for a macro that was missing on
FreeBSD over 20 years ago.

Remove unused libbsd (#1037)

Squid links against libbsd, but doesn't actually use any of its
functions.

I've build tested this on Alpine Linux.

Fix CID 1503308: Logically dead code (DEADCODE) (#1036)

The rsa variable has already been checked for nil and
cannot be unset before this extra check.

Detected by Coverity Scan

Bug 5208: Part 1: Restart kids killed by SIGKILL (#1035)

OOM killer uses SIGKILL. Squid did not restart kids killed by SIGKILL.
Kids are essential Squid components. Essential components should be
revived (by default) because providing a service without an essential
component would violate Squid functionality requirements.

Squid did not revive a kid killed by SIGKILL because we thought that
doing so will interfere with the "squid -k kill" feature that uses that
signal to kill the whole Squid instance. However, that feature does not
actually work[^1] -- the signal is sent to (and kills) the master
process only, the process which PID is in squid.pid file. This change is
orthogonal to fixing "squid -k kill" (a difficult out-of-scope project).

[^1]: Except in the special case of the no-daemon (squid -N) mode.

Fix CID 1461131 Invalid type in argument to printf (#1027)

in ext_edirecory_userip, persist_timeout is defined
as a time_t, wihich doesn't fly well with printf.
Cast it to int for printing; since it is set using atoi
it is guaranteed not to overflow anyway

Initialize all HttpStateData data members (#1029)

If fwd->serverConnection is null, _peer is read before being
initialized. We now make sure all HttpStateData data members are
initialized (using in-class member initialization).

Detected by Coverity. CID 1494356: Uninitialized pointer read (UNINIT).

Also removed HttpStateData::read_sz as unused.

Improve Ipc::Mem page limits accounting (#1030)

Undocumented Ipc::Mem::PageId::maxPurpose is used to mark freed pages.
Freed pages do not participate in estimating future memory needs, so the
Limits array does not need to store their needs. However, some of the
needs estimation code is written without that assumption in mind because
maxPurpose is a legitimate/used purpose value (rather than an enum end
marker). This change improves this code in case our assumption change.

Detected by Coverity:
* CID 1504262: Out-of-bounds write (OVERRUN)
* CID 1504263: Out-of-bounds read (OVERRUN)

sslcrtd_program: Use (more) Squid-wide APIs (#1021)

This switch allows us to use Squid-native debugging and error handling
code in ssl/libsslutil.la. Such use is necessary to fix and enhance
certificate handling code currently located in that library. For now, we
only did straightforward adjustments like fixing parseBytesOptionValue()
and Ssl::CrtdMessage::parseRequest() error detection/handling. More
serious changes deserve dedicated PRs.

This change also converts the remaining bare std::runtime_error uses in
runtime Squid code and almost all std::cerr uses in sslcrtd_program:

* std::runtime_error: TextException provides the source code location of
  the thrower and will be enhanced along with Squid improvements.

* std::cerr: debugs() provides much better runtime logging, on many
  levels. The usage() dump still uses std::cerr because debugs()
  decorations are not useful in that special case and because, IMO, that
  case should be using std::cout instead (it is not reporting an error).

Removing STL APIs (that have Squid-native alternatives) from the old
helper code also reduces the temptation to use wrong APIs in new code,
especially when authors are not familiar with Squid conventions/plans.

Also replaced sslcrtd_program's `Here` hack with SourceLocation.

Also report all started helpers (at debug level 2) because a successful
helper start is a significant event worth reporting (when level-2
debugging is enabled) across all helpers.

Fix SQUID-MIB smilint errors (#1020)

- Reorganized cachePeerTable and related objects to fix
  smilint errors which prevent importing the MIB in compilers
  with more strict validation (e.g. MGsoft).

- Import of Unsigned32 from SNMPv2-SMI was removed as it was
  not being used.

- Improved formatting consistency.

Improve handling of Gopher responses (#1022)

Add Assure() as a replacement for problematic Must() (#864)

The now-deprecated Must() has several interdependent flaws:

1. Must() logs at debug level 3, hiding some important bugs from humans.

2. Must() has been (ab)used for both checking code logic and validating
   input, making purpose-specific implementation changes impractical.

3. Must() does not honor the standard NDEBUG macro, complicating runtime
   cost assessment and surprising some developers that are used to that
   standard assert() semantics.

The new Assure() is a throwing version of a POSIX assert(3):

* Meant for detecting Squid code logic bugs (not input validation).
* Informs admins about bugs by logging their info to cache.log.
* Completely disabled in NDEBUG builds.
* Kills the current component (e.g., a Squid-origin HTTP transaction).

The killed component boundary is essentially defined by the location of
the handling try/catch statement. Some Assure() failures will kill a
Squid process, but the throwing code should not worry (or even know)
about the catcher location and handling logic.

This change also optimizes the compiled Must()/Assure() caller code
size, which may help a bit with runtime performance: With the new
Assure()/Must() implementation approach, the total stripped Squid
executable size in one reasonable configuration goes down by 5%. For
comparison, removing all Must()s completely gives 6% size reduction.

Automatically replacing Must() calls with Assure() is not practical due
to the second flaw itemized above.

SourceLayout: Move time related tools to time/libtime.la (#1001)

Just code moves and documentation.

Solves the occasional build issues of copying time.cc for linking to
pinger and tools/ binaries. Which now link to the library.

Bug 5186: noteDestinationsEnd check failed: transportWait (#985)

When the "no more destinations to try" notification comes after the last
forwarding/tunneling attempt has failed, the !destinationsFound block
does not run (because destinations were found), the usingDestination()
block does not run (because we are done with that last/failed
destination), but transportWait is false (for the same reason).

Also applied Bug 5090 (master/v6 commit 15bde30) FwdState protections to
tunnel.cc code. Tunnels may continue writing to the client while the
to-server connection is closing or closed, so TunnelStateData can be
potentially exposed to bug 5090 "no server connection but still
transporting" concerns. TunnelStateData never _retries_ successfully
established tunnels and, hence, can (and now does) stop receiving spare
destinations after committedToServer becomes true, but future tunnels
may start reforwarding in many cases, and most of the code does not need
to know about this (temporary?) simplification.

Also re-unified and polished related FwdState and TunnelStateData code,
including fixing lying source code comments and debug messages.

Improve debugs() handling in helpers (#1011)

This change also reduces libdebug dependency on globals.cc, improving
libdebug reusability.

Also do not default-reset debug sections after they were explicitly set:

* In most sbin/squid contexts, DebugModule constructor is called before
  Debug::parseOptions(). That call order results in the Levels array
  being reset to default values before it is reset to configured values.
  No problem.

* In sbin/squid -X context, DebugModule constructor is called after
  Debug::parseOptions(), but the override_X flag protected the Levels
  array from being reset to default values in this case. No problem.

* In helper contexts, DebugModule constructor may be called after
  Debug::parseOptions(), and the override_X flag stays false. This order
  results in the parseOptions() effects erased by ResetSections() called
  from the constructor.

This bug was detected while trying to understand why pinger's hard-coded
(and wrong) ALL,10 default has no effect on pinger's debugging. The two
bugs cancelled each other.

The following changes affect pinger (now) and other helpers that will
eventually use libdebug and its new NameThisHelper() API:

* Label helper debugs() lines with the helper name (e.g., "pinger"),
  similar to how we already label SMP debugs() lines with "kidN". This
  change improves cache.log readability, distinguishes output from
  different helpers, and distinguishes helper output from sbin/squid
  output in non-SMP logs.

* Make sure level-1 debugs() messages are logged.

* Stop pointless accumulation of cache.log and syslog channel messages.

* Automatically honor SQUID_DEBUG environment variable set by the parent
  Squid process. The pinger helper was already honoring it.

The following changes are specific to pinger:

* Removed the now-duplicated references to "pinger" in pinger debugs().

* Do not default pinger debugging levels to ALL,10. The default pinger
  debugging levels should be the same as the default Squid debugging
  levels (i.e. "ALL,1"). Bugs in the debugging module prevented the
  hard-coded "ALL,10" (or any other elevated setting) from having an
  effect, but we now fixed the last of those bugs. AFAICT, the change of
  default from ALL,1 to ALL,10 in commit cc192b5 was accidental.

Prep for 5.5 (#1014)

Remove failing-to-build and not-built-by-default ufsdump (#1013)

The program was not built by default since 2010 (see commit 2d94e2d).
Its build has been failing since before 2017 (see commit 4c2f8b7).

Judging by commit log, other ufsdump problems were discovered and often
left unaddressed throughout the years. Addressing some of them probably
requires serious work. The program source code and its dependencies have
been neglected for a long time and are a distraction. If we decide the
Project should have this tool, it should be rewritten from scratch.

The program does not support newer STORE_META_ TLVs and rock cache_dirs.

There are no signs that the program is in demand. No official bug
reports mention ufsdump since 2013. I could only find irrelevant
squid-dev references and no squid-users references at all since 2012.

comm/libminimal.la to facilitate helper use of convenience libs (#1009)

Helpers may not (want to) use fd_open() and fd_close() directly, but
they (want to) use libdebug which does use fd_open() and fd_close().

Currently, the Comm "importing" fd_open() and Comm "delisting"
fd_close() APIs are not implemented inside src/comm/ but they do belong
to the Comm module and use its internals. Moving fd.h to src/comm/ will
require a lot of noisy out-of-scope changes deserving a dedicated PR.

Despite their names, the minimally-implemented functions do not open(2)
and close(2) file descriptors in their full implementations either: That
full implementation just updates fd-associated Squid-specific metadata
that Squid helpers do not need/use.

Do not use stub_fd.cc in the deployed/non-test pinger program.

Avoid build-breaking copying of stub_fd.cc source file in test-suite/.

Maintenance: Removed a few known unused globals from src/tests (#1008)

Do not ApplyTcpKeepAlive() to PortCfg-unrelated traffic (#1006)

Honor assertions during shutdown (#1007)

From code correctness/guarantees point of view, it is much better to
assert than to exhibit undefined behavior, especially since the
asserting code is usually not shutdown-specific and the shutdown state
often lasts through hundreds of transactions.

We are not aware of any frequent assertions during shutdown, and we want
to fix the ones that do exist (instead of not knowing about them). Thus,
this change is unlikely to introduce a lot of problems and might trigger
other positive changes.

Bypassing assertion failures does not guarantee the code will keep
running: In many (most?) cases, the asserting code will still crash or
seriously misbehave. In those cases, this change clearly improves Squid.

We ignored assertion failures during shutdown since Squid started
customizing assert() in commit 54f742e. Back in 1998, Squid was known to
often crash while shutting down, the crashes were often "benign" (the
code was just mishandling disappearing modules), and Squid could not
always start after a crash, complicating startup scripts. With a focus
on adding new features, we probably felt it is best to ignore these
usually minor but often annoying failures.

This change reduces libdebug dependency on globals.cc, addressing commit
6249367 TODO, and improving libdebug reusability.

Logformat %lp expands to "-" in wildcard listening port configs (#997)

FindListeningPortAddress() and its helpers look for "local" address of
an accepted connection. When FindListeningPortAddress() is called in %la
context, we must skip Ip::Address objects with INADDR_ANY IP addresses
because we are looking for a specific IP address, not a wildcard.
However, when called in %lp context, skipping those Ip::Address objects
may, in some cases, result in skipping the only object that actually
contains the port information, resulting in %lp expanding to "-".

Similarly, zero-port Ip::Address objects (with non-any IPs) could be, in
theory, returned instead of continuing the search for an object with a
non-zero port number, although this case was not observed in tests.

Now we configure the address searching helpers with a "good Ip::Address
object" filter so that each %code code path can customize its search.

The problem was introduced in commit ea35939 that expanded INADDR_ANY
check scope from %la to %lp.

Remove SCO 3.2 support (#1005)

This OS is now obsolete. Users wishing to build for this OS
can use build time options:
./configure --disable-poll CFLAGS="-lintl"

also add to lib/util.c:
#define rint(X) floor((X) + 0.5)

mem/libminimal.la to facilitate helper reuse of convenience libs (#1004)

Squid helpers/tools cannot reuse features like SBuf because, in part,
the corresponding convenience libraries depend on libmem.la which drags
in heavy dependencies, including Store and time-based Event modules. The
new mem/libminimal convenience library implements enough of src/mem/
APIs to make the new memory library usable by helpers/tools without
causing a dependency explosion.

Do not use tests/stub_libmem.cc in the deployed/non-test squidclient,
cachemgr.cgi, and pinger programs.

Kid restart leads to persistent queue overflows, delays/timeouts (#706)

    WARNING: communication with ... may be too slow or disrupted...
    WARNING: abandoning ... I/Os
    ERROR: worker I/O push queue for ... overflow...
    ERROR: Collapsed forwarding queue overflow...

SMP queues rely on the shared memory QueueReader::popSignal flag to
reduce the number of UDS messages that queue readers and writers need to
send each other. If the flag is true but there is no corresponding "wake
up, you have new queued items to read" UDS message for the reader, the
reader may stall. This happens when the reader restarts (e.g., after
hitting an assertion) while the flag is true. A stalled queue reader
leads to delays and queue overflows:

* When the problem affects worker-disker queues, disk I/O delays under
  the hard-coded 7-second timeout are not reported to the admin but may
  affect user experience. Larger delays trigger level-1 WARNINGs. Push
  queue overflows trigger level-1 ERRORs.

* Transient worker-worker queue problems may stall concurrent
  transactions that are reading from the cache entry being written by
  another process. Overflows trigger level-1 ERRORs.

The restarted worker usually starts working just fine because it does
not expect any messages. A busy restarted worker may also appear to
continue working fine because workers always pop queued items before
pushing new ones -- as long as the worker queues new items, it will see
and pop responses to earlier requests, masking the problem. However, the
"stuck popSignal" problem never goes away: Squid only clears the flag
when receiving a notification, but sending new notifications is blocked
by that stuck flag.

Upon kid start, we now clear popSignal (to reflect the actual
communication state) and empty the queue (to reduce overflows). Since
commit 4c21861, any corresponding in-flight UDS queue notification is
ignored because it was sent to the previous process playing the same kid
role. The queue writer will see the false popSignal flag and send a new
notification when queuing a new item, preventing queue stalls.

Also properly ignore stale disker responses about cache_dirs we have not
opened yet, especially since we are now trying to empty the queues ASAP
after a restart, before Coordinator has a chance to inform us about
available diskers, populating the IpcIoFiles container. We already have
similar protection from stale UDS messages and from stale disker queue
messages about _opened_ cache_dirs ("LATE disker response to...").

Also report SMP queue flags in mgr:store_queues.

Detach libsbuf from StatHist to facilitate SBuf reuse (#1003)

The stated purpose of sbuf/DetailedStats was to "avoid adding external
dependencies to the SBuf code". DetailedStats failed to accomplish that
because it introduced a dependency on StatHist (which depends on Store).

To break the unwanted dependency, we outsource at-destruction-time size
statistics collection to external-to-SBuf code, configurable via the new
SBufStats fields. This also allows to avoid non-trivial SBuf statistics
collection in programs that do not need that statistics.

Since the new SBufStats fields are set at cache manager configuration
time in mainInitialize(), earlier SBuf and MemBlob destructions are not
accounted for. With more code changes/complications, we could initialize
the fields much earlier, but this delay may be considered a _positive_
change because it removes unusual SBuf stats from long-term histograms.

Also do not use the SBuf STUB in the deployed/non-test pinger program.
TODO: Convert more helpers to use SBuf. This change only adjusts pinger
because that adjustment did not require other significant changes.

Bug 4946: client_side_request.cc: "request != newRequest" (#1000)

... assertion when preceded by the following error message:

    ERROR: Inconsistent service method ... in dynamic adaptation chain

The assertion is triggered by the following chain of events. During
SslBump step1, a REQMOD adaptation service returns a dynamic
(X-Next-Services) plan containing a RESPMOD service P. Then, during
SslBump step2 (after obtaining TLS client SNI):

* Adaptation::AccessCheck::start() discovers P in the "future services"
  storage (Adaptation::History::theFutureServices) and returns it.

* The adaptation routing code correctly concludes that P is not
  applicable to the current vectoring point, logs the above ERROR, and
  returns the untouched virgin message object to the adaptation
  initiator. See thePlan.exhausted() in Adaptation::Iterator::step().

* ClientHttpRequest asserts because it expects a new message object.

Fixed Adaptation::AccessCheck code no longer assumes that it cannot be
activated twice for the same vectoring point. It leaves services
applicable to future vectoring points in theFutureServices instead of
always suggesting them for the current vectoring point.

TODO: We can and should optimize adaptation requesting code to stop
requiring a new message object when no adaptation is necessary, but that
change is difficult (we tried!) and independent from the bug fixed here.

Fix ignore-cc/act-as-origin in wildcard split-stack ports (#994)

The PortCfg::clone() hack (and clone_http_port_list() before it) forgot
to copy those two flags to the IPv4 port variant.

Compilers will now be able to warn us if copying misses future members.

Also prohibited other forms of copying, nearly restricting copying to
the parsing code with special needs.

Add RegisteredRunners::bootstrapConfig event hook (#992)

Remove --disable-loadable-modules build option (#990)

... in favour of libtool --enable/disable-shared option which
provides the same functionality.

This has the nice side effect of simplifying the LT_INIT and
related autoconf sequences which were being modified by
--disable-loadable-modules.