git.ipfire.org Git - thirdparty/squid.git/log

5.4

Prep for 5.4 (#973)

Source Format Enforcement

Bug 5189: Preserve configured order of intermediate CA certificate chain (#956)

https_port ... tls-cert=signing,itsIssuer,itsIssuerIssuer.pem

The order was reversed in commit cf48712, probably by accident. Wrong
order violates TLS protocol and breaks TLS clients that are incapable of
reordering received intermediate CAs. Squid deployments that use
wrong-order bundles (to compensate for this bug) should reorder their
bundles when deploying this fix (or wait for Squid to order certificates
correctly, regardless of the bundle order -- a work in progress).

This is a Measurement Factory project.

Bug 5188: Fix reconfiguration leaking tls-cert=... memory (#911)

Refactored ReadX509Certificate() API for safe use in more contexts,
including in leaking Security::KeyData::loadX509ChainFromFile().

Abstract direct calls to the dangerous PEM_read_bio_X509() API.

Leaking (under certain conditions) since master/v5 commit 540e296.

Bug 5187: Properly track (and mark) truncated store entries (#909)

Squid used an error-prone approach to identifying truncated responses:
The response is treated as whole[^1] unless somebody remembers to mark
it as truncated. This dangerous default naturally resulted in bugs where
truncated responses are treated as complete under various conditions.

This change reverses that approach: Responses not explicitly marked as
whole are treated as truncated. This change affects all Squid-server
FwsState-dispatched communications: HTTP, FTP, Gopher, and WHOIS. It
also affects responses received from the adaptation services.

Squid still tries to deliver responses with truncated bodies to clients
in most cases (no changes are expected/intended in that area).

[^1]: A better word to describe a "whole" response would be "complete",
    but several key Squid APIs use "complete" to mean "no more content
    is coming", misleading developers into thinking that a "completed"
    response has all the expected bytes and may be cached/shared/etc.

Transactions that failed due to origin server or peer timeout (a common
source of truncation) are now logged with a _TIMEOUT %Ss suffix and
ERR_READ_TIMEOUT/WITH_SRV %err_code/%err_detail.

Transactions prematurely canceled by Squid during client-Squid
communication (usually due to various timeouts) now have WITH_CLT
default %err_detail. This detail helps distinguish otherwise
similarly-logged problems that may happen when talking to the client or
to the origin server/peer.

FwdState now (indirectly) complete()s truncated entries _after_
releasing/adjusting them so that invokeHandlers() and others do not get
a dangerous glimpse at a seemingly OK entry before its release().

Bug 5132: Close the tunnel if to-server conn closes after client (#957)

Since commit 25d2603, blind CONNECT tunnel "jobs" (and equivalent) were
not destroyed upon a "lonely" to-server connection closure, leading to
memory leaks. And when a from-client connection was still present at the
time of the to-server connection closure, we did not try to reforward,
violating the spirit of commit 25d2603 changes. Calling retryOrBail() is
sufficient to handle both cases.

langpack: Fix typo in Russian texts (#948)

Missing whitespace between two words in ERR_READ_TIMEOUT

Bug 5134: assertion failed: Transients.cc:221: "old == e" (#958)

Make sure the StoreMap anchor we open for reading has our key (and not
just happens to be at our hash position). Prior to this change, two
openForReadingAt() calls were missing a sameKey() post-call check due to
a buggy backport (v5 commit ec50061; mis-attributed to me).

Now the sameKey() check is integrated into the openForReadingAt() method
itself, as was already done in master/v6 since commit b2aca62.

5.3 (#945)

Docs: %adapt::sum_trs entries may well exceed %icap::tt (#914)

%icap::tt documentation incorrectly implied that the measurement
includes the entire ICAP transaction(s) lifetime. In reality, individual
ICAP transaction contribution stops with
Adaptation::Icap::ModXactLauncher::swanSong(), which is normally
triggered by Adaptation::Icap::Launcher::noteAdaptationAnswer(). Here,
the "answer" does not include the entire ICAP response, but just enough
information to form adapted HTTP message headers (echoed or received).
Thus, a large or slow ICAP response body may result in %adapt::sum_trs
values that far exceed the corresponding %icap::tt "total".

This change does not imply that %icap::tt should (not) work differently.

Also fixed a typo in %adapt::all_trs and polished %adapt::sum_trs docs.

Bug 5060: Parallel builds are not reliable (#927)

Create tests directory before using it. Needed since commits 44e802f and
9ba9313.

    cp ../../src/tests/stub_debug.cc tests/stub_debug.cc
    cp ../../src/tests/stub_libmem.cc tests/stub_libmem.cc
    cp: cannot create regular file 'tests/stub_debug.cc':
        No such file or directory

Bug 5158: AnyP::Uri::host() mishandles [escaped] IPv6 addresses (#920)

When an address is expected to be a string in the form "Host:Port", the
Host MUST be wrapped in square brackets iff it is a numeric IPv6
address. The ":Port" part is usually optional. i.e. "[2001:db8::1]:443"
or "[2001:db8::1]".

There are 2 bugs relating to the handling of numeric IPv6 addresses:

* Bug 1: AnyP::Uri::host(const char *) is supposed to accept the host
  part of a URI, but just copied it into hostAddr_ instead of using the
  fromHost() method to set hostAddr_.  Since the argument is the host
  part of a URI, numeric IPv6 addresses are wrapped in square brackets,
  which would never be stripped and hostIsNumeric was therefore not set.
  We now use the fromHost() method to process the host name correctly.

* Bug 2: Conversely, AnyP::Uri::hostOrIp() is supposed to return the
  host name, suitable for use in a URI or Host header, which means that
  numeric IPv6 addresses need to be wrapped in square brackets.
  However, that wrapping was never done.  We now use the toHostStr()
  method to correctly format the string.

As far as I know, neither of these bugs actually break anything at the
moment, but they have the potential to break future developments, so
applying this fix should have no operational impact.

Also removed wrong IP::Address::toStr() description from .cc file that
(poorly) duplicated correct description of that method in the header.

Bug 5169: StoreMap.cc:517 "!s.reading()" assertion (#918)

unlockSharedAndSwitchToExclusive() was not properly failing when future
readers violated preconditions for obtaining an exclusive lock:

- `if (!readers)` means no old readers
+ `if (!readLevel)` means no old readers and nobody is becoming a reader

That missing "becoming a reader" condition covers a lockShared() caller
that had passed their writeLevel test before we incremented writeLevel
to lock other lockShared() callers out. That caller increments `readers`
while unlockSharedAndSwitchToExclusive() makes `writing` true.

Introduced in commit 1af789e due to poor code duplication. This change
also removes that code duplication by adding finalizeExclusive().

5.2 (#907)

Bug 5164: a copy-paste typo in HttpHdrCc::hasMinFresh() (#901)

This bug could result in unexpected hits/misses because Squid used the
cached CC:max-stale value (or -1) instead of CC:min-fresh when
calculating entry freshness.

Broken since 810d879.

WCCP: Validate packets better (#899)

Update WCCP to support exception based error handling for
parsing and processing we are moving Squid to for protocol
handling.

Update the main WCCPv2 parsing checks to throw meaningful
exceptions when detected.

TLS: Fix X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY handling (#898)

Bug 4922: Improve ftp://... filename extraction (#879)

Symptoms include: Unexpected ERR_FTP_NOT_FOUND/FTP_REPLY_CODE=550
outcome for GET ftp://... requests with URLs containing `;type=...`.

Broken since commit 51b5dcf.

5.1 (#872)

Fix SslBump reconfiguration leaking public key memory (#861)

X509_get_pubkey() increments key reference count.

Probably leaking since commit 2a268a0.

Fix ACL-related reconfiguration memory leak (#859)

Leaking since commit 4eac340 that migrated from statically-allocated
ACLStrategy<...>::Instance_ objects to dynamically allocated ones.

Most ACLStrategy objects have no data members, probably leaking just one
virtual table pointer (per named ACL), but strategies that support ACL
(data) --options, like ACLDestinationDomainStrategy and
ACLServerNameStrategy, leak memory used for options storage.

Bug 4696: Fix leaky String move assignment operator (#858)

The original attempt at fixing String move assignment operator (i.e.
commit 20a04c1) leaked the assigned-to String object memory.

These leaks are measurable even in --disable-optimizations builds.

Fix build on RISC-V (#856)

Compiling on RISC-V (without an explicit -latomic) fails with

/usr/riscv64-linux-gnu/include/c++/10/ostream:611:
undefined reference to __atomic_compare_exchange_1

Use std::atomic<uint8_t>::exchange() to detect whether -latomic
implements 1-byte compare-and-exchange API used by Squid.

Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>

Fix build on Ubuntu 21 (#855)

The `-Wunused-result` warning for setuid(0) call was observed on Ubuntu
21.04 (when cross compiling Squid for Ubuntu 21.10).

Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>

5.0.7 (#851)

Fix sslcrtd and external_acl helper name lifetimes (#843)

helperShutdown: <binary garbage> #Hlpr523 shutting down.
helperShutdown: See example.net/legal-terms #Hlpr4456 shutting down.

Busy "class helper" objects often outlive configuration that was used to
create them. Ideally, the supplied helper category name should be copied
and maintained by the helper object itself, but that long-term solution
requires a lot more work (TODO) due to out-of-scope bugs.

Reconfiguration orphans stateless helper connection (#842)

The helper.cc code (ab)uses two different Connection objects with the
same FD. Properly fixing that problem requires significant work, but the
official code almost gets away with it, in part, because helper.cc, with
one exception, uses heavily customized code to "safely" close its
connections (instead of just calling conn->close() as many Squid jobs
still do). This fix replaces that single exceptional case with a
closeWritePipeSafely() call which prevents orphaning readPipe and
triggering BUG 3329 warnings. It also restores the symmetry between the
corresponding (previously buggy) stateless and (OK) statefull code.

A code merge mistake in commit b038892 created this bug AFAICT.

Source Format Enforcement

Maintenance: Update astyle version to 3.1 (#841)

Version 2.04 is quite outdated now, and there are only minor
formatting differences with v3.1. All changes look to be good syntax.

Also, pass the astyle executable with command line option to
formater.pl. Avoiding environment variables which do not work on servers
with sanitized sub-shell environments such as our main server.

Maintenance: Support custom astyle versions (#644)

... to override the astyle version used for official sources:

    $ ./scripts/source-maintenance.sh
    Astyle version problem. You have 3.1 instead of 2.04

    $ ./scripts/source-maintenance.sh --enable-astyle 3.1
    Found astyle 3.1. Formatting...

Remove custom global new/delete operators (#839)

These custom operators did not cover all new/delete cases (e.g., array
allocations), were not declared according to C++ standards (triggering
compiler warnings), and were not enabled in clang builds.

These customizations enabled custom OOM handling (for covered cases),
but it is not clear whether that feature is desirable overall, and C++
has better ways to implement such handling (i.e. set_new_handler()).

These customizations participated in collection of optional statistics
(--enable-xmalloc-statistics), but it is not clear whether that feature
implementation is good enough, and, even if it is, providing these
partial stats does not outweigh recurring customization problems.

Remove unused code silencing intercept errors (#836)

The removed code has not been actively used almost since it was added.

It is now widely accepted that NAT and TPROXY can only be done on the
machine running Squid. The corresponding address lookup errors are an
indication of either a system misconfiguration or an adverse external
event such as flushing of conntrack tables. Since these errors should be
fatal to the affected transactions and the admin usually has the power
to address them, Squid should report them at level 1.

Bug 4528: ICAP transactions quit on async DNS lookups (#795)

The bug directly affected some ICAP OPTIONS transactions and indirectly
affected some ICAP REQMOD/RESPMOD transactions:

* OPTIONS: When a transaction needed to look up an IP address of the
  ICAP service, and that address was not cached by Squid, it ended
  prematurely because Adaptation::Icap::Xaction::doneAll() was unaware
  of ipcache_nbgethostbyname()'s async nature. This bug is fixed now.

* REQMOD/RESPMOD: Adaptation::Icap::ModXact masked the _direct_ effects
  of the bug: ModXact::startWriting() sets state.writing before calling
  openConnection() which schedules the DNS lookup. That "I am still
  writing" state makes ModXact::doneAll() false while a REQMOD or
  RESPMOD transaction waits for the DNS lookup.

  However, REQMOD and RESPMOD transactions that require an OPTIONS
  transaction (because the service options have never been fetched
  before or have expired) could still fail because the OPTIONS
  transaction they trigger could fail as described in the first bullet.
  For example, the first few REQMOD and RESPMOD transactions for a given
  service -- all those started before the DNS lookup completes and Squid
  caches its result -- could fail this way. With the OPTIONS now fixed,
  these REQMOD and RESPMOD transactions should work correctly.

Broken since inception (commit fb505fa).

Bug 4832: '!schemeAccess' assertion on exit (#819)

* Add test to detect the bug and prevent regressions

* delete the access list if not cleared by the time
  Auth::Config instance is destructed. This is what the
  free_AuthSchemes() code does when the function call
  nesting and debugs are pruned away.

Bug 5129 pt1: remove Lock use from HttpRequestMethod (#825)

Removes the need for a custom assignment operator with a questionable
implementation, addressing compiler and static analysis warnings.

Fix --with-valgrind-debug build broken by commit 02f5357 (#822)

error: cbdata_htable was not declared in this scope

Bug 5128: Translation: Fix % i typo in es/ERR_FORWARDING_DENIED (#821)

| ERROR: .../es/ERR_FORWARDING_DENIED: Unsupported error page %code
near % i es un...

Typo added in bbeb83f.

5.0.6 (#817)

Fix various Clang-12.0 compiler warnings (#800)

Fixes -Wrange-loop-construct, -Wmismatched-tags, and -Wshadow.

Fix GCC -Werror=range-loop-construct (#808)

This warning detects unnecessary object copying in C++ range loops, with
a focus on large objects and copies triggered by implicit type
conversions. Included in -Wall.

error: loop variable ... creates a copy from type ...

Fixed typo in SPONSORS.list (#812)

Fixed spellings of Digital in DigitalOcean.

%ssl::<negotiated_version, %ssl::>negotiated_version for TLS/1.3 (#803)

Both %codes were logged as dashes when Squid negotiated TLS v1.3.

This is a Measurement Factory project.

Replace cbdata::Offset hack with offsetof() (#809)

Also remove unused OFFSET_OF macro.

Stop processing a response if the Store entry is gone (#806)

HttpStateData::processReply() is usually called synchronously, after
checking the Store entry status, but there are other call chains.

StoreEntry::isAccepting() adds STORE_PENDING check to the ENTRY_ABORTED
check. An accepting entry is required for writing into Store. In theory,
an entry may stop accepting new writes (without being aborted) if
FwdState or another entry co-owner writes an error response due to a
timeout or some other problem that happens while we are waiting for an
I/O callback or some such.

N.B. HTTP and FTP code cannot use StoreEntry::isAccepting() directly
because their network readers may not be the ones writing into Store --
the content may go through the adaptation layer first and that layer
might complete the store entry before the entire peer response is
received. For example, imagine an adaptation service that wants to log
the whole response containing a virus but also replaces that (large)
response with a small error reply.

Bug 5106: Broken cache manager URL parsing (#788)

Use already parsed request-target URL in cache manager and
update CacheManager to Tokanizer based URL parse

Removing use of sscan() and regex string processing which have
proven to be problematic on many levels. Most particularly with
regards to tolerance of normally harmless garbage syntax in URLs
received.

Support for generic URI schemes is added possibly resolving some
issues reported with ftp:// URL and manager access via ftp_port
sockets.

Truly generic support for /squid-internal-mgr/ path prefix is
added, fixing some user confusion about its use on cache_object:
scheme URLs.

TODO: support for single-name parameters and URL #fragments
are left to future updates. As is refactoring the QueryParams
data storage to avoid SBuf data copying.

Fix a few level-2+ debugs(); improve ErrorState debugging (#804)

When possible, report object contents rather than its memory address.

Fix GCC 10.2.0 build on Ubuntu Hirsute s390x (#796)

GCC reports
error: free-nonheap-object: snmp_core.cc(950): snmpCreateOidFromStr

Removed redundant Http::StatusLine::protocol (#794)

Also polished Http::StatusLine initialization, but that part needs a
lot more work to get rid of Http::StatusLine::init().

Protect SMP kids from unsolicited IPC answers (#771)

When an SMP kid restarts, it recreates its IPC socket to flush any old
messages, but it might still receive IPC messages intended for its
previous OS process because the sender may write the message _after_
kid's IPC socket is flushed. Squid IPC communication is connectionless,
so the sender cannot easily detect the reopening of the recipient socket
to prevent this race condition. Some notifications are desirable across
kid restarts, so properly switching to connection-oriented IPC
communication would only complicate things further.

This change protects kids from, for the lack of a better term, an
"unsolicited" answer: An answer to a question the recipient did not ask.
When allowed to reach regular message-handling code, unsolicited answers
result in misleading diagnostics, may trigger assertions, and might even
result in a bad recipient state. For example, a kid might think it has
been successfully registered with Coordinator while its registration
attempt was actually dropped due to a Coordinator restart.

Our protection targets one specific use case: Responses that were born
(or became) "unsolicited" due to a recipient restart. Other problematic
cases may not require any protection, may require a very different
protection mechanism (e.g. cryptography), may deal with requests rather
than responses, or even cannot be reliably detected. For example:

* messages sent by a malicious attacker
* requests sent by a misconfigured Squid instance
* requests sent by a previous Squid instance
* messages sent by a no-longer-running kid process
* messages sent by buggy Squid code

----

Also marked a few out-of-scope problems/improvements, including a bug:

Improved handling of Coordinator and Strand exceptions exposed and
partially addressed an old problem: When configured to listen on an
inaccessible port, Squid used to kill the Coordinator job, resulting in
subsequent kid registration timeouts. Squid now correctly keeps the
Coordinator job running, logging a detailed report:

    WARNING: Ignoring problematic IPC message
        message type: 5
        problem: check failed: fd >= 0
        exception location: TypedMsgHdr.cc(198) putFd

Still, the affected kids do not report the port opening problem and
continue running, listening on the problem-free ports (if any).
Depending on the exception timing, similar behavior was possible before
these changes. The correct action here is to send the port opening error
to the kid process without throwing the putFd() exception, but we should
decide how Squid should handle such inaccessible configured ports first.

Handle more partial responses (#791)

Handle more Range requests (#790)

Also removed some effectively unused code.

Replace defective Must2(c, "not c") API (#785)

... with Must3(c, "c", Here())

* We should not make a Must*() API different from the standard C++
  static_assert() API. It is just too confusing, especially since the
  two calls are very closely related.

* We should not tell Must*() writers to specify one condition (i.e. what
  must happen) but then describe the opposite condition (i.e. what went
  wrong). When the text describes the opposite condition, it is easy for
  a human writing or thinking about the text to type the wrong
  condition: Must2(got < need, "incomplete message") looks correct!

We should not keep the same macro name when changing the meaning of the
second parameter. Fortunately, adding a third argument to the macro fits
nicely into how modern Squid code should pass the source code location.

Must3() does not support SBuf descriptions. I tried to support them, but
doing so without duplicating non-trivial code is too difficult, and
since the current code does not actually use them, wasteful.

If future (complex) code needs SBuf conditions, then it should probably
use an explicit throw statement instead, especially since Must*() is,
like assert(), supposed to quickly check source code sanity rather than
validate unsafe input. A compile-time condition description and the
source code location is usually enough for sanity checks, while proper
reporting of invalid input usually also requires parsing context info.

Source Format Enforcement

Maintenance: Start following Inclusive Naming Initiative advice (#786)

... as detailed at https://inclusivenaming.org/

This change is limited to the words "whitelist" and "blacklist". Those
two words are easier to replace with, arguably, better ones.

No functionality changes.

Bug 5112: Excessively loud chunked reply parsing error reporting (#789)

Traffic parsing errors should be reported at level 2 (or below) because
Squid admins can usually do nothing about them and a noisy cache.log
hides important problems that they can and should do something about.

TODO: Detail this and similar parsing errors for %err_detail logging.

Also removed an unnecessary used-once macro.

Limit HeaderLookupTable_t::lookup() to BadHdr and specific IDs

Fix HttpHeaderStats definition to include hoErrorDetail (#787)

... when Squid is built --with-openssl.

We were "lucky" that the memory area after HttpHeaderStats was not,
apparently, used for anything important enough when HttpHeader::parse(),
indirectly called from errorInitialize() during initial Squid
configuration, was writing to it.

Detected by using AddressSanitizer.

The bug was created in commit 02259ff and cemented by commit 2673511.

Maintenance: Fix two misspellings (#784)

These typos were caught by scripts/spell-check.sh but then merged anyway
due to a `git diff` status code misinterpretation bug in the CI scripts.

Fixed a couple of minor typos (#783)

Bug 5104: Memory leak in RFC 2169 response parsing (#778)

A temporary parsing buffer was not being released when
parsing completed.

Handling missing issuer certificates for TLSv1.3 (#766)

Prior to TLS v1.3 Squid could detect and fetch missing intermediate
server certificates by parsing TLS ServerHello. TLS v1.3 encrypts the
relevant part of the handshake, making such "prefetch" impossible.

Instead of looking for certificates in TLS ServerHello, Squid now waits
for the OpenSSL built-in certificate validation to fail with an
X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY error. Squid-supplied
verify_callback function tells OpenSSL to ignore that error. Squid
SSL_connect()-calling code detects that the error was ignored and, if
possible, fetches the missing certificates and orchestrates certificate
chain validation outside the SSL_connect() sequence. If that validation
is successful, Squid continues with SSL_connect(). See comments inside
Security::PeerConnector::negotiate() for low-level details.

In some cases, OpenSSL is able to complete SSL_connect() with an ignored
X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY error. If Squid validation
fails afterwards, the TLS connection is closed (before any payload is
exchanged). Ideally, the negotiation with an untrusted server should not
complete, but complexity BIO changes required to prevent such premature
completion is probably not worth reaching that ideal, especially since
all of this is just a workaround.

The long-term solution is adding SSL_ERROR_WANT_RETRY_VERIFY to OpenSSL,
giving an application a chance to download the missing certificates
during SSL_connect() negotiations. We assist OpenSSL team with that
change, but it will not be available at least until OpenSSL v3.0.

This description and changes are not specific to SslBump code paths.

This is a Measurement Factory project.

Bug 3556: "FD ... is not an open socket" for accept() problems (#777)

Many things could go wrong after Squid successfully accept(2)ed a socket
and before that socket was registered with Comm. During that window, the
socket is stored in a refcounted Connection object. When that object was
auto-destroyed on the error handling path, its attempt to auto-close the
socket would trigger level-1 BUG 3556 errors because the socket was not
yet opened from Comm point of view. This change eliminates that "already
in Connection but not yet in Comm" window.

The fixed BUG 3556 errors stalled affected clients and leaked their FDs.

TODO: Keeping that window closed should not require a human effort, but
achieving that goal probably requires significant changes. We are
investigating.

Detail certificate validation errors during TLS handshake (#770)

Fix certificate validation error handling in Security::Connect/Accept().
The existing validation details were not propagated/copied to IoResult,
requiring the caller to extract them via ssl_ex_index_ssl_error_detail.
The clunky approach even required a special "ErrorDetail generations"
API to figure out which error detail is "primary": the one received in
IoResult or the just extracted one. That API is removed now.

This change is used by the upcoming improvements that fetch missing TLS
v1.3 server certificates, but it also has an immediate positive effect
on the existing reporting of the client certificate validation errors.
Currently, only a general TLS error is reported for those cases because
Security::Accept() code forgot to check ssl_ex_index_ssl_error_detail.

This is a Measurement Factory project.

Allow 1xx and 204 responses with Transfer-Encoding headers (#769)

HTTP servers MUST NOT send those header fields in those responses, but
some do, possibly because they compute the same basic headers for all
responses, regardless of the status code. Item 1 in RFC 7230 Section
3.3.3 is very clear about message framing in these cases. We have been
ignoring Content-Length under the same conditions since at least 2018.
We should be consistent and apply the same logic to Transfer-Encoding.

I also polished the Transfer-Encoding handling comment for clarity sake.

Deduplicating IPC strand messages (#756)

No functionality changes other than minor debugging improvements.

* replaced identical (except for the message kind value) HereIamMessage
and StrandSearchResponse classes with StrandMessage
* reduced code duplication with a new StrandMessage::NotifyCoordinator()
* split TypedMsgHdr::type() into unchecked rawType() and checked type()
* renamed and documented several Ipc::MessageType enum values

The above code improvements will help with adding more IPC messages.

Profiling: CPU timing implemented for MAC non-x86 (#757)

Bug 5057: "Generated response lacks status code" (#752)

... for responses carrying status-code with numerical value of 0.

Upon receiving a response with such a status-code (e.g., 0 or 000),
Squid reported a (misleading) level-1 BUG message and sent a 500
"Internal Error" response to the client.

This BUG message exposed a different/bigger problem: Squid parser
declared such a response valid, while other Squid code could easily
misinterpret its 0 status-code value as scNone which has very special
internal meaning.

A similar problem existed for received responses with status-codes that
HTTP WG considers semantically invalid (0xx, 6xx, and higher values).
Various range-based status-code checks could misinterpret such a
received status-code as being cachable, as indicating a control message,
or as having special for-internal-use values scInvalidHeader and
scHeaderTooLarge.

Unfortunately, HTTP/1 does not explicitly define how a response with a
status-code having an invalid response class (e.g., 000 or 600)
should be handled, but there may be an HTTP WG consensus that such
status-codes are semantically invalid:

https://lists.w3.org/Archives/Public/ietf-http-wg/2010AprJun/0354.html

Since leaking semantically invalid response status-codes into Squid code
is dangerous for response retries, routing, caching, etc. logic, we now
reject such responses at response parsing time.

Also fixed logging of the (last) received status-code (%<Hs) when we
cannot parse the response status-line or headers: We now store the
received status-code (if we can parse it) in peer_reply_status, even if
it is too short or has a wrong response class. Prior to this change,
%<Hs was either not logged at all or, during retries, recorded a stale
value from the previous successfully parsed response.

Source Format Enforcement

Maintenance: Sort source file lists in Makefiles (#643)

Automate keeping files in Makefile.am SOURCES
directives in the appropriate order and apply to
current Makefile.am throughout the source

Maintenance: Move sort-includes.pl to scripts/maintenance/ (#599)

We will no longer get `changed #include order` notices but will get
`changed: by maintenance/sort-includes.pl` notices, including for
whitespace-only changes.

Support plugin-style scripts for source format enforcement (#531)

Allow the source-maintenance script to run arbitrary code or sub-scripts
to perform enforcement of Squid code style and content.

Code placed in the scripts/maintenance/ sub-folder MUST meet the
following criteria:

* be self-executable,
* receive filename of the code file to be touched as one and only
command-line parameter,
* always dump the file contents to stdout (with or without edits),
* not depend on any other code in this sub-folder being run first.

Detail client closures of CONNECT tunnels during TLS handshake (#781) (#691)

... and improve detailing of other errors.

Many admins cannot triage TLS client failures, and even Squid developers
often cannot diagnose TLS problems without requiring detailed debugging
logs of failing transactions. The problem is especially bad for busy
proxies where debugging individual transactions is often impractical.

We enhance existing error detailing code so that more information is
logged via the existing %err_code/%err_detail logformat codes.
Propagating low-level error details required significant enhancements
and refactoring. We also built initial scaffolding for better error
detailing by GnuTLS-driven code and documented several key
error-handling APIs, exposing a few out-of-scope problems.

Also checkLogging() once, after consuming unparsed input attributed to a
transaction: Due to fake CONNECT requests, from-client read errors, and
possibly other complications, we may have a transaction that did not
consume every input byte available to it. That transaction is still
responsible for reporting those unparsed bytes (e.g., by logging the
number of bytes read on a connection and the number of parsed bytes).

Also fixed passing wrong (errno vs. size) or stale (requested vs. read)
I/O size to connFinishedWithConn(); now shouldCloseOnEof(). The bad
value was "correct" (i.e. zero) in many cases, obscuring the bug.

This is a Measurement Factory project

Fix build on Fedora Rawhide (#773)

* add SYSTEMD_LIBS to all binaries using client_side.cc, fixing linking
* add `<limits>` to all sources using std::numeric_limits, fixing gcc-11
builds

5.0.5 (#767)

Source Format Enforcement (#762)

Ignore SMP queue responses made stale by worker restarts (#711)

When a worker restarts (for any reason), the disker-to-worker queue may
contain disker responses to I/O requests sent by the previous
incarnation of the restarted worker process (the "previous generation"
responses). Since the current response:request mapping mechanism relies
on a 32-bit integer counter, and a worker process always starts counting
from 0, there is a small chance that the restarted worker may see a
previous generation response that accidentally matches the current
generation request ID.

For writing transactions, accepting a previous generation response may
mean unlocking a cache entry too soon, making not yet written slots
available to other workers that might read wrong content. For reading
transactions, accepting a previous generation response may mean
immediately serving wrong response content (that have been already
overwritten on disk with the information that the restarted worker is
now waiting for).

To avoid these problems, each disk I/O request now stores the worker
process ID. Workers ignore responses to requests originated by a
different/mismatching worker ID.

Concurrent hit misses due to Transients entry creation collision (#686)

Transients::addEntry() fails when two simultaneous cache hit requests
arrive at two different workers:

    Controller.cc find: e:=msV/0x55c2dbdac7f0*0 check failed: slot
    exception location: Transients.cc(236) addEntry

The failure leads to a cache miss for one of the requests. The miss
transaction will overwrite the already cached entry, possibly creating
more misses due to long write locks used by the disk cache. The problem
does not affect collapsed requests. It affects regular cache hits.

The failure happens because two requests are trying to create a
Transients entry at about the same time. Only one can succeed. With the
current ReadWriteLock-based design, some contention is unavoidable, but
the contention window was rather large/long:

* The critical section essentially started when Controller::find() (in
  each request worker) checked Transients and did not find a matching
  entry there. That failed lookup put each request into "create a new
  Transients entry" mode.

* A worker exited that critical section when openForWriting() in
  Transients::addEntry() returned (with a write lock to the
  ready-to-be-filled entry for one of the two requests and with a
  failure for the other).

This change reduces the contention window to the bare minimum necessary
to create and fill a Transients entry inside openOrCreateForReading().
In the micro-tests that exposed this problem, the probability of a
failure is reduced from more than 70% to less than 3%. YMMV.

Also split addEntry() into two directional methods because while they
are structured similarly they actually have almost no common code now.

Transactions exceeding client_lifetime are logged as _ABORTED (#748)

... rather than timed out (_TIMEOUT).

To record the right cause of death, we have to call terminateAll()
rather than setting logType.err.timedout directly. Otherwise, when
ConnStateData::swanSong() calls terminateAll(0), it overwrites our
direct setting.

Squid-to-client write_timeout triggers client_lifetime timeout (#747)

Since commit 5ef5e5c, a socket write timeout triggers two things:
* reporting of a write error to the socket writer (as designed/expected)
* reporting of a socket read timeout to the socket reader (unexpected).

The exact outcome probably depends on the transaction state, but one
known manifestation of this bug is the following level-1 message in
cache.log, combined with an access.log record showing a
much-shorter-than-client_lifetime transaction response time.

WARNING: Closing client connection due to lifetime timeout

Optimization: Avoid more SBuf::cow() reallocations (#744)

This optimization contains two parts:

1. A no-brainer part that allows SBuf to reuse MemBlob area previously
   used by other SBufs sharing the same MemBlob. To see this change,
   follow the "cowAvoided" code path modifications in SBuf::cow().

2. A part based on a rule of thumb: memmove is usually better than
   malloc+memcpy. This part of the optimization (follow the "cowShift"
   path) is only activated if somebody has consume()d from the buffer
   earlier. The implementation is based on the heuristic that most
   consuming callers follow the usual append-consume-append-... usage
   pattern and want to preserve their buffer capacity.

MemBlob::consume() API mimics SBuf::consume() and std::string::erase(),
ignoring excessive number of bytes rather than throwing an error.

Also detailed an old difference between an SBuf::cow() requiring just a
new buffer allocation and the one also requiring data copying.

Co-Authored-By: Christos Tsantilas <christos@chtsanti.net>

Cleanup: use C++11 initialization in ConnStateData (#721)

Also, update some method documentation doxygen syntax.

Resolves Coverity Issue #1231353 - preserveClientData_ member
uninitialized by any constructor sequence.

Do not send keep-alive or close in HTTP Upgrade requests (#732)

A presence of a Connection:keep-alive or Connection:close header in an
Upgrade request sent by Squid breaks some Google Voice services. FWIW,
these headers are not present in RFC 7230 Upgrade example, and popular
client requests do not send them.

* In most cases, Squid was sending Connection:keep-alive which is
  redundant in Upgrade requests (because they are HTTP/1.1).

* In rare cases (e.g., server_persistent_connections=off), Squid was
  sending Connection:close. Since we cannot send that header and remain
  compatible with popular Upgrade-dependent services, we now do not send
  it but treat the connection as non-persistent if the server does not
  upgrade and expects us to continue reusing the connection.

Fix cachemgr.cgi regression in the bug 4957 fix (#741)

After master commit 2e29287, authenticated CGI interface users could not
use the menu links (getting HTTP 403 error). Symptoms in cache.log:

CacheManager: unknown@...: password needed for 'menu'
CacheManager: <username>@...: incorrect password for 'menu'

Preserve caller context across IPC-related timeout events (#742)

Before this fix, the transaction context was not saved/restored when
scheduling the following three events:

* Ipc::Forwarder::RequestTimedOut()
* Ipc::UdsSender::DelayedRetry()
* Ipc::Inquirer::RequestTimedOut()

Bug 5073: Compile error: index was not declared in this scope (#740)

Use strchr(3) instead of a legacy POSIX.1-2001 index(3) API.

Also removed the index() implementation on MS Windows as no longer used.

Translations: Update integration (#738)

* Add credits for es-mx translation moderator
* Use es-mx for default of all Spanish (Central America) texts
* Update translation related .am files

Translation: Add es-mx dialect translation (#728)

es-mx.po translation file

Always send port in request-target of CONNECT requests to peers (#733)

Broken since commit f5e1794 (Peering support for SslBump).

Bug 5076: WCCP Security Info incorrect (#725)

When generating and validating WCCP2 Security Info use only an
8 byte password.

Bug 5065: url_rewrite_program documentation update (#692)

Restored support for non-lowercase Transfer-Encoding values (#723)

... after "Improve Transfer-Encoding handling" commit f6dd87e.

Folks are reporting Chunked Transfer-Encoding values in real
traffic. HTTP requires case-insensitve treatment of codings.

Return from handleIMSReply when verdict is decided (#722)

Prevents nullptr de-reference after response object has
been decided and already started sending to client.

Resolves Coverity Scan Issue 1364722

Merge pull request from GHSA-jvf6-h9gj-pmj6

* Add slash prefix to path-rootless or path-noscheme URLs

* Update src/anyp/Uri.cc

Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>
* restore file trailer GH auto-removes

* Remove redundant path-empty check

* Removed stale comment left behind by b2ab59a

Many things imply a leading `/` in a URI. Their enumeration is likely to
(and did) become stale, misleading the reader.

* fixup: Remind that the `src` iterator may be at its end

We are dereferencing `src` without comparing it to `\0`.
To many readers that (incorrectly) implies that we are not done iterating yet.

Also fixed branch-added comment indentation.

Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>

5.0.4 (#717)

WCCP: Fix GCC-10 -Wstringop-truncation failures (#708)

I can only trigger these failures when I compile on s390x.

In function 'char* strncpy(char*, const char*, size_t)', output
may be truncated copying 8 bytes from a string of length 8

Signed-off-by: Sergio Durigan Junior <sergiodj@debian.org>

Add http_port sslflags=CONDITIONAL_AUTH (#510)

Enabling this flag removes SSL_VERIFY_FAIL_IF_NO_PEER_CERT from the
SSL_CTX_set_verify callback. Meaning a client certificate verify
occurs iff provided.

Do not send keep-alive in 101 (Switching Protocols) responses (#709)

... because it breaks clients using websocket_client[1] library and is
redundant in our HTTP/1.1 control messages anyway.

I suspect that at least some buggy clients are confused by a multi-value
Connection field rather than the redundant keep-alive signal itself, but
let's try to follow RFC 7230 Upgrade example more closely this time and
send no keep-alive at all.

[1] https://pypi.org/project/websocket_client/

Improve Transfer-Encoding handling (#702)

Reject messages containing Transfer-Encoding header with coding other
than chunked or identity. Squid does not support other codings.

For simplicity and security sake, also reject messages where
Transfer-Encoding contains unnecessary complex values that are
technically equivalent to "chunked" or "identity" (e.g., ",,chunked" or
"identity, chunked").

RFC 7230 formally deprecated and removed identity coding, but it is
still used by some agents.

Forbid obs-fold and bare CR whitespace in framing header fields (#701)

Header folding has been used for various attacks on HTTP proxies and
servers. RFC 7230 prohibits sending obs-fold (in any header field) and
allows the recipient to reject messages with folded headers. To reduce
the attack space, Squid now rejects messages with folded Content-Length
and Transfer-Encoding header field values. TODO: Follow RFC 7230 status
code recommendations when rejecting.

Bare CR is a CR character that is not followed by a LF character.
Similar to folding, bare CRs has been used for various attacks. HTTP
does not allow sending bare CRs in Content-Length and Transfer-Encoding
header field values. Squid now rejects messages with bare CR characters
in those framing-related field values.

When rejecting, Squid informs the admin with a level-1 WARNING such as

obs-fold seen in framing-sensitive Content-Length: ...