When FwdState decides to re-forward a request, it forgets the original
response[^1] but does not create an error object. Since commit e2bbd3b,
FwdState::noteDestinationsEnd() correctly assumed that we only idly wait
for additional destinations after encountering a problem, but
incorrectly asserted that past problems imply error object existence.
Now Squid generates an HTTP 502 (Bad Gateway) response while setting
%err_code/%err_detail to ERR_CANNOT_FORWARD/REFORWARD_TO_NONE.
TunnelStateData::noteDestinationsEnd() code is similar, but it probably
does not suffer from the same bug because an error object is created
before every retryOrBail() call, and there is no re-forwarding code that
forgets an HTTP error response without creating an error. Those
invariants are not well tracked, and this change mimics FwdState changes
in TunnelStateData just in case and to keep the two methods in sync. In
those hypothetical cases, ERR_CANNOT_FORWARD/RETRY_TO_NONE is logged.
[^1]: Long-term we probably want to preserve that original response so
that we do not have to replace it with a generated error, but doing so
requires significant refactoring and is outside this minimal fix scope.
Alex Rousskov [Wed, 5 Jul 2023 14:41:47 +0000 (14:41 +0000)]
Do not leak memory when handling cache manager requests (#1408)
Also adjusted Cache-Control APIs to prevent similar bugs. These changes
also speed up processing a bit and simplify most of the affected code.
The now-gone "just remove the old CC" putCc() misfeature was unused.
The leak was introduced by commit 92a5adb: PutCommonResponseHeaders()
incorrectly assumed that putCc(pointerToX) takes ownership of X.
Detected by Coverity. CID 1534779: Resource leak (RESOURCE_LEAK).
Alex Rousskov [Fri, 1 Mar 2024 22:20:20 +0000 (22:20 +0000)]
Bug 5069: Keep listening after getsockname() error (#1713)
ERROR: Stopped accepting connections:
error: getsockname() failed to locate local-IP on ...
In many cases, these failures are intermittent client-triggered errors
(e.g., client shut down the accepted socket); Squid will successfully
accept other connections and, hence, should keep listening for them.
POSIX.1-2001 marks vfork(2) OBSOLETE.
POSIX.1-2008 removes the specification of vfork(2).
MacOS system headers declare vfork(2) as deprecated.
We only use vfork(2) in negotiate_wrapper, where it is not necessary.
Alex Rousskov [Fri, 23 Feb 2024 04:26:38 +0000 (04:26 +0000)]
Fix marking of problematic cached IP addresses (#1691)
Since inception in 2017 commit fd9c47d, Dns::CachedIps::have() always
returned position zero after finding a matching IP address (at zero or
positive position). The bug affected two callers:
* markAsBad() always marked the first stored address (as bad);
* forgetMarking() always cleared the first stored address marking.
Buggy markings led to Squid sometimes not attempting to use a working
address (e.g., IPv4) while using a known problematic one (e.g., IPv6).
Alex Rousskov [Sun, 18 Feb 2024 00:45:41 +0000 (00:45 +0000)]
Fix debugging for responses that Expire at check time (#1683)
Since 2000 commit 65fa5c6, our level-3 debugging mislead about Expires
being less than the check time when the two times were identical. The
actual checked conditions are correct: Roughly speaking, the response
with Expires value T is considered expired at that time T.
Also dropped extra (and inconsistent) trailing space on debugs() lines.
This space was added by the same 2000 commit, probably accidentally.
Alex Rousskov [Fri, 16 Feb 2024 04:03:40 +0000 (04:03 +0000)]
Bug 5344: mgr:config segfaults without logformat (#1680)
Since 2011 commit 38e16f9, Log::LogConfig::dumpFormats() dereferenced a
nil `logformats` pointer while reporting a non-existent logformat
configuration (e.g., squid.conf.default): `logformats->dump(e, name)`.
In most environments, that code "worked" because the corresponding
Format::Format::dump() method happens to do nothing if "this" is nil.
However, in some environments, Squid segfaulted.
Alex Rousskov [Thu, 8 Feb 2024 22:03:44 +0000 (22:03 +0000)]
Fix max-stale in default refresh_pattern (#1664)
RefreshPattern constructor must set data fields to honor refresh_pattern
defaults promised in squid.conf.documented. For max-stale, that implies
making max_stale negative. A negative value allows refreshCheck() to use
a max_stale directive value (i.e. Config.maxStale that defaults to 1
week). The buggy constructor set max_stale to 0 instead and, hence,
refreshCheck() ignored max_stale directive when no refresh_pattern rules
were configured.
The fixed bug did not affect Squids configured using explicit
refresh_pattern rules because those rules are handled by
parse_refreshpattern() which sets max_stale to -1 by default. Our
squid.conf.default does have explicit refresh_pattern rules.
Alex Rousskov [Tue, 31 Oct 2023 11:35:02 +0000 (11:35 +0000)]
Fix infinite recursion when parsing HTTP chunks (#1553)
This change stops infinite HttpStateData recursion with at-max-capacity
inBuf. Such inBuf prevents progress in the following call chain:
* processReply()
* processReplyBody() and decodeAndWriteReplyBody()
* maybeReadVirginBody()
* maybeMakeSpaceAvailable() -- tries but fails to quit processing
* processReply()
HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(),
preventing recursion.
maybeReadVirginBody() now aborts transactions that would otherwise get
stalled due to full read buffer at its maximum capacity. This change
requires that all maybeReadVirginBody() callers do actually need more
response data to make progress. AFAICT, that (natural) invariant holds.
We moved transaction stalling check from maybeMakeSpaceAvailable() into
its previous callers. Without that move, maybeMakeSpaceAvailable() would
have to handle both abortTransaction() and delayRead() cases. Besides
increased code complexity, that would trigger some premature delayRead()
calls (at maybeReadVirginBody() time). Deciding whether to delay socket
reads is complicated, the delay mechanism is expensive, and delaying may
become unnecessary by the time the socket becomes readable, so it is
best to continue to only delayRead() at readReply() time, when there is
no other choice left.
maybeReadVirginBody() mishandled cases where progress was possible, but
not _immediately_ -- it did nothing in those cases, probably stalling
transactions when maybeMakeSpaceAvailable() returned false but did not
call processReply(). This is now fixed: maybeReadVirginBody() now starts
waiting for the socket to be ready for reading in those cases,
effectively passing control to readReply() that handles them.
maybeReadVirginBody() prematurely grew buffer for future socket reads.
As a (positive) side effect of the above refactoring, we now delay
buffer growth until the actual read(2) time, which is best for
performance. Most likely, this premature buffer growth was an accident:
maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with
doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted
doGrow as a "do not actually do it" parameter. That bug is now gone.
This recursion bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/
where it was filed as "Chunked Encoding Stack Overflow".
Alexey [Mon, 29 Jan 2024 19:47:41 +0000 (19:47 +0000)]
Fix memory leak in ssl/gadgets/mimicAuthorityKeyId() (#1651)
An unnecessary std::unique_ptr::release() call prevented temporary
extOct string from being automatically deallocated. The leak usually
happened when SslBump mimicked certificates with an Authority Key
Identifier extension. The leak was added in 2016 commit 5f1318b.
Alexey [Sun, 21 Jan 2024 16:24:57 +0000 (16:24 +0000)]
NTLM/Negotiate: Fix crash on bad helper TT responses (#1645)
Helper lookup may be made without a client HTTP Request,
(stored in lm_request->request). But in Helper::TT cases the
lm_request->request was dereferenced without any checks.
Alex Rousskov [Sat, 9 Dec 2023 04:46:55 +0000 (04:46 +0000)]
Bug 5274: Successful tunnels logged as TCP_TUNNEL/500 (#1608)
Stop calling retryOrBail() when the tunneled Squid-server connection
(that we have committed to use) closes. Our retryOrBail() is dedicated
to handling errors. Most[^1] serverClosed() calls are _not_ related to
errors because our tunneling code abuses asynchronous connection closure
callbacks for TunnelStateData work termination. Depending on the
transaction details (e.g., TLS interception vs. true CONNECT), calling
retryOrBail() on these no-error code paths may result in retryOrBail()
"catch all other errors" code creating bogus ERR_CANNOT_FORWARD errors.
Most tunneling errors are already detailed, and retryOrBail() does not
have enough information to correctly detail the remaining ones anyway.
Removing this retryOrBail() call selects the arguably lesser evil.
The client-Squid connection closure callback, clientClosed(), already
uses the same logic.
This change does not resurrect Bug 5132 fixed by commit 752fa20 that
added the now-replaced retryOrBail() call to serverClosed(). That commit
fixed the leak by calling deleteThis() (via retryOrBail()). Our
finishWritingAndDelete() call preserves that logic. That commit also
claimed to allow more retries, but that claim was a mistake: To-server
closure callback registration (e.g. commitToServer()) bans retries.
[^1]: The fact that severClosed() is called for both successful and
problematic outcomes prevents TunnelStateData from properly handling
certain (rare) errors. We tried to fix that as well, but the changes
quickly snowballed, so we left a few XXXs instead.
Avoid file name conflict with Windows WinSvc.h (#1637)
There is a conflict between our WinSvc.h header file and
Windows system include <winsvc.h>, which is one of the
factors preventing our ability to build native windows helpers.
Alex Rousskov [Wed, 6 Dec 2023 05:25:53 +0000 (05:25 +0000)]
Docs: Describe more ACL effects on (re)authentication (#1611)
Existing documentation was
* silent about %ul, max_user_ip, ident, and ident_regex side effects;
* silent about adapted_http_access context triggering authentication;
* vague about (re)authentication triggers.
Thomas Leroy [Tue, 28 Nov 2023 07:35:46 +0000 (07:35 +0000)]
Limit the number of allowed X-Forwarded-For hops (#1589)
Squid will ignore all X-Forwarded-For elements listed after the first 64
addresses allowed by the follow_x_forwarded_for directive. A different
limit can be specified by defining a C++ SQUID_X_FORWARDED_FOR_HOP_MAX
macro, but that macro is not a supported Squid configuration interface
and may change or disappear at any time.
Squid will log a cache.log ERROR if the hop limit has been reached.
This change works around problematic ACLChecklist and/or slow ACLs
implementation that results in immediate nonBlockingCheck() callbacks.
Such callbacks have caused many bugs and development complications. In
clientFollowXForwardedForCheck() context, they lead to indirect
recursion that was bound only by the number of allowed XFF entries,
which could reach thousands and exhaust Squid process call stack.
This recursion bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/xff-stackoverflow.html
where it was filed as "X-Forwarded-For Stack Overflow".
Andreas Weigel [Tue, 14 Nov 2023 15:17:09 +0000 (15:17 +0000)]
FTP: Ignore credentials with a NUL-prefixed username (#1557)
FATAL: FTP login parsing destroyed username info
This bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/ftp-fatal.html
where it was filed as "FTP Authentication Crash".
Squids receives IPv6 addresses from traffic, configuration, or
hard-coded constants even when ./configured with --disable-ipv6 or when
IPv6 support was automatically disabled at startup after failing IPv6
tests. To handle IPv6 correctly, such Squids must support basic IPv6
operations like recognizing an IPv6 address in a request-target or
reporting an unsolicited IPv6 DNS record. At least for now, such Squids
must also correctly parse configuration-related IPv6 addresses.
All those activities rely on various low-level operations like filling
addrinfo structure with IP address information. Since 2012 commit c5fbbc7, Ip::Address::getAddrInfo() was failing for IPv6 addresses when
Ip::EnableIpv6 was falsy. That change correctly recognized[^1] the need
for such Squids to handle IPv6, but to support basic operations, we need
to reject IPv6 addresses at a higher level and without asserting.
That high-level rejection work is ongoing, but initial attempts have
exposed difficult problems that will take time to address. For now, we
just avoid the assertion while protecting IPv6-disabled Squid from
listening on or opening connections to IPv6 addresses. Since Squid
already expects (and usually correctly handles) socket opening failures,
disabling those operations is better than failing in low-level IP
manipulation code.
The overall IPv6 posture of IPv6-disabled Squids that lack http_access
or other rules to deny IPv6 requests will change: This fix exposes more
of IPv6-disabled Squid code to IPv6 addresses. It is possible that such
exposure will make some IPv6 resources inside Squid (e.g., a previously
cached HTTP response) accessible to external requests. Squids will not
open or accept IPv6 connections but may forward requests with raw IPv6
targets to IPv4 cache_peers. Whether these and similar behavior changes
are going to be permanent is open for debate, but even if they are
temporary, they are arguably better than the corresponding assertions.
These changes do not effect IPv6-enabled Squids.
The assertion in IPv6-disabled Squid was reported by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/ipv6-assert.html where
it was filed as "Assertion on IPv6 Host Requests with --disable-ipv6".
Alex Rousskov [Tue, 14 Nov 2023 18:40:37 +0000 (18:40 +0000)]
Bug 5317: FATAL attempt to read data from memory (#1579)
FATAL: Squid has attempted to read data ... that is not present.
Recent commit 122a6e3 attempted to deliver in-memory response body bytes
to a Store-reading client that requested (at least) response headers.
That optimization relied on the old canReadFromMemory() logic, but that
logic results in false positives when the checked read offset falls into
a gap between stored headers and the first body byte of a Content-Range.
In that case, a false positive leads to a readFromMemory() call and a
FATAL mem_hdr::copy() error.
This workaround disables the above optimization without fixing
canReadFromMemory(). We believe that a readFromMemory() call that comes
right after response headers are delivered to the Store-reading client
will not suffer from the same problem because the client will supply the
read offset of the first body byte, eliminating the false positive.
Alex Rousskov [Sun, 12 Nov 2023 09:33:20 +0000 (09:33 +0000)]
Do not update StoreEntry expiration after errorAppendEntry() (#1580)
errorAppendEntry() is responsible for setting entry expiration times,
which it does by calling StoreEntry::storeErrorResponse() that calls
StoreEntry::negativeCache().
This change was triggered by a vulnerability report by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/cache-uaf.html where
it was filed as "Use-After-Free in Cache Manager Errors". The reported
"use after free" vulnerability was unknowingly addressed by 2022 commit 1fa761a that removed excessively long "reentrant" store_client calls
responsible for the disappearance of the properly locked StoreEntry in
this (and probably other) contexts.
2015 commit 21cd322 started to continue ClientStream processing after
socket write(2) failures. In most cases, the code still "worked". For
example, initiateClose() would close the client-Squid connection, and
connStateClosed() would be called before Store has a chance to deliver
response body data requested by pullData() in writeComplete().
However, that response body data could sometimes reach Server, and
handleReply() would assert because startOfOutput() says that we have not
written the headers, but ClientStream state (i.e. a nil `rep` parameter)
says that we have. These assertion can be triggered by disabling
initiateClose(), and they can probably be triggered by traffic as well.
Now, after a Comm::Write() error, we terminateAll() client transactions
on the failed connection[^1] and do not call afterClientWrite() that is
not equipped to handle I/O errors and would continue ClientStream
processing if called.
This bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/stream-assert.html
where it was filed as "Implicit Assertion in Stream Handling".
[^1]: We terminateAll() instead of potentially postponing closure with
initiateClose() because the failed client-Squid connection most likely
cannot be salvaged for, say, reading the remainder of the request body.
Recent commit 122a6e3 removed HTTP response headers from store_client
responses. That removal created the possibility of an empty
StoreIOBuffer at the beginning of the feeding sequence. Pending Bug 5317
fix will make such buffers even more frequent. Existing store_client
recipients have varying requirements with regard to empty response
buffers, as documented in store_client::finishCallback(). We missed this
requirement conflict in Cache Digest code. This fix adjusts Cache
Digests code to be compatible with empty StoreIOBuffer representation in
current store_client code.
Nicolai Moore [Wed, 15 Nov 2023 00:59:59 +0000 (00:59 +0000)]
Bug 5319: QOS Netfilter MARK preservation is always disabled (#1585)
Default ./configure options and explicit --enable-zph-qos enabled ZPH
QOS support (USE_QOS_TOS) as expected but did not enable QOS Netfilter
MARK preservation support (USE_LIBNETFILTERCONNTRACK). For example,
qos_flows directive became available, but clientside_mark and
client_connection_mark ACL types were not recognized.
The missing opening bracket before AS_IF() condition injected a trailing
closing bracket into that condition, resulting in an always-false
condition for setting USE_LIBNETFILTERCONNTRACK.
Alex Rousskov [Wed, 1 Nov 2023 03:16:12 +0000 (03:16 +0000)]
Remove mem_hdr::freeDataUpto() assertion (#1562)
stmem.cc:98: "lowestOffset () <= target_offset"
The assertion is conceptually wrong: The given target_offset parameter
may have any value; that value does not have to correlate with mem_hdr
state in any way. It is freeDataUpto() job to preserve nodes at or above
the given offset and (arguably optionally) remove nodes below it, but
the assertion does not actually validate that freeDataUpdo() did that.
The old mem_hdr::freeDataUpto() assertion incorrectly assumed that,
after zero or more unneeded memory nodes were freed, the remaining
memory area never starts after the given target_offset parameter. That
assumption fails in at least two use cases, both using target_offset
values that do not belong to any existing or future mem_hdr node:
1. target_offset is points to the left of the first node. freeDataUpto()
correctly keeps all memory nodes in such calls, but then asserts. For
example, calling freeDataUpto(0) when mem_hdr has bytes [100,199)
triggers this incorrect assertion.
2. target_offset is in the gap between two nodes. For example, calling
freeDataUpto(2000) when mem_hdr contains two nodes: [0,1000) and
[3000,3003) will trigger this assertion (as happened in Bug 5309).
Such gaps are very common for HTTP 206 responses with a Content-Range
header because such responses often specify a range that does not
start with zero and create a gap after the node(s) with HTTP headers.
Bugs notwithstanding, it is unlikely that relevant calls exist today,
but they certainly could be added, especially when freeDataUpto() stops
preserving the last unused node. The current "avoid change to [some
unidentified] part of code" hoarding excuse should not last forever.
Prior to commit 122a6e3, Squid did not (frequently) assert in gap cases:
Callers first give target_offset 0 (which results in freeDataUpto()
doing nothing, keeping the header node(s)) and then they give
target_offset matching the beginning of the first body node (which
results in freeDataUpto() freeing the header nodes(s) and increasing
lowerOffset() from zero to target_offset). A bug in commit 122a6e3
lowered target_offset a bit, placing target_offset in the gap and
triggering frequent (and incorrect) assertions (Bug 5309).
Recent commit 122a6e3 left store_client::readOffset() unchanged but
should have adjusted it to match changed copyInto.offset semantics:
Starting with that commit, storeClientCopy() callers supply HTTP
response _body_ offset rather than HTTP response offset.
This bug decreased readOffset() values (by the size of stored HTTP
response headers), effectively telling Store that we are not yet done
with some of the MemObject/mem_hdr bytes. This bug could cause slightly
higher transaction memory usage because the same response bytes are
trimmed later. This bug should not have caused any assertions.
However, the old mem_hdr::freeDataUpto() code that uses readOffset() is
also broken -- the assertion in that method only "works" when
readOffset() returns values matching a memory node boundary. The smaller
values returned by buggy readOffset() triggered buggy assertions.
This minimal fix removes the recent store_client::readOffset() bug
described above. We will address old mem_hdr problems separately.
Alex Rousskov [Fri, 27 Oct 2023 21:27:20 +0000 (21:27 +0000)]
Exit without asserting when helper process startup fails (#1543)
... to dup() after fork() and before execvp().
Assertions are for handling program logic errors. Helper initialization
code already handled system call errors correctly (i.e. by exiting the
newly created helper process with an error), except for a couple of
assert()s that could be triggered by dup(2) failures.
This bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/ipc-assert.html
where it was filed as 'Assertion in Squid "Helper" Process Creator'.
Alex Rousskov [Wed, 25 Oct 2023 19:41:45 +0000 (19:41 +0000)]
RFC 1123: Fix date parsing (#1538)
The bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/datetime-overflow.html
where it was filed as "1-Byte Buffer OverRead in RFC 1123 date/time
Handling".
Alex Rousskov [Wed, 25 Oct 2023 11:47:19 +0000 (11:47 +0000)]
Improve handling of expanding HTTP header values (#1536)
Squid manipulations often increase HTTP header value length compared to
the corresponding raw value received by Squid. Raw header length is
checked against request_header_max_size and reply_header_max_size that
default to 64KB, making the raw value safe to store in a String object
(by default). However, when the increased length of a manipulated value
exceeds String class limits, Squid leaks memory, asserts, or possibly
stalls affected transactions. The long-term fix for this problem is a
complete String elimination from Squid sources, but that takes time.
Known manipulations may effectively concatenate headers and/or increase
header value length by 50%. This workaround makes such known increases
safe by essentially tripling String class limits:
(64KB + 64KB) * 150% = 3 * 64KB
This bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/response-memleaks.html
where it was filed as "Memory Leak in HTTP Response Parsing".
Andreas Weigel [Wed, 18 Oct 2023 04:14:31 +0000 (04:14 +0000)]
Fix validation of certificates with CN=* (#1523)
The bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/
where it was filed as "Buffer UnderRead in SSL CN Parsing".
Alex Bason [Sun, 15 Oct 2023 13:04:47 +0000 (13:04 +0000)]
Fix stack buffer overflow when parsing Digest Authorization (#1517)
The bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/digest-overflow.html
where it was filed as "Stack Buffer Overflow in Digest Authentication".
Alex Rousskov [Thu, 25 May 2023 02:10:28 +0000 (02:10 +0000)]
Fix userinfo percent-encoding (#1367)
%X expects an unsigned int, and that is what we were giving it. However,
to get to the correct unsigned int value from a (signed) char, one has
to cast to an unsigned char (or equivalent) first.
Alex Rousskov [Tue, 27 Jun 2023 22:47:36 +0000 (22:47 +0000)]
Fix store_client caller memory leak on certain errors (#1347)
When a storeUnregister() code path destroys store_client before the
latter has a chance to deliver the answer, the cbdataReferenceDone()
call in store_client::finishCallback() is not reached, keeping the
callback data (e.g., clientReplyContext) alive forever. These
storeClientCopy() "cancellations" may happen, for example, when the
client-to-Squid connection is closed while store_client waits for Store.
Use CallbackData to guarantee cbdataReferenceDone() when store_client is
destructed before it can finishCallback(). These synchronous callbacks
will be replaced with AsyncCalls. For now, we use the "discouraged"
CallbackData API to accommodate the existing legacy callbacks.
Alex Rousskov [Wed, 2 Aug 2023 05:00:49 +0000 (05:00 +0000)]
CI: More HTTP caching and revalidation tests (#1440)
Use Daft cache-response test to monitor for bugs in basic caching code,
including bugs like the one fixed by commit c203754. Unlike the old
proxy-collapsed-forwarding test that uses concurrent requests, this test
varies basic response properties.
Use Daft accumulate-headers-after-304 test to monitor for bugs like the
one fixed by commit 55e1c6e. Unlike old proxy-update-headers-after-304,
this test focuses on certain _problematic_ HTTP 304 header updates.
Alex Rousskov [Wed, 5 Jul 2023 01:55:59 +0000 (01:55 +0000)]
CI: Remove unnecessary test-functionality test wrappers (#1393)
These workarounds are not needed for the current and future code in this
branch. Other branches get their own test-functionality.sh files that
can be used to maintain a branch-specific collection of test wrappers.
The (now unused) has_commit_by_message() function was left in the script
because that function is likely to be used by future workarounds. Unlike
specific test workarounds that only apply to a subset of old code, this
and similar functions can be viewed as a reusable code "library".
Alex Rousskov [Sun, 18 Jun 2023 00:30:35 +0000 (00:30 +0000)]
CI: Remove pass-through test-functionality test wrappers (#1383)
Instead of requiring a custom test wrapper for each test and, hence,
creating an ever-increasing number of pass-through wrappers that do
nothing useful, use a custom test wrapper if and only if it exists. By
default (i.e. when there is no custom wrapper), just run the named test.
As a positive side effect, this change also simplifies running tests
that are not on the $default_tests list hard-coded in main():
Miss if a 304 update would exceed reply_header_max_size (#1420)
Fetch the resource unconditionally when a 304 (Not Modified) response to
an internal cache revalidation request grows cached HTTP response
headers beyond the reply_header_max_size limit.
Alex Rousskov [Sat, 24 Jun 2023 08:18:55 +0000 (08:18 +0000)]
Remove serialized HTTP headers from storeClientCopy() (#1335)
Do not send serialized HTTP response header bytes in storeClientCopy()
answers. Ignore serialized header size when calling storeClientCopy().
This complex change adjusts storeClientCopy() API to addresses several
related problems with storeClientCopy() and its callers. The sections
below summarize storeClientCopy() changes and then move on to callers.
Squid incorrectly assumed that serialized HTTP response headers are read
from disk in a single storeRead() request. In reality, many situations
lead to store_client::readBody() receiving partial HTTP headers,
resulting in parseCharBuf() failure and a level-0 cache.log message:
Could not parse headers from on disk object
Inadequate handling of this failure resulted in a variety of problems.
Squid now accumulates storeRead() results to parse larger headers and
also handles parsing failures better, but we could not just stop there.
With the storeRead() accumulation in place, it is no longer possible to
send parsed serialized HTTP headers to storeClientCopy() callers because
those callers do not provide enough buffer space to fit larger headers.
Increasing caller buffer capacity does not work well because the actual
size of the serialized header is unknown in advance and may be quite
large. Always allocating large buffers "just in case" is bad for
performance. Finally, larger buffers may jeopardize hard-to-find code
that uses hard-coded 4KB buffers without using HTTP_REQBUF_SZ macro.
Fortunately, storeClientCopy() callers either do not care about
serialized HTTP response headers or should not care about them! The API
forced callers to deal with serialized headers, but callers could (and
some did) just use the parsed headers available in the corresponding
MemObject. With this API change, storeClientCopy() callers no longer
receive serialized headers and do not need to parse or skip them.
Consequently, callers also do not need to account for response headers
size when computing offsets for subsequent storeClientCopy() requests.
Restricting storeClientCopy() API to HTTP _body_ bytes removed a lot of
problematic caller code. Caller changes are summarized further below.
A similar HTTP response header parsing problem existed in shared memory
cache code. That code was actually aware that headers may span multiple
cache slices but incorrectly assumed that httpMsgParseStep() accumulates
input as needed (to make another parsing "step"). It does not. Large
response headers cached in shared memory triggered a level-1 message:
Corrupted mem-cached headers: e:...
Fixed MemStore code now accumulates serialized HTTP response headers as
needed to parse them, sharing high-level parsing code with store_client.
Old clientReplyContext methods worked hard to skip received serialized
HTTP headers. The code contained dangerous and often complex/unreadable
manipulation of various raw offsets and buffer pointers, aggravated by
the perceived need to save/restore those offsets across asynchronous
checks (see below). That header skipping code is gone now. Several stale
and misleading comments related to Store buffers management were also
removed or updated.
We replaced reqofs/reqsize with simpler/safer lastStreamBufferedBytes,
while becoming more consistent with that "cached" info invalidation. We
still need this info to resume HTTP body processing after asynchronous
http_reply_access checks and cache hit validations, but we no longer
save/restore this info for hit validation: No need to save/restore
information about the buffer that hit validation does not use and must
never touch!
The API change also moved from-Store StoreIOBuffer usage closer to
StoreIOBuffers manipulated by Clients Streams code. Buffers in both
categories now contain just the body bytes, and both now treat zero
length as EOF only _after_ processing the response headers.
These changes improve overall code quality, but this code path and these
changes still suffer from utterly unsafe legacy interfaces like
StoreIOBuffer and clientStreamNode. We cannot rely on the compiler to
check our work. The risk of these changes exposing/causing bugs is high.
asHandleReply() expected WHOIS response body bytes where serialized HTTP
headers were! The code also had multiple problems typical for manually
written C parsers dealing with raw input buffers. Now replaced with a
Tokenizer-based code.
To skip received HTTP response headers, peerDigestHandleReply() helper
functions called headersEnd() on the received buffer. Twice. We have now
merged those two parsing helper functions into one (that just checks the
already parsed headers). This merger preserved "304s must come with
fetch->pd->cd" logic that was hidden/spread across those two functions.
urnHandleReply() re-parsed received HTTP response headers. We left its
HTTP body parsing code unchanged except for polishing NUL-termination.
netdbExchangeHandleReply() re-parsed received HTTP response headers to
find where they end (via headersEnd()). We improved handing of corner
cases and replaced some "tricky bits" code, reusing the new
Store::ParsingBuffer class. The net_db record parsing code is unchanged.
Mgr::StoreToCommWriter::noteStoreCopied() is a very special case. It
actually worked OK because, unlike all other storeClientCopy() callers,
this code does not get serialized HTTP headers from Store: The code
adding bytes to the corresponding StoreEntry does not write serialized
HTTP headers at all. StoreToCommWriter is used to deliver kid-specific
pieces of an HTTP body of an SMP cache manager response. The HTTP
headers of that response are handled elsewhere. We left this code
unchanged, but the existence of the special no-headers case does
complicate storeClientCopy() API documentation, implementation, and
understanding.
Co-authored-by: Eduard Bagdasaryan <eduard.bagdasaryan@measurement-factory.com>
Alex Rousskov [Sat, 11 Mar 2023 05:48:14 +0000 (05:48 +0000)]
Replaced clientReplyContext::tempBuffer with old_reqofs (#1304)
The tempBuffer data member was not actually used as a buffer. We only
used its offset field, and only for saving reqofs (which has a different
type than tempBuffer.offset!). Replaced the buffer with old_reqofs,
consistent with the rest of the "saved stale entry state" code.
Also fixed old_reqsize type to match reqsize and grouped that member
with the other private "saved stale entry state" fields.
Bad old types probably did not trigger runtime failures because the
associated saved numbers are saved at the very beginning of fetching the
entry, when all these accumulation-related counters are still small.
The remaining reqofs and reqsize types are wrong for platforms where
size_t is not uint64_t, but fixing that deserves a dedicated change. For
now, we just made the types of "old_" and "current" members consistent.
Avoid truncation errors when printing time_t-based squid.conf directives
on platforms with 32-bit int and 64-bit time_t. Also avoid similar
errors when printing time_msec-based directives on platforms with 32-bit
int.
Detected by Coverity. CID 1529622: Use of 32-bit time_t (Y2K38_SAFETY).
Bug 5301: cachemgr.cgi not showing new manager interface URLs (#1479)
Also fix several related UI issues uncovered during testing:
* Prune the list of servers accessible via the CGI tool login.
Their responses would be badly mangled if accessed via
the old tools parse logic.
Also, hide the old login form if all servers use the new
manager interface.
* Ensure the 'menu' report is always used by default after
the CGI tool login. This prevents errors about MGR_INDEX
not being available on recent Squid releases. Restoring the
expected CGI tool behavior.
Extend cache_log_message to problematic from-helper annotations (#1481)
WARNING: Unsupported or unexpected from-helper annotation
with a name reserved for Squid use
The above message is emitted for every helper response containing
problematic annotations. Let admins control this reporting using
cache_log_message directive and message id 69.
Alex Rousskov [Sat, 16 Sep 2023 12:50:10 +0000 (08:50 -0400)]
Restore support for legacy cache_object cache manager requests (#1475)
Squid v6.3 commit 6695897 (i.e. a backport of master/v7 commit 3c383cc)
accidentally removed support of legacy cache_object URLs (that master/v7
does not support) from Squid v6. This fix restores that support in v6.
Alex Rousskov [Fri, 14 Jul 2023 17:46:30 +0000 (17:46 +0000)]
Do not use static initialization to register modules (#1422)
ERROR: ... Unknown authentication scheme 'ntlm'.
When a translation unit does not contain main() and its code is not used
by the rest of the executable, the linker may exclude that translation
unit from the executable. This exclusion by LTO may happen even if that
code _is_ used to initialize static variables in that translation unit:
"If no variable or function is odr-used from a given translation unit,
the non-local variables defined in that translation unit may never be
initialized"[^1].
For example, src/auth/ntlm/Scheme.o translation unit contains nothing
but NtlmAuthRr class definition and static initialization code. The
linker knows that the rest of Squid does not use NtlmAuthRr and excludes
that translation unit from the squid executable, effectively disabling
NTLM module registration required to parse "auth_param ntlm" directives.
The problem does affect existing NTLM module, and may affect any future
module code as we reduce module's external footprint. This change
converts all RegisteredRunner registrations from using side effects of
static initialization to explicit registration calls from SquidMain().
Relying on "life before main()" is a design bug. This PR fixes that bug
with respect to RegisteredRunner registrations.
Due to indeterminate C++ static initialization order, no module can
require registration before main() starts. Thus, moving registration
timing to the beginning of SquidMain() should have no negative effects.
The new registration API still does not expose main.cc to any module
details (other than the name of the registration function itself). We
still do not need to #include any module headers into main.cc. Compiler
or linker does catch most typos in RegisteredRunner names.
Unfortunately, explicit registration still cannot prevent bugs where we
forget to register a module or fail to register a module due to wrong
registration code guards. Eventually, CI will expose such bugs.
Due to poor code duplication, commit 92a5adb accidentally classified
URLs without a trailing slash in the magical prefix as valid cache
manager URLs, triggering the above ERRORs. We were denying such
"slashless" cache manager URLs (as invalid internal URLs) prior to that
commit. Since that commit, the ERRORs triggered by that commit
effectively denied them as well. Denying them properly results in
simpler/smaller code (than allowing them would), so we should avoid a UI
change and continue to deny them, at least for now.
This change also reduces duplication of magic prefix definitions. Other
pending work will completely eliminate that duplication in src/ code.
Alex Rousskov [Sat, 12 Aug 2023 15:40:12 +0000 (15:40 +0000)]
Bug 5294: ERR_CANNOT_FORWARD returned instead of ERR_DNS_FAIL (#1453)
Since 2017 commit fd9c47d, peer selection code stopped reporting
ERR_DNS_FAIL cases because PeerSelector::noteIps() treated DNS answers
without IP addresses as if at least one IP address was received. Without
seeing a DNS resolution error, the ultimate recipient of the DNS
resolution results (e.g., CONNECT tunneling or regular forwarding code)
used ERR_CANNOT_FORWARD to indicate a failure to find a forwarding path.
PeerSelector::noteIps() code mimicked legacy IPH code with regard to
handling of the addresses parameter. However, IPH caller had a special
emptyIsNil adjustment that was missing from the noteIps() call! We now
apply that adjustment to both noteIps() and IPH code paths.
Long-term, we should probably remove nil address container pointers.
Having two different ways to signal lack of IPs is dangerous. Currently,
there is only one known supplier of nil address container:
IpcacheStats.invalid code that validates ipcache_nbgethostbyname() name
parameter. Either the corresponding nil/empty name check should be
converted into an assertion (blaming the ipcache_nbgethostbyname()
caller for not validating the name) OR that checking code should supply
an empty address container to finalCallback().
Shmaya [Fri, 11 Aug 2023 17:41:30 +0000 (17:41 +0000)]
Bug 4981: Work around in-call job invalidation bugs (#1428)
Bug 4981 is one known case of such invalidation, but this workaround is
much broader than that bug context. We can speculate that architectural
problems described in commit e3b6f15 are behind (some of) these bugs.
Alex Rousskov [Sat, 29 Jul 2023 17:44:59 +0000 (17:44 +0000)]
Bug 5290: pure virtual call in Ftp::Client constructor (#1429)
FATAL: Dying from an exception handling failure;
exception: [no active exception]
Converting `this` to CbcPointer in a constructor of an abstract class
like Ftp::Client does not work because our virtual toCbdata() method
remains pure until the final/child class constructor runs.
Conceptually, the bug was probably introduced in 2013 commit 434a79b,
when FTP class hierarchy grew, making Ftp::Client an abstract class, but
the trigger was recent commit 337b9aa that removed CBDATA_CLASS() from
Ftp::Client class declaration. We discovered, described, and addressed
several such bugs in that commit, but we missed this case.
Andrew Novikov [Sun, 9 Jul 2023 02:05:56 +0000 (02:05 +0000)]
Bug 5187: Work around REQMOD satisfaction regression (#1400)
Commit ba3fe8d broke ICAP REQMOD satisfaction transactions. In some
cases, this workaround may resurrect Squid Bug 5187. Triage available at
https://bugs.squid-cache.org/show_bug.cgi?id=5187#c6
Alex Rousskov [Sun, 23 Jul 2023 01:38:19 +0000 (01:38 +0000)]
Fix memory leak when reconfiguring multiline all-of ACLs (#1425)
Normally, Acl::InnerNode::add() automatically registers stored ACL nodes
for future cleanup, but when we find the second all-of rule/line (with
the same ACL name), we do not add() the newly created OrNode and have to
explicitly register it to avoid memory leaks on reconfiguration.
Leaking since all-of ACL support was added in 2013 commit 6f58d7d.