Alex Rousskov [Thu, 25 May 2023 02:10:28 +0000 (02:10 +0000)]
Fix userinfo percent-encoding (#1367)
%X expects an unsigned int, and that is what we were giving it. However,
to get to the correct unsigned int value from a (signed) char, one has
to cast to an unsigned char (or equivalent) first.
Alex Rousskov [Tue, 27 Jun 2023 22:47:36 +0000 (22:47 +0000)]
Fix store_client caller memory leak on certain errors (#1347)
When a storeUnregister() code path destroys store_client before the
latter has a chance to deliver the answer, the cbdataReferenceDone()
call in store_client::finishCallback() is not reached, keeping the
callback data (e.g., clientReplyContext) alive forever. These
storeClientCopy() "cancellations" may happen, for example, when the
client-to-Squid connection is closed while store_client waits for Store.
Use CallbackData to guarantee cbdataReferenceDone() when store_client is
destructed before it can finishCallback(). These synchronous callbacks
will be replaced with AsyncCalls. For now, we use the "discouraged"
CallbackData API to accommodate the existing legacy callbacks.
Alex Rousskov [Wed, 2 Aug 2023 05:00:49 +0000 (05:00 +0000)]
CI: More HTTP caching and revalidation tests (#1440)
Use Daft cache-response test to monitor for bugs in basic caching code,
including bugs like the one fixed by commit c203754. Unlike the old
proxy-collapsed-forwarding test that uses concurrent requests, this test
varies basic response properties.
Use Daft accumulate-headers-after-304 test to monitor for bugs like the
one fixed by commit 55e1c6e. Unlike old proxy-update-headers-after-304,
this test focuses on certain _problematic_ HTTP 304 header updates.
Alex Rousskov [Wed, 5 Jul 2023 01:55:59 +0000 (01:55 +0000)]
CI: Remove unnecessary test-functionality test wrappers (#1393)
These workarounds are not needed for the current and future code in this
branch. Other branches get their own test-functionality.sh files that
can be used to maintain a branch-specific collection of test wrappers.
The (now unused) has_commit_by_message() function was left in the script
because that function is likely to be used by future workarounds. Unlike
specific test workarounds that only apply to a subset of old code, this
and similar functions can be viewed as a reusable code "library".
Alex Rousskov [Sun, 18 Jun 2023 00:30:35 +0000 (00:30 +0000)]
CI: Remove pass-through test-functionality test wrappers (#1383)
Instead of requiring a custom test wrapper for each test and, hence,
creating an ever-increasing number of pass-through wrappers that do
nothing useful, use a custom test wrapper if and only if it exists. By
default (i.e. when there is no custom wrapper), just run the named test.
As a positive side effect, this change also simplifies running tests
that are not on the $default_tests list hard-coded in main():
Miss if a 304 update would exceed reply_header_max_size (#1420)
Fetch the resource unconditionally when a 304 (Not Modified) response to
an internal cache revalidation request grows cached HTTP response
headers beyond the reply_header_max_size limit.
Alex Rousskov [Sat, 24 Jun 2023 08:18:55 +0000 (08:18 +0000)]
Remove serialized HTTP headers from storeClientCopy() (#1335)
Do not send serialized HTTP response header bytes in storeClientCopy()
answers. Ignore serialized header size when calling storeClientCopy().
This complex change adjusts storeClientCopy() API to addresses several
related problems with storeClientCopy() and its callers. The sections
below summarize storeClientCopy() changes and then move on to callers.
Squid incorrectly assumed that serialized HTTP response headers are read
from disk in a single storeRead() request. In reality, many situations
lead to store_client::readBody() receiving partial HTTP headers,
resulting in parseCharBuf() failure and a level-0 cache.log message:
Could not parse headers from on disk object
Inadequate handling of this failure resulted in a variety of problems.
Squid now accumulates storeRead() results to parse larger headers and
also handles parsing failures better, but we could not just stop there.
With the storeRead() accumulation in place, it is no longer possible to
send parsed serialized HTTP headers to storeClientCopy() callers because
those callers do not provide enough buffer space to fit larger headers.
Increasing caller buffer capacity does not work well because the actual
size of the serialized header is unknown in advance and may be quite
large. Always allocating large buffers "just in case" is bad for
performance. Finally, larger buffers may jeopardize hard-to-find code
that uses hard-coded 4KB buffers without using HTTP_REQBUF_SZ macro.
Fortunately, storeClientCopy() callers either do not care about
serialized HTTP response headers or should not care about them! The API
forced callers to deal with serialized headers, but callers could (and
some did) just use the parsed headers available in the corresponding
MemObject. With this API change, storeClientCopy() callers no longer
receive serialized headers and do not need to parse or skip them.
Consequently, callers also do not need to account for response headers
size when computing offsets for subsequent storeClientCopy() requests.
Restricting storeClientCopy() API to HTTP _body_ bytes removed a lot of
problematic caller code. Caller changes are summarized further below.
A similar HTTP response header parsing problem existed in shared memory
cache code. That code was actually aware that headers may span multiple
cache slices but incorrectly assumed that httpMsgParseStep() accumulates
input as needed (to make another parsing "step"). It does not. Large
response headers cached in shared memory triggered a level-1 message:
Corrupted mem-cached headers: e:...
Fixed MemStore code now accumulates serialized HTTP response headers as
needed to parse them, sharing high-level parsing code with store_client.
Old clientReplyContext methods worked hard to skip received serialized
HTTP headers. The code contained dangerous and often complex/unreadable
manipulation of various raw offsets and buffer pointers, aggravated by
the perceived need to save/restore those offsets across asynchronous
checks (see below). That header skipping code is gone now. Several stale
and misleading comments related to Store buffers management were also
removed or updated.
We replaced reqofs/reqsize with simpler/safer lastStreamBufferedBytes,
while becoming more consistent with that "cached" info invalidation. We
still need this info to resume HTTP body processing after asynchronous
http_reply_access checks and cache hit validations, but we no longer
save/restore this info for hit validation: No need to save/restore
information about the buffer that hit validation does not use and must
never touch!
The API change also moved from-Store StoreIOBuffer usage closer to
StoreIOBuffers manipulated by Clients Streams code. Buffers in both
categories now contain just the body bytes, and both now treat zero
length as EOF only _after_ processing the response headers.
These changes improve overall code quality, but this code path and these
changes still suffer from utterly unsafe legacy interfaces like
StoreIOBuffer and clientStreamNode. We cannot rely on the compiler to
check our work. The risk of these changes exposing/causing bugs is high.
asHandleReply() expected WHOIS response body bytes where serialized HTTP
headers were! The code also had multiple problems typical for manually
written C parsers dealing with raw input buffers. Now replaced with a
Tokenizer-based code.
To skip received HTTP response headers, peerDigestHandleReply() helper
functions called headersEnd() on the received buffer. Twice. We have now
merged those two parsing helper functions into one (that just checks the
already parsed headers). This merger preserved "304s must come with
fetch->pd->cd" logic that was hidden/spread across those two functions.
urnHandleReply() re-parsed received HTTP response headers. We left its
HTTP body parsing code unchanged except for polishing NUL-termination.
netdbExchangeHandleReply() re-parsed received HTTP response headers to
find where they end (via headersEnd()). We improved handing of corner
cases and replaced some "tricky bits" code, reusing the new
Store::ParsingBuffer class. The net_db record parsing code is unchanged.
Mgr::StoreToCommWriter::noteStoreCopied() is a very special case. It
actually worked OK because, unlike all other storeClientCopy() callers,
this code does not get serialized HTTP headers from Store: The code
adding bytes to the corresponding StoreEntry does not write serialized
HTTP headers at all. StoreToCommWriter is used to deliver kid-specific
pieces of an HTTP body of an SMP cache manager response. The HTTP
headers of that response are handled elsewhere. We left this code
unchanged, but the existence of the special no-headers case does
complicate storeClientCopy() API documentation, implementation, and
understanding.
Co-authored-by: Eduard Bagdasaryan <eduard.bagdasaryan@measurement-factory.com>
Alex Rousskov [Sat, 11 Mar 2023 05:48:14 +0000 (05:48 +0000)]
Replaced clientReplyContext::tempBuffer with old_reqofs (#1304)
The tempBuffer data member was not actually used as a buffer. We only
used its offset field, and only for saving reqofs (which has a different
type than tempBuffer.offset!). Replaced the buffer with old_reqofs,
consistent with the rest of the "saved stale entry state" code.
Also fixed old_reqsize type to match reqsize and grouped that member
with the other private "saved stale entry state" fields.
Bad old types probably did not trigger runtime failures because the
associated saved numbers are saved at the very beginning of fetching the
entry, when all these accumulation-related counters are still small.
The remaining reqofs and reqsize types are wrong for platforms where
size_t is not uint64_t, but fixing that deserves a dedicated change. For
now, we just made the types of "old_" and "current" members consistent.
Avoid truncation errors when printing time_t-based squid.conf directives
on platforms with 32-bit int and 64-bit time_t. Also avoid similar
errors when printing time_msec-based directives on platforms with 32-bit
int.
Detected by Coverity. CID 1529622: Use of 32-bit time_t (Y2K38_SAFETY).
Bug 5301: cachemgr.cgi not showing new manager interface URLs (#1479)
Also fix several related UI issues uncovered during testing:
* Prune the list of servers accessible via the CGI tool login.
Their responses would be badly mangled if accessed via
the old tools parse logic.
Also, hide the old login form if all servers use the new
manager interface.
* Ensure the 'menu' report is always used by default after
the CGI tool login. This prevents errors about MGR_INDEX
not being available on recent Squid releases. Restoring the
expected CGI tool behavior.
Extend cache_log_message to problematic from-helper annotations (#1481)
WARNING: Unsupported or unexpected from-helper annotation
with a name reserved for Squid use
The above message is emitted for every helper response containing
problematic annotations. Let admins control this reporting using
cache_log_message directive and message id 69.
Alex Rousskov [Sat, 16 Sep 2023 12:50:10 +0000 (08:50 -0400)]
Restore support for legacy cache_object cache manager requests (#1475)
Squid v6.3 commit 6695897 (i.e. a backport of master/v7 commit 3c383cc)
accidentally removed support of legacy cache_object URLs (that master/v7
does not support) from Squid v6. This fix restores that support in v6.
Alex Rousskov [Fri, 14 Jul 2023 17:46:30 +0000 (17:46 +0000)]
Do not use static initialization to register modules (#1422)
ERROR: ... Unknown authentication scheme 'ntlm'.
When a translation unit does not contain main() and its code is not used
by the rest of the executable, the linker may exclude that translation
unit from the executable. This exclusion by LTO may happen even if that
code _is_ used to initialize static variables in that translation unit:
"If no variable or function is odr-used from a given translation unit,
the non-local variables defined in that translation unit may never be
initialized"[^1].
For example, src/auth/ntlm/Scheme.o translation unit contains nothing
but NtlmAuthRr class definition and static initialization code. The
linker knows that the rest of Squid does not use NtlmAuthRr and excludes
that translation unit from the squid executable, effectively disabling
NTLM module registration required to parse "auth_param ntlm" directives.
The problem does affect existing NTLM module, and may affect any future
module code as we reduce module's external footprint. This change
converts all RegisteredRunner registrations from using side effects of
static initialization to explicit registration calls from SquidMain().
Relying on "life before main()" is a design bug. This PR fixes that bug
with respect to RegisteredRunner registrations.
Due to indeterminate C++ static initialization order, no module can
require registration before main() starts. Thus, moving registration
timing to the beginning of SquidMain() should have no negative effects.
The new registration API still does not expose main.cc to any module
details (other than the name of the registration function itself). We
still do not need to #include any module headers into main.cc. Compiler
or linker does catch most typos in RegisteredRunner names.
Unfortunately, explicit registration still cannot prevent bugs where we
forget to register a module or fail to register a module due to wrong
registration code guards. Eventually, CI will expose such bugs.
Due to poor code duplication, commit 92a5adb accidentally classified
URLs without a trailing slash in the magical prefix as valid cache
manager URLs, triggering the above ERRORs. We were denying such
"slashless" cache manager URLs (as invalid internal URLs) prior to that
commit. Since that commit, the ERRORs triggered by that commit
effectively denied them as well. Denying them properly results in
simpler/smaller code (than allowing them would), so we should avoid a UI
change and continue to deny them, at least for now.
This change also reduces duplication of magic prefix definitions. Other
pending work will completely eliminate that duplication in src/ code.
Alex Rousskov [Sat, 12 Aug 2023 15:40:12 +0000 (15:40 +0000)]
Bug 5294: ERR_CANNOT_FORWARD returned instead of ERR_DNS_FAIL (#1453)
Since 2017 commit fd9c47d, peer selection code stopped reporting
ERR_DNS_FAIL cases because PeerSelector::noteIps() treated DNS answers
without IP addresses as if at least one IP address was received. Without
seeing a DNS resolution error, the ultimate recipient of the DNS
resolution results (e.g., CONNECT tunneling or regular forwarding code)
used ERR_CANNOT_FORWARD to indicate a failure to find a forwarding path.
PeerSelector::noteIps() code mimicked legacy IPH code with regard to
handling of the addresses parameter. However, IPH caller had a special
emptyIsNil adjustment that was missing from the noteIps() call! We now
apply that adjustment to both noteIps() and IPH code paths.
Long-term, we should probably remove nil address container pointers.
Having two different ways to signal lack of IPs is dangerous. Currently,
there is only one known supplier of nil address container:
IpcacheStats.invalid code that validates ipcache_nbgethostbyname() name
parameter. Either the corresponding nil/empty name check should be
converted into an assertion (blaming the ipcache_nbgethostbyname()
caller for not validating the name) OR that checking code should supply
an empty address container to finalCallback().
Shmaya [Fri, 11 Aug 2023 17:41:30 +0000 (17:41 +0000)]
Bug 4981: Work around in-call job invalidation bugs (#1428)
Bug 4981 is one known case of such invalidation, but this workaround is
much broader than that bug context. We can speculate that architectural
problems described in commit e3b6f15 are behind (some of) these bugs.
Alex Rousskov [Sat, 29 Jul 2023 17:44:59 +0000 (17:44 +0000)]
Bug 5290: pure virtual call in Ftp::Client constructor (#1429)
FATAL: Dying from an exception handling failure;
exception: [no active exception]
Converting `this` to CbcPointer in a constructor of an abstract class
like Ftp::Client does not work because our virtual toCbdata() method
remains pure until the final/child class constructor runs.
Conceptually, the bug was probably introduced in 2013 commit 434a79b,
when FTP class hierarchy grew, making Ftp::Client an abstract class, but
the trigger was recent commit 337b9aa that removed CBDATA_CLASS() from
Ftp::Client class declaration. We discovered, described, and addressed
several such bugs in that commit, but we missed this case.
Andrew Novikov [Sun, 9 Jul 2023 02:05:56 +0000 (02:05 +0000)]
Bug 5187: Work around REQMOD satisfaction regression (#1400)
Commit ba3fe8d broke ICAP REQMOD satisfaction transactions. In some
cases, this workaround may resurrect Squid Bug 5187. Triage available at
https://bugs.squid-cache.org/show_bug.cgi?id=5187#c6
Alex Rousskov [Sun, 23 Jul 2023 01:38:19 +0000 (01:38 +0000)]
Fix memory leak when reconfiguring multiline all-of ACLs (#1425)
Normally, Acl::InnerNode::add() automatically registers stored ACL nodes
for future cleanup, but when we find the second all-of rule/line (with
the same ACL name), we do not add() the newly created OrNode and have to
explicitly register it to avoid memory leaks on reconfiguration.
Leaking since all-of ACL support was added in 2013 commit 6f58d7d.
Alex Rousskov [Tue, 27 Jun 2023 11:58:16 +0000 (11:58 +0000)]
Do not cache (and do not serve cached) cache manager responses (#1185)
The fixed bug affected cache manager transactions that were using
/squid-internal-mgr URL path prefix with http(s) URL scheme. It did not
affect transactions that were using legacy cache_object URL scheme.
Stale cache manager responses had their Age response header set to the
number of seconds since Unix epoch. If disk and memory caches were
disabled, then cache manager requests just triggered "found KEY_PRIVATE"
WARNINGs in cache.log (for reasons that remain unclear).
Probably broken since 2011 commit e37bd29 that did not expand
HttpRequest::maybeCacheable() (called cacheable() back then)
PROTO_CACHE_OBJECT check to include /squid-internal-mgr requests.
Also added missing Access-Control-* response headers to cache manager
responses in SMP mode and reduced code duplication related to sending
those headers (which led to them missing in SMP Squids).
Alex Rousskov [Wed, 14 Jun 2023 14:09:38 +0000 (14:09 +0000)]
Fix key equality comparison in LookupTable map (#1379)
An std::unordered_map with case-insensitive keys must use a
case-insensitive key equality comparison. lookupTable_t used (the
default) std::equal_to<SBuf> comparison which is case-sensitive.
The full extent of this bug effects is unknown, but Squid was
mishandling Cache-Control and Surrogate-Control directives with
non-canonical (i.e. other than all-lower-case) spelling. Similar
problems affected WWW-Authenticate Digest parameter names, but the
related code remains inconsistent, with both case-sensitive and
case-insensitive checks applied to some of the key names in
Auth::Digest::Config::decode().
Also removed a similarly buggy (and, technically, unused) "typical use"
example from the CaseInsensitiveSBufHash class description. C++
documentation and Squid code are better sources of usage examples when
it comes to STL-used concepts like hash function objects.
This minimal fix excludes LookupTable class polishing.
Alex Rousskov [Mon, 19 Jun 2023 01:48:38 +0000 (01:48 +0000)]
Honor DNS RR TTLs larger than negative_dns_ttl (#1380)
Since 2017 commit fd9c47d, Squid was effectively ignoring DNS RR TTLs
that exceeded negative_dns_ttl (i.e. 60 seconds by default) because the
"find the smallest TTL across the DNS records seen so far" code in
ipcache_entry::updateTtl() mistook the "default" ipcache_entry::expires
value as the one based on an earlier seen DNS record.
In most cases, this bug decreased IP cache hit ratio.
Existing fqdncache code does not suffer from the same bug because
fqdncacheParse() always resets fqdncache_entry::expires instead of
updating it incrementally. ipcacheParse() has to update incrementally
because it is called twice per entry, once with an A answer and once
with an AAAA answer.
Ideally, ipcache_entry::expires should be made optional to eliminate
awkward "first updateTtl() call" detection, but doing so well requires
significant code changes, so that entries without a known expiration
value are not cached forever _unless_ they were loaded from /etc/hosts.
And those changes should probably be propagated to fqdncache.cc.
Alex Rousskov [Sat, 17 Jun 2023 16:34:11 +0000 (16:34 +0000)]
Bug 5278: Log %err_code for "early" request handling errors (#1382)
Certain "early" request handling errors (e.g., ERR_TOO_BIG,
ERR_UNSUP_REQ, ERR_UNSUP_HTTPVERSION, ERR_PROTOCOL_UNKNOWN, and
ERR_INVALID_URL) are handled with nil ClientHttpRequest::request object.
ErrorState relied on that object to record error type and details, so
%err_code expanded to "-" in the corresponding access.log records.
Now, ErrorState also uses ALE (where available). ALE already handles the
combination of an early error and late HttpRequest (e.g., the fake one
clientReplyContext::createStoreEntry() creates when handling the error).
This minimal fix does not address known problems with ErrorState::detail
and ErrorState::BuildHttpReply(). It also duplicates related xerrno
detailing code. We will address some of these problems shortly,
including the increased code duplication problem.
Alex Rousskov [Fri, 23 Jun 2023 23:05:27 +0000 (23:05 +0000)]
Documentation: Update stale SMP cache_dir caveats (#1394)
The requirement to specify "workers" before "cache_dir" was added in
2010 commit acf69d7. It became obsolete since 2011 commit 095ec2b.
The "dedicated cache directory" hack for UFS-based stores has always led
to HTTP violations, but the increased complexity of worker-to-worker
synchronization code (required to improve HTTP support) also increased
the probability of crashes or worse outcomes when SMP conditionals are
used. Those hacks violate the "all processes see the same configuration"
and similar basic code assumptions. We do not test (and usually do not
even consider the needs of) such unsupported configurations.
* web server has not been the direct release destination for
many years. Remove mentions of $dst/changesets/
* web server references github repository for basic
information. Do not push CONTRIBUTORS, COPYING, README,
CREDITS and SPONSORS files to staging area.
* release notes no longer link to local ChangeLog file.
No need to adjust the hyperlink.
When Squid failed to open a TCP connection to an origin server (or
through a cache peer) while handling a CONNECT request, Squid logged %Ss
as NONE_NONE because TunnelStateData waited for a successful TCP
connection to update the log tag. That unnecessary wait (modeled after
the necessary HTTP status code wait) complicated code. Squid now logs
TCP_TUNNEL to indicate the request handling path chosen by Squid.
We already use the same "handling path" approach for most other request
status codes. For example, TCP_MISS does not mean the miss transaction
was successful. It only means that Squid satisfied the request using the
cache miss forwarding logic. Even the LOG_TCP_REFRESH_FAIL_ERR status
code does not imply that the error response was successfully forwarded
to the client; only that the request was satisfied on a particular
(albeit very detailed in this case!) handling path.
Apply the same logic to tunneling attempts blocked by miss_access. They
were also logged as NONE_NONE and are now logged as TCP_TUNNEL.
The actual outcome of a tunneling attempt can usually be determined
using %err_code/%err_detail and %>Hs logformat codes.
Alex Rousskov [Wed, 17 May 2023 19:33:51 +0000 (19:33 +0000)]
Do not erase aborted StoreMap entries that are still being read (#1344)
The old `!s.lock.readers` precondition for calling freeChain() in
StoreMap::abortWriting() was not broad enough: It covered "old" readers
that have incremented ReadWriteLock::readers already while missing those
lockShared() callers that have fully qualified for becoming a reader but
have not incremented the `readers` counter yet. A correct test is now
implemented by ReadWriteLock::stopAppendingAndRestoreExclusive().
This code was broken since appending mode was introduced in 2013 commit ce49546. Evidently, the mishandled race condition is short-lived enough
and abortWriting() calls are rare enough, that there are no (known)
relevant bug reports. This bug was accidentally discovered while reading
the code to explain another SMP bug.
Controller::markedForDeletionAndAbandoned() similarly misjudges
readers(), but that bug does not lead to serious problems, and we would
like to find a fix that identifies such entries without false positives
(that looking at ReadWriteLock::readLevel instead of readers creates).
Also polished Ipc::StoreMap::abortWriting() debugging a little.
Alex Rousskov [Wed, 10 May 2023 20:45:27 +0000 (20:45 +0000)]
Do not leak Security::CertErrors created in X509_verify_cert() (#1346)
ACLFilledChecklist was using a raw C pointer for handling cbdata-managed
Security::CertErrors. Some sslErrors users knew about hidden cbdata
requirements, some did not, resulting in inconsistent locking/unlocking
and associated memory leaks. Upgrading ACLFilledChecklist::sslErrors to
smart CbcPointer fixes those leaks (and simplifies code).
Fix default build with libgnutls but absent GnuTLS headers (#1332)
Ensure that when libgnutls is found but any of the required GnuTLS
headers cannot be used, GnuTLS support is not enabled by default and an
explicit request for GnuTLS support is fatally rejected during
./configure. Otherwise, the build fails later anyway, during "make".
ClientHttpRequest::out.offset is documented to only count body bytes,
and that is what receivedBodyTooLarge() needs. We should not subtract
the guessed header size, occasionally violating reply_body_max_size
configuration and often leading to a confusing negative body size in
receivedBodyTooLarge() debugging.
Possibly broken since 2003 commit b51aec6: out.offset was not documented
back then, but, AFAICT, it was already used just for the body bytes.
Alex Rousskov [Sun, 2 Apr 2023 07:42:10 +0000 (07:42 +0000)]
Fix GCC v13 build [-Warray-bounds] (#1318)
src/ipc/mem/FlexibleArray.h:34:52: error: array subscript -1 is
below array bounds of 'int [1]' [-Werror=array-bounds]
We suspect this warning is a GCC v13 regression bug because the callers
marked as problematic by GCC (e.g., Rock::LoadingEntry::LoadingEntry) do
not use "array subscript -1", and the Ipc::StoreMapItems::at() operator
they use even asserts that the subscript is not negative. It might be
GCC bug 107699: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107699
This change replaces the fake one-item array hack with a properly
aligned byte (still used as the "start of the real array" marker).
Also removed some unused and problematic code (instead of polishing it).
Alex Rousskov [Sat, 25 Mar 2023 03:40:27 +0000 (03:40 +0000)]
Improve detection of lonely StoreEntry clients (#1312)
The first storeClientType() call still sees zero nclients because
storeClientListAdd() creates a store_client object (which triggers
storeClientType()) first and only then adds it to the list of clients
(which increases nclients). This off-by-one bug was introduced in commit 528b2c6 that reordered nclients increase and the storeClientType() call.
The bug could lead to excessive disk I/O in some environments because
Squid could see more STORE_DISK_CLIENT store_client objects, and those
may trigger startSwapin() in store_client::doCopy(). However, the exact
effects of this bug/fix are difficult to predict because startSwapin()
may be separated by async calls and is guarded by additional conditions.
A full fix requires refactoring the caller, but there are so many
problems associated with storeClientType() and startSwapin() calls that
such refactoring deserves a dedicated effort.
Technically, this "first" client may be 10th client this StoreEntry has
seen, but it is the first one among the _current_ entry clients.
Alex Rousskov [Fri, 24 Mar 2023 23:58:42 +0000 (19:58 -0400)]
CI: Remove irrelevant branch from v6 GitHub Actions config (#1311)
The original GitHub Actions configuration (see commit 2ed8cc4)
incorrectly assumed that GitHub will use that configuration (stored in
the master branch) for testing _all_ repository branches. In reality,
GitHub uses configuration from the being-tested branch. Thus, we do not
need to (and, hence, should not) list any branches in the X branch
configuration, with the exception of the X branch itself and the "auto"
branch where all PRs (including all X-targeting PRs) are staged for the
final "as if merged" testing.
Alex Rousskov [Sun, 19 Feb 2023 00:23:31 +0000 (00:23 +0000)]
Ensure StoreEntry presence during store_client lifetime (#1281)
Most likely, all store_client users already lock their StoreEntry
objects, so this change is unlikely to affect any current use cases.
However, objects like store_client that store StoreEntry objects for
long-term/asynchronous reuse should lock their entries rather than rely
on others to do that for them. Being able to rely on StoreEntry locking
also clarifies existing logic in several store_client methods.
Alex Rousskov [Tue, 14 Feb 2023 21:31:11 +0000 (21:31 +0000)]
CI: Disable optimizations in "minimal" build tests (#1278)
Our "minimal" layer tests had to stop disabling optimizations in 2019
commit 4f3c41b due to problems specific to GCC v9 (v9.1 or earlier). GCC
v9.4 (and later) appear to work OK (at least on Ubuntu 22.04), so let's
try to resume testing --disable-optimizations ./configure option.
Enable compiler checks for printf-like SBuf methods (#1270)
After commit 0f17e25, these two are the only known printf()-like
functions not already covered by these compiler checks. Some helper
debugging hacks do not have PRINTF_FORMAT_ARG2 or similar attributes,
but they are covered because their `__GNUC__` variants use standard
fprintf().
Since inception (2017 commit e99fa72), FileOpeningConfig::RetryGapUsec
was declared but not defined in any translation unit (a bug). The build
probably "worked" because compilers simply inlined the value of the
constant data member when calling xusleep(). Commit 5eee0e4 removed code
that passed RetryGapUsec to that external "C" function. Now, the same
member is passed to a heavily-inlined and convoluted STL code, and at
least some compilers (e.g., GCC v10-v12 on Ubuntu 22.04) stopped
inlining RetryGapUsec unless optimization is enabled. Our CI tests
passed because we do not test --disable-optimizations builds (yet).
We do not need FileOpeningConfig::RetryGapUsec to be static, breaking
builds, requiring extra code, and triggering questions. We do not even
need it to be constant, but removing "const" is an arguably out-of-scope
change, as is improving its type.
File::InvalidHandle is missing a definition as well, except on Windows,
but we are leaving that code "as is" for now, until we can test whether
Windows is OK with removing that "static" keyword.
Alex Rousskov [Tue, 14 Feb 2023 10:03:34 +0000 (10:03 +0000)]
CI: Extend GitHub Actions build-tests to doc/release-notes/ (#1275)
"make dist[check]" does nothing in doc/release-notes unless a linuxdoc
tool has been detected during ./configure, missing errors like this:
make[6]: Entering directory /squid/doc/release-notes
linuxdoc -B html -T 2 --split=0 release-6.sgml
Processing file release-6.sgml
onsgmls: ...E: end tag for element "E" which is not open
Alex Rousskov [Tue, 14 Feb 2023 00:00:20 +0000 (00:00 +0000)]
Place more msgs under cache_log_message control, downgrade some (#1273)
Also support cache_log_message-controlled messages in ::Config-unaware
libraries (e.g., libip) used by tools (e.g., cachemgr.cgi).
Restricting the very first (i.e. "depth 0") "Processing Configuration
File" message does not work during startup (before that file is parsed),
and does not work during reconfiguration (because the old configuration
is reset shortly before logging that line). Future reconfiguration
support improvements may fix the reset problem. Restrictions do work as
expected for included files (i.e. positive "depth" levels).
Restricting "BCP 177 violation" WARNINGs does not work because the
warnings are printed _before_ Squid configuration is parsed. Future
initialization improvements may fix this.
Also downgraded the importance of a few (re)configuration
progress-reporting messages from level 1 to level 2.
Run cache_peer probes without transaction context (#1256)
Even when triggered by a master transaction, these background probes
should not be associated with it. That transaction just wants to know
whether the neighbor is "up" right now. The transaction does not wait
(and is not aware of) the background probe its question may or may not
have triggered.
Amos Jeffries [Fri, 10 Feb 2023 12:25:07 +0000 (12:25 +0000)]
Cleanup: remove xusleep() hacks (#613)
usleep() has been obsolete for a long time. C++11 provides better and
portable waiting mechanism we can use easily instead of these hacks
trying to replicate and extend usleep().
The original text contained a "last modification age" typo and was
misinterpreted by some admins as if Squid applied the configured percent
to the current object age.
The text also did not make it clear that the percent-based heuristic is
effectively only applied to responses with a Last-Modified header (in
addition to the usual "without an explicit expiry time" precondition).
squidclient: Drop cache_object protocol support (#1251)
Use `http` URL scheme for `mgr:command` cache manager requests instead.
Also fixed a bug: When `mgr:command` was used together with a `-w X`
command-line option, squidclient sent that proxy authentication password
`X` in the `cache_object` URL instead of sending the `Y` password from
the `-W Y` option (or no password at all if no `-W` option was given).
If no Authorization header had been sent, Squid's cachemgr_passwd
honored the `-w X` password sent in the cache_object URL. If
Authorization header had been sent, Squid's cachemgr_passwd honored the
corresponding `-W Y` password, ignoring any password sent in the
cache_object URL.
Now, squidclient uses Authorization HTTP header for sending cache
manager authentication credentials with `mgr:command` requests. Those
credentials are taken either from the `-U` and `-W` command-line options
(if `mgr:command` parameter lacks an embedded password) or from the `-U`
command line option and from the `mgr:command@password` parameter
(otherwise).
squidclient now sends Proxy authentication credentials if and only if
`-u` and `-w` command line options are given. These credentials may be
required to authenticate with the proxy, but they do not affect cache
manager authentication driven by the cachemgr_passwd directive.
Also prohibit specifying a command-line option with a password (`-w` or
`-W`) without specifying the corresponding user name (`-u` or `-U`).
Prior to this change, squidclient did not send the Proxy-Authorization
or Authorization header if the corresponding user name was missing but
did not complain about the problem.
Remove broken -sha1 option from server_cert_fingerprint (#1249)
server_cert_fingerprint support for the sha1 parameter has been broken
for years, probably since its inception (2012 commit 42d3334). The bug
was known since at least 2018 when it was mentioned in Bug 4847
discussion. The single-dash syntax violates the double-dash pattern
used for other --long ACL options. If fixed, using the option would not
change Squid behavior because SHA1 is the default (and the only
supported) fingerprinting algorithm.
The option was meant to allow admins to be explicit about that default
in case it changes in the future, but implementation bugs derailed that
plan. The fix is not straightforward, and we should be focusing on more
important things.
Improve error message storage in Dns::LookupDetails (#1241)
Removed one more deprecated String usage and optimized handling of a
common "no error, no lookup" case as well as portions of the "no-error
lookup" code paths. Further optimizations need similar ipcache_entry and
fqdncache_entry error storage changes (at least).
Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>
Fix logging of responses truncated by premature EOF (#1244)
When a transaction failed due to a premature EOF while receiving an HTTP
response, %Ss logformat code was missing the ABORTED suffix (e.g.,
logging TCP_MISS instead of TCP_MISS_ABORTED). When premature EOF
truncated the header, the default "squid" access log format contained a
502 sent status code (%>Hs), hinting at the problem. When the body was
truncated, even that weak hint was usually absent because the actual
status code returned by the server (usually 200) was usually logged.
Similarly, %err_code/%err_detail logformat codes for HTTP responses with
truncated by a premature EOF bodies carried no information. Now they
expand to ERR_READ_ERROR/SRV_PREMATURE_EOF in those cases.
No changes to requests truncated by Squid-server read timeouts. They
were and are still logged with TIMEDOUT %Ss suffix and
ERR_READ_TIMEOUT/WITH_SERVER %err_code/%err_detail.
Also adjusted WHOIS to mark zero-length responses with
ERR_ZERO_SIZE_OBJECT instead of default ERR_READ_ERROR. This affects
forwarded WHOIS transactions and AS number queries generated during
(re)configuration.
Reduce code duplication in ACLCertificateData::parse() (#1242)
ACLs ca_cert and user_cert already prohibit key changes. And
server_cert_fingerprint ACL only supports one option name spelling, so,
while "-sha1" is not, technically, a "key", it still cannot be
"changed". Adjust SetAclKey() (added in recent commit 4a3b853) and reuse
it to implement those existing "key change" checks.
Fix build with clang v15.0.6 [-Wdeprecated-non-prototype] (#1246)
clang v15.0.6 distributed with Fedora Rawhide complains about function
definitions using K&R style:
lib/snmplib/mib.c:229:1: error: a function definition without a
prototype is deprecated in all versions of C and is not supported in
C2x [-Werror,-Wdeprecated-non-prototype]
Amos Jeffries [Tue, 17 Jan 2023 02:19:01 +0000 (02:19 +0000)]
Removed tcp-banger* and pconn-banger tools (#1236)
We do not have enough resources/demand for maintaining these tools, they
do require maintenance, and there are better tools available.
* tcp-banger2 is not built by default and probably could not be built at
all since at least 2006 commit 5679212.
* tcp-banger3 lacked build rules since inception (1998 commit 2742510)
and probably could not be built manually (by mimicking tcp-banger2
build commands) without warnings since 2002 commit 88d8a2a.
* pconn-banger lacked build rules since inception (1997 commit eb5f55b)
and probably could not be built manually since at least 2007 commit cc192b5.
* tcp-banger.pl has portability and code quality issues; its basic
functionality is supported by squidclient, wget, curl, and others.
Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>
Alex Rousskov [Mon, 16 Jan 2023 23:24:41 +0000 (23:24 +0000)]
Maintenance: Check for expected squid-conf-tests notifications (#1231)
Sometimes, Squid must accept a configuration file while also logging a
WARNING or even an ERROR message. The new "expect-message" test case
instruction tells test-suite/test-squid-conf.sh to check for a given
expected message while still expecting configuration acceptance.
Two test cases (mgr_passwd.conf and time_units.conf) were enhanced. We
plan to use this feature in future tests already under development.