Alex Rousskov [Mon, 30 Jun 2025 20:32:16 +0000 (20:32 +0000)]
Simplify appending SBuf to String (#2108)
There is still a lot of code that uses legacy String class, and an
increasing amount of code that has to append modern SBufs to Strings.
SBuf handles NUL bytes better than String. With respect to handling NUL
bytes, the new append() method is as (un)safe as the existing explicit
size-based one (that callers tend to use prior to these changes anyway).
aafbsd [Thu, 26 Jun 2025 18:57:19 +0000 (18:57 +0000)]
Bug 5497: Fix detection of duped IPs returned by getaddrinfo() (#2100)
WARNING: Ignoring <IP X> because it is already covered by <IP X>
Affects `src`, `dst`, and `localip` ACLs, especially those that use
domain names with multiple DNS A or AAAA records.
IP addresses returned by getaddrinfo(3) may not be sorted (e.g., when a
host has multiple DNS A RRs on FreeBSD). Instead of comparing the
current address with just the previous one, we now check all previously
added addresses (while processing a single getaddrinfo() call output).
This surgical fix minimizes changes without improving surrounding code.
Amos Jeffries [Sun, 22 Jun 2025 00:48:05 +0000 (00:48 +0000)]
Fix missing CONTRIBUTOR name (#2091)
Despite CONTRIBUTORS file being UTF-8 Ture's name
was dropped by our conversion script which does not
handle non-ASCII characters nicely. Re-add it manually.
Do not duplicate received Surrogate-Capability in sent requests (#2087)
When computing Surrogate-Capability header while forwarding an
accelerated request, Squid duplicated old (i.e. received) header entries
(if any). For example, this outgoing request shows an extra hop1 entry:
GET / HTTP/1.1
...
Surrogate-Capability: hop1="Surrogate/1.0"
Surrogate-Capability: hop1="Surrogate/1.0", hop2="Surrogate/1.0"
Amos Jeffries [Mon, 9 Jun 2025 04:49:04 +0000 (04:49 +0000)]
CI: Fix gperf 3.2 output filter (#2081)
gperf 3.2 now provides properly compiler and C++ version scoped
fallthrough attributes. Our filter to convert the gperf 3.1 and
older output for C++17 attribute requirements is now broken and
produces compiler errors due to listing '[[fallback]];' on two
consecutive lines.
Fix OpenSSL build with GCC v15.1.1 [-Wformat-truncation=] (#2077)
On arm64 Fedora 42:
src/ssl/crtd_message.cc:132:39: error: '%zd' directive output may be
truncated writing between 1 and 19 bytes into a region
of size 10 [-Werror=format-truncation=]
snprintf(buffer, sizeof(buffer), "%zd", body.length());
Carl Vuosalo [Wed, 28 May 2025 19:17:38 +0000 (19:17 +0000)]
Fix SNMP cacheNumObjCount -- number of cached objects (#2053)
SNMP counter cacheNumObjCount used StoreEntry::inUseCount() stats. For
Squid instances using a rock cache_dirs or a shared memory cache, the
number of StoreEntry objects in use is usually very different from the
number of cached objects because these caches do not use StoreEntry
objects as a part of their index. For all instances, inUseCount() also
includes ongoing transactions and internal tasks that are not related to
cached objects at all.
We now use the sum of the counters already reported on "on-disk objects"
and "Hot Object Cache Items" lines in "Internal Data Structures" section
of `mgr:info` cache manager report. Due to floating-point arithmetic,
these stats are approximate, but it is best to keep SNMP and cache
manager reports consistent.
This change does not fix SNMP Gauge32 overflow bug: Caches with 2^32 or
more objects continue to report wrong/smaller cacheNumObjCount values.
### On MemStore::getStats() and StoreInfoStats changes
To include the number of memory-cached objects while supporting SMP
configurations with shared memory caches, we had to change how cache
manager code aggregates StoreInfoStats::mem data collected from SMP
worker processes. Before these changes, `StoreInfoStats::operator +=()`
used a mem.shared data member to trigger special aggregation code hack,
but
* SNMP-specific code cannot benefit from that StoreInfoStats aggregation
because SNMP code exchanges simple counters rather than StoreInfoStats
objects. `StoreInfoStats::operator +=()` is never called by SNMP code.
Instead, SNMP uses Snmp::Pdu::aggregate() and friends.
* We could not accommodate SNMP by simply adding special aggregation
hacks directly to MemStore::getStats() because that would break
critical "all workers report about the same stats" expectations of the
special hack in `StoreInfoStats::operator +=()`.
To make both SNMP and cache manager use cases work, we removed the hack
from StoreInfoStats::operator +=() and hacked MemStore::getStats()
instead, making the first worker responsible for shared memory cache
stats reporting (unlike SMP rock diskers, there is no single kid process
dedicated to managing a shared memory cache). StoreInfoStats operator
now uses natural aggregation logic without hacks.
TODO: After these changes, StoreInfoStats::mem.shared becomes
essentially unused because it was only used to enable special
aggregation hack in StoreInfoStats that no longer exists. Remove?
Bug 5352: Do not get stuck in RESPMOD after pausing peer read(2) (#2065)
The transaction gets stuck if Squid, while sending virgin body bytes to
an ICAP RESPMOD service, temporary stops reading additional virgin body
bytes from cache_peer or origin server. Squid pauses reading (with
readSizeWanted becoming zero) if reading more virgin bytes is temporary
prohibited by delay pools and/or read_ahead_gap limits:
readReply: avoid delayRead() to give adaptation a chance to drain...
HttpStateData::readReply() starts waiting for ModXact to drain the
BodyPipe buffer, but that draining may not happen, either because
ModXact::virginConsume() is not called at all[^1] or because it is
"postponing consumption" when BodyPipe still has some unused space[^2].
With HttpStateData not reading more virgin bytes, Squid may not write
more virgin body bytes to the ICAP service, and the ICAP service may not
start or continue responding to the RESPMOD request. Without that ICAP
activity, ModXact does not consume, the virgin BodyPipe buffer is not
drained, HttpStateData is not reading, and no progress is possible.
HttpStateData::readReply() should start waiting for adaptation to drain
BodyPipe only when the buffer becomes completely full (instead of when
it is not empty). This change may increase virgin response body bytes
accumulation but not the buffer capacity because existing buffer
space-increasing logic in maybeMakeSpaceAvailable() remains intact.
To prevent stalling, both BodyPipe ends (i.e. HttpStateData and
Icap::ModXact) must use matching "progress is possible" conditions, but
* HttpStateData used hasContent()
* Icap::ModXact used spaceSize()
* Ftp::Client used potentialSpaceSize()
Now, all three use matching potentialSpaceSize()-based conditions.
Squid eCAP code is unaffected by this bug, because it does not postpone
BodyPipe consumption. eCAP API does not expose virgin body buffer
capacity, so an eCAP adapter that postpones consumption risks filling
the virgin body buffer and stalling. This is an eCAP API limitation.
[^1]: Zero readSizeWanted is reachable without delay pools, but only if
Squid receives an adapted response (that makes readAheadPolicyCanRead()
false by filling StoreEntry). Ideally, receiving an adapted response
should result in a virginConsume() calls (that would trigger BodyPipe
draining), but currently it may not. Reliably starting virgin data
consumption sooner is not trivial and deserves a dedicated change.
[^2]: ModXact postpones consumption to preserve virgin bytes for ICAP
retries and similar purposes. ModXact believes it is safe to postpone
because there is still space left in the buffer for HttpStateData to
continue to make progress. ModXact would normally start or resume
draining the buffer when sending more virgin bytes to the ICAP service.
Except for a trivial spaceSize() call and a never-executed (for the only
caller) `space` parameter adjustment, replyBodySpace() duplicates
Client::calcBufferSpaceToReserve(). This duplication complicates
refactoring aimed at addressing other known buffer management problems.
Amos Jeffries [Sun, 18 May 2025 06:39:04 +0000 (06:39 +0000)]
Maintenance: Remove shared LDADD (#2058)
Most built binaries have a distinct set of dependencies and already have
their own foo_LDADD variables. Add a few variables to cover the
remaining binaries and stop setting an (incomplete) LDADD global.
Also removed unnecessary EXTRA_PROGRAMS because mem_node_test and splay
binaries are built unconditionally.
Alex Rousskov [Wed, 14 May 2025 14:28:51 +0000 (14:28 +0000)]
CI: Upgrade Ubuntu runners from v22.04 to v24.04 (#2064)
* Ubuntu 24.04 uses different apt sources file location and format.
* Some Ubuntu 24.04 GitHub Actions runners lack libltdl-dev as detailed
at https://github.com/actions/runner-images/issues/11316.
We could explicitly install libltdl-dev for functionality tests, but
decided to go one step further and unify all prerequisites
installation steps for tests that build Squid on Ubuntu.
Also added a hack to fix /etc/hosts broken on some Ubuntu runners and
resulting in Squid startup errors during functionality tests:
kid1| ERROR: ipcacheAddEntryFromHosts: Bad IP address '-e'
Do not treat tiny virgin response buffer space specially (#2056)
A long series of commits sprinkled Squid code with special treatment of
virgin response buffer space equal to one or two bytes:
* 2005 commit 2afaba07 adds a 2-byte hack to HttpStateData. The hack
avoids stalling transactions by never asking
StoreEntry::delayAwareRead() to read 1 byte.
* 2006 commit 253cacc adds the same 2-byte hack to Ftp::Client.
* 2007 commit 478cfe99 adds a 2-byte hack to
Icap::ModXact::virginConsume() so that ICAP transactions do not stall
when HttpStateData hack prevents new virgin bytes from being read into
a BodyPipe buffer that has just one or two space bytes left. The ICAP
hack was probably one-byte more aggressive than necessary...
* 2011 commit 0ad2b63b adds +1 byte hacks to ServerStateData and
ClientHttpRequest to work around the same StoreEntry::delayAwareRead()
problem leading to excessive buffering with slow clients.
* A followup 2012 commit 4dc2b072 adjusts StoreEntry::delayAwareRead()
to allow 1-byte reads, making _all_ of the above hacks unnecessary.
However, that commit only removed +1 byte hacks added in 2011, missing
HttpStateData and ICAP hacks added in 2005, 2006 and 2007, probably
because that 2012 work focused on fixing the bug introduced by 2011
commit (that was triggered by slow client-Squid communication).
We now remove the remaining known hacks that worked around
StoreEntry::delayAwareRead() inability to read 1 byte (i.e. the first
three hacks on the above list).
These changes also happen to fix transaction stalls with read_ahead_gap
set to 1 and probably some other rare use cases: Transactions could
stall because inBuf.spaceSize() may be 1 for an emptied inBuf, leading
to maybeMakeSpaceAvailable() returning zero due to `read_size < 2` being
true. That zero result prevents HttpStateData from making progress by
reading new virgin response bytes. Mistreatment of "emptied inBuf" cases
was exposed by 2023 commit 50c5af88 changes. More work is needed to
address other, more prominent "emptied inBuf" cases.
This assertion may be triggered when RESPMOD adaptation is aborted
(e.g., when an essential ICAP service is down, or when it sends an ICAP
response status code unsupported by Squid).
Ftp::Relay::forwardReply() calls markParsedVirginReplyAsWhole() before
adaptOrFinalizeReply(). Thus, at that markParsedVirginReplyAsWhole()
call time, startedAdaptation remained false, markStoredReplyAsWhole()
was called, and storedWholeReply_ became true. If Squid then decided to
start adaptation, but that adaptation had failed, FwdState::completed()
detected a mismatch between true storedWholeReply_ and a still-empty
STORE_PENDING StoreEntry. This bug was added in 2021 commit ba3fe8d.
Squid does not write virgin response to Store while adaptation_access
check is pending or after RESPMOD adaptation has started. Code that does
not write to Store must not make storedWholeReply_ true. On the other
hand, after adaptation_access is denied, and Squid finishes storing all
virgin response bytes, Squid needs to know whether storedWholeReply_
should be set (i.e. whether Squid has received the whole virgin reply
and called markParsedVirginReplyAsWhole()). This fix adds
Client::markedParsedVirginReplyAsWhole to remember the latter event.
The FreeBSD project has promoted version 14.2 to stable.
Some packages we use are not compatible with version 14.1.
Upgrade the reference version we use, the action supports it
Fixed missing includes, type mismatches in some
local variables, applied some AAA;
extracted WIN32_maperror
into own header and implementation files using NoMoreGlobals,
and used modern c++ data types for it.
This change also mirrors changes introduced in aiops.cc by commit 91d1cfb. These changes require further refinement (in both files).
Examples of errors fixed:
```
aiops_win32.cc: In function
'void* squidaio_xmalloc(int)':
aiops_win32.cc:161:17: error:
invalid use of incomplete type 'class Mem::Allocator'
aiops_win32.cc: In function 'void squidaio_init()':
aiops_win32.cc:278:19: error:
comparison of integer expressions of different signedness:
'int' and 'size_t' {aka 'long long unsigned int'}
aiops_win32.cc: In function
'void squidaio_do_read(squidaio_request_t*)':
aiops_win32.cc:782:9: error:
'WIN32_maperror' was not declared in this scope
```
The pipe(2) function is not available on Windows and mingw,
in favour of a broader _pipe() call.
Fixes the following build error:
DiskThreads/CommIO.cc:
In static member function 'static void CommIO::Initialize()':
DiskThreads/CommIO.cc:26:9: error:
'pipe' was not declared in this scope; did you mean '_pipe'?
ntlm_sspi_auth: Fix missing base64 symbol linkage (#2031)
Solve build error:
```
ld: ntlm_sspi_auth.o: in function `token_decode':
undefined reference to `nettle_base64_decode_init'
undefined reference to `nettle_base64_decode_update'
undefined reference to `nettle_base64_decode_final'
```
HttpRequest::peer_host was added in 2009 commit 9ca29d23 so that
httpFixupAuthentication() could pass copied raw CachePeer::host pointer
value to peer_proxy_negotiate_auth(). Unfortunately, raw peer_host
pointer (to CachePeer::host memory) becomes dangling when CachePeer is
reconfigured away. Instead of maintaining this problematic field, we can
safely obtain the same CachePeer::host value from HttpStateData::_peer.
Do not reuse an idle connection to a removed cache_peer (#2028)
Sending new requests to a removed cache_peer contradicts current Squid
configuration and even exposes Squid code that forgets to check
CachePeer validity to dangling pointers. We will address the latter
concern separately.
CI: Enable eCAP and Valgrind in layer build tests if possible (#1996)
Now these optional features are enabled during applicable layer tests if
their packages appear to be available on the build system. This should
help prevent regressions like the one fixed in recent commit 53ed1a9.
On Windows, mkdir only takes one argument.
compat/mswindows.h has an adapter, add it to
compat/mingw.h as well.
Solves error:
```
UFSSwapDir.cc:617:26: error: too many arguments
to function 'int mkdir(const char*)'
mingw/include/io.h:282:15: note: declared here
int __cdecl mkdir (const char *);
```
The AIO Windows compatibilty layer is also
necessary on mingw
Problems fixed:
```
DiskIO/AIO/async_io.h:58:18:
error: field 'aq_e_aiocb' has incomplete type 'aiocb'
DiskIO/AIO/async_io.h:58:12:
note: forward declaration of 'struct aiocb'
DiskIO/AIO/AIODiskFile.cc:
In member function
'virtual void AIODiskFile::read(ReadRequest*)':
src/DiskIO/AIO/AIODiskFile.cc:134:9:
error: 'aio_read' was not declared in this scope;
did you mean 'file_read' ?
```
Andreas Weigel [Thu, 13 Mar 2025 11:30:28 +0000 (11:30 +0000)]
Fix tls-dh support for DHE parameters with OpenSSL v3+ (#1949)
# When applying tls-dh=prime256v1:dhparams.pem configuration:
WARNING: Failed to decode EC parameters 'dhparams.pem'
# When forcing the use of FFDHE with something like
# openssl s_client -tls1_2 -cipher DHE-RSA-AES256-SHA256 -connect...
ERROR: failure while accepting a TLS connection on:
SQUID_TLS_ERR_ACCEPT+TLS_LIB_ERR=A0000C1+TLS_IO_ERR=1
Squid `https_port ... tls-dh=curve:dhparams.pem` configuration is
supposed to support _both_ ECDHE and FFDHE key exchange mechanisms (and
their cipher suites), depending on client-supported cipher suites. ECDHE
mechanism should use the named curve (e.g., `prime256v1`), and FFDHE
mechanism should use key exchange parameters loaded from the named PEM
file (e.g., `ffdhe4096` named group specified in RFC 7919).
When 2022 commit 742236c added support for OpenSSL v3 APIs, new
loadDhParams() code misinterpreted curve name presence in `tls-dh` value
as an indication that the named parameters file contains ECDHE
parameters, setting OSSL_DECODER_CTX_new_for_pkey() type parameter to
"EC", and (when parameter file specified FFDHE details) triggering the
WARNING message quoted above.
Squid should not expect additional ECDHE parameters when the elliptic
curve group is already fully specified by naming it at the start of
`tls-dh` value. Squid now reverts to earlier (v4) behavior, where
the two mechanisms can coexist and can be configured separately as
described above:
Furthermore, updateContextEecdh() code in commit 742236c continued to
load parsed parameters using old SSL_CTX_set_tmp_dh() call but should
have used SSL_CTX_set0_tmp_dh_pkey() API because the type of parsed
parameters (i.e. DhePointer) have changed from DH to EVP_PKEY pointer.
This second bug affected configurations with and without an explicit
curve name in `tls-dh` value.
Also report a failure to load parsed parameters into TLS context.
Lan Manager (LM) is an obsolete variant of the SMB protocol.
No product on the market has supported it for several years now,
in favour of NTLMv1 , NTLMv2 and Kerberos.
MinGW: use nameless unions in ext_ad_group_acl (#2004)
ext_ad_group_acl was written in 2008 in C, and
it used the C variant of the Win32 API.
It was then ported to C++, but the API callers were
not updated to the C++ version of the API.
With more modern compilers, and
Squid enforcing more strict types and error handling,
it is no longer compiling.
This is part 1 of 2 of the fixes to make the helper build
again, the scope is to update Win32 API callers so they
use the C++ version of the API
Examples of fixed errors:
error: 'IADs' {aka 'struct IADs'} has no member named 'lpVtbl'
error: 'VARIANT' {aka 'struct tagVARIANT'} has no member named 'n1'
CI: Do not classify "no failures" stats as test-build errors (#2001)
CppUnit tests emit a lot of "FAIL: 0" and "XFAIL: 0" lines, which are
incorrectly classified as errors by the test-builds.sh. Filter these
messages out as they are not indicative of problems.
CI: Revert recent "show build commands on terminal" change (#2000)
GitHub Actions UI does not handle large amounts of console output with
collapsable `::group::` sections well.
For example, UI may truncate console output if a collapsable `::group::`
section gets too many log lines. In some cases, GitHub does not report
truncation at all, resulting in misleading console output. In other, UI
warns: "This step has been truncated due to its large size. Download the
full logs from the menu once the workflow run has completed."
store/Disks.cc:690: error: argument 1 value 18446744073709551615
exceeds maximum object size 9223372036854775807
[-Werror=alloc-size-larger-than=]
const auto tmp = new SwapDir::Pointer[swap->n_allocated];
pconn.cc:43:53: error: argument 1 value 18446744073709551615 ...
theList_ = new Comm::ConnectionPointer[capacity_];
During shutdown, iocb_table global was deleted, but the corresponding
cleanup code ignored some IoCallback members, triggering misleading
Ipc::UdsSender memory leak reports from Valgrind.
This change uses NoNewGlobals design to address the above problem, but
this code needs a lot more refactoring to address other old problems
associated with Comm initialization.
test-builds.sh records build output in per-layer logs, and we are
already collecting those logs. Send build output to standard output as
well, to expose it in GitHub Actions user interface. This new detailed
output is enclosed in a "layer ... output" group to keep emphasizing
problematic output lines extracted by test-builds.sh.
Also enclose MacOS build setup output in a group, for similar reasons.
Level-2 "Tunnel Server RESPONSE:..." debugs() incorrectly assumed that
its readBuf parameter contained hdr_sz header bytes. In reality, by the
time code reached that debugs(), readBuf no longer had any header bytes
(and often had no bytes at all). Besides broken header dumps, this bug
could lead to problems that Valgrind reports as "Conditional jump or
move depends on uninitialised value" in DebugChannel::writeToStream().
This fix mimics HttpStateData::processReplyHeader() reporting code,
including its known problems. Future changes should address those
problems and reduce code duplication across at least ten functions
containing similar "decorated" level-2 message dumps.
Bug 5093: List http_port params that https_port/ftp_port lack (#1977)
To avoid documentation duplication, current https_port and ftp_port
directive descriptions reference http_port directive instead of
detailing their own supported parameters. For https_port, this solution
creates a false impression that the directive supports all http_port
options. Our ftp_port documentation is better but still leaves the
reader guessing which options are actually supported.
This change starts enumerating http_port configuration parameters that
ftp_port and https_port directives do _not_ support. Eventually, Squid
should reject configurations with unsupported listening port options.
Alex Rousskov [Tue, 31 Dec 2024 21:59:05 +0000 (21:59 +0000)]
Work around some mgr:forward accounting/reporting bugs (#1969)
In modern code, FwdReplyCodes[0][i] is usually zero because n_tries is
usually at least one at logReplyStatus() call time. This leads to
mgr:forward report showing nothing but table heading (i.e. no stats)
Also improve `try#N` heading:data match by skipping FwdReplyCodes[0]
reporting (there is still no `try#0` heading) and adding a previously
missing `try#9` heading
Alex Rousskov [Tue, 31 Dec 2024 20:40:46 +0000 (20:40 +0000)]
Clarify --enable-ecap failure on missing shared library support (#1968)
checking if libtool supports shared libraries... no
checking whether to build shared libraries... no
configure: error: eCAP support requires loadable modules.
Please do not use --disable-shared with --enable-ecap.
After 2022 commit 5a2409b7, our advice for handling the above error
became misleading in environments that do not --disable-shared
explicitly but lack shared libraries support for other reasons
Alex Rousskov [Tue, 31 Dec 2024 19:22:21 +0000 (19:22 +0000)]
Do not lookup IP addresses of X509 certificate subject CNs (#1967)
A true-vs-false `nodns` parameter value bug in a recent commit 22b2a7a0
caused, in some environments, significant startup delays and/or runtime
stalls because getaddrinfo(3) performed blocking DNS lookups when
parsing common names of X509 certificate subjects. Squid parses CNs when
loading configured and validating received certificates. Other side
effects may have included Squid-generated certificates having wrong
alternative subject names and/or wrong certificate validation results.
Negative names and context-disassociated boolean constants strike again!
Fortunately, associated problematic Ip::Address::lookupHostIP() will be
replaced when the existing Ip::Address::Parse() TODO is addressed.
Alex Rousskov [Tue, 31 Dec 2024 17:27:40 +0000 (17:27 +0000)]
REQMOD stuck when adapted request (body) is not forwarded (#1966)
Squid forwards request bodies using BodyPipe objects. A BodyPipe object
has two associated agents: producer and consumer. Those agents are set
independently, at different processing stages. If BodyPipe consumer is
not set, the producer may get stuck waiting for BodyPipe buffer space.
When producer creates a BodyPipe, it effectively relies on some code
somewhere to register a consumer (or declare a registration failure),
but automatically tracking that expectation fulfillment is impractical
For REQMOD transactions involving adapted request bodies, including ICAP
204 transactions, Client::startRequestBodyFlow() sets body consumer. If
that method is not called, there will be no consumer, and REQMOD may get
stuck. Many `if` statements can block request forwarding altogether or
block a being-forwarded request from reaching that method. For example,
adapted_http_access and miss_access denials block request forwarding
Without REQMOD, request processing can probably get stuck for similar
lack-of-consumer reasons, but regular requests ought to be killed by
various I/O or forwarding timeouts. There are no such timeouts for those
REQMOD transactions that are only waiting for request body consumer to
clear adapted BodyPipe space (e.g., after all ICAP 204 I/Os are over).
Relying on timeouts is also very inefficient
For a `mgr:mem` observer, stuck REQMOD transactions look like a ModXact
memory leak. A `mgr:jobs` report shows ModXact jobs with RBS(1) status
Report cache_peer context in probe and standby pool messages (#1960)
The absence of the usual "current master transaction:..." detail in
certain errors raises "Has Squid lost the current transaction context?"
red flags:
ERROR: Connection to peerXyz failed
In some cases, Squid may have lost that context, but for cache_peer TCP
probes, Squid has not because those probes are not associated with
master transactions. It is difficult to distinguish the two cases
because no processing context is reported. To address those concerns,
we now report current cache_peer probing context (instead of just not
reporting absent master transaction context):
ERROR: Connection to peerXyz failed
current cache_peer probe: peerXyzIP
When maintaining a cache_peer standy=N connection pool, Squid has and
now reports both contexts, attributing messages to pool maintenance:
ERROR: Connection to peerXyz failed
current cache_peer standby pool: peerXyz
current master transaction: master1234
The new PrecomputedCodeContext class handles both reporting cases and
can be reused whenever the cost of pre-computing detailCodeContext()
output is acceptable.
CI: Add OpenBSD build tests for staged commits (#1964)
Use a GitHub-hosted VM to create OpenBSD test environment. Requires
GitHub repository configuration that permits the use of
`vmactions/openbsd-vm@v1` Action.
We have not enabled ccache optimization for OpenBSD tests because we
do not know how to copy updated ccache files from VM back into runner.
CI: Add GitHub Actions workflow for periodic Coverity Scan (#1958)
Implement a weekly scheduled GitHub Actions workflow to run Coverity
Scan (i.e. cov-build). Currently, we run Coverity Scan using Jenkins.
The new job uses the Squid Project pre-made docker image because
installing the tools required to use free Coverity Scan service cannot
be easily automated at the moment.
The job only runs for the official Squid Project repository.
Portability: remove explicit check for libdl (#1963)
OpenBSD does not have libdl, as it has dlopen() in libc.
It is not really needed, and force-requiring the presence of libdl
causes ./configure to fail on openbsd:
checking for dlopen in -ldl... no
configure: error: Required library 'dl' not found
stat5minClientRequests() was to meant to return the number of recent
client requests. However, the function did not provide implied 5 minute
precision. It returned, roughly speaking, the number of requests during
the last 0-6 minutes. The new, less strict function name and boolean
type avoid this false precision implication.
Squid may crash when the SquidConfig global is auto-destructed after
main() ends. Since SquidConfig global is used by cleanup code, we should
keep its fields alive, essentially emulating "No New Globals" policy
effects. This surgical fix will be followed up with more changes to
address general OpenSSL cleanup problems exposed by this bug.
This bug fix facilitates backporting by using FuturePeerContext shim.
Alex Rousskov [Wed, 27 Nov 2024 02:32:02 +0000 (02:32 +0000)]
Treat responses to collapsed requests as fresh (#1927)
Squid transactions involving collapsed requests receive their responses
as Store cache hits. Cache hits undergo mandatory freshness validation
checks and, if those checks detect a stale response, may trigger
revalidation (e.g., an If-Modified-Since request to the origin server).
This logic results in a still-being-delivered-to-Squid response
triggering its own revalidation if that response is deemed stale on
arrival by collapsed request (e.g., has a Cache-Control: max-age=0
header field).
HTTP RFC 9111 Section 4.7 briefly mentions collapsed requests but is
ambiguous with regard to validation of responses to collapsed requests.
IETF HTTP Working Group chair believes that specs are unclear, and that
these responses should not be treated as cache hits (in primary cases):
https://lists.w3.org/Archives/Public/ietf-http-wg/2024JanMar/0095.html
This change follows the above advice and prevents arguably excessive
freshness checks for responses to collapsed requests. This change is
collapsed-forwarding specific: It does not prevent freshness checks for
responses that were, at the time of a hit request, either fully cached
or still receiving response body bytes.
After this change, clientReplyContext can no longer collapse twice, once
after initial Store lookup and then again during refresh, because the
first collapse now precludes refresh.
Simplified quick_abort_pct code and improved its docs (#1921)
Instead of ignoring quick_abort_pct settings that would, together with
other conditions, abort a pending download of a 99-byte or smaller
response, Squid now honors quick_abort_pct for all response sizes. Most
Squids are not going to be affected by this change because default
quick_abort_min settings (16KB) prevent aborts of 99-byte responses even
before quick_abort_pct is checked.
Due to conversion from integer to floating point math, this change may
affect responses larger than 99 bytes as well, but these effects ought
to be limited to cases where the decision is based on a tiny difference
(e.g., receiving 1% more bytes would have triggered full download). In
most such cases, the decision could probably go either way due to
response header size fluctuations anyway.
Also updated quick_abort_pct documentation, primarily to clarify a
misleading statement: Squid did not and does not treat 16KB or smaller
responses specially in this context. The original statement was probably
based on quick_abort_min _default_ setting of 16KB, but statement
phrasing and placement hid that connection.
Tony Walker [Sat, 16 Nov 2024 22:10:39 +0000 (22:10 +0000)]
Bug 5363: Handle IP-based X.509 SANs better (#1793)
Most X.509 Subject Alternate Name extensions encountered by Squid are
based on DNS domain names. However, real-world servers (including
publicly available servers that use vanity IP addresses) also use
IP-based SANs. Squid mishandled IP-based SANs in several ways:
* When generating certificates for servers targeted by their IP
addresses, addAltNameWithSubjectCn() used that target IP as a
DNS-based SAN, resulting in a frankenstein DNS:[ip] SAN value that
clients ignored when validating a Squid-generated certificate.
* When validating a received certificate, Squid was ignoring IP-based
SANs. When Subject CN did not match the requested IP target, Squid
only looked at DNS-based SANs, incorrectly failing validation.
* When checking certificate-related ACLs like ssl::server_name,
matchX509CommonNames() ignored IP-based SANs, not matching
certificates containing ACL-listed IP addresses.
Squid now recognizes and generates IP-based SANs.
Squid now attempts to match IP-based SANs with ACL-listed IP addresses,
but the success of that attempt depends on whether ACL IP parameters are
formatted the same way inet_ntop(3) formats those IP addresses: Matching
is still done using c-string/domain-based ::matchDomainName() (for
ssl::server_name) and string-based regexes (for ssl::server_name_regex).
Similar problems affect dstdomain and dstdomain_regex ACLs. A dedicated
fix is needed to stop treating IPs as domain names in those contexts.
This change introduces partial support for preserving IP-vs-domain
distinction in parsed/internal Squid state rather than converting both
to a string and then assuming that string is a DNS domain name.
Alex Rousskov [Wed, 13 Nov 2024 18:16:08 +0000 (18:16 +0000)]
Annotate PoolMalloc memory in valgrind builds (#1946)
MemPoolMalloc code (i.e. memory_pools code used by default) was missing
VALGRIND_MAKE_MEM_*() calls. Similar calls do exist in MemPoolChunked
code (i.e. code enabled by setting MEMPOOLS environment variable to 1).
Even with these markings, "memory_pools on" configuration is still not
quite compatible with Valgrind leak reporting suppressions: In some
cases, Valgrind may still incorrectly suppress leak reporting (or report
leaks that should have been suppressed) because Valgrind associates leak
suppressions with memory _allocators_ while buggy code may leak memory
allocated by others. The long-term solution (if it exists) requires
upgrading these markings to VALGRIND_MEMPOOL_*() API targeting memory
pools, but that requires a serious effort, especially when dealing with
MemPoolChunked complexities. The added markings help detect non-leak
violations and improve PoolMalloc/MemPoolChunked code symmetry.