Amos Jeffries [Sat, 11 Oct 2025 03:33:02 +0000 (16:33 +1300)]
Bug 3390: Proxy auth data visible to scripts (#2249)
Original changes to redact credentials from error page %R code
expansion output was incomplete. It missed the parse failure
case where ErrorState::request_hdrs raw buffer contained
sensitive information.
Also missed was the %W case where full request message headers
were generated in a mailto link. This case is especially
problematic as it may be delivered over insecure SMTP even if
the error was secured with HTTPS.
After this change:
* The HttpRequest message packing code for error pages is de-duplicated
and elides authentication headers for both %R and %W code outputs.
* The %R code output includes the CRLF request message terminator.
* The email_err_data directive causing advanced details to be added to
%W mailto links is disabled by default.
Also redact credentials from generated TRACE responses.
---------
Co-authored-by: Alex Rousskov <rousskov@measurement-factory.com>
Avoid putenv() problems by switching to setenv() (#2262)
Developers work around putenv() design flaws by writing more complex
code that still leaks memory (e.g., commit b1b2793 and commit 96b9d96).
We should follow putenv(3) manual page advice and use setenv() instead:
The setenv() function is strongly preferred to putenv().
Kerberos builds have been using setenv() since 2009 commit 9ca29d2.
Twenty years ago, setenv() code failed to build on Solaris (see 2005
commit cff61cb), but modern Solaris does have setenv(3).
Since setenv(3) is a standard C library _extension_ unavailable on
Windows, we provide a setenv(3) replacement for MS Windows builds. Our
replacement should work correctly in known current use cases, but acts
differently if given a variable with an empty value. This replacement
does not make things worse because the old macro trick had the same
flaw. We do not know of an easy way to support empty variable values on
Windows the way setenv(3) does.
Also fixed negotiate_kerberos_auth undefined behavior caused by freeing
memory returned by getenv() when the helper was run without `-k keytab`.
Lior Brown [Fri, 3 Oct 2025 08:18:34 +0000 (08:18 +0000)]
Bug 5510: False Cache Digests misses (#2254)
Since 2002 commit add2192d, code receiving fresh Cache Digests from
cache_peers corrupted peer digest bitmask, leading to misses for objects
that were supposed to be present in the digest[^1]. Memory overreads
were probably happening as well. The exact corruption conditions/effects
probably changed when 2023 commit 122a6e3c removed HTTP response headers
from storeClientCopy() API, but the underlying memmove() size
calculation bug predates that 2023 change.
[^1]: Bitmask corruption also ought to trigger some hits for objects
that were not present in peer's cache, although such hits were not
observed in triage, and some excessive hits are endemic to our Bloom
filters.
Flip the default of configure's `--enable-arch-native` option from
enabled to disabled.
GCC's and clang's `-march=native` argument causes issues in some
environments, most notably containers where the compilers' heuristics
about CPU feature set may fail. These issues manifest as
hard-to-reproduce SIGILL errors in binaries such as `squid` or unit test
programs. The new default is safer. Performance-minded administrators
still have a convenient option to optimize via `--enable-arch-native`.
Support no-digest X509 certificate keys like ML-DSA/EdDSA (#2165)
Recent OpenSSL releases (e.g., OpenSSL v3.5) support several private key
types[^1] for which supplying a message digest algorithm is prohibited
when signing a certificate. Prior to this enhancement, Squid was
rejecting https_port and http_port configurations using such key types
(with the above FATAL message) because OpenSSL X509_sign() call made
with a prohibited (for the given key type) non-nil digest algorithm was
failing.
Technically, only listening ports with generate-host-certificates (and
ssl-bump) parameters need to generate X509 certificates and, hence, call
X509_sign(). However, current Squid code generates so called "untrusted"
certificates even for ports that do not support dynamic host certificate
generation or SslBump (XXX). Thus, this enhancement is applicable to
both regular and SslBump configurations.
[^1]: Known no-message-digest key types are ML-DSA-44, ML-DSA-65,
ML-DSA-87, ED25519, and ED448, but others might exist or will be added.
This change was tested against known types, but should support others.
ML-DSA key types are used in post-quantum cryptography.
SNMP: Match Var allocation/deallocation methods (#2183)
Pdu::setVars() and Pdu::unpack() allocate variables with `new Var(...)`,
but clearVars() freed them using snmp_var_free(). That skipped the `Var`
destructor and mismatched the allocator.
We hope that all Pdu::variables are allocated via Pdu class methods
despite the presence of snmp_var_new() and snmp_var_clone() calls in
low-level snmplib code.
Fix parsing of malformed quoted squid.conf strings (#2239)
Iteration of a quoted token that ends with a backslash (escape with no
next char) kept going past the end of the token. That bug as well as
hypothetical 2KB-byte tokens (exceeding CONFIG_LINE_LIMIT) could also
result in a 1-byte NUL overrun.
ext_ldap_group_acl: Require base dn and user search filter (#2167)
Make ext_ldap_group_acl require -B (user base DN) and -F (user search
filter) so the helper finds the user before checking group membership.
Fix LDAP referrals handling and make %v always expand to the requested
group name. Improve logging and errors, and include release notes for
the breaking flag requirements.
HTCP: Check for too-small packed and too-large unpacked fields (#2164)
Harden HTCP parsing by checking HTCP fields
- Check packed field lengths and buffer space before reads.
- Guard CLR "reason" when sz < 2; log invalid messages.
- Support old minor==0 layout with safe prefix copy.
- Use early returns and unique_ptr for safer flows.
Fix off-by-one in helper args count assertion (#2212)
The `nargs` value should now be pointing past both the
`HELPER_MAX_ARGS` and the additional terminator. i.e. outside
the valid array space. This is okay because it is an absolute
counter (1-based) not an offset (0-based) despite how it is
used to fill the array.
Previously, nmasldap_get_simple_pwd() and nmasldap_get_password()
could overrun or return non-terminated strings at length
boundaries. This change adds strict bounds checks, copies at most
len - 1, and ensures explicit NUL termination, aligning both
helpers buffer/length semantics without altering call-site
behavior.
As of 2025 Microsoft has officially removed NTLM support from
Windows machines entirely. The SSPI mechanisms will no
longer provide NTLM authentication for this helper to use.
Fix UDP log module opening and closing code (#2214)
logfile_mod_udp_open() mistreated successful comm_connect_addr() result
as an "Unable to connect" failure (and vice versa), rendering UDP-based
logging unusable. Broken since at least 2010 commit d938215.
Also fixed logfile_mod_udp_close() closing FD 0 after "Invalid UDP
logging address" ERRORs during logfile_mod_udp_open().
ext_file_userip_acl: harden lookups and memory handling (#2198)
Stop mutating getgrnam(3) buffer; iterate gr_mem safely
Zero last node and NULL-check in dict_lookup() to prevent OOB read
Add free_dict() and free dictionary before exit
negotiate_kerberos_auth: Properly align NDR data (#2186)
Resolves sporadic Negotiate/Kerberos auth failures that manifested
as proxy 407 loops or helper errors when decoding PAC data, depending
on ticket layout.
Previously, the parser advanced bpos by the remainder:
(bpos += bpos %n)
instead of padding to the next multiple of n.
For example, n = 4:
bpos=5 (r=1): current: 6 (wrong), correct: 8
bpos=6 (r=2): current: 8 (accidentally right)
bpos=7 (r=3): current: 10 (wrong), correct: 8
This patch NULL-terminates the NMAS Universal Password values
array (values[1] = nullptr) to match ldap_get_values() semantics
and avoid potential out-of-bounds iteration.
ntlm_sspi_auth: memcmp not memcpy, send newline, no uninit mem (#2218)
Previously, memcpy was incorrectly used instead of memcmp. In addition
to this, uninitalized memory could be used, and responses to Squid were
missing a newline.
SNMP: Improve parsing of malformed ASN.1 object identifiers (#2185)
ASN.1 object identifiers are length-delimited, not null-terminated. If
the input encoding omits a terminating byte (MSB clear), then the parser
would walk past the buffer.
Also simplified expressions related to sub-identifier parsing.
Alex Rousskov [Mon, 8 Sep 2025 08:05:00 +0000 (08:05 +0000)]
SNMP: Do not send responses that we fail to encode (#2151)
When snmp_build() fails, its output buffer must not be used:
Syscall param sendto() points to uninitialised byte(s)
by xsendto() (socket.h:118)
by comm_udp_sendto() (comm.cc:923)
by snmpConstructReponse() (snmp_core.cc:449)
Also inform the admin about SNMP encoding failures.
On errors, the helper was reporting DsRoleGetPrimaryDomainInformation()
error info using the wrong/bogus error code and probably freeing an
uninitialized pointer.
FTP: Avoid null dereferences when handling ftp_port traffic (#2172)
`strchr` may return null if a deliminator is not found. Likewise,
if an `Http::HdrType::FTP_REASON` string is not found, nullptr would
be used in the %s formatter, leading to UB.
This drastic reduction in testHttpRange dependencies is made possible by
splitting HeaderTools code into several parts:
* Header-mangling code that pulled in many heavy dependencies but was
unused by testHttpRange code. This high-level code does not belong to
libhttp, so it was left in src/ (see HeaderMangling.cc).
* Simple header manipulation functions without heavy dependencies; some
used by testHttpRange code. This low-level code should eventually be
moved to libhttp, but it was left in HttpHeaderTools.cc for now
because libhttp itself is currently bloated with heavy dependencies --
linking with libhttp requires linking with or stubbing a lot of
code unrelated to testHttpRange.
* Definition of httpHeaderParseQuotedString() and a few other functions
declared in HttpHeader.h were moved to HttpHeader.cc to improve
declaration/definition/stubs affinity and avoid linking errors:
testHttpRange now depends on stub_HttpHeader.cc that has stubs for
these functions.
* httpHeaderMaskInit() and getStringPrefix() are only used by
HttpHeader.cc, so they were moved to HttpHeader.cc and made static.
Moved code was unchanged. A few comments were relocated to better match
Squid coding style. A few STUBs were added/adjusted as needed.
TODO: Check the remaining two tests that still include header mangling
code: testHttpRequest and testCacheManager.
Amos Jeffries [Sat, 30 Aug 2025 04:00:29 +0000 (04:00 +0000)]
Add origin-form and absolute-path to URI API (#2141)
Convert callers of Uri::path() as appropriate to their needs.
These methods provide %-encoded URI components for use in
display or normalized processing. The path() method is now only
guaranteed to provide the URL-path component in un-escaped form
(though query has not yet been moved).
Amos Jeffries [Tue, 26 Aug 2025 02:36:53 +0000 (02:36 +0000)]
Remove DEFAULT_ICP_PORT build variable (#2146)
Since 2008 commit df2eec10, this variable has only been used for
icp_port documentation and only in one location. Just stating the IANA
registered port (i.e. 3130) there is better.
Ben Kallus [Mon, 11 Aug 2025 13:02:17 +0000 (13:02 +0000)]
Fix type mismatch in new/delete of addrinfo::ai_addr (#2136)
new/delete type mismatches are UB. Fix an instance of this
problem that occurs when sockaddr_in6 is allocated, but
sockaddr is deallocated, by always allocating/deallocating
sockaddr_storage.
AddressSanitizer: new-delete-type-mismatch:
object passed to delete has wrong type:
size of the allocated type: 28 bytes;
size of the deallocated type: 16 bytes.
#0 0xaaaad1a8db54 in operator delete(void*, unsigned long)
#1 0xaaaad287a668 in Ip::Address::FreeAddr(addrinfo*&)
src/ip/Address.cc:710:22
Norman Ziert [Tue, 5 Aug 2025 08:31:40 +0000 (08:31 +0000)]
Bug 5407: Support at least 1000 groups per Kerberos user (#2047)
Increase MAX_PAC_GROUP_SIZE to a more reasonable value,
so negotiate_kerberos_auth can report more than approximately
200 groups an authenticated user is member of back to Squid.
Amos Jeffries [Mon, 4 Aug 2025 12:20:02 +0000 (12:20 +0000)]
Make spell-check.sh happy and remove stale spelling exceptions (#2122)
This change fixes a few misspellings identified by Codespell v2.4.1 and
removes all no-longer-used spelling exceptions that we could find. Most
of the fixed misspellings were present but ignored in 2020 commit 2f8abb64 because then-current Codespell v1.16.0 did not flag them.
N.B. GitHub Actions ubuntu-24.04 runner still uses Codespell v1.16.0.
Enhance and use POSIX socket compatibility layer (#2046)
Implement portable wrappers around most socket-related functions, named
x[function], and use them in all callsites.
winsock's socket-related functions are very similar but not
identical to the POSIX standard. Make the mswindows
compatibility layer available to MinGW, and modernize it.
Error highlighting the issue:
```
TcpAcceptor.cc: In member function
'bool Comm::TcpAcceptor::acceptInto(Comm::ConnectionPointer&)':
TcpAcceptor.cc:352:17: error:
comparison of unsigned expression in '< 0'
is always false [-Werror=type-limits]
352 | if (rawSock < 0) {
```
Checking PID to ignore stale responses became unnecessary after 2021
commit 4c21861 added Mine() calls that guarantee message freshness.
Also replaced the matching kidId check with an assertion because no IPC
messages, not even stale ones, may be sent to the kid with the wrong kid
identifier. This assertion cannot be easily generalized because most IPC
messages do not contain the kid identifier of the intended recipient.
CI: Support customizing CONTRIBUTORS in new contributor PRs (#2128)
When our source-maintenance.sh recognizes a magic phrase in a commit
message, it treats that commit as authoritative for CONTRIBUTORS file.
That feature allows us to customize CONTRIBUTORS by adding a magic
phrase to the commit message, preventing unwanted automated
collectAuthors() actions for earlier commits. Such commits work well
enough when CONTRIBUTORS is customized by an already known Squid
developer in a dedicated PR.
The same feature cannot work for PRs created by new developers because
when quick.yaml tests a PR, the first commit visible to collectAuthors()
is not a PR branch commit that the author could, with some considerable
trouble, customize, but a so called GitHub PR "merge commit" that GitHub
automatically creates with PR creator as the author. That merge commit
message does not contain PR description and never has our magic phrase.
That merge commit author details are wrong when PR creators want to use
information that differs from their public GitHub account info.
With this change, collectAuthors() consults PR description (when testing
PRs), allowing CONTRIBUTORS customization without any extra efforts:
Adding the magic phrase to PR description has to be done anyway.
Simplify DelayId composite position management (#2126)
Moved DelayId::compositeId initialization to the constructor and removed
its setter and getters as unused. Simplified DelayId methods by using
"pool and compositeId are either both set or both unset" invariant that
is now upheld by the two corresponding class constructors.
Possible unintended interpolation of @users in string
Bareword "true" not allowed
Bareword "false" not allowed
Execution of ... aborted due to compilation errors
Recent commit daa76f41 broke scripts/update-contributors.pl syntax (see
error messages quoted above) and its lower-case comparison logic,
effectively disabling CONTRIBUTORS checks.
Also do not hide update-contributors.pl execution failures. Buggy
failure detection contributed to the above problems ignored by CI tests.
DelayId::operator bool() should account for cache_peer no-delay (#2121)
Without this check, a DelayId::operator bool() caller could incorrectly
limit reading from a cache_peer that has been configured to ignore delay
pools. Luckily, all existing code paths that use this operator checked
the markedAsNoDelay flag, avoiding this problem:
* DelayId::bytesWanted() and DelayId::bytesIn() checked markedAsNoDelay.
* While MemObject::delayRead() itself did not check markedAsNoDelay, the
method is not called when markedAsNoDelay is true. The analysis of the
corresponding code paths is complex, especially for HttpStateData! We
also call expensive mostBytesAllowed() several times on these paths.
Refactoring this code is outside this API safety improvement scope.
This change helps avoid accidental/unwanted read delays during future
code changes.
Bug 5499: Remove support for src_as and dst_as ACLs (#2113)
Squid ACL initialization code calls asnCacheStart() and tries to connect
to an ASN server. If the configuration requires a cache_peer, that
connection fails because none of cache_peers are available an that time:
WARNING: AS ... whois request failed
Since ASN-based ACLs are essentially unused and properly fixing this
bug requires significant effort work, we drop Autonomous System Numbers
instead. Any rare use cases may implement an external ACL helper with
similar functionality and more features.
Also removed no longer necessary radix.{c,h} code, "asndb" cache manager
report, and "asn" initiator value in transaction_initiator ACLs.
Reduce UDS/segment name clashes across same-service instances (#2023)
Add a PID file name hash to the names of the shared memory segments and
Unix Domain Sockets. Since all instances running on the same host are
supposed to have unique PID files, this addition significantly reduces
the probability of name clashes when running multiple Squid instances
with the same service name (i.e. the same `squid -n` parameter value
that defaults to "squid").
A clash may still happen if two different PID file names have the same
hash or if multiple instances disable PID file management with
`pid_filename none`. Clashes may also happen in environments where Squid
does not even use service name for naming shared memory segments.
Examples of UDS and shared memory segment names (while using default
service name):
Alex Rousskov [Mon, 30 Jun 2025 20:32:16 +0000 (20:32 +0000)]
Simplify appending SBuf to String (#2108)
There is still a lot of code that uses legacy String class, and an
increasing amount of code that has to append modern SBufs to Strings.
SBuf handles NUL bytes better than String. With respect to handling NUL
bytes, the new append() method is as (un)safe as the existing explicit
size-based one (that callers tend to use prior to these changes anyway).
aafbsd [Thu, 26 Jun 2025 18:57:19 +0000 (18:57 +0000)]
Bug 5497: Fix detection of duped IPs returned by getaddrinfo() (#2100)
WARNING: Ignoring <IP X> because it is already covered by <IP X>
Affects `src`, `dst`, and `localip` ACLs, especially those that use
domain names with multiple DNS A or AAAA records.
IP addresses returned by getaddrinfo(3) may not be sorted (e.g., when a
host has multiple DNS A RRs on FreeBSD). Instead of comparing the
current address with just the previous one, we now check all previously
added addresses (while processing a single getaddrinfo() call output).
This surgical fix minimizes changes without improving surrounding code.
Amos Jeffries [Sun, 22 Jun 2025 00:48:05 +0000 (00:48 +0000)]
Fix missing CONTRIBUTOR name (#2091)
Despite CONTRIBUTORS file being UTF-8 Ture's name
was dropped by our conversion script which does not
handle non-ASCII characters nicely. Re-add it manually.
Do not duplicate received Surrogate-Capability in sent requests (#2087)
When computing Surrogate-Capability header while forwarding an
accelerated request, Squid duplicated old (i.e. received) header entries
(if any). For example, this outgoing request shows an extra hop1 entry:
GET / HTTP/1.1
...
Surrogate-Capability: hop1="Surrogate/1.0"
Surrogate-Capability: hop1="Surrogate/1.0", hop2="Surrogate/1.0"
Amos Jeffries [Mon, 9 Jun 2025 04:49:04 +0000 (04:49 +0000)]
CI: Fix gperf 3.2 output filter (#2081)
gperf 3.2 now provides properly compiler and C++ version scoped
fallthrough attributes. Our filter to convert the gperf 3.1 and
older output for C++17 attribute requirements is now broken and
produces compiler errors due to listing '[[fallback]];' on two
consecutive lines.
Fix OpenSSL build with GCC v15.1.1 [-Wformat-truncation=] (#2077)
On arm64 Fedora 42:
src/ssl/crtd_message.cc:132:39: error: '%zd' directive output may be
truncated writing between 1 and 19 bytes into a region
of size 10 [-Werror=format-truncation=]
snprintf(buffer, sizeof(buffer), "%zd", body.length());
Carl Vuosalo [Wed, 28 May 2025 19:17:38 +0000 (19:17 +0000)]
Fix SNMP cacheNumObjCount -- number of cached objects (#2053)
SNMP counter cacheNumObjCount used StoreEntry::inUseCount() stats. For
Squid instances using a rock cache_dirs or a shared memory cache, the
number of StoreEntry objects in use is usually very different from the
number of cached objects because these caches do not use StoreEntry
objects as a part of their index. For all instances, inUseCount() also
includes ongoing transactions and internal tasks that are not related to
cached objects at all.
We now use the sum of the counters already reported on "on-disk objects"
and "Hot Object Cache Items" lines in "Internal Data Structures" section
of `mgr:info` cache manager report. Due to floating-point arithmetic,
these stats are approximate, but it is best to keep SNMP and cache
manager reports consistent.
This change does not fix SNMP Gauge32 overflow bug: Caches with 2^32 or
more objects continue to report wrong/smaller cacheNumObjCount values.
### On MemStore::getStats() and StoreInfoStats changes
To include the number of memory-cached objects while supporting SMP
configurations with shared memory caches, we had to change how cache
manager code aggregates StoreInfoStats::mem data collected from SMP
worker processes. Before these changes, `StoreInfoStats::operator +=()`
used a mem.shared data member to trigger special aggregation code hack,
but
* SNMP-specific code cannot benefit from that StoreInfoStats aggregation
because SNMP code exchanges simple counters rather than StoreInfoStats
objects. `StoreInfoStats::operator +=()` is never called by SNMP code.
Instead, SNMP uses Snmp::Pdu::aggregate() and friends.
* We could not accommodate SNMP by simply adding special aggregation
hacks directly to MemStore::getStats() because that would break
critical "all workers report about the same stats" expectations of the
special hack in `StoreInfoStats::operator +=()`.
To make both SNMP and cache manager use cases work, we removed the hack
from StoreInfoStats::operator +=() and hacked MemStore::getStats()
instead, making the first worker responsible for shared memory cache
stats reporting (unlike SMP rock diskers, there is no single kid process
dedicated to managing a shared memory cache). StoreInfoStats operator
now uses natural aggregation logic without hacks.
TODO: After these changes, StoreInfoStats::mem.shared becomes
essentially unused because it was only used to enable special
aggregation hack in StoreInfoStats that no longer exists. Remove?
Bug 5352: Do not get stuck in RESPMOD after pausing peer read(2) (#2065)
The transaction gets stuck if Squid, while sending virgin body bytes to
an ICAP RESPMOD service, temporary stops reading additional virgin body
bytes from cache_peer or origin server. Squid pauses reading (with
readSizeWanted becoming zero) if reading more virgin bytes is temporary
prohibited by delay pools and/or read_ahead_gap limits:
readReply: avoid delayRead() to give adaptation a chance to drain...
HttpStateData::readReply() starts waiting for ModXact to drain the
BodyPipe buffer, but that draining may not happen, either because
ModXact::virginConsume() is not called at all[^1] or because it is
"postponing consumption" when BodyPipe still has some unused space[^2].
With HttpStateData not reading more virgin bytes, Squid may not write
more virgin body bytes to the ICAP service, and the ICAP service may not
start or continue responding to the RESPMOD request. Without that ICAP
activity, ModXact does not consume, the virgin BodyPipe buffer is not
drained, HttpStateData is not reading, and no progress is possible.
HttpStateData::readReply() should start waiting for adaptation to drain
BodyPipe only when the buffer becomes completely full (instead of when
it is not empty). This change may increase virgin response body bytes
accumulation but not the buffer capacity because existing buffer
space-increasing logic in maybeMakeSpaceAvailable() remains intact.
To prevent stalling, both BodyPipe ends (i.e. HttpStateData and
Icap::ModXact) must use matching "progress is possible" conditions, but
* HttpStateData used hasContent()
* Icap::ModXact used spaceSize()
* Ftp::Client used potentialSpaceSize()
Now, all three use matching potentialSpaceSize()-based conditions.
Squid eCAP code is unaffected by this bug, because it does not postpone
BodyPipe consumption. eCAP API does not expose virgin body buffer
capacity, so an eCAP adapter that postpones consumption risks filling
the virgin body buffer and stalling. This is an eCAP API limitation.
[^1]: Zero readSizeWanted is reachable without delay pools, but only if
Squid receives an adapted response (that makes readAheadPolicyCanRead()
false by filling StoreEntry). Ideally, receiving an adapted response
should result in a virginConsume() calls (that would trigger BodyPipe
draining), but currently it may not. Reliably starting virgin data
consumption sooner is not trivial and deserves a dedicated change.
[^2]: ModXact postpones consumption to preserve virgin bytes for ICAP
retries and similar purposes. ModXact believes it is safe to postpone
because there is still space left in the buffer for HttpStateData to
continue to make progress. ModXact would normally start or resume
draining the buffer when sending more virgin bytes to the ICAP service.
Except for a trivial spaceSize() call and a never-executed (for the only
caller) `space` parameter adjustment, replyBodySpace() duplicates
Client::calcBufferSpaceToReserve(). This duplication complicates
refactoring aimed at addressing other known buffer management problems.