Do not download remote certificate for issuer X if the received server
certificate (signed by X) can be validated using a locally available CA
certificate. According to our tests, a typical browser does not follow
'CA Issuers' references to download 'missing' certificates when the
browser can validate the origin server certificate (or its chain) using
a local CA certificate. Avoiding unnecessary validations and downloads
not only saves time, but can prevent validation failures as well!
Alex Rousskov [Fri, 17 Nov 2017 16:30:09 +0000 (09:30 -0700)]
Relay peer CONNECT error status line and headers to users (#80)
Automated agents and human users (or their support staff!) often benefit
from knowing what went wrong. Dropping such details is a bad default.
For example, automation may rely on receiving the original status code.
Our CVE-2015-5400 fix (74f35ca) was too aggressive -- it hid all peer
errors behind a generic 502 (Bad Gateway) response. Pass-through peer
authentication errors were later (971003b) exposed again, but our CVE
fix intent was _not_ to hide _any_ peer errors in the first place! The
intent was to close the connection after delivering the error response.
Hiding peer errors was an (unfortunate) implementation choice.
It could be argued that some peer errors should not be relayed, but
since Squid successfully relayed all peer errors prior to 74f35ca and
continues to relay all non-CONNECT peer errors today, discriminating
peer errors is a separate (and possibly unnecessary) feature.
Ideally, Squid should mangle and relay the whole error message (instead
of sending small original headers). Squid should also relay 1xx control
messages while waiting for the final response. Unfortunately, doing so
properly, without reopening CVE-2015-5400 or duplicating a lot of
complex code, is a huge project. This small change fixes the most acute
manifestation of the "hiding errors from users" problem. The rest is a
long-term TODO.
Bug 2821: Ignore Content-Range in non-206 responses (#77)
Squid used to honor Content-Range header in HTTP 200 OK (and possibly
other non-206) responses, truncating (and possibly enlarging) some
response bodies. RFC 7233 declares Content-Range meaningless for
standard HTTP status codes other than 206 and 416. Squid now relays
meaningless Content-Range as is, without using its value.
Why not just strip a meaningless Content-Range header? Squid does not
really know whether it is the status code or the header that is "wrong".
Let the client figure it out while the server remains responsible.
Also ignore Content-Range in 416 (Range Not Satisfiable) responses
because that header does not apply to the response body.
Also fixed body corruption of (unlikely) multipart 206 responses to
single-part Range requests. Valid multipart responses carry no
Content-Range (in the primary header), which confused Squid.
Amos Jeffries [Thu, 2 Nov 2017 08:14:54 +0000 (21:14 +1300)]
Move TLS/SSL http_port config values to libsecurity (#51)
These are most of the minor shuffling prerequisite for the proposal to allow generate-host-certificates to set a CA filename. These are required in libsecurity in order to prevent circular dependencies between libsecurity, libssl and libanyp.
Also contains some improvements to how configuration errors are displayed for these affected settings and some bugs fixed where the configured values were handled incorrectly.
Bug 4718: Support filling raw buffer space of shared SBufs (#64)
SBuf::forceSize() requires exclusive SBuf ownership but its precursor
SBuf::rawSpace() method does not guarantee exclusivity. The pair of
calls may result in SBuf::forceSize() throwing for no good reason.
New SBuf API provides a new pair of raw buffer appending calls that
reduces the number of false negatives.
This change may alleviate bug 4718 symptoms but does not address its
core problem (which is still unconfirmed).
This bug was probably caused by Bug 2833 feature/fix (1a210de).
The primary fix here is limited to clientReplyContext::processExpired():
Collapsed forwarding code must ensure StoreEntry::mem_obj existence. It
was missing for cache hits purged from (or never admitted into) the
memory cache. Most storeClientListAdd() callers either have similar code
or call storeCreateEntry() which also creates StoreEntry::mem_obj.
Also avoided clobbering known StoreEntry URIs/method in some cases. The
known effect of this change is fixed store.log URI and method fields
when a hit transaction did not match the stored entry exactly (e.g., a
HEAD hit for a GET cached entry), but this improvement may have even
more important consequences: The original method is used by possibly
still-running entry filling code (e.g., determining the end of the
incoming response, validating the entry length, finding vary markers,
etc.). Changing the method affects those actions, essentially corrupting
the entry state. The same argument may apply to store ID and log URI.
We even tried to make URIs/method constant, but that is impractical w/o
addressing an XXX in MemStore::get(), which is outside this issue scope.
To facilitate that future fix, the code now distinguishes these cases:
* createMemObject(void): Buggy callers that create a new memory object
but do not know what URIs/method the hosting StoreEntry was based on.
Once these callers are fixed, we can make the URIs/method constant.
* createMemObject(trio): Callers that create a new memory object with
URIs/method that match the hosting StoreEntry.
* ensureMemObject(trio): Callers that are not sure whether StoreEntry
has a memory object but have URIs/method to create one if needed.
Fix SSL certificate cache refresh and collision handling (#40)
SslBump was ignoring some origin server certificate changes or differences,
incorrectly using the previously cached fake certificate (mimicking
now-stale properties or properties of a slightly different certificate).
Also, Squid was not detecting key collisions inside certificate caches.
On-disk certificate cache fixes:
Use the original certificate signature instead of the certificate
subject as part of the key. Using signatures reduces certificate key
collisions to deliberate attacks and woefully misconfigured origins,
and makes any mishandled attacks a lot less dangerous because the
attacking origin server certificate cannot by trusted by a properly
configured Squid and cannot be used for encryption by an attacker.
We have considered using certificate digests instead of signatures.
Digests would further reduce the attack surface to copies of public
certificates (as if the origin server was woefully misconfigured).
However, unlike the origin-supplied signatures, digests require
(expensive) computation in Squid, and implemented collision handling
should make any signature-based attacks unappealing. Signatures won
on performance grounds.
Other key components remain the same: NotValidAfter, NotValidBefore,
forced common name, non-default signing algorithm, and signing hash.
Store the original server certificate in the cache (together with
the generated certificate) for reliable key collision detection.
Upon detecting key collisions, ignore and replace the existing cache
entry with a freshly computed one. This change is required to
prevent an attacker from tricking Squid into hitting a cached
impersonating certificate when talking to a legitimate origin.
In-memory SSL context cache fixes:
Use the original server certificate (in ASN.1 form) as a part of the
cache key, to completely eliminate cache key collisions.
Other related improvements:
Make the LruMap keys template parameters.
Polish Ssl::CertificateDb class member names to match Squid coding
style. Rename some functions parameters to better match their
meaning.
Replace Ssl::CertificateProperties::dbKey() with:
Ssl::OnDiskCertificateDbKey() in ssl/gadgets.cc for
on-disk key generation by the ssl_crtd helper;
Ssl::InRamCertificateDbKey() in ssl/support.cc for
in-memory binary keys generation by the SSL context memory cache.
Optimization: Added Ssl::BIO_new_SBuf(SBuf*) for OpenSSL to write
directly into SBuf objects.
Alex Rousskov [Tue, 22 Aug 2017 01:09:23 +0000 (19:09 -0600)]
Do not die silently when dying early. (#43)
Report (to stderr) various problems (e.g., unhandled exceptions) that
may occur very early in Squid lifetime, before stderr-logging is forced
by SquidMain() and way before proper logging is configured by the first
_db_init() call.
To enable such early reporting, we started with a trivial change:
-FILE *debug_log = NULL;
+FILE *debug_log = stderr;
... but realized that debug_log may not be assigned early enough! The
resulting (larger) changes ensure that we can log (to stderr if
necessary) as soon as stderr itself is initialized. They also polish
related logging code, including localization of stderr checks and
elimination of double-closure during log rotation on Windows.
These reporting changes do not bypass or eliminate any failures.
Alex Rousskov [Sun, 6 Aug 2017 00:20:40 +0000 (18:20 -0600)]
Fixed, changed addresses in README. Made README look better on Github.
Why not add README.md? Not enough reasons to warrant info duplication:
Markdown is not particularly helpful for rendering a trivial list of
references, and Github already renders HTTP links appropriately.
Why not move README to README.md? Many tools and console humans still
look for README rather than README.md.
Why not use Markdown in README? Github does not render such markup.
TODO: Consider removing detailed distribution terms at the bottom
because "everybody" knows what GPLv2 basically means, and we already
tell the reader where to find the exact licensing terms.
Garri Djavadyan [Tue, 1 Aug 2017 00:03:18 +0000 (18:03 -0600)]
Bug 4648: Squid ignores object revalidation for HTTPS scheme
Squid skips object revalidation for HTTPS scheme and, hence, does not
honor a reload_into_ims option (among other settings).
TODO: Add an httpLike() method or function to detect all HTTP-like
schemes instead of comparing with AnyP::PROTO_HTTP directly. There are
20+ candidates for similar bugs: git grep '[!=]= AnyP::PROTO_HTTP'.
Bug 1961 extra: Convert the URL::parse method API to take const URI strings
The input buffer is no longer truncated when overly long. All callers have
been checked that they handle the bool false return value in ways that do
not rely on that truncation.
Callers that were making non-const copies of buffers specifically for the
parsing stage are altered not to do so. This allows a few data copies and
allocations to be removed entirely, or delayed to remove from error handling
paths.
While checking all the callers of Http::FromUrl several places were found to
be using the "raw" URL string before the parsing and validation was done. The
simplest in src/mime.cc is already applied to v5 r15234. A more complicated
redesign in src/store_digest.cc is included here for review. One other marked
with an "XXX: polluting ..." note.
Also, added several TODO's to mark code where class URL needs to be used when
the parser is a bit more efficient.
Also, removed a leftover definition of already removed urlParse() function.
Fixed reporting of validation errors for downloaded intermediate certs. (#73)
When Squid or its helper could not validate a downloaded intermediate
certificate (or the root certificate), Squid error page contained
'[Not available]' instead of the broken certificate details, and '-1'
instead of depth of broken certificate in logs.
Security::HandshakeParser::parseServerCertificates builds cert list with nils (#42) (#69)
... if squid does not compiled with OpenSSL support.
This patch fixes:
* HandshakeParser::ParseCertificate() to return a Security::Pointer
* HandshakeParser:: parseServerCertificates() to be a no-op if OpenSSL is
not used
* Fix compile error if squid compiled without openssl but with gnutls enabled
squidadm [Sat, 19 Aug 2017 17:21:30 +0000 (05:21 +1200)]
Prep for 3.5.27 (#48)
* Maintenance: update snapshot script for git (#24)
* Update snapshot script after git migration
- Remove unused BZRROOT environment variable
- Replace tag with branch name
* Update source-maintenance script for git (#26)
* replace bzr calls with git equivalent
* remove obsolete ROOT and PWD variables (git does not support non-recursive file listing)
* add exceptions to ignore more files caught by git than bzr
* Protect Squid Client classes from new requests that compete with
ongoing pinned connection use and
* resume dealing with new requests when those Client classes are done
using the pinned connection.
Replaced primary ConnStateData::pinConnection() calls with a pair of
pinBusyConnection() and notePinnedConnectionBecameIdle() calls,
depending on the pinned connection state ("busy" or "idle").
Removed pinConnection() parameters that were not longer used or could be computed from the remaining parameters.
Removed ConnStateData::httpsPeeked() code "hiding" the originating
request and connection peer details while entering the first "idle"
state. The old (trunk r11880.1.6) bump-server-first code used a pair of
NULLs because "Intercepted connections do not have requests at the
connection pinning stage", but that limitation no longer applicable
because Squid always fakes (when intercepting) or parses (a CONNECT)
request now, even during SslBump step1.
The added XXX and TODOs are not directly related to this fix. They
were added to document problems discovered while working on this fix.
In v3.5 code, the same problems manifest as Read.cc
"fd_table[conn->fd].halfClosedReader != NULL" assertions.
Amos Jeffries [Fri, 30 Jun 2017 10:35:56 +0000 (22:35 +1200)]
Bug 1961 partial: move urlParse() and urlParseFinish() into class URL
* Move the urlParseFinish() logic into a class URL method and remove its
dependency on HttpRequest objects.
* Remove unnecessary urnParse() function.
* rename local variables in urlParse() to avoid symbol
clashes with class URL members and methods.
* move HttpRequest method assignment out to the single caller
which actually needed it. Others all passed in the method
which was already set on the HttpRequest object passed.
* removed now needless HttpRequest parameter of urlParse()
* rename urlParse as a class URL method
* make URL::parseFinish() private
* remove unnecessary CONNECT_PORT define
* add RFC documentation for 'CONNECT' URI handling
* fixed two XXX in URL-rewrite handling doing unnecessary
HttpRequest object creation and destruction cycles on
invalid URL-rewrite helper output.
Alex Rousskov [Fri, 30 Jun 2017 06:37:58 +0000 (18:37 +1200)]
Minimize direct comparisons with ACCESS_ALLOWED and ACCESS_DENIED.
No functionality changes expected.
Added allow_t API to avoid direct comparisons with ACCESS_ALLOWED and
ACCESS_DENIED. Developers using direct comparisons eventually mishandle
exceptional ACCESS_DUNNO and ACCESS_AUTH_REQUIRED cases where neither
"allow" nor "deny" rule matched. The new API cannot fully prevent such
bugs, but should either led the developer to the right choice (usually
.allowed()) or alert the reviewer about an unusual choice (i.e.,
denied()).
The vast majority of checks use allowed(), but we could not eliminate
the remaining denied() cases ("miss_access" and "cache" directives) for
backward compatibility reasons -- previously "working" deployments may
suddenly start blocking cache misses and/or stop caching:
http://lists.squid-cache.org/pipermail/squid-dev/2017-May/008576.html
Alex Rousskov [Fri, 30 Jun 2017 06:03:23 +0000 (18:03 +1200)]
Fix mgr query handoff from the original recipient to Coordinator.
This bug has already been fixed once, in trunk r11164.1.61, but that fix
was accidentally undone shortly after, during significant cross-branch
merging activity combined with the Forwarder class split. The final
merge importing the associated code (trunk r11730) was buggy.
The bug (explained in r11164.1.61) leads to a race condition between
* Store notifying Server classes about the entry completion (which might
trigger a bogus error message sent to the cache manager client while
Coordinator sends its own valid response on the same connection!) and
* post-cleanup() connection closure handlers of Server classes silently
closing everything (and leaving Coordinator the only responding
process on that shared connection).
The bug probably was not noticed for so long because, evidently, the
latter actions tend to win in the current code.
Andreas Weigel [Thu, 29 Jun 2017 10:53:05 +0000 (22:53 +1200)]
Fix option --foreground to implement expected behavior
... and allow usage of SMP mode with service supervisors that do not work
well with daemons.
Currently, --foreground behavior is counter-intuitive in that the launched
process, while staying in the foreground, forks another "master" process,
which will create additional children (kids), depending on the number of
configured workers/diskers.
Furthermore, sending a SIGINT/SIGTERM signal to this foreground process
terminates it, but leaves all the children running.
The behavior got introduced with v4 rev.14561.
From discussion on squid-dev, the following behavior is implemented:
* -N: The initial process is a master and a worker process.
No kids.
No daemonimization.
* --foreground: The initial process is the master process.
One or more worker kids (depending on workers=N).
No daemonimization.
* neither: The initial process double-forks the master process.
One or more worker kids (depending on workers=N).
Daemonimization.
The Release Notes for v4 were updated to reflect the corrected behavior.
Add transaction_initiator ACL for detecting various unusual transactions
This ACL is essential in several use cases, including:
* After fetching a missing intermediate certificate, Squid uses the
regular cache (and regular caching rules) to store the response. Squid
deployments that do not want to cache regular traffic need to cache
fetched certificates and only them.
acl fetched_certificate transaction_initiator certificate-fetching
cache allow fetched_certificate
cache deny all
* Many traffic policies and tools assume the existence of an HTTP client
behind every transaction. Internal Squid requests violate that
assumption. Identifying internal requests protects external ACLs, log
analyzers, and other mechanisms from the transactions they mishandle.
The new transaction_initiator ACL classifies transactions based on their
initiator. Currently supported initiators are esi, certificate-fetching,
cache-digest, internal, client, and all. In the future, the same ACL
will be able to identify HTTP/2 push transactions using the "server"
initiator. See src/cf.data.pre for details.
Concurrent identical same-worker security_file_certgen (a.k.a. ssl_crtd)
requests are collapsed: The first such request goes through to one of
the helpers while others wait for that first request to complete,
successfully or otherwise. This optimization helps dealing with flash
crowds that suddenly send a large number of HTTPS requests to a small
group of origin servers.
Two certificate generation requests are considered identical if their
on-the-wire images are identical. This simple and fast approach covers
all certificate generation parameters, including all mimicked
certificate properties, and avoids hash collisions and poisoning.
Compared to collision- or poisoning-sensitive approaches that store raw
certificates and compare their signatures or fingerprints, storing
helper queries costs a few extra KB per pending helper request. That
extra RAM cost is worth the advantages and will be eliminated when
helper code switches from c-strings to SBufs.
Add ssl::server_name options to control matching logic.
Many popular servers use certificates with several "alternative subject
names" (SubjectAltName). Many of those names are wildcards. For example,
a www.youtube.com certificate currently includes *.google.com and 50+
other subject names, most of which are wildcards.
Often, admins want server_name to match any of the subject names. This
is useful to match any server belonging to a large conglomerate of
companies, all including some *.example.com name in their certificates.
The existing server_name functionality addresses this use case well.
The new ACL options address several other important use cases:
--consensus identifies transactions with a particular server when
server's subject name is also present in certificates used by many other
servers (e.g., matching transactions with a particular Google server but
not with all Youtube servers).
--client-requested allows both (a) SNI-based matching even after
Squid obtains the server certificate and (b) pinpointing a particular
server in a group of different servers all using the same wildcard
certificate (e.g., matching appengine.example.com but not
www.example.com when the certificate for has *.example.com subject).
--server-provided allows matching only after Squid obtains the server
certificate and matches any of the conglomerate parts.
Also this patch fixes squid to log client SNI when client-first bumping mode
is used too.
The old single-letter ACL "flags" code was refactored to support long option
names (with option-specific value types) without significant
per-ACL-object performance/RAM overheads and without creating a global
registry for all possible options. This refactoring (unexpectedly)
resulted in removal of a lot of unreliable static initialization code.
Refactoring fixed ACL flags parsing code that was dangerously misinterpreting
-i and +i flags in several contexts. For example, each of the three cases
below was misinterpreted as if three domains were configured (e.g., "+i",
"-z", and "example.com") on each line instead of one domain ("example.com"):
Amos Jeffries [Tue, 20 Jun 2017 19:00:03 +0000 (07:00 +1200)]
TLS: Move the remaining SSL_SESSION cache callbacks to security/Session.*.
No GnuTLS additions here, or significant code changes. Most of this is a
straight cut-n-paste. Though it does make the slot lookup to auto in the
if-condition to simplify the callback code and removes some no longer
necessary comments as requested in audit.
Amos Jeffries [Mon, 19 Jun 2017 10:31:04 +0000 (22:31 +1200)]
Improve config parsing of logformat definitions
Squid has for some time ignored custom definitions using the same name
as internally defined formats. And overwritten custom formats when there
was a repeated definition.
* Detect logformat duplicates and produce ERROR message indicating the
format name, config line and action taken.
* Add some missing FATAL labels on parse abort when access_log has
multiple logformat= options configured.
* Add missing FATAL error message when logformat lines has no name
parameter (and thus no tokens either).
* Omit the default "logformat=squid" option from cachemgr config dumps.
Alex Rousskov [Mon, 19 Jun 2017 10:07:40 +0000 (22:07 +1200)]
Bug 4492: Chunk extension parser is too pedantic.
Support HTTP/1 BWS when parsing HTTP and ICAP chunk extensions.
Per RFC 7230 Errata #4667, HTTP parsers MUST parse BWS in chunk-ext.
Per RFC 3507 and its extensions, ICAP agents generate BWS in chunk-ext.
Also discovered that our DelimiterCharacters() in pedantic mode is too
strict for many use cases: Most HTTP syntax rules allow both SP and HTAB
but pedantic DelimiterCharacters() only allows SP. Added
WhitespaceCharacters() to provide the more general set where it is
needed in new code (including BWS), but did not remove excessive
DelimiterCharacters() use.
Alex Rousskov [Sun, 18 Jun 2017 19:08:57 +0000 (07:08 +1200)]
Do not die silently when dying via std::terminate().
Report exception failures that call std::terminate(). Exceptions unwind
stack towards main() and sooner or later get handled/reported by Squid.
However, exception _failures_ just call std::terminate(), which aborts
Squid without the stack unwinding. By default, a std::terminate() call
usually results in a silent Squid process death because some default
std::terminate_handler implementations do not say anything at all while
others write to stderr which Squid redirects to /dev/null by default.
Many different problems trigger std::terminate() calls. Most of them are
rare, but, after the C++11 migration, one category became likely in
Squid: A throwing destructor. Destructors in C++11 are implicitly
"noexcept" by default, and many old Squid destructors might throw.
These reporting changes do not bypass or eliminate any failures.
Bug 2833 pt3: Do not respond with HTTP/304 to unconditional requests
... after internal revalidation. The original unconditional HttpRequest
was still marked (and processed) as conditional after internal
revalidation because the original (clear) Last-Modified and ETag values
were not restored (cleared) after the internal revalidation abused them.
TODO: Isolate the code converting the request into conditional one _and_
the code that undoes that conversion, to keep both actions in sync.
The security fix in v5 r14979 had a negative effect on collapsed
forwarding. All "private" entries were considered automatically
non-shareable among collapsed clients. However this is not true: there
are many situations when collapsed forwarding should work despite of
"private" entry status: 304/5xx responses are good examples of that.
This patch fixes that by means of a new StoreEntry::shareableWhenPrivate
flag.
The suggested fix is not complete: To cover all possible situations, we
need to decide whether StoreEntry::shareableWhenPrivate is true or not
for all contexts where StoreEntry::setPrivateKey() is used. This patch
fixes only few important cases inside http.cc, making CF (as well
collapsed revalidation) work for some [non-cacheable] response status
codes, including 3xx, 5xx and some others.
The original support for internal revalidation requests collapsing
was in trink r14755 and referred to Squid bugs 2833, 4311, and 4471.
Amos Jeffries [Mon, 29 May 2017 03:19:47 +0000 (15:19 +1200)]
Add OpenSSL library details to -v output
This is partially to meet the OpenSSL copyright requirement that binaries
mention when they are using the library, and partially for admin to see
which library their Squid is using when multiple are present in the system.
Crashes when server-first bumping mode is used with openSSL-1.1.0 release
When OpenSSL-1.1.0 or later is used:
- The SQUID_USE_SSLGETCERTIFICATE_HACK configure test is false
- The SQUID_SSLGETCERTIFICATE_BUGGY configure test is true
- Squid hits an assert(0) inside Ssl::verifySslCertificate when trying to
retrieve a generated certificate from cache.
Create PID file ASAP, before the shared memory segments.
PID file is created right after configuration finalization, before the
allocation for any shared memory segments.
Late PID file creation allowed N+1 concurrent Squid instances to create
the same set of shared segments (overwriting each other segments),
resulting in extremely confusing havoc because the N instances would
later lose the race for the PID file (or some other critical resource)
creation and remove the segments. If that removal happened before a kid
of the single surviving instance started, that kid would fail to start
with open() errors in Segment.cc because the shared segment it tries to
open would be gone. Otherwise, that kid would fail to _restart_ after
any unrelated failures (possibly many days after the conflict), with
same errors, for the same reason.
Shared state corruption was also possible if different kids (of the
winning instance) opened (and started using) segments created (and
initialized) by different instances.
Situations with N+1 concurrent Squid instances are not uncommon because
many Squid service management scripts (or manual admin commands!)
* do not check whether another Squid is already running and/or
* incorrectly assume that "squid -z" does not daemonize.
This change finally makes starting N+1 Squid instances safe (AFAIK).
Also made daemonized and non-daemonized Squid create the PID file at the
same startup stage, reducing inconsistencies between the two modes.
Make PID file check/creation atomic to avoid associated race conditions.
After this change, if N Squid instances are concurrently started shortly
after time TS, then exactly one Squid instance (X) will run (and have
the corresponding PID file). If another Squid instance has already been
running (with the corresponding PID file) at TS, then X will be that
"old" Squid instance. If no Squid instances were running at TS, then X
will be one of those new N Squids started after TS.
Lack of atomic PID file operations caused unexpected Squid behavior:
* Mismatch between started Squid instance and stored PID file.
* Unexpected crashes due to failed allocation of shared resources,
such as listening TCP ports or shared memory segments.
A new File class guarantees atomic PID file operations using locks. We
tried to generalize/reuse Ssl::Lock from the certificate generation
helper, but that was a bad idea: Helpers cannot use a lot of Squid code
(e.g., debugs(), TextException, SBuf, and enter_suid()), and the old
Ssl::Lock class cannot support shared locking without a major rewrite.
File locks on Solaris cannot work well (see bug #4212 comment #14), but
those problems do not affect PID file management code. Solaris- and
Windows-specific File code has not been tested and may not build.
Failure to write a PID file is now fatal. It used to be fatal only when
Squid was started with the -C command line option. In the increasingly
SMP world, running without a PID file leads to difficult-to-triage
errors. An admin who does not care about PID files should disable them.
Squid now exits with a non-zero error code if another Squid is running.
Also removed PID file rewriting during reconfiguration in non-daemon
mode. Squid daemons do not support PID file reconfiguration since trunk
r13867, but that revision (accidentally?) left behind half-broken
reconfiguration code for non-daemon mode. Fixing that code is difficult,
and supporting PID reconfigure in non-daemons is probably unnecessary.
Also fixed "is Squid running?" check when kill(0) does not have
permissions to signal the other instance. This does happen when Squid is
started (e.g., on the command line) by a different user than the user
Squid normally runs as or, perhaps, when the other Squid instance enters
a privileged section at the time of the check (untested). The bug could
result in undelivered signals or multiple running Squid instances.
These changes do not alter partially broken enter/leave_suid() behavior
of main.cc. That old code will need to be fixed separately!
PID file-related cache.log messages have changed slightly to improve
consistency with other DBG_IMPORTANT messages and to simplify code.
Squid no longer lies about creating a non-configured PID file. TODO:
Consider lowering the importance of these benign/boring messages.
* Terminal errors should throw instead of calling exit()
Squid used to call exit() in many PID-related error cases. Using exit()
as an error handling mechanism creates several problems:
1. exit() does not unwind the stack, possibly executing atexit()
handlers in the wrong (e.g., privileged) context, possibly leaving
some RAII-controller resources in bad state, and complicating triage;
2. Using exit() complicates code by adding a yet another error handling
mechanism to the (appropriate) exceptions and assertions.
3. Spreading exit() calls around the code obscures unreachable code
areas, complicates unifying exit codes, and confuses code checkers.
Long-term, it is best to use exceptions for nearly all error handling.
Reaching that goal will take time, but we can and should move in that
direction: The adjusted SquidMainSafe() treats exceptions as fatal
errors, without dumping core or assuming that no exception can reach
SquidMainSafe() on purpose. This trivial-looking change significantly
simplified (and otherwise improved) PID-file handling code!
The fatal()-related code suffers from similar (and other) problems, but
we did not need to touch it.
TODO: Audit catch(...) and exit() cases [in main.cc] to take advantage
of the new SquidMainSafe() code supporting the throw-on-errors approach.
Alex Rousskov [Mon, 29 May 2017 00:18:24 +0000 (12:18 +1200)]
Do not unconditionally revive dead peers after a DNS refresh.
Every hour, peerRefreshDNS() performs a DNS lookup of all cache_peer
addresses. Before this patch, even if the lookup results did not change,
the associated peerDNSConfigure() code silently cleared dead peer
marking (CachePeer::tcp_up counter), if any. Forcefully reviving dead
peers every hour can lead to transaction delays (and delays may lead to
failures) due to connection timeouts when using a still dead peer.
This patch starts standard TCP probing (instead of pointless dead peer
reviving), correctly refreshing peer state. The primary goal is to
cover a situation where a DNS refresh changes the peer address list.
However, TCP probing may be useful for other situations as well and has
low overhead (that is why it starts unconditionally). For example,
probing may be useful when the DNS refresh changes the order of IP
addresses. It also helps detecting dead idle peers.
Also delay and later resume peer probing if peerDNSConfigure() is
invoked when peers are being probed. Squid should re-probe because the
current probes may use stale IP addresses and produce wrong results.
xstrndup() does not work like strndup(3), and some callers got confused:
1. When n is the str length or less, standard strndup(str,n) copies all
n bytes but our xstrndup(str,n) drops the last one. Thus, all callers
must add one to the desired result length when calling xstrndup().
Most already do, but it is often hard to see due to low code quality
(e.g., one must remember that MAX_URL is not the maximum URL length).
2. xstrndup() also assumes that the source string is 0-terminated. This
dangerous assumption does not contradict many official strndup(3)
descriptions, but that lack of contradiction is actually a recently
fixed POSIX documentation bug (i.e., correct implementations must not
assume 0-termination): http://austingroupbugs.net/view.php?id=1019
The OutOfBoundsException bug led to truncated exception messages.
The ESI bug led to truncated 'literal strings', but I do not know what
that means in terms of user impact. That ESI fix is untested.
cachemgr.cc bug was masked by the fact that the buffer ends with \n
that is unused and stripped by the custom xstrtok() implementation.
TODO. Fix xstrndup() implementation (and rename the function so that
fixed callers do not misbehave if carelessly ported to older Squids).
This ACL detects presence of request, response or ALE transaction
components. Since many ACLs require some of these components, lack of
them in a transaction may spoil the check and confuse admin with
warnings like "... ACL is used in context without an HTTP request".
Using 'has' ACL should help dealing with these problems caused by
component-less transactions.
Also: addressed TODO in item #3 of v4 revision 14752.
Bug 4321: ssl_bump terminate does not terminate at step1
The following trivial configuration should terminate all connections that
are subject to SslBumping:
ssl_bump terminate all
but Squid either splices or bumps instead.
This patch fixes Squid to immediately close the connection.
Also this patch:
- solves wrong use of Ssl::bumpNone in cases where the Ssl::bumpEnd
(do not bump) or Ssl::bumpSplice (splice after peek/stare at step1)
must be used.
- updates %ssl::bump_mode documetation.
- fixes %ssl::bump_mode formating code to print last bumping action
Squid does not forward HTTP transactions to dead peers except when a
dead peer was idle for some time (ten peer connect timeouts or longer).
When the idle peer is still dead, this exception leads to transaction
delays (at best) or client disconnects/errors (at worst), depending on
Squid and client configurations/state. I am removing this exception.
The "use dead idle peer" heuristic was introduced as a small part of a
much bigger bug #14 fix (trunk r6631). AFAICT, the stated goal of the
feature was speeding up failure recovery: The heuristic may result in
HTTP transactions sent to a previously dead (but now alive) idle peer
earlier, before the peer is proven to be alive (using peer revival
mechanisms such as TCP probes). However, the negative side effects of
this heuristic outweigh its accidental benefits. If somebody needs Squid
to detect revived idle peers earlier, they need to add a different
probing mechanism that does not jeopardize HTTP transactions.
Nobody has spoken in defense of this feature on Squid mailing lists:
http://lists.squid-cache.org/pipermail/squid-users/2017-March/014785.html
http://lists.squid-cache.org/pipermail/squid-dev/2017-March/008308.html
The removed functionality was not used to detect revived peers. All peer
revival mechanisms (such as TCP probes) remain intact.
Bug 4711: SubjectAlternativeNames is missing in some generated certificates
Squid may generate certificates which have a Common Name, but do not have
a subjectAltName extension. For example when squid generated certificates
do not mimic an origin certificate or when the certificate adaptation
algorithm sslproxy_cert_adapt/setCommonName is used.
This is causes problems to some browsers, which validates a certificate using
the SubjectAlternativeNames but ignore the CommonName field.
This patch fixes squid to always add a SubjectAlternativeNames extension in
generated certificates which do not mimic an origin certificate.
Squid still will not add a subjectAltName extension when mimicking an origin
server certificate, even if that origin server certificate does not include
the subjectAltName extension. Such origin server may have problems when
talking directly to browsers, and patched Squid is not trying to fix those
problems.
Bug 4682: ignoring http_access deny when client-first bumping mode is used
Squid fails to identify HTTP requests which are tunneled inside an already
established client-first bumped tunnel, and this is results in ignoring
http_access denied for these requests.
Squid does not send CONNECT request to adaptation services
if the "ssl_bump splice" rule matched at step 2. This adaptation
is important because the CONNECT request gains SNI information during
the second SslBump step. This is a regression bug, possibly caused by
the Squid bug 4529 fix (trunk commits r14913 and r14914).
Count failures and use peer-specific connect timeouts when tunneling.
Fixed two bugs with tunneling CONNECT requests (or equivalent traffic)
through a cache_peer:
1. Not detecting dead cache_peers due to missing code to count peer
connect failures. TLS/SSL-level failures were detected (for "tls"
cache_peers) but TCP/IP connect(2) failures were not (for all peers).
2. Origin server connect_timeout used instead of peer_connect_timeout or
a peer-specific connect-timeout=N (where configured).
The regular forwarding code path does not have the above bugs. This
change reduces code duplication across the two code paths (that
duplication probably caused these bugs in the first place), but a lot
more work is needed in that direction.
The 5-second forwarding timeout hack has been in Squid since
forward_timeout inception (r6733). It is not without problems (now
marked with an XXX), but I left it as is to avoid opening another
Pandora box. The hack now applies to the tunneling code path as well.
Bug 4659: sslproxy_foreign_intermediate_certs does not work
The sslproxy_foreign_intermediate_certs directive does not work after r14769.
The bug is caused because of wrong use of X509_check_issued OpenSSL API call.
Amos Jeffries [Thu, 4 May 2017 10:12:39 +0000 (22:12 +1200)]
ext_session_acl: cope with new logformat inputs
Now that Squid is sending an explicit '-' for the trailing %DATA parameter
if there were no acl parameters this helper needs to cope with it on
'active mode' session lookups when login/logout are not being performed.
Amos Jeffries [Thu, 4 May 2017 06:05:54 +0000 (18:05 +1200)]
ext_session_acl: cope with new logformat inputs
Now that Squid is sending an explicit '-' for the trailing %DATA parameter
if there were no acl parameters this helper needs to cope with it on
'active mode' session lookups when login/logout are not being performed.
Fixes squid documentation to correctly describe the squid behavior when the
"bump" action is selected on step SslBump1. In this case squid selects
the client-first bumping mode.