Fix assertion comm.cc:759: "Comm::IsConnOpen(conn)" in ConnStateData::getSslContextDone
This is an ssertion inside ConnStateData::getSslContextDone while
setting timeout. The reason is that the ConnStateData::clientConnection
may closed while waiting response from ssl_crtd helper.
Amos Jeffries [Fri, 5 Jun 2015 23:38:34 +0000 (16:38 -0700)]
Bug 3875: bad mimeLoadIconFile error handling
Improve the MimeIcon reliability when filesystem I/O errors or others
cause the icon data to not be loadable.
The loading process is re-worked to guarantee that once the
MimeIon::created callback occurs it will result in a valid StoreEntry in
the cache representing the wanted icon.
* If the image can be loaded without any issues it will be placed in
the cache as a 200 response.
* If errors prevent the image being loaded or necessary parameters
(size and mtime) being known a 204 object will be placed into the cache.
NP: There is no clear agreement on 204 being 'the best' status for this
case. 500 Internal Error is also appropriate. I have use 204 since:
* the bug is not in the clients request (eliminating 400, 404, etc),
* a 500 would be revealing details about server internals unnecessarily
often and incur extra complexity creating the error page.
* 204 also avoids needing to send Content-Length, Cache-Control header
and body object (bandwidth saving over 500 status).
NP: This started with just correcting the errno usage, but other bugs
promptly started appearing once I got to seriously testing this load
process. So far it fixes:
* several assertions resulting from StoreEntry being left invalid in
cache limbo beween created hash entries and valid mem_obj data.
* repeated attempts on startup to load absent icons files which dont
exist in the filesystem.
* buffer overfow on misconfigured or corrupt mime.conf file entries
* incorrect debugs messages about file I/O errors
* large error pages delivered when icons not installed (when it does
not assert from the StoreEntry)
This patch allow user_cert and ca_cert ACLs to match arbitrary
stand-alone OIDs (not DN/C/O/CN/L/ST objects or their substrings).
For example, should be able to match certificates that have
1.3.6.1.4.1.1814.3.1.14 OID in the certificate Subject or Issuer field.
Squid configuration would look like this:
acl User_Cert-TrustedCustomerNum user_cert 1.3.6.1.4.1.1814.3.1.14 1001
Bug 3329: The server side pinned connection is not closed properly
... in ConnStateData::clientPinnedConnectionClosed CommClose handler.
Squid enters a buggy state when an idle connection pinned to a peer closes:
- The ConnStateData::clientPinnedConnectionRead, the pinned peer
connection read handler, is called with the io.flag set to
Comm::ERR_CLOSING. The read handler does not close the peer
Comm::Connection object. This is correct and expected -- the I/O
handler must exit on ERR_CLOSING without doing anything.
- The ConnStateData::clientPinnedConnectionClosed close handler is called,
but it does not close the peer Comm::Connection object either. Again,
this is correct and expected -- the close handler is not the place to
close a being-closed connection.
- The corresponding fde object is marked as closed (fde::flags.open
is false), but the peer Comm::Connection object is still open
(Comm::Connection.fd >= 0)! From this point on, we have an inconsistency
between the peer Comm::Connection object state and the real world.
- When the ConnStateData::pinning::serverConnection object is later
destroyed (by refcounting), it will try to close its fd. If that fd
is already in use (e.g., by another Comm::Connection), bad things
happen (crashes, segfaults, etc). Otherwise (i.e., if that fd is
not open), comm_close may cry about BUG 3556 (or worse).
To fix this problem, we must not allow Comm::Connections to get out
of sync with fd_table, even when a descriptor is closed without going
through Connection::close(). There are two ways to accomplished that:
* Change Comm to always store Comm::Connections and similar high-level
objects instead of fdes. This is a huge change that has been long on
the TODO list (those "other high-level objects" is on of the primary
obstacles there because not everything with a FD is a Connection).
* Notify Comm::Connections about closure in their closing handlers
(this change). This design relies on every Comm::Connection having
a close handler that notifies it. It may take us some time to reach
that goal, but this change is the first step providing the necessary
API, a known bug fix, and a few preventive changes.
This change:
- Adds a new Comm::Connection::noteClosure() method to inform the
Comm::Connection object that somebody is closing its FD.
- Uses the new method inside ConnStateData::clientPinnedConnectionClosed
handler to inform the ConnStateData::pinning::serverConnection object
that its FD is being closed.
- Replaces comm_close calls which may cause bug #3329 in other places with
Comm::Connection->close() calls.
Initially based on Nathan Hoad research for bug 3329.
The Adaptation::Icap::Xaction::swanSong may try to use an invalid
Icap::Xaction::cs object (Comm::ConnOpener object) if the Comm::ConnOpener
is already gone (because its job finished) but the Xaction::noteCommConnected
method is not called yet.
This patch makes the Adaptation::Icap::Xaction::cs object a CbcPointer instead
of a raw pointer and checks if the Xaction::cs object is still valid before
using it.
Fix "Not enough space to hold server hello message" error message
This patch merges the Ssl::ClientBio and Ssl::ServerBio read buffering code
to the Ssl::Bio::readAndBuffer method and uses the MemBuf::potentialSpaceSize
instead of MemBuf::spaceSize to check space size for SSL hello messages buffer,
to take in account available space after a possible buffer grow.
Amos Jeffries [Fri, 22 May 2015 04:55:35 +0000 (21:55 -0700)]
Prevent unused ssl_crtd helpers being run
The conditions for when to start ssl_crtd helpers was ignoring the
generate-host-certificates=off option. Meaning most ssl-bump installs
were running them needlessly.
Alex Dowad [Fri, 22 May 2015 04:47:22 +0000 (21:47 -0700)]
Fix incorrect use of errno in various libcomm.la places
Fix problems with 'errno' in TcpAcceptor::Listen, Comm::HandleRead, and
Comm::HandleWrite. 'errno' is only valid after a standard library function
returns an error. Also, we must avoid calling out to other functions before
recording the value of 'errno', since they might overwrite it.
Alex Dowad [Fri, 22 May 2015 04:33:41 +0000 (21:33 -0700)]
Fix signal.h usage to resolve compiler warning
When included, musl libc's sys/signal.h issues a compiler warning
stating that signal.h should be used directly instead. If gcc is
treating all warnings as errors, this breaks the build.
glibc's sys/signal.h does not contain any definitions; all it does
is include signal.h (indirectly). So directly including signal.h
doesn't break anything with glibc.
Nathan Hoad [Fri, 22 May 2015 04:26:17 +0000 (21:26 -0700)]
Fix missing external ACL helper notes
external ACL helper notes are only added onto the HTTP request that
kicked off the external ACL lookup, and not cached ACL responses.
This means if you set notes from an external ACL that are used for
some processing in other ACLs, or post-processing on logs, things
may be missed.
Inside IdleConnList::findUseable the IdleConnList::removeAt call can delete
"this" IdleConnList object. The IdleConnList::clearHandlers called imediatelly
after the removeAt method, will try to use the invalid "this" object in
a comm_read_cancel function call, causing this assertion or other similar.
This patch fixes the IdleConnList::findUseable, IdleConnList::pop and
IdleConnList::findAndClose methods to call IdleConnList::clearHandlers before
the IdleConnList::removeAt is called.
The comm_connect_addr on connect failures sets the xerrno to 0
and returns Comm::OK. This is causes problems on ConnOpener
class users which believes the connection is established and
it is ready for use.
Amos Jeffries [Sat, 9 May 2015 11:48:35 +0000 (04:48 -0700)]
Docs: shuffle SMP specific options to the top of squid.conf
The workers directive is required to be used before several other
directives. It makes little sense to documents it after the controls
which depend on it.
Make a new config section to contain the SMP specific options.
Amos Jeffries [Sat, 9 May 2015 11:43:14 +0000 (04:43 -0700)]
CacheMgr: display 'client_db off' instead of 0 clients accessing cache
... to clarify why there is no record of even the mgr request happening.
The client_db mechanism needs to be enabled and measuring traffic for
any useful client counter value to exist.
The maximum buffer size for holding Server and Client SSL hello messages is only
16k which is not enough hold a Hello message which includes some extensions and
1-2 or more Certificates.
This patch increases the maximum size to 65535 and also adds some checks to
avoid squid crashes in the case the hello messages buffer overflows.
comm_connect_addr() uses errno to determine whether library calls like connect()
are successful. Its callers also use errno for extra information on the cause
of any problem. However, after calling library functions like connect(),
comm_connect_addr() calls other library functions which can overwrite errno.
As the errno manpage explains, "a function that succeeds is allowed to change
errno". So even when nothing is wrong, comm_connect_addr() may return an error
flag if libc sets errno. And when something *is* wrong, incorrect error information
may be returned to the caller because errno was overwritten with a different code.
Correct this by using our own error code variable which is set only when a library
call fails. To avoid breaking callers, set errno before returning.
Fix 'access_log none' to prevent following logs being used
The documented behaviour of "access_log none" for preventing logging
using log lines following the directive has not been working in
Squid-3 for some time.
Since the 'none' type does not have a log module associated the entire
switch logic where its abort is checked for was being skipped.
Unexpected SQUID_X509_V_ERR_DOMAIN_MISMATCH errors while accessing sites with valid certificates
A "const char *" pointer retrieved using the SBuf::c_str() method may attached
to an SSL object using the SSL_set_ex_data method as server name used to
validate server certificates. This pointer may become invalid, causing
the SQUID_X509_V_ERR_DOMAIN_MISMATCH errors.
This patch changes the type of the ssl_ex_index_server index used with the
SSL_set_ex_data function to be an SBuf object.
Portability: Add hacks to define C++11 explicit N-bit type limits
Add cstdint and stdint.h to libcompat headers and ensure that type limits
used by Squid are always available. Mostly this involves shuffling
existing hacks into the compat headers but the UINT32_* limits are new.
The SSL_get_peer_certificate openSSL function increases the lock for X509
object it returns so X509 object retrieved using this function must be
released with X509_free after use.
This patch uses the Ssl::X509_Pointer TidyPointer to release X509 object
retrieved with the SSL_get_peer_certificate function inside the
Ssl::PeerConnector::handleNegotiateError method
Despite the "must match" comment, MAX_AUTHTOKEN_LEN in
auth/UserRequest.h got out of sync with similar constants in Negotiate helpers.
A 32KB buffer cannot fit some helper requests (e.g., those carrying Privilege
Account Certificate information in the client's Kerberos ticket). Each truncated
request blocks the negotiate helper channel, eventually causing helper queue
overflow and possibly killing Squid.
This patch increases MAX_AUTHTOKEN_LEN in UserRequest.h to 65535 which
is also the maximum used by the negotiate helpers. The patch also adds checks
to avoid sending truncated requests, treating them as helper errors instead.
This patch adds code in squid to control SslBump behavior when dealing with
"resuming SSL/TLS sessions". Without these changes, SslBump usually terminates
all resuming sessions with an error because such sessions do not include
server certificates, preventing Squid from successfully validating the server
identity.
After these changes, Squid splices resuming sessions. Splicing is the right
because Squid most likely has spliced the original connections that the client
and server are trying to resume now.
Without SslBump, session resumption would just work, and SslBump behaviour
should approach that ideal.
Future projects may add ACL checks for allowing resuming sessions and may
add more complex algorithms, including maintaining an SMP-shared
cache of sessions that may be resumed in the future and evaluating
client/server attempts to resume a session using that cache.
This patch also makes SSL client Hello message parsing more robust and
adds an SSL server Hello message parser.
Also add support for NPN (next protocol negotiation) and ALPN
(Application-Layer Protocol Negotiation) tls extensions, required to
correctly bump web clients support these extensions
Technical details
-----------------
In Peek mode, the old Squid code would forward the client Hello message to the
server. If the server tries to resume the previous (spliced) SSL session with
the client, then Squid SSL code gets an ssl/PeerConnector.cc "ccs received
early" error (or similar) because the Squid SSL object expects a server
certificate and does not know anything about the session being resumed.
With this patch, Squid detects session resumption attempts and splices
There are two mechanism in SSL/TLS for resuming sessions. The traditional
shared session IDs and the TLS ticket extensions:
* If Squid detects a shared ID in both client and server Hello messages, then
Squid decides whether the session is being resumed by comparing those client
and server shared IDs. If (and only if) the IDs are the same, then Squid
assumes that it is dealing with a resuming session (using session IDs).
* If Squid detects a TLS ticket in the client Hello message and TLS ticket
support in the server Hello message as well as a Change Cipher Spec or a New
TLS Ticket message (following the server Hello message), then (and only then)
Squid assumes that it is dealing with a resuming session (using TLS tickets).
The TLS tickets check is not performed if Squid detects a shared session ID
in both client and server Hello messages.
NPN and ALPN tls extensions
---------------------------
Even if squid has some SSL hello messages parsing code, we are relying to
OpenSSL for full parsing. The openSSL used in peek and splice mode to parse
server hello message, check for errors and verify server certificates.
If the openSSL, while parses the server hello message, find an extension enabled
in the server hello message, which is not enabled in its side, fails with an
error ("...parse tlsext...").
OpenSSL supports NPN tls extension and from 1.0.2 release supports also the
ALPN tls extensions. In peek mode we are forwading the client SSL hello message
as is, and if this message include support for NPN or ALPN tls extension is
possible that the SSL server support them and include related extensions
in its response. The openSSL will fail if support for these extensions is
not enabled in its side.
This patch handles the NPN (TLSEXT_TYPE_next_proto_neg) as follows:
Try to select the http/1.1 protocol from the server protocols list. If the
http/1.1 is not supported then the SSL bumping will fail. This is valid
because only http protocol we are supporting in squid.
Splicing is not affected.
Also add support for the ALPN TLS extension. This extension is a replacement
for the NPN extension. The client sends a list of supported protocols. In the
case of stare mode squid now sends only http as supported protocol. In the
case of server-first or client-first bumbing modes, squid does enable this
extension.
The NPN supported by chromium browser the ALPN supported by firefox.
Support for ALPN is added to openSSL 1.0.2 release.
These extensions are used to support SPDY and similar protocols.
The fix for Bug 3664 "ssl_crtd fails to build on OpenSolaris/OpenIndiana/Solaris 11"
introduced a regression on BSD and Linux where lockf() implementations appear not to
lock the entire file correctly or as reliably as flock().
Reverting the flock/lockf change for non-Solaris OS.
Add server_name ACL matching server name(s) obtained from various sources
... such as CONNECT request URI, client SNI, and SSL server certificate CN.
During each SslBump step, Squid improves its understanding of a "true server
name", with a bias towards server-provided (and Squid-validated) information.
The server-provided server names are retrieved from the server certificate CN
and Subject Alternate Names. The new server_name ACL matches any of alternate
names and CN. If the CN or an alternate name is a wildcard, then the new ACL
matches any domain that matches the domain with the wildcard.
Other than supporting many sources of server name information (including
sources that may supply Squid with multiple server name variants and
wildcards), the new ACL is similar to dstdomain.
Invalid request->clientConnectionManager object used by Ssl::PeerConnector::handleNegotiateError
This patch adds the Ssl::ServerBio::bumpMode() method to retrieve the configured
mode from a ServerBio object, and uses this method for checking the bumping
mode inside Ssl::PeerConnector::handleNegotiateError method
After a failed http_access acl check of an HTTP request, tunneled through a
SSL bumped connection, ssl bumping code try to re-setup the connection for a
client-first bumping mode to serve the error crashing squid.
It can be hard determining what simple operations (ie cow(), grow()) are
being done no what SBuf object. Add the SBuf::id to debugs() output on
many more operations.
When squid generated an error page which contains the "%m" formating code
but the authentication information is not available squid dies with
segfault.
Amos Jeffries [Sat, 21 Mar 2015 07:38:14 +0000 (00:38 -0700)]
Portability: check 64-bit GNU atomic operators are useable
Sometimes (namely 32-bit OpenBSD libstdc++) do not fully implement the
GNU atomic operators for both 32-bit and 64-bit. But Squid makes use of
both types if the compiler deems them required.
We need to check them all before declaring the atomics usable, or not.
Thanks to Stuart Henderson for identifying the issue.
Amos Jeffries [Tue, 3 Mar 2015 14:48:01 +0000 (06:48 -0800)]
Bug 2907: high CPU usage on CONNECT when using delay pools
When delay pools are active on a CONNECT tunnel and the pool is drained
the I/O loop cycles very often transferring 1 byte until the pool is
topped-up at the end of the second.
Instead of looping constantly trying to read 1 byte at a time, add an
asynchronous event to wait for a few I/O cycles or until more bytes can
be read.
To protect against infinite loops of waiting when many tunnels are
competing for the pool allowance we only delay for a limited number of
loops before allowing at least 1 byte through. Also, the amount of time
waited is an odd fraction of 1 second so re-tries naturally spread
across any given second fairly, with connections rotating closer or
further from the time when pool topup happens. That behaviour still
allows some variance in service times, but overall the CPU consumption
and (as a result) total proxy speed appears to be much improved.
NP: Initial production testing shows a 36% RPS speed increase,
with a 50% reduction in total CPU usage.
Squid closes the SSL client connection with "Failed to start fake CONNECT
request for ssl spliced connection". This happens especially often when
the pipeline_prefetch configuration parameter is set to "0" (i.e., default).
When a transparent SSL connection is peeked and then spliced in step2, we are
generating a fake CONNECT request. The fake CONNECT request is counted as a
new pipelined request and may exceed the configured limit. This patch solves
this problem by raising the limit for that request.
Needs more work to better identify the requests that need a different limit.
Joshua Root [Tue, 3 Mar 2015 14:41:07 +0000 (06:41 -0800)]
Bug 3805: support shared memory on MacOS X in Mem::IPC::Segment
MacOS X doesn't support the O_TRUNC flag to shm_open; it is redundant anyway
because the shared memory segment is truncated immediately after opening
as per best practices. With this support Squid can now be built and run
under MacOS X.
Amos Jeffries [Tue, 3 Mar 2015 14:37:28 +0000 (06:37 -0800)]
basic_nis_auth: fail authentication on crypt() failures
... instead of crashing the helper.
"
Starting with glibc 2.17 (eglibc 2.17), crypt() fails with EINVAL (w/
NULL return) if the salt violates specifications. Additionally, on
FIPS-140 enabled Linux systems, DES or MD5 encrypted passwords passed to
crypt() fail with EPERM (w/ NULL return).
"
Amos Jeffries [Tue, 3 Mar 2015 14:36:20 +0000 (06:36 -0800)]
basic_getpwnam_auth: fail authentication on crypt() failures
... instead of crashing the helper.
"
Starting with glibc 2.17 (eglibc 2.17), crypt() fails with EINVAL (w/
NULL return) if the salt violates specifications. Additionally, on
FIPS-140 enabled Linux systems, DES or MD5 encrypted passwords passed to
crypt() fail with EPERM (w/ NULL return).
"
Problem description:
- Squid sslproxy_options deny the use of TLSv1.2 SSL protocol:
sslproxy_options NO_TLSv1_2
- Squid uses peek mode for bumped connections.
- Web client sends a TLSv1.2 hello message and squid in peek mode,
forwards the client hello message to server
- Web server responds with a TLSv1.2 hello message
- Squid while parsing server hello message aborts with an error because
sslproxy_options denies the use of TLSv1.2 protocol.
This patch fixes squid to ignore sslproxy_options when peek or stare
bumping mode is selected on bumpStep2 bumping step.
The sslproxy_options are applied if bump (server-first or client-first)
mode is selected on bumpStep1 or bumpStep2 bumping step. Also for
proxied https:// scheme requests.
Eldar Akchurin [Tue, 10 Feb 2015 04:42:35 +0000 (20:42 -0800)]
Bug 4073: Cygwin compile errors
Remove the definition of _SQUID_WINDOWS_ for Cygwin builds. The blend
of win32 and Linux environments is sufficiently different to have major
build issues. We have a precedent in kFreeBSD blend of BSD and Linux to
consider Cygwin a blend and first-class OS.
Also, temporarily disable the Win32-specific libraries and objects until
they can be properly tested.
Fix some small remaining compile errors after the above.
Cygwin Windows build is sponsored by Diladele B.V.
Amos Jeffries [Tue, 10 Feb 2015 03:15:56 +0000 (19:15 -0800)]
Cygwin: Disable C++11 detection and default use
The flags to enable can still be presented explicitly by the user but
there are known clashes with POSIX / ANSI definitions that sound
remarkably sinmilar to the Clang issues with -std=c++0x
Amos Jeffries [Fri, 6 Feb 2015 15:59:58 +0000 (07:59 -0800)]
Fix some cbdataFree related memory leaks
The delete operator should have been called for these objects after
previous code changes converted them to CBDATA_CLASS. As a result any
member objects relying on their destructor to cleanup were being leaked.
Also, make generic_cbdata::data a private member. The unwrap() method is
easily used now.