Alex Rousskov [Fri, 23 Sep 2016 12:40:44 +0000 (00:40 +1200)]
Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration
Other possible assertions include "NumberOfUFSDirs" in UFSSwapDir and
"fd >= 0" in file_close(). And Fs::Ufs::UFSStoreState::drainWriteQueue()
may segfault while dereferencing nil theFile, although I am not sure
that bug is caused specifically by reconfiguration.
Since trunk r9181.3.1, reconfiguration is done in at least two steps:
First, mainReconfigureStart() closes/cleans/disables all modules
supporting hot reconfiguration and then, at the end of the event loop
iteration, mainReconfigureFinish() opens/configures/enables them again.
UFS code hits assertions if it has to log entries or rebuild swap.state
between those two steps. The tiny assertion opportunity window hides
these bugs from most proxies that are not doing enough disk I/O or are
not being reconfigured frequently enough.
Asynchronous UFS cache_dirs such as diskd were the most exposed, but
even blocking UFS caching code could probably hit [rebuild] assertions.
The swap.state rebuilding (always initiated at startup) probably did not
work as intended if reconfiguration happened during the rebuild time
because reconfiguration closed the swap.state file being rebuilt. We now
protect that swap.state file and delay rebuilding progress until
reconfiguration is over.
TODO: To squash more similar bugs, we need to move hot reconfiguration
support into the modules so that each module can reconfigure instantly.
Alex Rousskov [Fri, 23 Sep 2016 11:11:48 +0000 (23:11 +1200)]
Do reset $HOME if needed after r13435. Minimize putenv() memory leaks.
While r13435 is attributed to me, the wrong condition was not mine. This
is a good example of why "cmp() op 0" pattern is usually safer because
the "==" or "!=" operator "tells" us whether the strings are equal,
unlike "!cmp()" that is often misread as "not equal".
Alex Rousskov [Wed, 21 Sep 2016 02:09:36 +0000 (14:09 +1200)]
Bug 4228: ./configure bug/typo in r14394.
The bug resulted in "test: too many arguments" error messages when
running ./configure and effectively replaced AC_MSG_ERROR() with
AC_MSG_WARN() for missing but required Heimdal and GNU GSS Kerberos
libraries.
Fix logged request size (%http::>st) and other size-related %codes.
The %[http::]>st logformat code should log the actual number of
[dechunked] bytes received from the client. However, for requests with
Content-Length, Squid was logging the sum of the request header size and
the Content-Length header field value instead. The logged value was
wrong when the client sent fewer than Content-Length body bytes.
Also:
* Fixed %http::>h and %http::<h format codes for icap_log. The old code
was focusing on preserving the "request header" (i.e. not "response
header") meaning of the %http::>h logformat code, but since ICAP
services deal with (and log) both HTTP requests and responses, that
focus on the HTTP message kind actually complicates ICAP logging
configuration and will eventually require introduction of new %http
logformat codes that would be meaningless in access.log context.
- In ICAP context, %http::>h should log to-be-adapted HTTP message
headers embedded in an ICAP request (HTTP request headers in REQMOD;
HTTP response headers in RESPMOD). Before this change, %http::>h
logged constant/"FYI" HTTP request headers in RESPMOD.
- Similarly, %http::<h should log adapted HTTP message headers
embedded in an ICAP response (HTTP request headers in regular
REQMOD; HTTP response headers in RESPMOD and during request
satisfaction in REQMOD). Before this change, %http::<h logged "-" in
REQMOD.
In other words, in ICAP logging context, the ">" and "<" characters
should indicate ICAP message kind (where the being-logged HTTP message
is embedded), not HTTP message kind, even for %http codes.
TODO: %http::>h code logs "-" in RESPMOD because AccessLogEntry does
not store virgin HTTP response headers.
* Correctly initialize ICAP ALE HTTP fields related to HTTP message
sizes for icap_log, including %http::>st, %http::<st, %http::>sh, and
%http::>sh format codes.
* Initialize HttpMsg::hdr_sz in HTTP header parsing code. Among other
uses, this field is needed to initialize HTTP fields inside ICAP ALE.
* Synced default icap_log format documentation with the current code.
HTTP: do not allow Proxy-Connection to override Connection header
Proxy-Connection header is never actually valid, it is relevant in
HTTP/1.0 messages only when Connection header is missing and not
relevant at all in HTTP/1.1 messages.
This fixes part of the behaviour, making Squid use Connection header
for persistence (keep-alive vs close) checking if one is present
instead of letting Proxy-Connection override it.
TODO: Proxy-Connection still needs to be ignored completely when
the message version is HTTP/1.1.
SSL CN wildcard must only match a single domain component [fragment].
When comparing the requested domain name with a certificate Common Name,
Squid expanded wildcard to cover more than one domain name label (a.k.a
component), violating RFC 2818 requirement[1]. For example, Squid
thought that wrong.host.example.com matched a *.example.com CN.
[1] "the wildcard character * ... is considered to match any single
domain name component or component fragment. E.g., *.a.com matches
foo.a.com but not bar.foo.a.com".
In other contexts (e.g., ACLs), wildcards expand to all components.
matchDomainName() now accepts a mdnRejectSubsubDomains flag that selects
the right behavior for CN match validation.
The old boolean honorWildcards parameter replaced with a flag, for clarity
and consistency sake.
This patch also handles the cases where the host name consists only from dots
(eg malformed Host header or SNI info). The old code has undefined behaniour
in these cases. Moreover it handles the cases a certificate contain zero length
string as CN or alternate name.
HTTP: MUST always revalidate Cache-Control:no-cache responses.
Squid MUST NOT use a CC:no-cache response for satisfying subsequent
requests without successful validation with the origin server. Squid
violated this MUST because Squid mapped both "no-cache" and
"must-revalidate"/etc. directives to the same ENTRY_REVALIDATE flag
which was interpreted as "revalidate when the cached response becomes
stale" rather than "revalidate on every request".
This patch splits ENTRY_REVALIDATE into:
- ENTRY_REVALIDATE_ALWAYS, for entries with CC:no-cache and
- ENTRY_REVALIDATE_STALE, for entries with other related CC directives
like CC:must-revalidate and CC:proxy-revalidate.
Reusing ENTRY_CACHABLE_RESERVED_FOR_FUTURE_USE value for the more
forgiving ENTRY_REVALIDATE_STALE (instead of the more aggressive
ENTRY_REVALIDATE_ALWAYS) fixes the bug for the already cached entries -
they will be revalidated and stored with the correct flag on the next
request.
Squid segfault via Ftp::Client::readControlReply().
Ftp::Client::scheduleReadControlReply(), which may called from the
asynchronous start() or readControlReply()/handleControlReply()
handlers, does not check whether the control connection is still usable
before using it.
The Ftp::Server::stopWaitingForOrigin() notification may come after
Ftp::Server (or an external force) has started closing the control
connection but before the Ftp::Server job became unreachable for
notifications. Writing a response in this state leads to assertions.
Other, currently unknown paths may lead to the same write-after-close
problems. This change protects all asynchronous notification methods
(except the connection closure handler itself) from exposing underlying
code to a closing control connection. This is very similar to checking
for ERR_CLOSING in Comm handlers.
Alex Rousskov [Thu, 30 Jun 2016 20:51:54 +0000 (08:51 +1200)]
Do not make bogus recvmsg(2) calls when closing UDS sockets.
comm_empty_os_read_buffers() assumes that all non-blocking
FD_READ_METHODs can read into an opaque buffer filled with random
characters. That assumption is wrong for UDS sockets that require an
initialized msghdr structure. Feeding random data to recvmsg(2) leads to
confusing errors, at best. Squid does not log those errors, but they
are visible in, for example, strace:
recvmsg(17, 0x7fffbb, MSG_DONTWAIT) = -1 EMSGSIZE (Message too long)
comm_empty_os_read_buffers() is meant to prevent TCP RST packets. The
function now ignores UDS sockets that are not used for TCP.
TODO: Useless reads may also exist for UDP and some TCP sockets.
This change fixes logic bugs that mostly affect performance: In micro-
tests, this change gives 10% performance improvement.
maybeMakeSpaceAvailable() is called with an essentially random in.buf.
The method must prepare in.buf for the next network read. The old code
was not doing that [well enough], leading to performance problems.
In some environments, in.buf often ends up having tiny space exceeding 2
bytes (e.g., 6 bytes). This happens, for example, when Squid creates and
parses a fake CONNECT request. The old code often left such tiny in.bufs
"as is" because we tried to ensure that we have at least 2 bytes to read
instead of trying to provide a reasonable number of buffer space for the
next network read. Tiny buffers naturally result in tiny network reads,
which are very inefficient, especially for non-incremental parsers.
I have removed the explicit "2 byte" space checks: Both the new and the
old code do not _guarantee_ that at least 2 bytes of buffer space are
always available, and the caller does not check that condition either.
If some other code relies on it, more fixes will be needed (but this
change is not breaking that guarantee -- either it was broken earlier or
was never fully enforced). In practice, only buffers approaching
Config.maxRequestBufferSize limit may violate this guarantee AFAICT, and
those buffers ought to be rare, so the bug, if any, remains unnoticed.
Another subtle maybeMakeSpaceAvailable() problem was that the code
contained its own buffer capacity increase algorithm (n^2 growth).
However, increasing buffer capacity exponentially does not make much
sense because network read sizes are not going to increase
exponentially. Also, memAllocStringmemAllocate() overwrites n^2 growth
with its own logic. Besides, it is buffer _space_, not the total
capacity that should be increased. More work is needed to better match
Squid buffer size for from-user network reads with the TCP stack buffers
and traffic patterns.
Both the old and the new code reallocate in.buf MemBlobs. However, the
new code leaves "reallocate or memmove" decision to the new
SBuf::reserve(), opening the possibility for future memmove
optimizations that SBuf/MemBlob do not currently support.
It is probably wrong that in.buf points to an essentially random MemBlob
outside ConnStateData control but this change does not attempt to fix that.
Alex Rousskov [Wed, 15 Jun 2016 22:08:16 +0000 (10:08 +1200)]
Do not allow low-level debugging to hide important/critical messages.
Removed debugs() side effects that inadvertently resulted in some
important/critical messages logged at the wrong debugging level and,
hence, becoming invisible to the admin. The removed side effects set the
"current" debugging level when a debugs() parameter called a function
that also called debugs(). The last nested debugs() called affected the
level of all debugs() still in progress!
Related changes:
* Reentrant debugging messages no longer clobber parent messages. Each
debugging message is logged separately, in the natural order of
debugs() calls that would have happened if debugs() were a function
(that gets already evaluated arguments) and not a macro (that
evaluates its arguments in the middle of the call). This order is
"natural" because good macros work like functions from the caller
point of view.
* Assertions hit while evaluating debugs() parameters are now logged
instead of being lost with the being-built debugs() log line.
* 10-20% faster debugs() performance because we no longer allocate a new
std::ostringstream buffer for the vast majority of debugs() calls.
Only reentrant calls get a new buffer.
* Removed old_debug(), addressing an old "needs to die" to-do.
* Removed do_debug() that changed debugging level while testing whether
debugging is needed. Use side-effect-free Debug::Enabled() instead.
Also removed the OutStream wrapper class. The wrapper was added in trunk
revision 13767 that promised to (but did not?) MemPool the debug output
buffers. We no longer "new" the buffer stream so a custom new() method
would be unused. Besides, the r13767 explanation implied that providing
a Child::new() method would somehow overwrite Parent::allocator_type,
which did not compute for me. Finally, Squid "new"s other allocator-
enabled STL objects without overriding their new methods so either the
same problem is still there or it did not exist (or was different?).
Also removed Debug::xassert() because the debugs() assertions now work
OK without that hack.
Alex Rousskov [Sat, 21 May 2016 15:52:02 +0000 (03:52 +1200)]
Fix icons loading speed.
Since trunk r14100 (Bug 3875: bad mimeLoadIconFile error handling), each
icon was read from disk and written to Store one character at a time. I
did not measure startup delays in production, but in debugging runs,
fixing this bug sped up icons loading from 1 minute to 4 seconds.
Steve Hill [Tue, 17 May 2016 14:58:50 +0000 (02:58 +1200)]
Support unified EUI format code in external_acl_type
Squid supports %>eui as a logformat specifier, which produces an EUI-48
for IPv4 clients and an EUI-64 for IPv6 clients. However, This is not
allowed as a format specifier for the external ACLs, and you have to use
%SRCEUI48 and %SRCEUI64 instead. %SRCEUI48 is only useful for IPv4
clients and %SRCEUI64 is only useful for IPv6 clients, so supporting
both v4 and v6 is a bit messy.
Adds the %>eui specifier for external ACLs and behaves in the same way
as the logformat specifier.
Amos Jeffries [Fri, 6 May 2016 09:39:48 +0000 (21:39 +1200)]
Fix SIGSEGV in ESIContext response handling
HttpReply pointer was being unlocked without heving been locked.
Resulting in a double-free. Make it use RefCount instead of
manual locking to ensure locked/unlock is always symmetrical.
Nathan Hoad [Fri, 6 May 2016 08:15:36 +0000 (20:15 +1200)]
Prevent Squid forcing -b 2048 into the arguments for sslcrtd_program
Previously Squid assumed it was running with the default sslcrtd_program, which
takes an argument for the FS block size. This causes issues for administrators
that use their own helpers that happen to take a -b argument that means
something else entirely, causing confusion and preventing them from removing
this argument.
A summary of the changes:
* Move the block size retrieval from Squid into security_file_certgen. It
does not use fsBlockSize as that introduces a lot of dependencies on
unrelated Squid code, e.g. fde, Debug, MemBuf.
* Make the -b argument mostly redundant, but leave it there so
administrators can overrule xstatvfs.
* Fix a small typo.
This work is submitted on behalf of Bloomberg L.P.
Squid currently does not handle SSL server responses that start with
an SSL Alert Record. Squid fails to parse the Server Hello message and
also fails to detect and handle resuming sessions.
Amos Jeffries [Mon, 2 May 2016 10:51:18 +0000 (22:51 +1200)]
Bug 4501: HTTP/1.1: normalize Host header
When absolute-URI is provided Host header should be ignored. However some
code still uses Host directly so normalize it using the URL authority
value before doing any further request processing.
For now preserve the case where Host is completely absent. That matters
to the CVE-2009-0801 protection.
This also has the desirable side effect of removing multiple or duplicate
Host header entries, and invalid port values.
cachemgr.cgi: use dynamic MemBuf for internal content generation
Using a fixed size buffer limits how big content lines can be. Modern
HTTP is fast reaching the point where such limits are problematic.
Also fixes incorrect uses of snprintf() by removing them.
Add chained certificates and signing certificate to peek-then-bumped connections.
The scenario this patch addresses is when Squid is configured with an
intermediate signing CA certificate, and clients have the root CA installed on
their machines. What happens is that the generated certificates come down with
an unknown issuer (the intermediate signing certificate), with no
intermediates, so they are rejected. By adding the configured certificate chain
as old client-first mode did, the intermediate and root certificates come down
as well, resulting in the issuer being identified and the connection being
established "securely".
This work is submitted on behalf of Bloomberg L.P.
Squid crashes on startup when the parent process exit()s after fork()ing
the kid process. Squid may also crash on shutdown after exiting main().
In both cases, the crashes are build- and environment-specific. Many
environments show no problems at all. Even disabling compiler
optimizations may prevent crashes. When crashes do happen, their
symptoms (e.g., backtrace) point to problems during destruction of
global objects, but the details point to innocent objects (e.g., PortCfg
or SSL_CTX).
In some environments, the following malloc error is printed on console
before the crash: "corrupted double-linked list".
This change replaces two StatHist globals used for SBuf statistics
collection with always-available singletons. The replaced globals could
be destructed before the last SBuf object using them, leading to memory
corruption (that would eventually crash Squid).
We should not expect the exact match because, as discovered during bug
3805 (r13947) fix, shared Segment::size() may exceed the originally
requested RAM amount due to stat() page rounding done by OSes like OS X.
Unfortunately, this rounding weakens the failed consistency check a lot.
TODO: Store the exact requested size and check against that as well.
Some servers cause an SSL handshake error with peek and splice.
The problem is related to the TLS Session Tickets extension handling. Squid
expects always a TLS Session Tickets extension, included in server hello
message, to assume that the ticket accepted and the session is a resumed
session, which is not always true.
Alex Rousskov [Tue, 12 Apr 2016 06:04:23 +0000 (18:04 +1200)]
Bug 4468: NotNode (!acl) naming: Terminate the name before strncat(name).
The fix may reduce or even eliminate garbage in logged ACL names (at
least). The bug was exposed by valgrind's "Conditional jump or move
depends on uninitialised value(s)" error.
However, pinger only drops setuid/setgid, and won't drop capabilities
after sockets are opened (when it is setuid-root, setuid(getuid()) also
drops capabilities, no code changes necessary; however, if it is only
setcap'ed, setuid() is no-op).
Fix is minimally tested, seems to work fine with both/either `setcap`
and `chmod u+s`; non-linux/non-libcap configurations should not be
affected).
Amos Jeffries [Wed, 23 Mar 2016 14:00:51 +0000 (03:00 +1300)]
RFC 7725: Add registry entry for 451 status text
While Squid does not generate these messages automatically we still have
to relay the status line text accurately, and admin may want to use it
for deny_info status.
Bug description:
- The client side and server side are finished
- On server side the Ftp::Relay::finalizeDataDownload() is called and
schedules the Ftp::Server::originDataCompletionCheckpoint
- On client side the "Ftp::Server::userDataCompletionCheckpoint" is
called. This is schedules a write to control connection and closes
data connection.
- The Ftp::Server::originDataCompletionCheckpoint is called which is
trying to write to control connection and the assertion triggered.
This bug is an corner case, where the client-side (FTP::Server) should
wait for the server side (Ftp::Client/Ftp::Relay) to finish its job before
respond to the FTP client. In this bug the existing mechanism, designed
to handle such problems, did not worked correctly and resulted to a double
write response to the client.
This patch try to fix the existing mechanism as follows:
- When Ftp::Server receives a "startWaitingForOrigin" callback, postpones
writting possible responses to the client and keeps waiting for the
stopWaitingForOrigin callback
- When the Ftp::Server receives a "stopWaitingForOrigin" callback,
resumes any postponed response.
- When the Ftp::Client starts working on a DATA-related transaction, calls the
Ftp::Server::startWaitingForOrigin callback
- When the Ftp::Client finishes its job or when its abort abnormaly, checks
whether it needs to call Ftp::Server::stopWaitingForOrigin callback.
- Also this patch try to fix the status code returned to the FTP client
taking in account the status code returned by FTP server. The
"Ftp::Server::stopWaitingForOrigin" is used to pass the returned status code
to the client side.
Squid crashes on shutdown while cleaning up idle ICAP connections.
The global Adaptation::Icap::TheConfig object is automatically
destroyed when Squid exits. Its destructor destroys Icap::ServiceRep
objects that, in turn, close all open connections in the idle
connections pool. Since this happens after comm_exit has destroyed all
Comm structures associated with those connections, Squid crases.
After certain failures, FwdState::retryOrBail() may be called twice,
once from FwdState::unregisterdServerEnd() [called from
HttpStateData::swanSong()] and once from the FwdState's own connection
close handler. This may result in two concurrent connections to the
remote server, followed by an assertion upon a connection closure.
This patch:
- After HttpStateData failures, instead of closing the squid-to-peer
connection directly (and, hence, triggering closure handlers), calls
HttpStateData::closeServer() and mustStop() for a cleaner exit with
fewer wasteful side effects and better debugging.
- Creates and remembers a FwdState close handler AsyncCall so that
comm_remove_close_handler() can cancel an already scheduled callback.
The conversion to the AsyncCall was necessary because legacy [close
handler callbacks] cannot be canceled once scheduled.
William Lima [Tue, 23 Feb 2016 15:47:10 +0000 (04:47 +1300)]
Bug 3870: assertion failed: String.cc: 'len_ + len <65536' in ESI::CustomParser
The custom ESI parser used in absence of libxml2 or libexpat parsers was
restricted to handling 64KB buffers but under some conditions could expand
to over 64KB during the parse process. Hitting this assertion.
TODO: the parser can now be redesigned to make use of Tokenizer and
CharacterSet parsing tools. But that is left for later work.
Alex Rousskov [Thu, 18 Feb 2016 04:15:33 +0000 (21:15 -0700)]
Better handling of huge response headers. Fewer "BUG 3279" messages.
When we failed to parse a response, do not store the fake half-baked
response (via a replaceHttpReply() call). Doing so leads to misleading
"BUG 3279: HTTP reply without Date" messages (at best). The fake
response is only meant for continueAfterParsingHeader().
Also removed a misleading XXX that may have caused Bug 4432 in v4.0
(trunk r14548).