HTTP: MUST ignore a [revalidation] response with an older Date header.
Before this patch, Squid violated the RFC 7234 section 4 MUST
requirement: "When more than one suitable response is stored, a cache
MUST use the most recent response (as determined by the Date header
field)." This problem may be damaging in cache hierarchies where
parent caches may have different responses. Trusting the older response
may lead to excessive IMS requests, cache thrashing and other problems.
The downloadFinished() method was responsible for the job clean up, but
that asynchronous method is not called when the Downloader job quits
before the call can be fired. This early termination happens when, for
example, the job finishes while still inside the start() method (e.g., a
memory hit with no async ACLs to check). It also happens if an exception
is thrown while the job is running under async call protections.
Ensure the cleanup happens regardless of the job path to finish.
Alex Rousskov [Thu, 6 Oct 2016 22:05:50 +0000 (16:05 -0600)]
Fix known "concurrent c_str()s" violations of SBuf API.
The second c_str() call destroys the buffer still being used by the
first c_str() result, leading to many "Invalid read of size N" errors.
IMO, we must instead fix SBuf to make similar violations unlikely, but
there is currently no squid-dev consensus on whether and how to do that.
See "[RFC] Support concurrent SBuf::c_str() calls" thread on squid-dev.
Amos Jeffries [Thu, 6 Oct 2016 21:02:32 +0000 (10:02 +1300)]
Fix header name mismatching after rev.14870
When a mime header set contains two custom headers and one
name is the prefix for the other the name lookup using a
fixed length for String comparison can wrongly match the
longer header as being equal to the shorter one, since only
the identical prefix portion is compared.
To avoid this we must check that the lengths are also matching.
This also improves performance very slightly as the common
case for custom headers is to have an "X-" prefix which is
slower to compare than total length. Headers having same
length and same prefix is quite rare.
Alex Rousskov [Thu, 6 Oct 2016 00:05:38 +0000 (18:05 -0600)]
Hide OpenSSL tricks from Valgrind far-reaching initialization errors.
This change has no effect unless ./configured --with-valgrind-debug.
OpenSSL, including its Assembly code, contains many optimizations and
timing defenses that Valgrind misinterprets as uninitialized value
usage. Most of those tricks can be disabled by #defining PURIFY when
building OpenSSL, but some are not protected with PURIFY and most
OpenSSL libraries are (and should be) built without that #define.
To make matters worse, once Valgrind misdetects uninitialized memory, it
will complain about every usage of that memory. Those complaints create
a lot of noise, complicate triage, and effectively mask true bugs.
AFAICT, they cannot be suppressed by listing the source of that memory.
For example, this OpenSSL Assembly trick:
Uninitialised value was created by a stack allocation
at 0x556C2F7: aesni_cbc_encrypt (aesni-x86_64.s:2081)
Triggers many false errors like this one:
Conditional jump or move depends on uninitialised value(s)
by 0x750838: Debug::Finish()
by 0x942E68: Http::One::ResponseParser::parse(SBuf const&)
...
This change marks OpenSSL-returned decrypted bytes as initialized. This
might miss some true OpenSSL bugs, but we should focus on Squid bugs.
Alex Rousskov [Wed, 5 Oct 2016 04:24:28 +0000 (22:24 -0600)]
Fixed "Invalid read of size 1" bug in non-standard HTTP header search.
Valgrind error report:
Invalid read of size 1
at strcasecmp
by String::caseCmp(char const*) const
by HttpHeader::getByNameIfPresent(char const*, int, String&)
by HttpHeader::getByNameIfPresent(SBuf const&, String&)
by HttpHeader::getByName(SBuf const&) const
by assembleVaryKey(String&, SBuf&, HttpRequest const&)
...
The bug is probably not specific to Vary assembly and may have been
present since r14285 (gperf perfect hash refactoring).
HTTP/1.1: handle syntactically valid requests with unsupported HTTP versions
Before this change, when a syntactically valid HTTP/1 request indicated
HTTP major version 2, Squid mangled and forwarded the request as an
HTTP/1 message. Since Squid does not and cannot correctly interpret an
HTTP/1 request using HTTP/2 rules, returning an error and closing the
connection appears to be the only correct action possible.
If a version label matches the "HTTP/" 1*DIGIT "." 1*DIGIT pattern from
RFC 2616 it should not be handled as 0.9 syntax. All unacceptible
versions that begin with "HTTP/" should get a 505.
To be compliant with RFC 7230 as well:
- versions 1.2 thru 1.9 accept and handle normally. That is a SHOULD
requirement in RFC 7230 section 2.6 final paragraph.
- other single-digit versions should get the 505 error.
- versions with multiple digits should get the 505 error.
Amos Jeffries [Mon, 3 Oct 2016 04:33:08 +0000 (17:33 +1300)]
Convert DNS shutdown to use a registered runner
This patch adds a Runner to manage the DNS component state on shutdown
(and the matching reconfigure port closures). The actions taken on
reconfigure and shutdown are not changed, just their timing.
Visible differences are now that Squid logs when DNS ports are being
closed (and the reason), also that the still-open FD table report no
longer lists the DNS listening FD as being open on shutdown when
comm_exit() is called and forces all fd_table entries to close.
NP: I have not moved the Dns::Init() operations to the runner at this
stage because it needs some deeper analysis and redesign as to which
Squid processes DNS is enabled/initialized. Currently everything from
coordinator to Diskers enable the internal-DNS state.
One BUG found but not fixed:
DNS sockets being closed during reconfigure produces a race between
any already active connections (or ones received between closing DNS
sockets and server listening sockets) and the reconfigure completing
(Runner syncConfig() being run). Transactions which loose this race
will produce DNS timeouts (or whatever the caller set) as the queries
never get queued to be re-tried after the DNS sockets are re-opened.
We now seem to have had several patches successfully use members
declared with default values (C++11 feature) and/or with the
"*this = Foo();" shortcut for a reset/clear method.
So I think we can start using these to replace old C-style
initialization and clear() functions.
This patch begins by replacing the Config2 use of memset(). I for one am
constantly mistaking which of the Config objects has memset() applied to
it at the global level when reconfigure happens. Now we do not need to
care, each object handles its own clearing one way or another.
Bug 4581: Secure ICAP segfault in checkForMissingCertificates
The Security::PeerConnector::request member is NULL in the case of secure
ICAP. This patch checks if this member is NULL before using it.
Also fixes Security::PeerConnector to send to cert validator the correct
domain name. Curretly sends the host name from the HttpRequest object, which
is not correct not only for secure ICAP cases (where this object does not
exist), but also for the cases where squid connecting to a remote proxy
using TLS.
Alex Rousskov [Wed, 21 Sep 2016 18:54:45 +0000 (12:54 -0600)]
Do reset $HOME if needed after r13435. Minimize putenv() memory leaks.
While r13435 is attributed to me, the wrong condition was not mine. This
is a good example of why "cmp() op 0" pattern is usually safer because
the "==" or "!=" operator "tells" us whether the strings are equal,
unlike "!cmp()" that is often misread as "not equal".
Make the ContextPointer for server TLS contexts extend out of
libsecurity up the stack of callers to their main place of
med/long-term storage.
This means the code outside location where SSL contexts are created
mostly no longer needs to worry about (non-)locking details. Just
about using a smart Pointer properly.
Alex Rousskov [Wed, 21 Sep 2016 16:26:39 +0000 (10:26 -0600)]
Fixed "Invalid read of size 1" squid.conf parsing error in r14719.
Valgrind found an off-by-one ProcessMacros() length argument usage:
Invalid read of size 1
by ReplaceSubstr(char*&, int&, ...) (cache_cf.cc:323)
by SubstituteMacro(char*&, int&, ...) (cache_cf.cc:338)
by ProcessMacros(char*&, int&) (cache_cf.cc:344)
by default_line(char const*) (cf_parser.cci:13)
by default_all() (cf_parser.cci:136)
by parseConfigFile(char const*) (cache_cf.cc:581)
ProcessMacros() assumes it is modifying a c-string and needs the number
of characters in that string (i.e., length), not length+1. Needless to
say, all this error-prone c-string manipulation code should be replaced!
Also replaced potential out-of-bounds config_input_line writing code.
XXX: Parsing should fail if config_input_line is not long enough.
Alex Rousskov [Wed, 21 Sep 2016 00:49:41 +0000 (18:49 -0600)]
Cleanup cleanup: Reconfigure crashes after r13895.
Reconfiguring crashes Squids that use tcp_outgoing_tos,
tcp_outgoing_dscp, tcp_outgoing_ds, clientside_tos, tcp_outgoing_mark,
or clientside_mark configuration directive.
Alex Rousskov [Tue, 20 Sep 2016 22:15:08 +0000 (16:15 -0600)]
Cleanup cleanup: Do not silently violate The Big Three after r14743.
Fortunately, the broken compiler-generated copy methods are not yet
used, at least not in the GCC STL implementation I tested with. If there
is a compiler that needs them for std::queue, or if we start using them,
they will need to be implemented.
Alex Rousskov [Tue, 20 Sep 2016 22:11:17 +0000 (16:11 -0600)]
Fixed ./configure bug/typo in r14394.
The bug resulted in "test: too many arguments" error messages when
running ./configure and effectively replaced AC_MSG_ERROR() with
AC_MSG_WARN() for missing but required Heimdal and GNU GSS Kerberos
libraries.
Alex Rousskov [Tue, 20 Sep 2016 18:11:35 +0000 (12:11 -0600)]
Bug 3819: "fd >= 0" assertion in file_write() during reconfiguration
Other possible assertions include "NumberOfUFSDirs" in UFSSwapDir and
"fd >= 0" in file_close(). And Fs::Ufs::UFSStoreState::drainWriteQueue()
may segfault while dereferencing nil theFile, although I am not sure
that bug is caused specifically by reconfiguration.
Since trunk r9181.3.1, reconfiguration is done in at least two steps:
First, mainReconfigureStart() closes/cleans/disables all modules
supporting hot reconfiguration and then, at the end of the event loop
iteration, mainReconfigureFinish() opens/configures/enables them again.
UFS code hits assertions if it has to log entries or rebuild swap.state
between those two steps. The tiny assertion opportunity window hides
these bugs from most proxies that are not doing enough disk I/O or are
not being reconfigured frequently enough.
Asynchronous UFS cache_dirs such as diskd were the most exposed, but
even blocking UFS caching code could probably hit [rebuild] assertions.
The swap.state rebuilding (always initiated at startup) probably did not
work as intended if reconfiguration happened during the rebuild time
because reconfiguration closed the swap.state file being rebuilt. We now
protect that swap.state file and delay rebuilding progress until
reconfiguration is over.
TODO: To squash more similar bugs, we need to move hot reconfiguration
support into the modules so that each module can reconfigure instantly.
Make the ContextPointer for client TLS contexts extend out of
libsecurity up the stack of callers to their main place of
med/long-term storage.
This means the code outside location where SSL contexts are created
mostly no longer needs to worry about (non-)locking details. Just
about using a smart Pointer properly.
Since class SquidConfig is one of those places this involves linking
pinger with openssl libraries. That is likely a bug to fix later.
Bug 4471: revalidation doesn't work when expired cached object lacks Last-Modified.
The bug was caused by the fact that Squid used only Last-Modified header
value for evaluating entry's last modification time while making an
internal revalidation request. So, without Last-Modified it was not
possible to correctly fill If-Modified-Since header value. As a result,
the revalidation request was not initiated when it should be.
To fix this problem, we should generate If-Modified-Since based on other
data, showing how "old" the cached response is, such as Date and Age
header values. Both Date and Age header values are utilized while
calculating entry's timestamp. Therefore, we should use the timestamp if
Last-Modified is unavailable.
TODO: HttpRequest::imslen support looks broken/deprecated. Using this
field inside StoreEntry::modifiedSince() leads to several useless checks
and probably may cause other [hidden] problems.
Fix logged request size (%http::>st) and other size-related %codes.
The %[http::]>st logformat code should log the actual number of
[dechunked] bytes received from the client. However, for requests with
Content-Length, Squid was logging the sum of the request header size and
the Content-Length header field value instead. The logged value was
wrong when the client sent fewer than Content-Length body bytes.
Also:
* Fixed %http::>h and %http::<h format codes for icap_log. The old code
was focusing on preserving the "request header" (i.e. not "response
header") meaning of the %http::>h logformat code, but since ICAP
services deal with (and log) both HTTP requests and responses, that
focus on the HTTP message kind actually complicates ICAP logging
configuration and will eventually require introduction of new %http
logformat codes that would be meaningless in access.log context.
- In ICAP context, %http::>h should log to-be-adapted HTTP message
headers embedded in an ICAP request (HTTP request headers in REQMOD;
HTTP response headers in RESPMOD). Before this change, %http::>h
logged constant/"FYI" HTTP request headers in RESPMOD.
- Similarly, %http::<h should log adapted HTTP message headers
embedded in an ICAP response (HTTP request headers in regular
REQMOD; HTTP response headers in RESPMOD and during request
satisfaction in REQMOD). Before this change, %http::<h logged "-" in
REQMOD.
In other words, in ICAP logging context, the ">" and "<" characters
should indicate ICAP message kind (where the being-logged HTTP message
is embedded), not HTTP message kind, even for %http codes.
TODO: %http::>h code logs "-" in RESPMOD because AccessLogEntry does
not store virgin HTTP response headers.
* Correctly initialize ICAP ALE HTTP fields related to HTTP message
sizes for icap_log, including %http::>st, %http::<st, %http::>sh, and
%http::>sh format codes.
* Initialize HttpMsg::hdr_sz in HTTP header parsing code. Among other
uses, this field is needed to initialize HTTP fields inside ICAP ALE.
* Synced default icap_log format documentation with the current code.
Squid crashes on shutdown while cleaning up idle ICAP connections, part2
Further polishing of r14825 patch:
- Replace RegisterRunner(this) calls, with Independent::registerRunner() cals
for IndependentRunner kid classes.
- If RegisterRunner called for an IndependentRunner will throw
- Fix Independent::registerRunner() to not use GetRidOfRunner. This is
confuses Coverity checks, and make the code more complex.
- Declare static FindRunners as inlinened function
Convert to an STL set instead of CBDATA list:
* The list is not passed as a callback parameter, so CBDATA overheads are
unnecessary.
* STL set has built-in unique entry protection, so special
push_back_unique handling is not required. Just emplace() entries.
* STL unorderd_set is used for fast lookup property. This should operate
faster on medium or larger sized ACL lists than CbDataList type could.
- Remove CertError copy constructor and assignment operator. The compiler
generated versions are enough.
- Implement equality operators as C++ inlined members
Squid crashes on shutdown while cleaning up idle ICAP connections, part2
Fixes to allow make check work:
- Fix tests/stub_pconn.cc to correctly implement a stub IdleConnList
- Replace reinterpret_cast with static_cast to avoid errors when clang is used.
The IdleConnList is now a hash_link kid and static_cast is the correct one.
Squid crashes on shutdown while cleaning up idle ICAP connections.
The global Adaptation::Icap::TheConfig object is automatically destroyed
when Squid exits. Its destructor destroys Icap::ServiceRep objects that,
in turn, close all open connections in the idle connections pool. Since
this happens after comm_exit has destroyed all Comm structures
associated with those connections, Squid crashes.
This patch uses updated RegisteredRunners API to close all connections
in existing connections pools before Squid cleans up the Comm subsystem.
Also added a new IndependentRunner class to the RegistersRunner API,
which must be used for runners that are destroyed by external forces,
possibly while still being registered. IndependentRunner objects
unregister themselves from Runners registry when they are destroyed.
The finishShutdown() method is now virtual and may be overloaded to
implement independent runner cleanup during main loop (and/or to support
complex cleanup actions that require calling other virtual methods). The
method is still called for all registered runners but it no longer
destroys the runner. For dependent runners, the destruction happens soon
after the finishShutdown event ends but also during the main loop
(unless the runner has managed to register after the main loop ended).
HTTP: do not allow Proxy-Connection to override Connection header
Proxy-Connection header is never actually valid, it is relevant in
HTTP/1.0 messages only when Connection header is missing and not
relevant at all in HTTP/1.1 messages.
This fixes part of the behaviour, making Squid use Connection header
for persistence (keep-alive vs close) checking if one is present
instead of letting Proxy-Connection override it.
TODO: Proxy-Connection still needs to be ignored completely when
the message version is HTTP/1.1.
Wrong error_depth value printed with %ssl::<cert_errors
This patch fixes the following problems:
- Squid stores the depth information only for the first certificate error
- The Ssl::CertError assignment operations does not update depth member
- The Ssl::CertError equal/not equal operators does not check depth information
SSL CN wildcard must only match a single domain component [fragment].
When comparing the requested domain name with a certificate Common Name,
Squid expanded wildcard to cover more than one domain name label (a.k.a
component), violating RFC 2818 requirement[1]. For example, Squid
thought that wrong.host.example.com matched a *.example.com CN.
[1] "the wildcard character * ... is considered to match any single
domain name component or component fragment. E.g., *.a.com matches
foo.a.com but not bar.foo.a.com".
In other contexts (e.g., ACLs), wildcards expand to all components.
matchDomainName() now accepts a mdnRejectSubsubDomains flag that selects
the right behavior for CN match validation.
The old boolean honorWildcards parameter replaced with a flag, for clarity
and consistency sake.
This patch also handles the cases where the host name consists only from dots
(eg malformed Host header or SNI info). The old code has undefined behaniour
in these cases. Moreover it handles the cases a certificate contain zero length
string as CN or alternate name.
HTTP: MUST always revalidate Cache-Control:no-cache responses.
Squid MUST NOT use a CC:no-cache response for satisfying subsequent
requests without successful validation with the origin server. Squid
violated this MUST because Squid mapped both "no-cache" and
"must-revalidate"/etc. directives to the same ENTRY_REVALIDATE flag
which was interpreted as "revalidate when the cached response becomes
stale" rather than "revalidate on every request".
This patch splits ENTRY_REVALIDATE into:
- ENTRY_REVALIDATE_ALWAYS, for entries with CC:no-cache and
- ENTRY_REVALIDATE_STALE, for entries with other related CC directives
like CC:must-revalidate and CC:proxy-revalidate.
Reusing ENTRY_CACHABLE_RESERVED_FOR_FUTURE_USE value for the more
forgiving ENTRY_REVALIDATE_STALE (instead of the more aggressive
ENTRY_REVALIDATE_ALWAYS) fixes the bug for the already cached entries -
they will be revalidated and stored with the correct flag on the next
request.
Alex Rousskov [Tue, 6 Sep 2016 02:45:20 +0000 (14:45 +1200)]
HTTP: validate Content-Length header values
Squid is violating HTTP MUSTs by forwarding messages with
problematic Content-Length values. Some of those bugs were fixed in
trunk r14215. This change handles multiple Content-Length values inside
one header field, negative values, and trailing garbage. Handling the
former required a change in the overall Content-Length interpretation
approach (which is why it was previously left as a TODO).
Squid now passes almost all Co-Advisor tests devoted to this area. We
are not 100% done though: We still need to handle malformed values with
leading signs (e.g., "-0" or "+1"). However, I hope that the remaining
problems are relatively minor. I do not plan on addressing them in the
foreseeable future.
Also improved httpHeaderParseOffset(): Added detection of overflowing
and underflowing integer values; polished malformed value detection code
(Linux strtoll(3) manual page has a good example). The function no
longer considers empty strings valid and reports trailing characters.
The function still accepts leading whitespace and signs. It is still the
wrong approach to HTTP numbers parsing, but further improvements are out
of scope because they are complicated and would require significant
caller rewrites.
MUST respond with 414 (URI Too Long) when request target exceeds limits.
Before the fix, Squid simply closed client connection after receiving a
huge URI (or a huge request-line), violating the RFC 7230 MUST. This
happened because a high-level Must(have buffer space) check in
ConnStateData::clientParseRequests() would throw an exception. Now these
problems are detected inside the low-level RequestParser code, where we
can distinguish huge URIs from huge methods.
Alex Rousskov [Fri, 19 Aug 2016 02:32:01 +0000 (20:32 -0600)]
Do not log error:transaction-end-before-headers after invalid requests.
Squid was not consuming read leftovers after failing to parse a request.
Starting with r14752, those leftovers were misinterpreted as another
unparsed request, creating an extra error:transaction-end-before-headers
access.log line after every error:invalid-request line (and probably
after every error:request-too-large line).
To stop Squid from accidentally reading new bytes and misinterpreting
them as another request, I was tempted to also clear flags.readMore
after consuming unparsable leftovers. In my tests, the flag is cleared
in ConnStateData::quitAfterError() called from clientTunnelOnError(),
but that logic looks rather fragile. I resisted the temptation to
improve it because controlling reads is a complicated matter (especially
in on_unsupported_protocol context) outside this logging fix scope.
Alex Rousskov [Wed, 17 Aug 2016 23:11:56 +0000 (17:11 -0600)]
Do not access-log chunked non-persistent responses with _ABORTED suffix.
The lack of Content-Length response header and STREAM_FAILED are not
closely related. ClientStreams should be capable of delivering complete
[chunked] responses without Content-Length header. If the peer response
was not chunked and lacked Content-Length, then we will return
STREAM_FAILED when false checkTransferDone() indicates that we did not
get everything yet but flags.complete forces us to stop sending.
Alex Rousskov [Wed, 17 Aug 2016 22:46:38 +0000 (16:46 -0600)]
Do not access-log SslBump-faked CONNECTs with _ABORTED suffixes.
All successful transactions, including fake CONNECTs, must be removed
from the pipeline (via a finished() call) to avoid _ABORTED suffix
added by Pipeline::terminateAll() which always calls noteIoError().
Also added a check that the remaining request in the pipeline is the
expected CONNECT. Old comments said as much but the code did not throw
if a different request was pipelined (and asserted if more than one
requests were). We now throw in all these problematic cases. I cannot
track all ConnStateData::getSslContextStart() callers deep enough to
be sure that only a single CONNECT can get through, but the change
reflects the expected behavior (and it may be useful to discover any
reality deviations from our expectations).
Amos Jeffries [Wed, 17 Aug 2016 00:38:25 +0000 (12:38 +1200)]
Better support for unknown URL schemes
Squid already contains AnyP::PROTO_UNKNOWN support for unknown protocols
but currently does not preserve the actual string value received for them.
This adds a textual representation ('image') to the UriScheme object to
fill that gap and ensure that all URL representations (ie cache keys,
logs and outgoing messages) are generated with the scheme string as it
was received rather than implicitly via a registered protocol type.
Future work:
* add ACL support for arbitrary scheme names
* support for comparisons of unknown schemes
Alex Rousskov [Sun, 14 Aug 2016 01:17:57 +0000 (19:17 -0600)]
Fixed build on systems needing explicit <utility> in RegexPattern.cc.
... after r14795.
Also removed bogus comments implying that std::move() does something to
its argument. That function is just a type cast; it does nothing else!
We use std::move() so that the compiler picks a move constructor or
assignment (where available). In case of RegexPattern::flags, using
std::move() has no effect because integers do not have those move
methods. However, it may be a good idea to use std::move() anyway
because the type of RegexPattern::flags might change in the future and
for consistency sake. Thus, I did not remove std::move(), only comments.
Amos Jeffries [Sat, 13 Aug 2016 10:12:06 +0000 (22:12 +1200)]
Bug 4561: Replace use of default move operators with explicit implementation
The '= default' syntax for move assignment operator and constructor can
result in double-free crashes when the class type containspointer members
since the trivial move uses std::memmove() which does not perform the
required zeroing/nullptr of the temporary source object being moved.
Make Squid death due to overloaded helpers optional.
Added on-persistent-overload=action option to helpers. Helper overload
is defined as running with an overflowing queue. Persistent helper
overload is [still] defined as being overloaded for more than 3 minutes.
The default behavior is unchanged(*) -- Squid worker dies with a fatal
error at the attempt to submit a new request to a persistenly overloaded
helper. This default behavior can also be configured explicitly using
on-persistent-overload=die.
With on-persistent-overload=ERR, when dealing with a persistently
overloaded helper, Squid immediately skips the helper request and sends
an ERR response to the caller. Squid informs the admin when it starts
and when it stops skipping helper requests due to persistent overload.
The code had conflicting notions of an "overloaded helper". The external
ACL helper, the URL rewriter, and the store ID code used queueFull() to
test whether the new request would overflow the queue (and, hence,
overload the helper), but queueFull() itself did not check whether the
queue was full! It checked whether the queue was already overflowing.
This confusion resulted in that code scheduling one extra helper request
before enabling bypass. The code and its documentation are now more
consistent (and better match the "overload" terminology used by the new
configuration option, which also feels better than calling the helper
"full").
(*) Resolving the above confusion resulted in minor (one request)
differences in the number of helper requests queued by Squid for
external ACL, URL rewriting, and store ID helpers, with the adjusted
behavior [better] matching the documentation.