Alex Rousskov [Sun, 24 Feb 2019 03:28:47 +0000 (03:28 +0000)]
Fixed squidclient authentication after 4b19fa9 (Bug 4843 pt2) (#373)
* squidclient -U sent Proxy-Authorization instead of Authorization.
Code duplication bites again.
* squidclient -U and -u could sent random garbage after the correct
[Proxy-]Authorization value as exposed by Coverity CID 1441999: Unused
value (UNUSED_VALUE). Coverity missed this deeper problem, but
analyzing its report lead to discovery of the two bugs fixed here.
Also reduced authentication-related code duplication.
Bug 4864: !Comm::MonitorsRead assertion in maybeReadVirginBody() (#351)
This assertion is probably triggered when Squid retries/reforwards
server-first or step2+ bumped connections (after they fail).
Retrying/reforwarding such pinned connections is wrong because the
corresponding client-to-Squid TLS connection was negotiated based on the
now-failed Squid-to-server TLS connection, and there is no mechanism to
ensure that the new Squid-to-server TLS connection will have exactly the
same properties. Squid should forward the error to client instead.
Also fixed peer selection code that could return more than one PINNED
paths with only the first path having the destination of the actual
pinned connection. To reduce the chances of similar future bugs, and to
polish the code, peer selection now returns a nil path to indicate a
PINNED decision. After all, the selection code decides to use a pinned
connection (whatever it is) rather than a specific pinned _destination_.
Added %proxy_protocol::>h logformat code for logging received PROXY
protocol TLV values and passing them to adaptation services and helpers.
For simplicity, this implementation treats TLV values as text and reuses
the existing HTTP header field logging interfaces. Support for binary
TLV value logging can be added later if needed.
Also support logging of metadata extracted from the fixed portion of a
PROXY protocol header and referenced as "pseudo headers": :command,
:version, :src_addr, :dst_addr, :src_port, and :dst_port.
Also fixed several bugs in the old PROXY protocol v1/v2 parsing code:
* Buffer overrun in ConnStateData::parseProxy2p0(). The available
local SBuf could be less than sizeof(pax), resulting in copying
excessive bytes with SBuf::rawContent().
* Incorrect processing of malformed v1 headers lacking CRLF in
ConnStateData::parseProxy1p0(), which waited for more data
even if the buffer size already exceeded the maximum v1 header
size.
* Incorrect processing of partial-buffered v1 headers, when only
the initial (magic) part of the header has been received. The old
code resulted with an error instead of waiting for more data in this
case.
* Incorrect v1 header framing for proto=UNKNOWN headers.
The code used only LF while the protocol requires CRLF for it.
* Do not use address information from v2 header if the header
proto is UNSPEC.
* Incorrect v1 magic expectations (a 6-character `PROXY ` instead of the
proper 5-character `PROXY` sequence) leading to mishandling of
non-PROXY input. For example, receiving `PROXY\r\n` would result in
"need more data" outcome (and long timeout) instead of an immediate
error.
Also eliminated code duplication in HttpHeader::getByNameListMember()
and HttpHeader::getListMember(), moving the common part into a
separate getListMember() method.
Also eliminated code duplication and probably fixed a bug with applying
client_netmask parameter in ConnStateData constructor. The mask should
not be applied to localhost and IPv6 addresses but was.
Also parse PROXY protocol v2 LOCAL addresses (for logging purposes). In
compliance with the PROXY protocol specs, LOCAL addresses are still
unused for connection routing.
Restored the natural order of the following two notifications:
* BodyConsumer::noteMoreBodyDataAvailable() and
* BodyConsumer::noteBodyProductionEnded() or noteBodyProducerAborted().
Commit b599471 unintentionally reordered those two notifications. Client
kids (and possibly other BodyConsumers) relied on the natural order to
end their work. If an HttpStateData job was done with the Squid-to-peer
connection and only waiting for the last adapted body bytes, it would
get stuck and leak many objects. This use case was not tested during b599471 work.
Reuse reserved Negotiate and NTLM helpers after an idle timeout (#59)
Squid can be killed or maimed by enough clients that start multi-step
connection authentication but never follow up with the second HTTP
request while keeping their HTTP connection open. Affected helpers
remain in the "reserved" state and cannot be reused for other clients.
Observed helper exhaustion has happened without any malicious intent.
To address the problem, we add a helper reservation timeout. Timed out
reserved helpers may be reused by new clients/connections. To minimize
problems with slow-to-resume-authentication clients, timed out reserved
helpers are not reused until there are no unreserved running helpers
left. The reservations are tracked using unique integer IDs.
Also fixed Squid crashes caused by unexpected helper termination -- the
raw UserRequest::authserver pointer could point to a deleted helper.
mahdi1001 [Sun, 10 Feb 2019 08:08:55 +0000 (08:08 +0000)]
Add support for buffer-size= to UDP logging (#359)
Allow admin control of buffering for log outputs written to UDP
receivers using the buffer-size= parameter.
buffer-size=0byte disables buffering and sends UDP packets
immediately regardless of line size.
When non-0 values are used lines shorter than the buffer may be
delayed and aggregated into a later UDP packet.
Log lines larger than the buffer size will be sent immediately
and may trigger delivery of previously buffered content to
retain log order (at time of send, not UDP arrival).
To avoid truncation problems known with common recipients
the buffer size remains capped at 1400 bytes.
Amos Jeffries [Fri, 25 Jan 2019 16:21:38 +0000 (16:21 +0000)]
Cleanup and simplify unit tests (#336)
Polish unit tests to reduce their dependency lists inline with
the updated test documentation requirements.
Some further cleanup of unit tests documentation based on
experience pruning existing unit tests.
* Only the test logic files actually need to be distributed by
a test. All files being tested are supposed to be distributed
from elsewhere and the test should rely on that to prevent
future issues like the base/RefCount.h bug mentioned below.
* Starting to deprecate TESTSOURCES which is causing more
trouble than benefit by pulling in too many needless
dependencies. eg the globals.cc objects and symbols. Tools it
used to provide are largely superseded by the stub mechanisms.
Add missing stub files necessary for pruning dependencies. Also
fix some bugs in existing stub files.
Fix base/RefCount.h distribution. This file was not included in
the base/libbase.la dependency list but indirectly being
distributed by the unit-test existence. Which in some builds
could cause it not to exist in generated minimal tarballs.
Systems which have been partially 'IPv6 disabled' may allow
sockets to be opened and used but missing the IPv6 loopback
address.
Implement the outstanding TODO to detect such failures and
disable IPv6 support properly within Squid when they are found.
This should fix bug 4915 auth_param helper startup and similar
external_acl_type helper issues. For security such helpers are
not permitted to use the machine default IP address which is
globally accessible.
Fail Rock swapout if the disk dropped some of the write requests (#352)
Detecting dropped writes earlier is more than a TODO: If the last entry
write was successful, the whole entry becomes available for hits
immediately. IpcIoFile::checkTimeouts() that runs every 7 seconds
(IpcIoFile::Timeout) would eventually notify Rock about the timeout,
allowing Rock to release the failed entry, but that notification may
be too late.
The precise outcome of hitting an entry with a missing on-disk slice is
unknown (because the bug was detected by temporary hit validation code
that turned such hits into misses), but SWAPFAIL is the best we could
hope for.
Initialize StoreMapSlice when reserving a new cache slot (#350)
Rock sets the StoreMapSlice::next field when sending a slice to disk. To
avoid writing slice A twice, Rock allocates a new slice B to prime
A.next right before writing A. Scheduling A's writing and, sometimes,
lack of data to fill B create a gap between B's allocation and B's
writing (which sets B.next). During that time, A.next points to B, but
B.next is untouched.
If writing slice A or swapout in general fails, the chain of failed
entry slices (now containing both A and B) is freed. If untouched B.next
contains garbage, then freeChainAt() adds "random" slices after B to the
free slice pool. Subsequent swapouts use those incorrectly freed slices,
effectively overwriting portions of random cache entries, corrupting the
cache.
How did B.next get dirty in the first place? freeChainAt() cleans the
slices it frees, but Rock also makes direct noteFreeMapSlice() calls.
Shared memory cache may have avoided this corruption because it makes no
such calls.
Ipc::StoreMap::prepFreeSlice() now clears allocated slices. Long-term,
we may be able to move free slice management into StoreMap to automate
this cleanup.
Also simplified and polished slot allocation code a little, removing the
Rock::IoState::reserveSlotForWriting() middleman. This change also
improves the symmetry between Rock and shared memory cache code.
Before this fix, Squid sometimes logged the following error:
BUG: Worker I/O pop queue for ... overflow: ...
The bug could result in truncated hit responses, reduced hit ratio, and,
combined with buggy lost I/O handling code (GitHub PR #352), even cache
corruption.
The bug could be triggered by the following sequence of events:
* Disker dequeues one I/O request from the worker push queue.
* Worker pushes more I/O requests to that disker, reaching 1024 requests
in its push queue (QueueCapacity or just "N" below). No overflow here!
* Worker process is suspended (or is just too busy to pop I/O results).
* Disker satisfies all 1+N requests, adding each to the worker pop queue
and overflows that queue when adding the last processed request.
This fix limits worker push so that the sum of all pending requests
never exceeds (pop) queue capacity. This approach will continue to work
even if diskers are enhanced to dequeue multiple requests for seek
optimization and/or priority-based scheduling.
Pop queue and push queue can still accommodate N requests each. The fix
appears to reduce supported disker "concurrency" levels from 2N down to
N pending I/O requests, reducing queue memory utilization. However, the
actual reduction is from N+1 to N: Since a worker pops all its satisfied
requests before queuing a new one, there could never be more than N+1
pending requests (N in the push queue and 1 worked on by the disker).
We left the BUG reporting and handling intact. There are no known bugs
in that code now. If the bug never surfaces again, it can be replaced
with code that translates low-level queue overflow exception into a
user-friendly TextException.
Alex Rousskov [Tue, 8 Jan 2019 15:14:18 +0000 (15:14 +0000)]
Fix BodyPipe/Sink memory leaks associated with auto-consumption (#348)
Auto-consumption happens (and could probably leak memory) in many cases,
but this leak was exposed by an eCAP service that blocked or replaced
virgin messages.
The BodySink job termination algorithm relies on body production
notifications. A BodySink job created after the body production had
ended can never stop and, hence, leaks (leaking the associated BodyPipe
object with it). Such a job is also useless: If production is over,
there is no need to free space for more body data! This change avoids
creating such leaking and useless jobs.
Amos Jeffries [Sun, 6 Jan 2019 13:22:19 +0000 (13:22 +0000)]
Bug 4875 pt2: GCC-8 compile errors with -O3 optimization (#288)
GCC-8 warnings exposed at -O3 optimization causes its
own static analyzer to detect optimized code is eliding
initialization on paths that do not use the
configuration variables.
Refactor the parseTimeLine() API to return the parsed
values so that there is no need to initialize anything prior
to parsing.
Amish [Wed, 2 Jan 2019 11:51:45 +0000 (11:51 +0000)]
basic_ldap_auth: Return BH on internal errors; polished messages (#347)
Basic LDAP auth helper now returns BH instead of ERR in case of errors
other than LDAP_SECURITY_ERROR, per helper guidelines.
Motivation: I have a wrapper around Basic LDAP auth helper. If an LDAP
server is down, then the helper returns BH, and the wrapper uses
a fallback authentication source.
Also converted printf() to SEND_*() macros and reduced message
verbosity.
Daris A Nevil [Mon, 17 Dec 2018 17:38:01 +0000 (17:38 +0000)]
Add %ssl::<cert macro for logging server X.509 certificate (#316)
We have chosen the PEM format instead of, for example, raw DER format
because most programs exchange certificates using PEM format and because
logging raw binary values would be unusual for Squid logformat.
The current support is limited to SslBump step3 which parses and stores
the peer certificate. TODO: Support all from-Squid TLS connections.
Fixed forward_max_tries documentation and implementation (#277)
Before 1c8f25b, FwdState::n_tries counted the total number of forwarding
attempts, including pinned and persistent connection retries. Since that
revision, it started counting just those retries. What should n_tries
count? The counter is used to honor the forward_max_tries directive, but
that directive was documented to limit the number of _different_ paths
to try. Neither 1c8f25b~1 nor 1c8f25b code matched that documentation!
Continuing to count just pinned and persistent connection retries (as in 1c8f25b) would violate any reasonable forward_max_tries intent and admin
expectations. There are two ways to fix this problem, synchronizing code
and documentation:
* Count just the attempts to use a different forwarding path, matching
forward_max_tries documentation but not what Squid has ever done. This
approach makes it difficult for an admin to limit the total number of
forwarding attempts in environments where, say, the second attempt is
unlikely to succeed and will just incur wasteful delays (Squid bug
4788 report is probably about one of such use cases). Also,
implementing this approach may be more difficult because it requires
adding a new counter for retries and, for some interpretations of
"different", even a container of previously visited paths.
* Count all forwarding attempts (as before 1c8f25b) and adjust
forward_max_tries documentation to match this historical behavior.
This approach does not have known unique flaws.
Also fixed FwdState::n_tries off-by-one comparison bug discussed during
Squid bug 4788 triage.
Also fixed admin concern behind Squid bug 4788 "forward_max_tries 1 does
not prevent some retries": While the old forward_max_tries documentation
actually excluded pconn retries, technically invalidating the bug
report, the admin now has a knob to limit those retries.
Amos Jeffries [Fri, 16 Nov 2018 06:04:45 +0000 (06:04 +0000)]
SourceLayout: Cleanup unit tests (#324)
Polish up and add some structure to the unit tests in
src/Makefile.am. Documenting current best practice for new test
additions and shuffling of lines around to make the tests easily
managed in future, consistent with that practice.
There are two functional changes amongst the non-logic changes;
1) testUfs and testRock src/fs/ tests are updated to using
automake conditionals to wrap the entire set of automake lists.
Not just the part adding the test binary to those built. This
reduces the compiler work creating dependency information for
them in builds where they are not going to be used.
2) testRefCount has missing LDFLAGS list added
Also, shuffling of most SOURCES lists into nodist_*_SOURCES to
comply with the documented practice is left to followup work
pruning those lists down.
chi-mf [Tue, 30 Oct 2018 04:48:40 +0000 (04:48 +0000)]
Fix netdb exchange with a TLS cache_peer (#307)
Squid uses http-scheme URLs when sending netdb exchange (and possibly
other) requests to a cache_peer. If a DIRECT path is selected for that
cache_peer URL, then Squid sends a clear text HTTP request to that
cache_peer. If that cache_peer expects a TLS connection, it will reject
that request (with, e.g., error:transaction-end-before-headers),
resulting in an HTTP 503 or 504 netdb fetch error.
Workaround this by adding an internalRemoteUri() parameter to indicate
whether https or http URL scheme should be used. Netdb fetches from
CachePeer::secure peers now get an https scheme and, hence, a TLS
connection.
chi-mf [Thu, 25 Oct 2018 13:33:06 +0000 (13:33 +0000)]
Update netdb when tunneling requests (#314)
Updating netdb on tunneled transactions (e.g., CONNECT requests) is
especially important for origin servers that are only reached via
tunnels. Without updates, requests for such sites may always through a
cache_peer, even if a direct connection to them is much faster.
flozilla [Wed, 24 Oct 2018 12:12:01 +0000 (14:12 +0200)]
Fix memory leak when parsing SNMP packet (#313)
SNMP queries denied by snmp_access rules and queries with certain
unsupported SNMPv2 commands were leaking a few hundred bytes each. Such
queries trigger "SNMP agent query DENIED from..." WARNINGs in cache.log.
Certificate fields injection via %D in ERR_SECURE_CONNECT_FAIL (#306)
%ssl_subject, %ssl_ca_name, and %ssl_cn values were not properly escaped
when %D code was expanded in HTML context of the ERR_SECURE_CONNECT_FAIL
template. This bug affects all ERR_SECURE_CONNECT_FAIL page templates
containing %D, including the default template.
Other error pages are not vulnerable because Squid does not populate %D
with certificate details in other contexts (yet).
Thanks to Nikolas Lohmann [eBlocker] for identifying the problem.
TODO: If those certificate details become needed for ACL checks or other
non-HTML purposes, make their HTML-escaping conditional.
Eneas Queiroz [Wed, 10 Oct 2018 16:45:29 +0000 (16:45 +0000)]
Allow compilation with minimal OpenSSL (#281)
Updated use of OpenSSL deprecated API, so that Squid can be compiled
with OpenSSL built with the OPENSSL_NO_DEPRECATED option. Such OpenSSL
builds are useful for saving storage space on embedded systems.
Also added compat/openssl.h -- a centralized OpenSSL portability shim.
Including it is now required before #including openssl/*.h headers.
chi-mf [Wed, 10 Oct 2018 07:50:52 +0000 (07:50 +0000)]
Fixed %USER_CA_CERT_xx and %USER_CERT_xx crashes (#301)
The bug was introduced in 4e56d7f6 when the formatting code was moved
into Format::Format::assemble() where the old "format" loop variable is
a Format data member with the right type but (usually) the wrong value.
Amos Jeffries [Mon, 8 Oct 2018 00:11:14 +0000 (00:11 +0000)]
ntlm_fake_auth: add ability to test delayed responses (#294)
Add a -t parameter which sets a timeout to artificially delay
authentication responses by a fixed amount longer than their
normal delay.
This enables the fake authenticator to be used to test NTLM
client and Squid behaviour under various network latency and
stress conditions which delay ActiveDirectory responses.
Commit bec110e (a.k.a. v4 commit fbbd5cd5) broke CONNECT URI logging
because it incorrectly assumed that URI::absolute() supports all URIs.
As the result, Squid logged CONNECT URLs as "://host:port".
Also fixed a similar wrong assumption in ACLFilledChecklist::verifyAle()
which may affect URL-related ACL checks for CONNECT requests, albeit
only in already buggy cases where Squid warns about "ALE missing URL".
Bug 4885: Excessive memory usage when running out of descriptors (#291)
TcpAcceptor now stops listening when it cannot accept due to FD limits.
We also no longer defer/queue the same limited TcpAcceptor multiple
times. These changes prevent unbounded memory growth and improve
performance of Squids running out of file descriptors. They should have
no impact on other Squids.
cloneReply() "reply == NULL" assertion when denying replies (#292)
Commit e2cc8c0 lost argument nullification when converting old
HTTPMSGUNLOCK() macro into a function. This change restores that
important part of the HTTPMSGUNLOCK() API without sacrificing argument
type checks added during that conversion.
Bug 4875 pt1: GCC-8 compile errors with -O3 optimization (#287)
Use xstrncpy instead of strncat for String appending
Our xstrncpy() is safer, not assuming the existing char*
is nul-terminated and accounting explicitly for the
nul-terminator byte.
GCC-8 -O3 optimizations were exposing a strncat() output
truncation of the terminator when insufficient space was
available in the String buffer.
We suspect the GCC error to be a false-positive for -O3
builds and, even it it is accurate, these changes should
not affect builds with lower optimization levels.
This change also fixes icc builds: Commit 39cca4e missed noexcept
specification for nothrow variants of new and delete operators,
and the icc compiler did not like that.
Furthermore, we can simplify the replacements because, according
to cppreference, with C++11, "replacing the throwing single object
allocation functions is sufficient to handle all [allocations and
deallocations]".
Bug 4877: Add missing text about external_acl_type %DATA changes (#276)
Conversion of external_acl_type to using logformat macros was
not quite seamless. The %DATA macro now expands to a dash '-' to
fix helpers using it explicitly from receiving incorrect number
of fields (and misaligned input) on their input lines.
Unfortunately that also results in the implicit use of that
macro expanding to non-whitespace ('-'). That small fact was not
documented in the initial v4 release notes and config texts.
Bug 4716: Blank lines in cachemgr.conf are not skipped (#274)
The default cachemgr.conf contains three lines other than
comments. Two of them are blank, the third is "localhost".
These blank lines show up in the "Cache Server" list in the
CGI output.
Amos Jeffries [Tue, 7 Aug 2018 13:00:02 +0000 (13:00 +0000)]
Update systemd dependencies in squid.service (#264)
The network.target is not sufficient to guarantee network
interfaces and IPs are assigned and available. Particularly when
systemd is not in charge of the IP assignment itself.
Use network-online.target as well, which should ensure network
is properly configured and online before starting Squid.
Packing reply headers into StoreEntry/ShmWriter directly means numerous
tiny append() calls which involve expensive mem_node/slice searches. For
example, every two-byte ": " and CRLF delimiter is packed separately.
Allow use of Samba TrivialDB instead of outdated BerkleyDB in
the session helper.
Require TrivialDB support for use of the time_quota helper.
libdb v1.85 is no longer supported by distributors and
upgrading to v5 only to deprecate use does not seem to be
worthwhile.
When dealing with an HTTP request header that Squid can parse but that
contains request URI length exceeding the 8K limit, Squid should log the
URL (prefix) instead of a dash. Logging the URL helps with triaging
these unusual requests. The older %ru (LFT_REQUEST_URI) was already
logging these huge URLs, but %>ru (LFT_CLIENT_REQ_URI) was logging a
dash. Now both log the URL (or its prefix).
As a side effect, %>ru now also logs error:request-too-large,
error:transaction-end-before-headers and other Squid-specific
pseudo-URLs, as appropriate.
Also refactored request- and URI-recording code to reduce chances of
similar inconsistencies reappearing in the future.
Also, honor strip_query_terms in %ru for large URLs. Not stripping query
string in %ru was a security problem.
Also fixed a bug with "redirected" flag calculation in
ClientHttpRequest::handleAdaptedHeader(). In general, http->url and
request->url should not be compared directly, because the latter always
passes through uri_whitespace cleanup, while the former does not.
Also fixed a bug with possibly wrong %ru after redirection:
ClientHttpRequest::log_uri was not updated in this case.
Also initialize AccessLogEntry::request and AccessLogEntry::notes ASAP.
Before this change, these fields were initialized in
ClientHttpRequest::doCallouts(). It is better to initialize them just
after the request object is created so that ACLs, running before
doCallouts(), could have them at hand. There are at least three such
ACLs: force_request_body_continuation, spoof_client_ip and
spoof_client_ip.
Also synced %ru and %>ru documentation with the current code.
A nil pointer is the proper way to indicate a missing heap-allocated
object in C++. Removing NullStoreEntry simplifies and optimizes code.
This removal also brings us one step closer to removing all virtual
methods from StoreEntry, further optimizing code and even saving 8 bytes
per non-shared memory cache entry on most platforms.
Also un-virtualized a few StoreEntry-only methods to optimize their
callers.
Optimization: Fewer memory (re)allocations for HTTP headers (#239)
Tests revealed multiple fresh memory allocations/deallocations while
storing small (few fields) HTTP headers. Many popular sites use larger
headers (15-30 fields). To avoid expensive memory operations:
1. Pool all std::vector<HttpHeaderEntries*> memory allocations.
2. Prevent reallocations (for HTTP headers with fewer than 32 fields).
This optimization deals with storing the header index. It does not
affect how individual header fields are stored.
Logging client "handshake" bytes is useful in at least two contexts:
* Runtime traffic bypass and bumping/splicing decisions. Identifying
popular clients like Skype for Business (that uses a TLS handshake but
then may not speak TLS) is critical for handling their traffic
correctly. Squid does not have enough ACLs to interrogate most TLS
handshake aspects. Adding more ACLs may still be a good idea, but
initial sketches for SfB handshakes showed rather complex
ACLs/configurations, _and_ no reasonable ACLs would be able to handle
non-TLS handshakes. An external ACL receiving the handshake is in a
much better position to analyze/fingerprint it according to custom
admin needs.
* A logged handshake can be used to analyze new/unusual traffic or even
trigger security-related alarms.
The current support is limited to cases where Squid was saving handshake
for other reasons. With enough demand, this initial support can be
extended to all protocols and port configurations.
Alex Rousskov [Fri, 6 Jul 2018 23:58:22 +0000 (23:58 +0000)]
Bug 4865: Unexpected exception on startup in TypedMsgHdr::sync() (#242)
Commit b56b37c broke Ipc::TypedMsgHdr copying by incorrectly assuming
that sync() sets name and ios members. The sync() method sets _other_
(low level) members based on name and ios.
Optimization: Fewer epoll(2) system calls when closing a socket (#235)
Squid was calling epoll(2) twice to clear a socket interest. One call is
more than enough: Technically, close(2) is supposed to clear epoll(2)
registration for us, but I did not risk relying on that.
In other environments, socket interest changes are pooled together
before being submitted to the OS, so Squid was doing a bit of extra
work, but not making (many) extra system calls AFAICT.
Also fixed (previously unused) Comm::ResetSelect() on these platforms:
* epoll(2): The old resetting code did not clear our interest AFAICT.
* kqueue(2): The old resetting code made no sense to me at all.
* poll(2): There was no code at all.
* select(Win32): There was no code at all.
Even though Comm::ResetSelect() implementation is now the same for all
platforms, I did not make that code platform-agnostic because it is
possible to optimize it further in platform-specific ways.
Alex Rousskov [Wed, 4 Jul 2018 15:59:26 +0000 (15:59 +0000)]
Documented when helper requests get queued (#238)
I had to change introductory paragraphs in several directives so that
the new documentation can refer to "numberofchildren". I fixed a few
spelling/grammar problems in changed paragraphs and edited them a bit
for consistency, but they need more work.
When an HTTPS or SSL-Bump port is configured without a cert=
parameter it results in a segmentation fault. Detect that
occurance and add the required FATAL error message instead for
these configurations where cert= is a parameter rather than an
option.
Our project terminology for config settings is;
"parameter"
- a required setting. Print a FATAL error message if missing.
"option"
- an optional setting. Ignored or default value if missing.
GCC-8 enables a lot more warnings related to unsafe coding
practices. The old Squid code contains a lot of risky buffer
size assumptions and implicit assumptions about C-string strcat,
strncat and snprintf changes when operating on those buffers -
many can result in output truncation. Squid's use of -Werror
makes these many issues all go from warnings to outright
compile failures.
Rather than just extending the char* buffer sizes not to
truncate this work seeks to actually remove the issues
permanently by converting to SBuf and updated Squid coding
styles.
The C++1z compilers (GCC-8 and Clang 4.0) are beginning to warn
about C functions memset/memcpy/memmove being used on class
objects which lack "trivial copy" constructor or assignment
operator - their use is potentially unsafe where anything more
complex than trivial copy/blit is required. A number of classes
in Squid are safely copied or initialized with those functions
for now but again the -Werror makes these hard errors.
Completing affected objects conversion from C to C++ code avoids
any deeply hidden issues or adding compiler exceptions to
silence the warnings.
see individual commit messages for details on the particular
changes each does.
Optimization: Do not create/configure ACLFilledChecklist in vain (#232)
While client_db is required for client-side pools to work, it may be
enabled for other reasons, without any client-side pools configured. We
should not create and configure useless ACLFilledChecklist objects
because those operations are already not trivial today and have a
a tendency of becoming more expensive with time.
This change disassociates Transients from collapsed forwarding, enabling
it for SMP caching configurations. Before this change, SMP Squid worker
could not read an entry being written by another worker. Besides
unexpected misses, there could be another (worse) negative effect: The
reader worker could get stuck because it did not get updates via
the Transients mechanism.
Also deprecate the collapsed_forwarding_shared_entries_limit directive
name in favor of shared_transient_entries_limit.
Also removed top-level Storage::smpAware() because memory cache SMP
awareness is determined by configuration and is now computed before we
create the memory cache Storage object. This ability to assess SMP
awareness earlier helps decide whether to create Transients segments.
Also eliminated code duplication in a couple of MemStoreRr methods.
Added a new CF tag to the Squid request status %Ss access log field.
This tag marks transactions that have waited for a CF initiator
transaction. This wait may happen in two cases (or their combination):
1. Classic collapsing: A client request gets collapsed on arrival
(e.g., TCP_CF_HIT or TCP_CF_MISS).
2. Collapsed revalidation: An internal revalidation request is collapsed
(e.g., TCP_CF_REFRESH_MODIFIED).
A CF tag approach is simple but the resulting access.log records cannot
distinguish some cases. For example, a pure collapsed revalidation
transaction (case 2) cannot be distinguished from these transactions:
* a collapsed client that got collapsed on revalidation (case 1+2);
* a collapsed client that initiated revalidation.
We may want to log more collapsing details in the future.
These changes do not affect CF initiating code.
In order to track collapsed transactions, a new CollapsingHistory class
was introduced. Since more and more non-logging code relies on ALE, this
history is kept in ALE. ClientHttpRequest uses its logType field instead
of the LogTags in ALE, so we also use logType for storing
ClientHttpRequest's CollapsedHistory. Eventually, ClientHttpRequest
should eliminate logType in favor of direct ALE use.
Also: ICP code fixing/refactoring:
* htcpSyncAle() and icpSyncAle() should not require the caller to supply
correct LogTags because callers like fillChecklist() do not have
access to that information (it is not stored in the transaction object
unlike the other pieces of info that these functions copy to ALE).
* Added icpUdpData::ale to preserve master transaction info when
messages are queued. Several icpUdpData improvements were triggered by
this change because ale is a (second!) non-POD member and icpUdpData
was mistreated as a POD. They include:
- Removed icpUdpData::start as unused.
- Removed icpUdpData::len as set but otherwise unused.
- Removed icpUdpData::logcode as essentially duplicating msg->opcode.
* Update ICP ALE, if any, as soon as the transaction tags become known
(instead of sometimes waiting for the ICP message to be logged). The
ICP message may be dropped and/or never be logged, but we should keep
ALE up to date because it is used in an increasingly many contexts.
Also found and marked an ICP memory leak. It is best to fix that in a
dedicated commit.
Also supplied URN code with ALE. Full-featured Client-based classes
already use ALE. We have not tested with URNs, but these changes may
improve logging of transactions that involve URN resolution.
Also fixed problematic StoreEntry::collapsingInitiator(). It could
return true if the entry had transients but had nothing to do with
collapsing. It also incorrectly assumed that a collapsed entry is always
marked with ENTRY_FWD_HDR_WAIT. That assumption is wrong because
Controller::allowCollapsing() does not set this flag for the entry.
We did not find a better way to track StoreEntry objects associated with
CF initiators than to add a new StoreEntry flag. Hitting an entry
flagged with ENTRY_REQUIRES_COLLAPSING requires collapsing the request.