Amos Jeffries [Wed, 25 Nov 2015 04:21:40 +0000 (20:21 -0800)]
Cleanup: Refactor ConnStateData pipeline handling
This refactors the request pipeline management API to use std::list
instead of a custom linked-list with accessors spread over both
ConnStateData and ClientSocketContext.
To do this a new class Pipeline is created with methods wrapping
std::list API and extending it slightly to meet the HTTP/1.1 pipeline
behaviours and perform basic stats gathering. The pipeline management
methods and state variables are moved inside this class.
ClientSocketContext was performing several layering violations in
relation to ConnStateData when one transaction ended and the next needed
starting. Treating the pipeline properly as a std::list forced removal
of that violation.
* actions for starting or resuming a transaction on the connection are
now moved to ConnStateData::kick(). Which gets called after each
transaction completes.
- with some further cleanup it can be called at any point the
ConnStateData needs to resume processing. However, that is left out of
scope for this patch.
* the ClientSocketContext scope now ends when the finished() method is
used to mark completion of these contexts transactions. Which will mark
itself done and de-register from the Pipeline queue. The ConnStateData
kick() method still needs to be called to resume other transactions
processing.
* the queue is now holding RefCounted Pointers. So that the
ClientSocketContext destructor no longer needs to be careful of
registrations, and the queue entries are guaranteed to still exist while
queued.
* The old freeAllContexts() and notifyAllContexts(int) members of
ConnStateData have been combined into Pipeline::terminateAll(int).
The ClientSocketContext and ConnStateData documentation is updated to
describe what they do in regards to connection and transaction processing.
Initial testing revealed CONNECT tunnels always being logged as ABORTED.
This turns out to be techincally correct, since the only way a tunnel
can finish is for client or server to just close the connection.
However, it is not right to log these as abnormal aborts. Instead, I
have now made the context be finished() just prior to the
TunnelStateData being destroyed. That way normal closure should show up
only as TUNNEL, but timeouts and I/O errors should still be recorded as
abnormal.
Two potential bugs have been highlighted:
* The on_unsupported_protocol handling function appears to be a bit
broken. It pop()'s contexts off the pipeline directly without going
through the proper finished() process to release their state data. I
have highlighted that with an XXX and comment.
* The ssl-bump handling logic switching to TLS begins with a terminateAll(0)
run on all active contexts. It does not check whether there is any existing
pipeline of requests waiting to be processed. And the action prematurely
purges the bumped CONNECT message context, which should be closed properly
and logged as successful.
Alex Rousskov [Thu, 19 Nov 2015 05:51:49 +0000 (22:51 -0700)]
Store API and layout polishing. No functionality changes intended.
Fixes "any Store is a Root" API that forced us to bloat the base
Store class with methods needed only in Store::Root() Controller.
Unblocks bug #7 (cached headers update) fixes.
The Store namespace hierarchy now looks like this:
* Storage: Any storage. Similar to the old Store class, but leaner.
* Controller: Combined memory/disks caches and transients. Root API.
* Controlled: Memory cache, disk(s) cache, or transient Storage.
* Disks: All disk caches combined.
* Disk: A single cache_dir Storage.
* Memory: A memory cache.
* Transients: Entries capable of being collapsed for CF.
Alex Rousskov [Wed, 18 Nov 2015 23:56:16 +0000 (15:56 -0800)]
Bug 4368: A simpler and more robust HTTP request line parser.
The primary changes are: Removed incremental parsing and revised parsing
sequence to accept virtually any URI (by default and also configurable
as before).
Also doubled hard-coded 16-character method length limit.
No changes to parsing HTTP header fields (a.k.a. the MIME block) were
intended.
Known side effects:
* Drastically simpler code.
* Some unit test case adjustments.
* The new parser no longer treats some request lines ending with
"HTTP/1.1" as HTTP/0.9 requests for URIs that end with "HTTP/1.1".
* The new parser no longer re-allocates character sets while parsing
each request.
Intentional Changes:
* Removal of incremental request line parsing.
Squid parsed the request line incrementally. That optimization was
unnecessary:
- most request lines are short enough to fit into one network I/O,
- the long lines contain only a single long field (the URI), and
- the user code must not use incomplete parsing results anyway.
Incremental parsing made code much more complex and possibly slower than
necessary.
The only place where incremental parsing of request lines potentially
makes sense is the URI field itself, and only if we want to accept URIs
exceeding request buffer capacity. Neither the old code, nor the
simplified one do that right now.
* Accept virtually any request-target (when allowed).
1. relaxed_header_parser allows whitespace in request-target.
2. relaxed_header_parser combined with USE_HTTP_VIOLATIONS now allows
any characters except non-whitespace CTL characters (see RFC 5234
appendix B.1) in the message request-target (aka URI).
#2 being the default build and configuration situation allows virtually
any URI that Squid can isolate by stripping method (prefix) and
HTTP/version (suffix) off the request line. This approach allows Squid to
forward slightly malformed (in numerous ways) URIs instead of misplacing
on the Squid admin the burden of explaining why something does not work
going through Squid but works fine when going directly or through another
popular proxy (or through an older version of Squid!).
URIs in what Squid considers an HTTP/0.9 request obey the same rules.
Whether the rules should differ for HTTP/0 is debatable, but the current
implementation is the simplest possible one, and the code makes it easy
to add complex rules.
* Code simplification.
RequestParser::parseRequestFirstLine() is now a simple sequence of
sequential if statements. There is no longer a path dedicated for the
strict parser. The decisions about parsing individual fields and
delimiters are mostly isolated to the corresponding methods.
* Unit test cases adjustments.
Removal of incremental request line parsing means that we should not
check parsed fields when parsing fails or has not completed yet.
Some test cases made arguably weird decisions apparently to accommodate
the old parser. The expectations of those test cases are more natural now.
Also, added optional (and disabled by default) debugging, to help pin-point
failures to test sub-cases that CPPUNIT cannot see.
Changing request methods to "none" in test sub-cases with invalid input
was not technically necessary because the new code ignores the method
when parsing fails, but it may help whoever would decide to reduce test
code duplication (by replacing hand-written expected outcomes for failed
test cases with a constant assignment or function call).
Alex Rousskov [Wed, 18 Nov 2015 20:03:55 +0000 (13:03 -0700)]
Do not _require_ anchor/updateCollapsed() re-implementation.
Also do not override Controlled methods that Disk is not going to
provide because doing so will complicate changing or deleting those
methods later as we revise the APIs.
Amos Jeffries [Wed, 18 Nov 2015 13:28:57 +0000 (05:28 -0800)]
C++ convert the global C functions that operate on class CacheDigest
This is largely a symbol renaming change. But there are two relatively
small logic changes:
1) convert the class to MEMPROXY_CLASS.
Which alters the pool creation timing from general memory pool
initialization time, to whenever the CacheDigest object is first used.
A nice side effect is removal the macro conditional within the old pool
type enumeration. Macros like that in enumeration lists such as this one
have been causing some builds to have run-time errors accessing memory
arrays out-of-bounds or incorrect postions when the build-time
dependency detection issues caused build objects to link with different
./configure'd versions.
2) Constructor logic sequence alteration.
The old *Create function used to set some members then call the *Init
function which would re-set some of them, and initialize most of the
rest (but not all).
The old *UpdateCap function would call a helper that emulated
safe_free(mask) then *Init to alter the objects mask related members
whether they needed it or not.
The class constructor now initializes all members via initialization
list then calls updateCapacity(), which calls a simplified init(). This
altered sequence contains the same operational acts while the new order
avoids repeated or unnecesarily setting members on create and update.
Alex Rousskov [Wed, 18 Nov 2015 05:46:36 +0000 (22:46 -0700)]
Store API and layout polishing. No functionality changes intended.
This first step towards bug #7 fix focuses on fixing "any Store is a
Root" API that forced us to bloat the base Store class with methods
needed only in Store::Root() Controller.
We resolved about 15 XXXs and 10 TODOs (although these counts are
inflated by many duplicated/repeated problems). We added a few new
XXXs and TODOs as well, but they are just marking already problematic
code, not adding more problems or genuinely new work.
The code movement to files in parenthesis is not tracked by bzr
because bzr cannot track file splits, and most of the moved code had
to be split across multiple files to untangle various messes. When
deciding what to tell "bzr mv", we picked file pairs that would allow
us to track the most complex, most voluminous code but there is
probably no single correct way to do that.
src/disk.* files were renamed to src/fs_io.* to avoid "src/foo
conflicts with src/store/Foo" problems expected on some case-
insensitive platforms.
The Store namespace hierarchy now looks like this:
* Storage: Any storage. Similar to the old Store class, but leaner.
* Controller: Combined memory/disks caches and transients. Root API.
* Controlled: Memory cache, disk(s) cache, or transient Storage.
* Disks: All disk caches combined.
* Disk: A single cache_dir Storage.
* Memory: A memory cache.
* Transients: Entries capable of being collapsed for CF.
The last two are not moved/finalized yet, but it should not be too
difficult to do that later because there are few direct references to
them from the high-level code.
Related polishing touches:
Moved a lot of misplaced code into the right class and/or source file.
Simplified Store::search() interface to match the actual code that
does not support any search parameters. Removed the search API from
all other stores because the code did not really support store-
specific searches. Resisted the temptation to rename parameterless
search() to iterate() or similar because the actual future of this API
is murky. We may add search parameters or even remove the method
completely. This could quickly snowball into a separate project.
Removed Store::get(x,y,z) API as unused and unsupported.
Removed FreeObject() template as unused (and possibly technically
flawed).
Simplified default Store initialization/cleanup sequence. Removed
empty disk_init(). The non-default Store::Init() parameter is used by
the unit testing code only.
Simplified Store::dereference() API by moving the second parameter to
dedicated Controller::dereferenceIdle() method that is the only ones
using that parameter.
Alex Rousskov [Wed, 18 Nov 2015 05:34:33 +0000 (22:34 -0700)]
Fixed STUB_RETREF() implementation to return the right type.
Removed bogus STUB_RETREF() comment about memory leaks in _unreachable_ code.
Deprecated STUB_RETSTATREF() as essentially duplicating STUB_RETREF().
Alex Rousskov [Wed, 18 Nov 2015 05:32:24 +0000 (22:32 -0700)]
Make RefCount pointers behave more like regular pointers.
Allow default (but safe, thanks to C++11) conversion of RefCount
pointers to bool. This helps keep the code succinct, minimizes changes
during conversion of reference counting pointers to/from other pointer
types, and avoids nullptr/NULL differences.
Amos Jeffries [Wed, 18 Nov 2015 03:23:59 +0000 (19:23 -0800)]
Combine the https_port list internal state with http_port state.
These two lists have been near identical for some time now and we can
easily reduce code by simply merging the two and using either the
secure.encryptTransport flag or the transport.protocol type to select
the remaining non-identical code paths.
Amos Jeffries [Tue, 17 Nov 2015 10:14:15 +0000 (02:14 -0800)]
Prevent all TUNNELs being marked as ABORTED
TUNNEL transactions are naturally ended by one of the client or server
closing the connection. This is not an abort. So finish the CONNECT
message context cleanly when the tunnel is closed.
Amos Jeffries [Tue, 17 Nov 2015 03:50:31 +0000 (19:50 -0800)]
Rename ClientSocketContext::connIsFinished() to finished()
Removes some needless mentions of "conn" and clarifies that the method
handles the context object and transaction finishing, not the connection
it belongs to.
Amos Jeffries [Tue, 17 Nov 2015 03:26:01 +0000 (19:26 -0800)]
Use connIsFinished() when a transaction is completed successfully
initiateClose() may sound okay, but it actually is the error handling logic.
It will terminate the ConnStateData with an erro rmessage, leaving the completed
request in the pipeline which in turn will result in *_ABORTED being logged for
all requests with Connection:close headers even if they are cleanly finished.
connIsFinished() is (now) the clean way to finish ClientSocketContext objects
lifetime regardless of whether keep-alive is needed. The ConnStateData::kick()
will now handle that so we do not even need to call keepaliveNextRequest().
Remove the now unused ClientSocketContext::keepaliveNextRequest().
Alex Rousskov [Sun, 15 Nov 2015 17:54:58 +0000 (10:54 -0700)]
Stop using dangling pointers for eCAP-set custom HTTP reason phrases.
Squid still does not support [external] custom reason phrases and,
hence, cannot reliably support eCAP API that sets the reason phrase to
the one supplied by the adapter. This and r14398 changes fix [known]
regression bugs introduced by r12728 ("SourceLayout").
Alex Rousskov [Sun, 15 Nov 2015 16:59:12 +0000 (09:59 -0700)]
Fixed status code-based HTTP reason phrase for eCAP-generated messages.
Calling .reason() on a not-yet-set theMessage.sline object resulted in
"Init" status reason phrase for all from-scratch (i.e., not cloned)
eCAP-made HTTP responses. This fix lets Squid compute the reason phrase
based on the status code, just like Squid does for forwarded responses
(IIRC).
The ERR_SECURE_ACCEPT_FAIL and ERR_REQUEST_START_TIMEOUT errors apears that
have missing templates on squid startup.
Actually these errors does not produce any error page. Move them under the
TCP_RESET error in err_type.h to mark them as optional.
- Squid receives TLS Hello from the client (TCP connection A).
- Squid successfully negotiates an TLS connection with the origin server
(TCP connection B).
- Squid successfully negotiates an TLS connection with the client
(TCP connection A).
- Squid marks connection B as "idle" and waits an HTTP request from
connection A.
- The origin server continues talking to Squid (TCP connection B).
Squid detects a network read on an idle connection and closes TCP
connection B (and then the associated TCP connection A as well).
This patch:
- When squid detects a network read on server idle connection do an
SSL_read to:
a) see if application data received from server and abort in this case
b) detect possible TLS error, or TLS shutdown message from server
c) or ignore if only TLS protocol related packets received.
Amos Jeffries [Sun, 8 Nov 2015 15:09:16 +0000 (07:09 -0800)]
Fix compile erorr on clang undefined reference to '__atomic_load_8'
Later versions of GCC on some architectures push atomic functions
out into a separate atomic library. Older versions of clang do not
handle that automatically and require the library to be linked
explicitly.
Add a check for when this is required and set ATOMICLIB if needed.
Amos Jeffries [Sat, 7 Nov 2015 12:08:33 +0000 (04:08 -0800)]
Split core Server operations from ConnStateData
This improves the servers/libserver.la class hierarchy in
preparation for HTTP/2 and other non-HTTP/1.1 protocol support.
The basic I/O functionality of ConnStateData is moved to Server
class and a set of virtual methods designed to allow for child
class implementation of data processing operations.
No logic is changed in this patch, just symbol renaming and
moving of method logics as-is into libservers.la
The autoconf check for SQUID_SSLGETCERTIFICATE_BUGGY fails on ssl library
builds which don't include SSLv3; as a result of the autoconf decision
this can end up triggering the assert(0) in Ssl::verifySslCertificate()
in ssl/support.cc (line 1712 in 3.5.11).
Allow unlimited LDAP search filter for ext_ldap_group_acl helper.
The LDAP search filter in ext_ldap_group_acl is limited to 256 characters.
In some environments the user DN or group filter can be larger than this
limitation.
This patch uses dynamic allocated buffers for LDAP search filters.