Alex Rousskov [Fri, 10 Sep 2010 20:56:24 +0000 (14:56 -0600)]
Compliance: Forward 1xx control messages to clients that support them.
Forward 1xx control messages to all HTTP/1.1 clients and to HTTP/1.0
clients that sent an Expect: 100-continue header unless the 1xx message
fails the optional http_reply_access check described below.
RFC 2616 requires clients to accept 1xx control messages, even if they
did not send Expect headers. However, 1xx control messages prohibited by
http_reply_access are ignored and not forwarded. This can be used to
protect broken HTTP/1.1 clients or naive HTTP/1.0 clients that
unknowingly forward 100-continue headers, for example. Only fast checks
are supported at this time.
The patch removes ignore_expect_100 squid.conf option and the
corresponding code because
- the reasons to treat 100-continue specially have changed since we
can now forward 1xx responses;
- rejection of 100-continue request can still be done using a
combination of the existing http_access and deny_info options;
- hiding of 100-continue header from next hops can still be done using
an existing request_header_access option;
- 100 (Continue) responses can be hidden from clients using
http_reply_access check described above.
We still respond with 417 Expectation Failed to requests with
expectations other than 100-continue.
Implementation notes:
We forward control messages one-at-a-time and stop processing the server
response while the 1xx message is being written to client, to avoid
server-driven DoS attacks with large number of 1xx messages.
1xx forwarding is done via async calls from HttpStateData to
ConnStateData/ClientSocketContext. The latter then calls back to notify
HttpStateData that the message was written out to client. If any one of
the two async messages is not fired, HttpStateData will get stuck unless
it is destroyed due to an external event/error. The code assumes such
event/error will always happen because when
ConnStateData/ClientSocketContext is gone, HttpStateData job should be
terminated. This requires more testing/thought, but should still be
better than not forwarding 1xx messages at all.
Alex Rousskov [Fri, 10 Sep 2010 20:06:41 +0000 (14:06 -0600)]
Dechunk incoming requests as they come in and chunk outgoing requests.
The server side always chunks the request if and only if the original request
was chunked. No next hop version checks are performed.
* Client-side changes:
Removed clientIsRequestBodyValid() as unused. It was called with a
content-length>0 precondition that made the function always return true.
Removed old dechunking hack that was trying to buffering the entire request
body, pretending that we are still reading the headers. Adjusted related
code. More work may be needed to identify client-side code that assumes
the request size is always known.
Removed ConnStateData::bodySizeLeft() because we do not always know how much
body is left to read -- chunked requests do not have known sizes until we read
the last-chunk. Moreover, it was possibly used wrong because sometimes we want
to know whether we want to comm_read more body bytes and sometimes we want to
know whether we want to "produce" more body bytes (i.e., copy already read
bytes into the BodyPipe buffer, which can get full).
Added ConnStateData::mayNeedToReadMoreBody() to replace conn->bodySizeLeft()
with something more usable and precise.
XXX: If there is a chunks parsing error, the new code just resets the
connection. I tried to find a way to send an error page to the client, but
failed to do so. It is easy to do when the headers and the body prefix is
parsed, but if the error is send later, the server-side may start sending us
its response, and the two responses would clash, causing assertions. I do not
know how to fully avoid that. Perhaps we need to unregister from Store before
responding with an error? Search for WE_KNOW_HOW_TO_SEND_ERRORS.
Tried to break deep recursion/iteration around clientParseRequest. When
chunked parser fails during the message prefix parsing, the rest of the code
may decide that the connection is no longer used (and that there is no pending
transaction, even though the currentobject member is not NULL!) and start
parsing the second request. If that second parse fails (for example), there
will be two concurrent errors to be sent to the client and the client-side
code cannot handle that. However, due to the XXX above, we never send an error
when chunking parser fails, making most of the related code polishing useless,
at least for now.
Removed my wrong XXX related to closing after initiateClose.
Removed my(?) XXX related to endless chunked requests. There is nothing special
about them, I guess, as a non-chunked request can be virtually endless as
well if it has a huge Content-Length value.
Use commIsHalfClosed() instead of fd_table[fd].flags.socket_eof for
consistency with other client-side code and to improve readability. I think
these should return the same value in our context but I am not sure.
Correctly handle identity encoding. TODO: HTTPbis dropped it. Should we?
Polished request_header_max_size handling code, converting old
connKeepReadingIncompleteRequest and connCancelIncompleteRequests functions
into a new ConnStateData::checkHeaderLimits() method.
* Server-side changes:
Separated "received whole request body" state from "sent whole request
body". When we chunk requests, we need to add last-chunk. Thus, we may
receive (and written) the whole raw body but still need to write
last-chunk. This is not trivial because we should not write last-chunk
if the body producer aborted. XXX: check all pipe->exhausted() callers
to make sure all code has been adjusted.
Added getMoreRequestBody() virtual method that Server uses to get
encoded body bytes from its kids. FTP does not encode and uses default
implementation.
Fixed HTTP/FTP doneSendingRequestBody() to call its parent. I am not sure
it helps with correctly processing transactions, but the parent method was
designed to be called, and calling it make the transaction state more clear.
Moved "broken POSTS" handling code into its own method and polished it
(HttpStateData::finishingBrokenPost). We now skip the "broken POSTS" fix
if the request is chunked.
Resolved old XXX: HttpStateData::handleRequestBodyProducerAborted() was
indeed doing nothing useful despite all the pretense. Now it aborts the
transaction.
Author: Christian Wittmer <chris@computersalat.de>
Add --with-swapdir=PATH to override default /var/cache/squid
For use on systems where /var/cache is not the required placement.
Adds the 'squid' part to the default path in accordance to the UNIX
standard filesystem layout definition. In squid-3 it is no longer used
for creating an implicit active cache. Only documentation and potential
core dumps use the variable set by this now.
Alex Rousskov [Mon, 6 Sep 2010 00:05:09 +0000 (18:05 -0600)]
Added producedSize() method to report total data size put by the producer.
It is often very convenient to know this size, but one must be careful not to
use its value for body size, available body size, or exhaustion-calculating
expressions.
Alex Rousskov [Sun, 5 Sep 2010 23:34:08 +0000 (17:34 -0600)]
Bug #2311 fix: crashes with ICAP RESPMOD for HTTP body size greater than 100kb
BodyPipe::undoCheckOut() must not assert that undo is possible because it
is not (currently) possible if the pipe buffer was modified.
BodyPipe::undoCheckOut() must not throw if undo is not possible because it is
often called when there is already an exception thrown and because it is
called from the BodyPipeCheckout destructor and destructors should not throw
(this case is an illustration for one reason why they should not).
Currently, we only use an implicit undo, and only when an exception
is being thrown while the buffer is checked out.
Currently, the code that does checkout is probably safe because it should
terminate the transaction if a parser throws. However, this is not 100%
guaranteed, and the situation may change without us noticing.
TODO: consider implementing the long-term solution discussed at
http://www.mail-archive.com/squid-dev@squid-cache.org/msg07910.html
COW-buffers may help here as well.
An upgrade/fix to handling HTTP request-lines as specific by
section 5.1 of the RFCs. Specifically to handle a sequence of
unknown bytes up to a terminating LF (\n) octet.
* The semantics as previously documented are taken on. No changes
there, but documentation clarified a bit. Some things previously not
erroring are now doing so. External code impact is in the nature of
reduced special cases to be handled. Specifically raw-CR weirdness in
the request line fields. This occuring in URL was a vulnerability at
least once before.
* Prior updates to HttpParser object for other parse stages opens the
possibility of this parse action returning HTTP status code directly.
Additions are done to make use of this (with the existing status codes
only).
* Input permutations where the unit-tests showed the old parser was
violating its own documentation have been fixed to produce expected
outputs.
* Old parser operated three distinct potentially long parse loops.
Added several local variables to remember various octets seen while
searching for the terminal LF. This removed the need for two of the
parse re-scans (length of method, length of URI).
* relaxed_header_parser will enable it to also skip prefix whitespace
(space character only) and multiple-\r sequences at the end of line.
* --enable-http-violations is still required before it will accept
non-HTTP version types 'downgraded' to HTTP/0.9
Alex Rousskov [Wed, 1 Sep 2010 03:44:05 +0000 (21:44 -0600)]
Check for NULL and empty strings before calling str*cmp().
These checks are necessary to ensure consistent comparison results (important
for sorting and searching) and to avoid segfaults on NULL buffers (because
termedBuf() may return NULL instead of the expected "0-terminated buffer").
Alex Rousskov [Wed, 1 Sep 2010 00:08:35 +0000 (18:08 -0600)]
Possible bug 3020 fix: Segmentation fault: nameservers[vc->ns].vc = NULL
Prevent idnsVCClosed segfaults during shutdown or reconfiguration (at least).
idnsShutdown() schedules comm_close and then frees nameservers[] by
calling idnsFreeNameservers. The closing handler tried to access freed
nameservers[]. The patch prevents access to the freed nameservers[]
array in idnsVCClosed and other functions.
TODO: Nameservers[] array management should be rewritten. The array
should not be freed while there are nameservers using it. It should be
freed when the last entry is gone.
Removed "double CR" check from parseHttpRequest() for several reasons:
1) The check was most likely introduced as a short-term defense
against "HTTP request smuggling" attacks identified in an
influential 2004 paper. The paper documented certain
vulnerabilities related to requests with "double CR" sequences, and
Squid was quickly hacked to prohibit such requests as
malformed. However, a more careful reading of the paper indicates
that only LF CR CR LF (a.k.a. "CR header") sequences were
identified as dangerous (note the leading LF). The quick fix was
too aggressive and blocked _all_ requests with CR CR LF sequences,
including benign requests.
2) The check duplicated a HttpHeader::parse() check.
3) The check was slower than the code it duplicated.
Improved "double CR" handling in HttpHeader::parse() to detect
potentially dangerous "empty headers", that is header fields that
contain nothing but CR character(s). Requests with such headers are
rejected as malformed. We used to reject similar requests (and more)
in parseHttpRequest() as described above.
After the change, potentially malicious requests with CR+ headers are
still denied. Other, benign headers ending with CRs are now allowed.
If the HTTP header parser is not "relaxed", benign and valid requests
with extra CR characters are blocked as before.
Alex Rousskov [Tue, 31 Aug 2010 23:50:57 +0000 (17:50 -0600)]
Compliance: Ignore unused chunk-extensions to correctly handle large ones.
Chunk parser did not advance until it got a complete chunk-extension.
HttpStateData::maybeReadVirginBody() does not grow the buffer if there is no
space available for the [chunked] body so the transaction with a large
chunk-extension would stall. The default HttpStateData input buffer size is
just 2KB so it does not take a "very large" extension to stall the
transaction.
Somewhat ironically, we are not even interested in the HTTP chunk-extension
itself. After the change, Squid skips the chunk-extension data as soon as it
gets it (except for the last-chunk, see below). Incrementally ignoring data
requires handling quoted strings correctly, to avoid mis-detecting a quoted
CRLF. Thus, we now preserve the quoted string parsing state in
ChunkedCodingParser.
Last-chunk chunk-extension is useful for ICAP. We parse it instead of
ignoring. This parsing is done as before and may still lead to connection
hanging, but a proper fix is outside this patch scope. Similarly, other
stalling reasons are outside this patch scope.
Co-Advisor test case:
test_case/rfc2616/chunked-1p0-longValExt-16385-toClt
Alex Rousskov [Tue, 31 Aug 2010 23:46:24 +0000 (17:46 -0600)]
Compliance: handle HTTP OPTIONS and TRACE requests with asterisk URIs.
Handle '*' URIs in urlParse(). This allows Squid properly respond to
OPTIONS and TRACE requests with '*' URIs and Max-Forwards value of
zero. Forwarding such requests is out of this change scope and still
does not work because the upstream host and port are not set.
Co-Advisor test cases:
test_case/rfc2616/options-bodyless-asterisk
test_case/rfc2616/maxForwardsZero-OPTIONS-asterisk
test_case/rfc2616/maxForwardsZero-TRACE-asterisk
Alex Rousskov [Tue, 31 Aug 2010 23:34:10 +0000 (17:34 -0600)]
Compliance: reply with 400 (Bad Request) if request header is too big.
Reply with a standard 400 (Bad Request) instead of 601 (Unknown) status in
case of an ERR_TOO_BIG error. HTTP does not have a dedicated code for the
too-big header error. There is 414 (Request-URI Too Long), but Squid does not
distinguish too-large headers from too-large URIs.
Co-Advisor test case: test_case/rfc2616/longUri-65536
Alex Rousskov [Mon, 30 Aug 2010 22:55:02 +0000 (16:55 -0600)]
Added safe HttpMsg pointer wrapper that locks and unlocks the message.
This class will not be needed if we switch to refcounting HttpMsg. Meanwhile,
it allows to pass message pointers in AsyncCalls and avoid auto_ptr<> in
exception-sensitive code.
Amos Jeffries [Mon, 30 Aug 2010 03:41:04 +0000 (21:41 -0600)]
Remove LTDL hack from bootstrap.
Now confirmed and tested the correct use of -I paths to enable Squid
to build on systems with older libtool versions. The hack forcing local
headers to be used is no longer required.
(In reply to comment #5)
> Unfortunately GPLv2+ and GPLv3 do not mix entirely well, unless we consider
> bundling the helper mere aggregation and not integrated.
>
> also, why can't you use the LDAP for matching the attribute instead of having
> to retreive all possible users and then compare the networkAddress "manually"?
I'll swap out the GPLv3 into GPLv2, no biggie. All I've done was add the files
and clauses so far. I'm just searching to see if there is something I am
missing for that.
comment #10: 2010-04-20 14:51:14 MDT
Created attachment 2139 [details]
v1.2 GPLv2+ version
Alex Rousskov [Sun, 29 Aug 2010 21:50:09 +0000 (15:50 -0600)]
Made eCAP compile again after CbcPointer<> changes.
Old eCAP code tried to call stillProducing(this) and stillConsuming(this)
methods from a const status() method. Doing so produces compiler errors
because stillProducing() and stillConsuming() do not accept pointers to
constant jobs.
CBDATA_CLASSes and, hence, CbcPointer<>, do not support const-correctness
well: In order to create/destroy a cbdata-based smart pointer, one has to
lock/unlock cbdata, which requires modifying the object. Thus, the smart
pointer cannot point to a truly constant object. The core of the problem is
that CBDATA_CLASSes keep cbdata and object data together. When all raw/dumb
CBDATA_CLASS pointers are gone, we can separate the two "datas" and solve the
const-correctness problem for good. The "separate-datas" design would even be
consistent with the original cbdata design which we often violate, IMO.
There are other workarounds. We could declare toCbdata() constant, for
example. However, that would essentially disable checks where a
cbdata-protected object is being destroyed despite the caller's intent to keep
the object constant. This change is not as general but is also much less
intrusive.
Amos Jeffries [Sat, 28 Aug 2010 07:31:03 +0000 (19:31 +1200)]
Author: Stefan Fritsch <sf@sfritsch.de>
Bug 2872: leaking file descriptors
As I understand it, the leak happens this way: A client request kicks off an
asynchronous file open request. If the client request is aborted and disappears
before the file open has completed, the file is never closed again. This
explains why this leak can only happen with aufs and not with ufs.
Alex Rousskov [Tue, 24 Aug 2010 22:05:52 +0000 (16:05 -0600)]
Possible fix for an OpenSolaris "pureparm" compilation error.
Our CallJobHere1() macro tried to help the compiler to determine the right
JobMemFun() profile by explicitly specifying the first JobMemFun() template
argument type. There is a second template argument in the CallJobHere1 case,
but we cannot specify it explicitly. Apparently, OpenSolaris compiler got
confused, perhaps justifiably so, by only one template argument specified.
This change removes the explicit first template parameter from the JobMemFun()
call inside the CallJobHere1(). It does not seem to be required for GCC on
RHEL5. Tests will show what other compilers think.
Henrik Nordstrom [Tue, 24 Aug 2010 21:04:22 +0000 (23:04 +0200)]
Strengthen the internal DNS client somewhat by making sure to keep
the receive queue drained. Also avoid parsing messages unless we
have a pending query.
Alex Rousskov [Tue, 24 Aug 2010 04:18:51 +0000 (22:18 -0600)]
Compliance: remove Content-Length header if Transfer-Encoding is present.
If after HTTP header parsing we have both "Transfer-Encoding: chunked"
and Content-Length headers, remove the Content-Length entry. The
adjusted behavior follows httpbis recommendations (ticket #95, part 2).
The old client-side code forwarded the original Content-Length header
which did not match the [dechunked] response, resulting in a malformed
response.
HttpHeader::chunked() method added to check if HTTP headers contain
chunked Transfer-Encoding header. Use this method in code that checks
for chunked encoding.
Co-Advisor test cases: test_case/rfc2616/chunked-1p0-badClen-toClt
test_case/rfc2616/chunked-1p1-badClen-toClt
Alex Rousskov [Tue, 24 Aug 2010 04:07:00 +0000 (22:07 -0600)]
Compliance: respond to OPTIONS requests with a zero Max-Forwards value.
RFC 2616 section 9.2 says that a proxy MUST NOT forward requests with a
zero Max-Forwards value. RFC 2616 does not define any proper OPTIONS
responses, so we consider successful responses optional and reply with
501 Not Implemented.
No change in handling OPTIONS requests with positive Max-Forwards values.
While TRACE and OPTIONS are similar with regard to Max-Forwards, we
handle them in different places because OPTIONS responses do not need to
echo the request via Store.
Co-Advisor test case: test_case/rfc2616/maxForwardsZero-OPTIONS-absolute
Alex Rousskov [Tue, 24 Aug 2010 00:02:15 +0000 (18:02 -0600)]
Send chunked responses if body size is unknown.
Apply HTTP chunked transfer encoding to the response body sent to client
if all of the following conditions are met:
* client claims HTTP version 1.1 or later support
* response does not have a Content-Length header already
* response does not use multipart/byteranges encoding
* connection is persistent
If we decide to send chunked reply, chunked_reply flag is set. Chunked
encoding is done in ClientSocketContext::packChunk(). The last-chunk
is sent only when clientReplyContext complete flag is set.
This change helps keep client-side connections persistent.
Alex Rousskov [Mon, 23 Aug 2010 23:15:26 +0000 (17:15 -0600)]
Bug #2583 fix: pure virtual method called
When a cbdata-protected class holds its own cbdata and has virtual
toCbdata(), there is a catch22 problem: we need cbdata to know whether
the pointer to the class object is valid, and we need to dereference
that pointer to get cbdata.
Added CbcPointer class to hold both a pointer to a potentially freed
class object and the cbdata pointer protecting that object. Keeping the
cbdata pointer allows us to test whether the object is still there
without dereferencing the object pointer.
Use the CbcPointer class to hold safe pointers to AsyncJobs. This
prevents "pure virtual method called" failures because we no longer
dereference freed job pointers.
Removed Initiator parameter from many initiatee constructors. The
Adaptation::Initiator::initiateAdaptation method now sets the initiator
of the job. This makes the constructor profile simpler and removes the
need to propagate Initiator changes through all the [nested]
constructors.
Renamed AsyncJob::AsyncStart() to AsyncJob::Start(). I had to change the
callers code anyway and it was a good opportunity to remove the
redundant "Async".
Special thanks to Stefan Fritsch for updating and testing an earlier
version of this patch.
Amos Jeffries [Wed, 18 Aug 2010 23:43:22 +0000 (17:43 -0600)]
Author: Alex Rousskov <rousskov@measurement-factory.com>
Bug 3016: HTTP/1.1 compliance: default keep-alive for 1.0/1.1 clients.
aka. NTLM Authentication with Java UA + SSL Problem
Moved httpMsgIsPersistent(version, headers) to HttpMsg::persistent(void).
This move makes it clear that the logic applies only to the message being
examined and not some irrelevant information such as HTTP version supported
by Squid.
Side-effects:
- In v3.2, Squid stops using persistent connections with HTTP/1.0 clients
that do not send "Connection: keep-alive".
- In v3.1, Squid starts using persistent connections with HTTP/1.1 clients
that do not send "Connection: close".
- HttpReply now sets HttpMsg::http_ver member. It is not clear whether
that member was ever used for HttpReplies though.
Fixed "ccb->active()" assertion related to the automatic write timeout.
We need to manually cancel writer's interest in select(2) before calling back
so that select() does not try to call the write handler when there is no
active callback anymore. Normally, select() cancels writer's interest when
calling the write handler, but in this case the call back is triggered not
from select() but from checkTimeouts().