Alex Rousskov [Thu, 25 Sep 2008 17:27:58 +0000 (11:27 -0600)]
Performance fix: Check half-closed descriptors at most once per second.
A few revisions back, comm checked half-closed descriptors once per second,
but the code was buggy. I replaced it with a simpler code that checked each
half-closed descriptor whenever the OS would mark it as ready for reading.
That was a bad idea: The checks wasted a lot of CPU cycles because half-closed
descriptors are usually ready for reading all the time.
This revision resurrects 1 check/sec limit, but hopefully with fewer bugs. In
my limited tests CPU usage seems to be back to normal.
All half-closed descriptors are now stored in TheHalfClosed set. When it is
time to check the corresponding connections, Comm schedules a read for
each descriptor that is not already reading. Conflicts with regular/user
reads are resolved as before -- we silently cancel the internal half-closed
read.
TODO: It is possible that we do not need to read at all and should call
getsockopt() instead to test the connection.
Alex Rousskov [Thu, 25 Sep 2008 17:22:12 +0000 (11:22 -0600)]
Added a DescriptorSet class to manage an unordered collection of unique
descriptors.
DescriptorSet is used for half-closed descriptor monitoring. It might be
useful for deferred reads as well, but that remains to be seen.
DescriptorSet has O(1) complexity for search, insertion, and deletion. It uses
about 2*sizeof(int)*MaxFD bytes total. Splay tree that used to store
half-closed descriptors previously uses less RAM for small number of
descriptors but has O(log n) complexity. Same for std::set<int>, a potential
DescriptorSet replacement.
- Ability to send HTCP CLR requests when objects are invalidated or purged from
the cache.
- Config logic to allow the following:
- HTCP peers who ONLY receive CLR messages from us.
- HTCP peers who NEVER receive CLR messages from us.
- HTCP peers who NEVER receive CLR messages from us for PURGE requests.
- HTCP peers who are forwarded CLR messages we receive.
- Unterminated blocks in if () statements.
- Use of a struct to refer to an enum declared within the struct.
- Use of incorrect enum values after the originals were renamed.
- References to enum values from within the struct without the struct name.
Note that these changes have not been tested, but they do allow the tree to
build again.
- Unterminated blocks in if () statements.
- Use of a struct to refer to an enum declared within the struct.
- Use of incorrect enum values after the originals were renamed.
- References to enum values from within the struct without the struct name.
Note that these changes have not been tested, but they do allow the tree to
build again.
Alex Rousskov [Tue, 23 Sep 2008 16:16:28 +0000 (10:16 -0600)]
Bug #2459 workaround: When dns_error_message value is lost, use "lost DNS
error" text and log at level 1 to inform the administrator about the internal
error.
This temporary hack does not fix the incorrect DNS error value problem, only
the lost one.
Alex Rousskov [Tue, 23 Sep 2008 15:05:36 +0000 (09:05 -0600)]
Do not call connect handler for closing descriptors because the handler
is unlikely to do something useful and is likely to hit Comm assertions
when working with a closing descriptor.
AFAIK, after adding close handlers to FtpStateData and peerProbe code,
all code that uses commConnectStart has a Comm close or I/O handler that
will be called when the descriptor is closing. This should prevent
connecting jobs from getting stuck waiting for the connection callback
to be called.
Alex Rousskov [Tue, 23 Sep 2008 14:49:50 +0000 (08:49 -0600)]
Added Comm close handler for the data channel of FtpStateData
transaction in preparation for officially dropping connect callbacks for
closing descriptors.
The data channel can be opened and closed a few times and the descriptor
must be kept in sync with the close handler. I factored out the
open/closing code into a simple FtpChannel class. That class is now used
for both FTP control and data channels.
The changes resolve one XXX discussion regarding FTP not having a close
handler for the data channel. On the other hand, adding a second close
handler attached to the same transaction is not a trivial change as the
side-effects of Squid cleanup code are often illusive.
For example, I suspect that FTP cleanup code does not close or even
check the control channel. I added a DBG_IMPORTANT statement to test
whether the control channel remains open. Or should that be an assert()?
I think that only one out of the two callbacks can be dialed because the
close handler executed first will invalidate the transaction object.
Bug 740: allow external acl's to use reply headers in format
Adds a small bit of token syntax to external_acl_type format.
%>{Header} HTTP request header
%>{Hdr:member}
HTTP request header list member
%>{Hdr:;member}
HTTP request header list member using ; as
list separator. ; can be any non-alphanumeric
character.
%<{Header} HTTP reply header
%<{Hdr:member}
HTTP reply header list member
%<{Hdr:;member}
HTTP reply header list member using ; as
list separator. ; can be any non-alphanumeric
character.
Basically the < and > are new following the existing meaning of their
direction in other tokens to match request/reply.
Old format of %{} is left as request header but with WARNING (1) level
noise at configure time indicating the new syntax.
Initial design was based on the false assumption that TPROXYv4 worked
like NAT lookups and returned the IPs on IP_TRANSPARENT.
It in fact returns the correct connection IPs on accept(),
This patch makes TPROXYv4 work correctly and spoof client IP. Port needs
to be randomly assigned by the OS to prevent kernel clashes.
Regular traffic is no longer guaranteed when passed in a tproxy marked
port. It may work as expected but no guarantess yet.
Accelerated traffic and NAT intercepted traffic will certainly fail.
As such their flags are marked as mutually exclusive with the tproxy flag.
Multi-Modes will still operate, but only on seperate ports.
Alex Rousskov [Mon, 22 Sep 2008 05:52:37 +0000 (23:52 -0600)]
Call failed(ERR_FTP_FAILURE, 0) when data channel is closed unexpectidly,
to force control channel closure. Apparently, FtpStateData does not close
that channel when cleaning up.
Alex Rousskov [Mon, 22 Sep 2008 05:14:39 +0000 (23:14 -0600)]
Added Comm close handler for the data channel of FtpStateData transaction in
preparation for officially dropping connect callbacks for closing descriptors.
The data channel can be opened and closed a few times and the descriptor must
be kept in sync with the close handler. I factored out the open/closing code
into a simple FtpChannel class. That class is now used for both FTP control
and data channels.
- Add some comments describing various function purposes.
- Remove some debugging debugs that had crept in.
- Use debugs() in preference to debug()().
- Adjust some debug levels.
Adds "Content-Language" header properly if the error page language was
negotiated. Hard codes the default templates as 'en', and the squid.conf
value for soft default language of error_default_language was used.
Sets "Vary: Accept-Language" if negotiation is configured to take place.
Alex Rousskov [Sun, 21 Sep 2008 05:08:44 +0000 (23:08 -0600)]
Added Comm close handler for peer probe to handle closing of a probe
descriptor while connect is pending. This was done in preparation for
officially dropping connect callbacks for closing descriptors.
I suspect that the old-code probe would get stuck if the descriptor were
closed during connect. One the other hand, nothing but a shutdown could
close that probe descriptor, I guess.
squid.conf cleanup: Modify several squid.conf defaults
Following the cleanup of squid.conf to minimal config modifies the
remaining defaults to make their explicit configuration unnecessary.
icp_port was made a 0 default (for safety?),
but the port config line left uncommented. fixed that.
(most won't need it, those who do need to configure it anyway)
icp_access lines to allow local network now commented out,
background default 'deny all' untouched.
(ditto on above reason)
miss_access default moved from explicit configured, to
background default. Implicit absent default was documented
to be same as explicit config default anyway.
access_log config moved to a background default + documented.
rather than explicit config only.
cache_store_log moved to default none + commented out.
We've been recommending that for a while now anyway.
request_header_max_size boosted to 64KB from 20KB.
HTTP/1.1 needs big headers. I think that should be okay?
reply_header_max_size boosted to 64KB from 20KB.
HTTP/1.1 needs big headers. I think that should be okay?
cache_dir defaults to no disk cache, memory only cache.
maximum_object_size_in_memory - boosted to 512KB.
Update to at least 64KB was needed anyway to match modern web
traffic. Picked 512KB to maximize HIT with new default cache.
cache_mem boosted to 256 MB for caching at least 500 objects.
TODO Options remaining to consider for removal:
hierarchy_stoplist
coredump_dir
TODO all the default values probably still need to be checked.
Alex Rousskov [Sat, 20 Sep 2008 05:00:47 +0000 (23:00 -0600)]
Abort sendMoreData if our socket is being closed to avoid
comm.cc:2032: "!fd_table[fd].closing()" assertions in comm_write.
The assert happens when sendMoreData is triggered by an event on the other
(server) side. The server side does not know that client side (or something
else) have decided to close the client socket. The close callback has not been
dialed yet so the client state has not been invalidated and non-async from
Store callbacks can reach it.
A better fix for this would be to convert more store callbacks to AsyncCalls,
but that may have to wait for client_side rewrite.
Also assert that USE_ZPH_QOS code does not use a negative socket descriptor.
The "fd = conn != NULL ? conn->fd : -1" code seems to imply that our
descriptor may be negative. I do not know whether that can actually happen.
This fix was a part of the recent cleanup effort but got lost when I split
the changes into several MERGE requests :-(.
Alex Rousskov [Fri, 19 Sep 2008 04:12:32 +0000 (22:12 -0600)]
Resurrected "fde *F" in comm_close_start. I lost it when cleaning up
the code because it was only used inside USE_SSL. Need to build with
more things enabled...
* Requires 'tproxy' option be teh only mode on a given port.
* Assumes all requests received there are TPROXY intercepted.
bind() errors may occur if external configuration passes normal
requests to the tproxy flagged Squid port.
* Spoofs client IP on all requests received at that port.
Based on new info, TPROXY once set on a port has to be assumed as always
set. There is nothing reasonably possible which Squid can do as a quick
lookup to retrieve the clients destination IP. BUT, the destination IP is
the one given on accept() in all these cases anyway.
This makes Squid handling code much simpler and faster, but also runs the
risk of breakage on non-tproxy requests to the port.
Finish forward-porting HTCP enhancements from squid 2.HEAD.
- Fix purgeEntriesByUrl call.
- Add logic needed to send HTCP CLR request.
- Add logic needed to send HTCP CLR requests to all configured neighbors.
- Wire in calls to send CLR requests to neighbors in appropriate places.
- Use a reference to the HttpRequestMethod rather than a pointer.
- Various cleanups.
formater.pl
Script to format source files with the astyle utility
NP: Squid code requires astyle version 1.22 or later
md5checker.sh
Scritp to validate the formater.pl script did not alter the code
syntax in any way. All non-whitespace alterations are reported
as bungled files for manual auditing.
Author: Christos Tsantilas <chtsanti@users.sourceforge.net>
Bug 2219: pconn status logic errors when getting chunks
The problem here is that chunk-decoding was wrongly using the eof to mark
the last chunk (look at method HttpStateData::decodeAndWriteReplyBody).
The eof used in HttpStateData to mark that the connection probably closed so
another flag must be used to mark that the last chunk received.
This patch uses the HttpStateData::lastChunk flag to mark that the last chunk
received. Maybe the flag should be moved to http_state_flags struct or a
general flag "complete" should used to mark that the whole body received.
Bug 2363: Install Error: comparison between signed and unsigned int
FreeBSD 7 now define FD_SETSIZE as unsigned. However squid needs to compare it
against signed values on occasion. This casts the squid internal FD limit macro
to ensure its signed.
A better fix may be to audit the code and completely change all FD_SET*
handling codepaths to handle and pass values of a custom signedness
as determined by the OS at build time.
prepareTransparentURL() was used in a strange way to catch requests received
on a transparent port but with NAT failures. Corrected the cases documentation
and added TPROXT flags to catch transparent failures.
pass transparency flags properly between client and server sides. include
the spoof client IP flag for TPROXY.
Alex Rousskov [Fri, 12 Sep 2008 03:57:47 +0000 (21:57 -0600)]
Aggregate commit after two --local commits:
- Cleaned up Comm: comm_close, comm_read_cancel, half-closed monitors, leaks.
- Cleaned up reconfiguration sequence.
Please see individual commit messages for details (bzr permitting).
Alex Rousskov [Fri, 12 Sep 2008 02:59:51 +0000 (20:59 -0600)]
Cleaned up reconfiguration sequence.
mainReconfigure() used to close and then open various sockets. Since
comm_close is now asynchronous, one cannot close and open in the same
function. Split mainReconfigure into mainReconfigureStart (that starts
the closing process for all relevant sockets) and mainReconfigureFinish
that opens the new sockets.
serverConnectionsClose is only used by main.cc and, hence, can be
static.
Polished comments and added an XXX comment on why SquidShutdown is
broken.
Also removed commCheckHalfClosed event scheduling. A separate cleanup
patch removes the associated half-closed monitoring loop.
Alex Rousskov [Fri, 12 Sep 2008 02:58:44 +0000 (20:58 -0600)]
Cleaned up Comm: comm_close, comm_read_cancel, half-closed monitors, leaks.
1) Comm_close now implements the following API:
Comm_close does not close the descriptor but initiates the following
closing sequence:
1) The descriptor is placed in a "closing" state.
2) The registered read, write, and accept callbacks (if any) are
scheduled (in an unspecified order).
3) The close callbacks are scheduled (in an unspecified order).
4) A call to the internal descriptor closing handler is
scheduled.
Details of the above steps are being documented separately and will
become a part of Comm API documentation.
Since all notifications are asynchronous, it is possible for a read or
write notification that was scheduled before comm_close was called to
arrive at its destination after comm_close was called. Such
notification will arrive with COMM_ERR_CLOSING flag even though that
flag was not set at the time of the I/O (and the I/O may have been
successful). CommIoCbParams::syncWithComm is used for this. The
credit for this trick goes to Christos Tsantilas.
Removed fde.flags.closing_ flag as unused.
2) Removed most of the half-closed monitoring code. Old code scheduled
monitoring reads every main loop iteration, I think. It is possible
that the assumption was that the handler will be activated and
cleared once per iteration so that the new read can be scheduled. The
design could result in conflicts between two monitoring reads and
possibly between a monitoring read and an active read. There were
also problems with handling closing descriptors.
I have removed the loop, AbortChecker, and the associated splay
tree). When user code marks the descriptor as half-closed, Comm now
simply schedules a monitoring read callback. If the user needs to
check whether the descriptor was marked, Comm checks whether the
callback is present. If a user schedules a read when there is
already a monitoring callback, the monitoring callback is removed.
Renamed user-facing monitoring functions but left compatibility
wrappers in place to minimize user code changes, for now.
It is possible that the whole half-closed monitoring code will be
eventually deleted. The above changes are meant to preserve the
intended functionality (but without coredumps) while the decision is
being made.
3) Removed _SQUID_LINUX_-only code that would avoid addrinfo destruction
on connect "errors". Squid seems to be working fine without this
code. With this code, we leak memory on many connect requests because
of EINPROGRESS. More work is probably needed to reproduce and fix the
true cause of the memory corruption observed earlier. Removing the
workaround will allow us to get more bug reports if the problem is
still there.
Alex Rousskov [Fri, 12 Sep 2008 02:52:50 +0000 (20:52 -0600)]
Cleaned up ConnStateData's closing and destruction.
1) Despite its name and the "if (open) close" use in ConnStateData
destructor, ConnStateData::close() was not closing anything. It was
called from the Comm close handler and from the destructor and would
attempt to immediately delete the ConnStateData object. Protecting code
in deleteThis() may have prevented the actual [double] delete from
happening, but it is difficult to say exactly what was going on when the
close() method was being called from the destructor.
I converted ConnStateData::close to swanSong, which is the standard
AsyncJob cleanup method. As before, the method does not close anything
(which may still be wrong). The swanSong method is never called directly
by the user code. It is called by lower layers just before the job is
destroyed. The updated close handler initiates job destruction by
calling deleteThis().
We may need to add Comm closing code to swanSong. For now, the updated
ConnStateData destructor will warn if ConnStateData forgot to close the
connection. The destructor will also warn if swanSong was not called,
which would mean that the job object is being deleted incorrectly.
2) Polished ClientSocketContext::writeComplete to distinguish
STREAM_UNPLANNED_COMPLETE from STREAM_FAILED closing state. This helps
when looking at stack traces.
3) Added an XXX comment about duplicated code.
4) Documented ClientSocketContext::initiateClose purpose and context.
Bug 1628: follow_x_forwarded_for shoudl not cause allow/deny behavior
clientFollowXForwardedForCheck() needs to always set the
request->indirect_client_addr properly at completion and call
calloutContext->clientAccessCheck(); unconditionally to begin actual
access ACL tests.
Calling clientAccessCheckDone(answer) is equivalent to processing an
http_access line with denial.
Alex Rousskov [Thu, 11 Sep 2008 06:32:57 +0000 (00:32 -0600)]
Cleaned up reconfiguration sequence.
mainReconfigure() used to close and then open various sockets. Since
comm_close is now asynchronous, one cannot close and open in the same
function. Split mainReconfigure into mainReconfigureStart (that starts
the closing process for all relevant sockets) and mainReconfigureFinish
that opens the new sockets.
serverConnectionsClose is only used by main.cc and, hence, can be static.
Polished comments and added an XXX comment on why SquidShutdown is broken.
Also removed commCheckHalfClosed event scheduling. A separate cleanup patch
removes the associated half-closed monitoring loop.
Alex Rousskov [Thu, 11 Sep 2008 05:58:32 +0000 (23:58 -0600)]
Cleaned up Comm: comm_close, comm_read_cancel, half-closed monitors,
leaks.
1) Comm_close now implements the following API:
Comm_close does not close the descriptor but initiates the following
closing sequence:
1) The descriptor is placed in a "closing" state.
2) The registered read, write, and accept callbacks (if any) are
scheduled (in an unspecified order).
3) The close callbacks are scheduled (in an unspecified order).
4) A call to the internal descriptor closing handler is
scheduled.
Details of the above steps are being documented separately and will
become a part of Comm API documentation.
Since all notifications are asynchronous, it is possible for a read or
write notification that was scheduled before comm_close was called to
arrive at its destination after comm_close was called. Such
notification will arrive with COMM_ERR_CLOSING flag even though that
flag was not set at the time of the I/O (and the I/O may have been
successful). CommIoCbParams::syncWithComm is used for this. The
credit for this trick goes to Christos Tsantilas.
Removed fde.flags.closing_ flag as unused.
2) Removed most of the half-closed monitoring code. Old code scheduled
monitoring reads every main loop iteration, I think. It is possible
that the assumption was that the handler will be activated and
cleared once per iteration so that the new read can be scheduled. The
design could result in conflicts between two monitoring reads and
possibly between a monitoring read and an active read. There were
also problems with handling closing descriptors.
I have removed the loop, AbortChecker, and the associated splay
tree). When user code marks the descriptor as half-closed, Comm now
simply schedules a monitoring read callback. If the user needs to
check whether the descriptor was marked, Comm checks whether the
callback is present. If a user schedules a read when there is
already a monitoring callback, the monitoring callback is removed.
Renamed user-facing monitoring functions but left compatibility
wrappers in place to minimize user code changes, for now.
It is possible that the whole half-closed monitoring code will be
eventually deleted. The above changes are meant to preserve the
intended functionality (but without coredumps) while the decision is
being made.
3) Removed _SQUID_LINUX_-only code that would avoid addrinfo destruction
on connect "errors". Squid seems to be working fine without this
code. With this code, we leak memory on many connect requests because
of EINPROGRESS. More work is probably needed to reproduce and fix the
true cause of the memory corruption observed earlier. Removing the
workaround will allow us to get more bug reports if the problem is
still there.
My braindead alteration to pass the base data and config directories to
the code for use at compile-time backfired with ./configure adding variable
names intended for automake into the autoconf.h file for build.
This approach drops any fancy definition/substitution attempts
and simply adds an compiler flag parameter to every object build.
Alex Rousskov [Thu, 11 Sep 2008 04:54:34 +0000 (22:54 -0600)]
Cleaned up ConnStateData's closing and destruction.
1) Despite its name and the "if (open) close" use in ConnStateData destructor,
ConnStateData::close() was not closing anything. It was called from the Comm
close handler and from the destructor and would attempt to immediately delete
the ConnStateData object. Protecting code in deleteThis() may have prevented
the actual [double] delete from happening, but it is difficult to say exactly
what was going on when close() was being called from the destructor.
I converted ConnStateData::close to swanSong, which is the standard AsyncJob
cleanup method. As before, the method does not close anything (which may be
wrong). The swanSong method is never called directly by the user code. It is
called by lower layers just before the job is destroyed.
We may need to add Comm closing code to swanSong. For now, the updated
ConnStateData destructor will warn if ConnStateData forgot to close
the connection. The destructor will also warn if swanSong was not called,
which would mean that the job object is being deleted incorrectly.
2) Polished ClientSocketContext::writeComplete to distinguish
STREAM_UNPLANNED_COMPLETE from STREAM_FAILED closing state. This helps when
looking at stack traces.
3) Added an XXX comment about duplicated code.
4) Documented ClientSocketContext::initiateClose purpose and context.