Author: Alex Rousskov <rousskov@measurement-factory.com>
Bug 2087: Support adaptation sets and chains, including dynamic ICAP chains
- Support adaptation service sets and chains
(adaptation_service_set and adaptation_service_chain)
- Dynamically form chains based on ICAP X-Next-Services header
(icap_service routing=on)
- Support cross-transactional ICAP header exchange
(adaptation_masterx_shared_names)
An adaptation service set contains similar, interchangeable services. No more
than one service is successfully applied. If one service is down or fails,
Squid can use another service. Think "hot standby" or "spare" ICAP servers.
Sets may seem similar to the existing "service bypass" feature, but they allow
the failed adaptation to be retried and succeed if a replacement service is
available. The services in a set may be all optional or all essential,
depending on whether ignoring the entire set is acceptable. The mixture of
optional and essential services in a set is supported, but yields results that
may be difficult for a human to anticipate or interpret. Squid warns when it
detects such a mixture.
When performing adaptations with a set, failures at a service (optional or
essential, does not matter) are retried with a different service if possible.
If there are no more replacement services left to try, the failure is treated
depending on whether the last service tried was optional or essential: Squid
either tries to ignore the failure and proceed or terminates the master
transaction.
An adaptation chain is a list of different services applied one after another,
forming an adaptation pipeline. Services in a chain may be optional or
essential. When performing adaptations, failures at an optional service are
ignored as if the service did not exist in the chain.
Request satisfaction terminates the adaptation chain.
When forming a set or chain for a given transaction, optional down services
are ignored as if they did not exist.
ICAP and eCAP services can be mixed and matched in an adaptation set or chain.
Merged from 3p1-plus branch at r9513.
* Implementation notes
The notes below focus on _changes_. Adaptation terminology and current layers
are now being documented in src/adaptation/notes.dox
Service sets and chains are implemented as ServiceGroup class kids. They are
very similar in most code aspects. The primary external difference is that
ServiceSet can "replace" a service and ServiceChain can find the "next"
service. The internal search code is implemented in ServiceGroup parent and
is parametrized by the kids.
Before the adaptation starts, Squid calculates the adaptation "plan", which is
just an iterator into the ServiceGroup. The client- and server-side adaptation
initiators used to deal with Service pointers. They now deal with ServiceGroup
pointers. The only interesting difference is that a ServiceGroup does not have
a notion of being optional or essential. Thus, if adaptation start fails, we
do not know whether the failure can be bypassed. Fortunately, starting an
adaptation does not require anything that depends on the adaptation services,
so we now simply assert that the start succeeds.
If the entire adaptation fails, the callers are notified as before. They are
told whether they can ignore the failure as before. No changes there.
A new Adaptation::Iterator class has been added to execute the adaptation
plan. That class is responsible for iterating the services in a service group
until the plan is exhausted or cannot progress due to a final failure.
Dynamically form adaptation chains based on the ICAP X-Next-Services header.
If an ICAP service with the routing=1 option in squid.conf returns an ICAP
X-Next-Services response header during a successful REQMOD or RESPMOD
transaction, Squid abandons the original adaptation plan and forms a new
adaptation chain consisting of services identified in the X-Next-Services
header value (using a comma-separated list of adaptation service names from
squid.conf). The dynamically created chain is destroyed once the new plan is
completed or replaced.
This feature is useful when a custom adaptation service knows which other
services are applicable to the message being adapted.
Limit adaptation iterations to adaptation_service_iteration_limit to protect
Squid from infinite adaptation loops caused by ICAP services constantly
including themselves in the dynamic adaptation chain they request. When the
limit is exceeded, the master transaction fails. The default limit of 16
should be large enough to not require an explicit configuration in most
environments yet may be small enough to limit side-effects of loops.
TODO: Add metadata support to eCAP API and honor X-Next-Services there as
well. Currently, only ICAP services can form dynamic chains but the formed
chains may contain eCAP services.
Other improvements:
Polished adaptation service configuration in squid.conf. Old format with an
anonymous bypass option is deprecated but still supported. Quit with a fatal
message if an adaptation service is misconfigured (debugging level-0 messages
do not seem to work at that stage, but that is probably another, general bug).
Polished HttpRequest::adaptHistory() interface so that the code that knows the
history is needed can force history creation without complex
configuration-time preparations and state. Currently, all adaptation history
users but the logging-related ones know runtime whether the history must be
created (e.g., when a certain ICAP header is received).
Fixed "canonical" Request URL maintenance when ICAP clones requests.
TODO: The urlCanonical() must become HttpRequest::canonical(), hiding the
often out-of-sync canonical data member.
Fixed ICAP request parsing (for ICAP logging). We used to parse Request-Line
as if it were the first header. TODO: optimize by parsing only when needed.
Fixed AccessCheck case where a service group disappears during a nb ACL check.
Replaced "done" member with an existing AsyncJob mustStop mechanism. Removed
extra async call as unneeded because ACL callbacks are already async.
Author: Alex Rousskov <rousskov@measurement-factory.com>
Limit X-Forwarded-For growth.
X-Forwarded-For growth leads to String size limit assertions and probably
other problems.
We now replace huge XFF values with a string "error", warn the admin the
first 100 times, and hope that something will stop the loop (if it is a
loop). TODO: we should probably deny requests with huge XFF.
To make growth-associated problems visible during forwarding loops, the
loop breaking code must be disabled (no Via) or not applicable (direct
forwarding) and request_header_max_size has to be raised or disabled.
The X-Forwarded-For header value may also grow too large for reasons
unrelated to forwarding loops.
This change also prevents most cases of pointless computation of the
original X-Forwarded-For value list. That computation can be quite
expensive.
Author: Henrik Nordstrom <henrik@henriknordstrom.net>
http_port allow-direct option to allow direct forwarding in accelerator mode
normally direct forwarding is disabled in accelerator mode unless overridden
by always_direct, to avoid unintentional security loops. But there is setups
where it makes sense to not have this restriction as this has effects on
peer selection as well.
Author: Alex Rousskov <rousskov@measurement-factory.com>
Truncate too-long HTTP response bodies to match their Content-Length header.
Sometimes a broken server sends more than Content-Length bytes in the
response. For example, a 302 redirect message with "Content-Length: 0" header
may include an HTML body. Squid used to send "everything" it read to the
client, even if it read more than the Content-Length bytes. That may have
helped in some cases, but we should be more conservative when dealing
with broken servers to combat message smuggling attacks and other bad
side-effects for clients.
We now do not forward more than the advertised content length and declare the
connection with a broken server non-persistent.
Chunked responses (that HTTP/1.0 Squid should not receive and that must not
have a Content-Length header) are not truncated because RFC 2616 says we
MUST ignore their Content-Length header.
TODO: Do not cache the truncated entry and purge the cached version, if any.
Author: Alex Rousskov <rousskov@measurement-factory.com>
Break forwarding loops for "transparent" or "intercept" http_ports.
Squid detected forwarding loops in most configurations, but broke
them (using a customizable HTTP_FORBIDDEN response) only when working as
an accelerator. Squid now breaks loops when working as a transparent
proxy as well.
A persistent loop is going to be broken anyway, when the Via and
X-Forwarded-For headers exceed header size limit, but that wastes a lot of
resources and may also crash misconfigured Squids.
TODO: Consider breaking all loops, regardless of the http_port options.
TODO: Consider adding a specific and/or configurable error page for this case
instead of using hard-coded ACCESS_DENIED.
The Date: header appears to already be implemented on all generated
pages and ICAP processed pages.
This tests and enforces Date: on all other outgoing replies as required.
I'm not certain this is the right place, it appears to be post-caching.
The RFC indicates the Date: should be enforced pre-caching. But was
unable to find a place of input cloning/processing after initial parse.
The storeEntry timestamp is used to estimate correct receiving date.
Uses a tri-state setting on enable_purge and acl parsing to
detect PURGE method addition/removal instead of a complicated ACL
creation test post-configure.
This removes the annoying false errors about temp ACL and some minor
speed up in all actions that parse squid.conf.
Author: Philip Allison <philip.allison@smoothwall.net>
Bug 2614 fix: Potential loss of adapted body data from eCAP adapters
It was possible for Squid to stop reading buffered adapted body data before it
has all been sent to the browser.
Squid treated a call to noteAbContentDone by an adapter as a signal to stop
consuming and sending adapted body data. The correct behaviour is to use
noteAbContentDone to record the fact that the adapter has stopped producing
new adapted body data, but continue to consume and send data until all
buffered ab content is consumed and sent (i.e., abContent returns an empty
Area).
Language Updates: Add aliases from live traffic info
Taking a scan of the last 98 days traffic and locating the country-code
Accept-Language headers used in that traffic to refer to the existing
languages gives a subset we can alias to further improve the coverage.
Also, Country-specific Arabic thanks to Alaa of the Translation Toolkit Project
Amos Jeffries [Wed, 24 Jun 2009 03:37:51 +0000 (15:37 +1200)]
Fix alias linker dist/install
make requires ';' after a SHELL command apparently.
Make alias-link.sh handle case where the DESTDIR is non-existent.
This occurs on some distro packaging systems (ie using langpack as a
separate package may not install errors).
Amos Jeffries [Sat, 20 Jun 2009 14:38:28 +0000 (02:38 +1200)]
Correct Licensing Credits
Several of the licenses mentioned in the CREDITS file are not relevant
to Squid-3.1 code any more. Several license disclaimers were found to be
missing.
Thanks to the Debian Project for identifying these incorrect entries.
Amos Jeffries [Sat, 20 Jun 2009 14:36:50 +0000 (02:36 +1200)]
Language alias linker/installer/upgrade scripts
alias-link.sh
This is a script set designed to be called via make/Makefile and setup
language codes for those languages which it would be impractical to
bundle duplicate translated files for.
Relies on local environment tools to be detected by automake.
make install
- now also calls generation of aliases after existing install.
Provided in file aliases.
make upgrade
- cleans out legacy files from pre-3.1 and replaces with symlinks
to the new upgraded language codes.
Provided in file alias-upgrade.
NP: this is a destructive process and must be manually run.
Bundle aliasing scripts and Makefile to use them with the langpack.
Amos Jeffries [Sun, 14 Jun 2009 12:07:07 +0000 (00:07 +1200)]
Bug 2395: FTP auth errors not displayed
Round 2 for this bug. Now we handle missing auth as an expected result
rather than a failure. FTP operations are now well tested and this patch
does not affect code shared with other components.
Side-effect is that browser authentication popups now appear when the
FTP server needs authentication. This has been a long missed event.
The root cause of the issue is not found so other subsequent errors in
FTP sub-protocol still silently lost due to the same issue.
Amos Jeffries [Sat, 6 Jun 2009 00:37:26 +0000 (12:37 +1200)]
SourceLayout: Shuffle ident files into libident.la
* Moves files into ident/ for library
* Adds Ident:: namespace for interface.
* Moves ident config to Ident::TheConfig
* reduces one avenue of memory leak on double-Init of ident objects.
* Makes ident ACL only relevant when ident is available
* Wraps Ident code in USE_IDENT for monolithic or empty library build
* Adds documentation for ident API
Amos Jeffries [Sat, 6 Jun 2009 00:11:44 +0000 (12:11 +1200)]
Author: Henrik Nordstrom <henrik@henriknordstrom.net>
Bug #2407: Spelling error in http_port tcpkeepalive option
One of the new parameters according to the docs is "keepalive". However, when
using this option you'll get a "Bungled squid.conf in line ...". That's because
when parsing the configuration Squid is looking for the keyword "tcpkeepalive"
instead of "keepalive" as stated in the docs.
Selected to fix the docs instead of code as having it named keepalive is too
easily confused with HTTP keep-alive / persistent connections.
2009-05-25: Also mistakes on spelling of config dump.
Amos Jeffries [Fri, 5 Jun 2009 23:47:35 +0000 (11:47 +1200)]
Wrap C++ headers. Fixes define clash with libcompat
ostream and family were including sys/types.h which causes
FD_SET rediefinition with libcompat at times.
Current autoconf allows these headers to be wrapped and config.h
included before to prevent this and other things.
Amos Jeffries [Fri, 5 Jun 2009 23:21:59 +0000 (11:21 +1200)]
Detatch debugs() from many of its dependencies
- makes cache.log independent of the other logging systems
- adds debug_options rotate=N setting to override logfile_rotate
- moves debug-specific globals and types into Debug::
TODO:
remove remaining dependancy on shutdown flag
polish up namespace etc for libdebug
Amos Jeffries [Sat, 30 May 2009 13:40:23 +0000 (01:40 +1200)]
Add Translate: and Unless-Modified-Since: headers to known list.
They are custom microsoft headers we may need to use header_access to
crop away. Translate: is needed for WEBDAV so we must leave this up
to individual admin.
Amos Jeffries [Sat, 30 May 2009 13:33:16 +0000 (01:33 +1200)]
Author: Henrik Nordstrom <henrik@henriknordstrom.net>
Bug 2481: Don't set expires: now in generated error responses
Sending Expires: "now" overrides any negative cache logics which may
be present in downstream caches and is a bad idea. Better to send
the responses without any explicit expiry information.
Amos Jeffries [Sat, 23 May 2009 02:59:52 +0000 (14:59 +1200)]
Author: Adrian Chadd <adrian@squid-cache.org>
Add in some better documentation for override-expire.
Attempt to clearly document exactly what it does - in this instance, it
enforces min age and doesn't allow the admin to enforce max-age -
ie, truncate staleness.
Amos Jeffries [Sat, 23 May 2009 02:44:08 +0000 (14:44 +1200)]
Author: Guido Serassio <serassio@squid-cache.org>
Windows port: Fix improper access permissions to registry and DNS parsing from registry
- RegOpenKey() always try to open registry keys in full control mode, even if not needed.
This could make Squid to fail when running as a non privileged user. RegOpenKeyEx() allow to
specify only the needed priviledge and now is used instead.
- When parsing DNS setting into registry, a fixed size loop was used. Now the loop count is
dynamic.
Amos Jeffries [Sat, 23 May 2009 02:09:53 +0000 (14:09 +1200)]
Replace assert with NOP action in hash free.
This resolves one small coverity itch.
When nothing to free we don't really need to care, we do need to act
safely and not try to actually action the free though.