Alex Rousskov [Wed, 29 Feb 2012 06:32:14 +0000 (23:32 -0700)]
Better helper-to-Squid buffer size management.
The minimum buffer size is reduced from 8KB to 4KB after a squid-dev
discussion to prevent wasting of "several hundred KB of unused permanent
memory on some installations".
We now increase the buffer if we cannot parse the helper response message.
The maximum buffer size is now 32KB. This should be enough for all known
helper responses.
We now warn if the read buffer reaches its capacity and kill the offending
helper explicitly. An increase in maximum buffer capacity to 32KB should make
such events rare.
Motivation: ssl_crtd helper may produce responses exceeding 9907 bytes in size
(and possibly much larger if multiple chained certificates need to be returned
to Squid). The old helper.cc code would fill the read buffer completely,
schedule a read for zero bytes, receive zero bytes, declare an EOF condition,
and close the stream (which kills ssl_crtd). Due to insufficient information
logged, the observable symptoms were pretty much the same as if ssl_crtd
closed the stream first, indicating a ssl_crtd bug.
HONDA Hirofumi [Tue, 28 Feb 2012 17:52:21 +0000 (10:52 -0700)]
Bug 3502: client timeout uses server-side read_timeout, not request_timeout
I have also adjusted request_timeout description in squid.conf to clarify that
request_timeout applies to receiving complete HTTP request headers and not
just the first header byte. We reset the connection timeout to
clientLifetimeTimeout after parsing request headers.
https_port was correctly using Config.Timeout.request already.
Guy Helmer [Tue, 28 Feb 2012 00:22:38 +0000 (17:22 -0700)]
Bug 3497: Bad ssl_crtd db size file causes infinite loop.
The db size file may become empty when Squid runs out of disk space. Ignoring
db size reading errors led to bogus db sizes used as looping condition. This
fix honors reading errors and also terminates the loop when no more
certificates can be removed. Both errors and removal failure are fatal to
ssl_crtd.
A positive side-effect of this fix is one less call to the relatively
expensive file-reading size()/readSize() methods under normal conditions.
I also removed "minimum db size" check because it did not seem to be in sync
with other ssl_crtd parameters such as fs block size and because its overall
purpose was unclear. The check was also removed by the original bug reporter.
TODO: Remaining problems include: ssl_crtd should not exit just because it
cannot write something to disk. A proper reporting/debugging API is missing.
Alex Rousskov [Mon, 20 Feb 2012 19:10:54 +0000 (12:10 -0700)]
Retry requests that failed due to a persistent connection race
instead of replying with ERR_ZERO_SIZE_OBJECT "Bad Gateway".
The ERR_ZERO_SIZE_OBJECT errors were visible to the client when the
destination had only one address because serverDestinations.shift()
made the list of destination empty and startConnectionOrFail() failed.
When FwdState starts to use a pinned connection, the connection is treated as
an idle persistent connection as far as race detection is concerned.
Currently, pinned connections cannot be reopened, repinned, and retried after
a pconn race. This will change when server-side bumped connections become
pinned.
It felt wrong that a failed serverConn may remain set while we are opening a
new connection so I set it to NULL after a squid-dev discussion indicating
that doing so should be safe.
We also now reset the local port number to zero in case it was set to the
actual source port by ConnOpener or other code working with the previous
connection to the same serverDestinations[0] address, although simple tests
worked (and showed changing source port) without this reset.
Bug fix: sslpassword_program for ssl-bump http ports
Currently the sslpassword_program configuration parameter does not work
for encrypted certificate keys on ssl-bump enabled http ports, and user
always asked to give the SSL key password.
Alex Rousskov [Fri, 10 Feb 2012 00:32:44 +0000 (17:32 -0700)]
Do not cache partially loaded entries in shared mem cache (and then serve them)
When handling a conditional request, Squid may load the beginning of a cached
object from disk, realize that the client has the same fresh copy, and respond
with 304 Not Modified. After that, Squid was checking whether the partially
loaded object should be kept in shared memory cache (if enabled). There were
no checks preventing memory caching of the partially loaded object.
Later, partially cached objects were served to clients, resulting in truncated
responses. I believe this happens because shared memory cache does not keep
all the StoreEntry data (just like a disk cache does not do that) so the fact
that only a part of the object was available was lost.
Alex Rousskov [Fri, 10 Feb 2012 00:01:17 +0000 (17:01 -0700)]
Do not swap out swapped out objects.
I noticed that sometimes Squid would start swapping out an entry that was
recently loaded from disk and was still on disk. That wastes disk
resources (at best).
The old StoreEntry::mayStartSwapOut() code assumed that when swap_status is
not SWAPOUT_NONE it is SWAPOUT_WRITING, but SWAPOUT_WRITING is impossible
after recent StoreEntry::swapOut() modifications because mayStartSwapOut() is
only called when we are not swappingOut() already. SWAPOUT_DONE is possible.
Amos Jeffries [Thu, 9 Feb 2012 13:27:51 +0000 (06:27 -0700)]
Drop dead code in reply parsing
This code has not been used/needed in some time. It can die.
It is also no clear why it existed in the first place. The RFC is not
mentioned by number and RFC 2068/2616 only talk about tolerance for
whitespace before request lines, not replies.
Amos Jeffries [Mon, 6 Feb 2012 01:01:23 +0000 (18:01 -0700)]
SourceLayout: shuffle CommIO into DiskThreads library
While investigating the Windows port problems it became clear that the
CommIO object is not actually related to the rest of comm systems. It is
instead a dedicated higher level disk I/O pipe manager for DiskThreads.
Given that it causes build errors in comm.cc on Windows and does not
even need to be there this patch shuffles it into the DiskThreads
library. The build issues on Windows still exist but are now limited to
just that threads library and can be avoided temporarily with simple
./configure options.
This add a bit more leniency to the Host: header validation without
re-opening Squid to the cache poisoning risks involved.Resolving most
issues with websites using geo-based DNS results and/or short DNS TTL for
load balancing.
It alters host_verify_strict directive to allow requests which fail Host:
validation to continue through processing. The default remains OFF.
* blocks caching on the response to protect all other network clients
against one compromised client spreading infections.
* forces the original (untrusted) destination IP to be used instead of
any alternative Squid might find. Preventing Squid or peer DNS lookup
being the point of vulnerability for the same-origin bypass. For any
client to be vulnerable it must be vulnerable inside the browser agent
where the original TCP connection is established.
Also add a new error template ERR_CONFLICT_HOST to replace the confusing
"invalid request" message with a clear explanation of the problem and
some client workarounds.
FUTURE WORK:
* adapt processing to allow these requests to safely be passed to peers.
* adapt caching to permit safe sharing between clients making identical
requests to same sources.
Alex Rousskov [Mon, 30 Jan 2012 17:13:52 +0000 (10:13 -0700)]
Bug 3441: Part 1: Minimize cache size corruption by malformed swap.state.
If swap.state gets corrupted, a single entry with bogus size value will screw
up Squid idea of the current cache size. A newly added StoreSwapLogData sane()
method attempts to minimize the chance of corruption by ignoring log entries
with obviously bogus values.
However, without expensive size checks (-S or "Does the log entry matches the
actual cache file size?"), it is not possible to reliably detect all bogus log
entries.
If Squid gets a wrong idea of the current cache size, it may either cache too
much (and possibly run out of space) OR delete everything.
Properly initialize memory_cache_shared when not explicitly configured and
do not allocate shared memory segments if memory_cache_shared is off.
Prior to this change, shared memory was requested (during rrClaimMemoryNeeds)
before memory_cache_shared was initialized (during rrAfterConfig, unless
explicitly set in squid.conf). Moreover, MemStore::EntryLimit() calculation
ignored memory_cache_shared setting. All that resulted in an attempt to create
shared memory segments where none were needed or possible.
We now have a new Runners Registry (rrFinalizeConfig) that is dedicated to
finalizing complex configuration options, before those configurations options
are used to initialize modules, features, and services. As far as shared
memory is concerned, the initialization order is now:
* rrFinalizeConfig: finalize memory_cache_shared if needed
* rrClaimMemoryNeeds: request shared memory pages for the caches
* rrAfterConfig: create shared memory segments
The same bug 3449 also needed a fix for shared segment paths (trunk r11961).
Alex Rousskov [Thu, 26 Jan 2012 04:51:21 +0000 (21:51 -0700)]
Bug 3268: Squid cannot do anything else during ufs/diskd rebuild
Replaced 50-entries/step UFS rebuild logic with 50msec/step logic,
while also ensuring that other events can be processed between steps.
During a swap.state rebuild, progress reporting will now happen
every 4000 entries instead of every 0xFFF entries because humans do not
think in hex. Still no progress reporting for dirty rebuilds based on the
cache directory scans.
The code is based on Rock Store rebuild code. Eventually, the common rebuild
pacing logic among all stores should be merged to avoid code duplication.
Also removed RebuildState::next methods as unused. A copy-paste error?