git.ipfire.org Git - thirdparty/squid.git/log

]> git.ipfire.org Git - thirdparty/squid.git/log

Alex Rousskov [Sun, 1 May 2016 21:37:52 +0000 (15:37 -0600)]

Accumulate fewer unknown-size responses to avoid overwhelming disks.

Start swapping out an unknown-size entry as soon as size-based cache_dir
selection is no longer affected by the entry growth. If the entry
eventually exceeds the selected cache_dir entry size limits, terminate
the swapout.

The following description assumes that Squid deals with a cachable
response that lacks a Content-Length header. These changes should not
affect other responses.

Prior to these changes, StoreEntry::mayStartSwapOut() delayed swapout
decision until the entire response was received or the already
accumulated portion of the response exceeded the [global] store entry
size limit, whichever came first. This logic protected Store from
entries with unknown sizes. AFAICT, that protection existed for two
reasons:

* Incorrect size-based cache_dir selection: When cache_dirs use
  different min-size and/or max-size settings, Squid cannot tell which
  cache_dir the unknown-size entry belongs to and, hence, may select the
  wrong cache_dir.

* Disk bandwidth/space waste: If the growing entry exceeds all cache_dir
  max-size limits, the swapout has to be aborted, resulting in waste of
  previously spent resources (primarily disk bandwidth and space).

The cost of those protections include RAM waste (up to maximum cachable
object size for each of the concurrent unknown-size entry downloads) and
sudden disk overload (when the entire delayed entry is written to disk
in a large burst of write requests initiated from a tight doPages() loop
at the end of the swapout sequence when the entry size becomes known).
The latter cost is especially high because swapping the entire large
object out in one function call can easily overflow disker queues and/or
block Squid while the OS drains disk write buffers in an emergency mode.

FWIW, AFAICT, cache_dir selection protection was the only reason for
introducing response accumulation (trunk r4446). The RAM cost was
realized a year later (r4954), and the disk costs were realized during
this project.

This change reduces those costs by starting to swap out an unknown-size
entry ASAP, usually immediately. In most caching environments, most
large cachable entries should be cached. It is usually better to spend
[disk] resources gradually than to deal with sudden bursts [of disk
write requests]. Occasional jolts make high performance unsustainable.

This change does not affect size-based cache_dir selection: Squid still
delays swapout until future entry growth cannot affect that selection.
Fortunately, in most configurations, the correct selection can happen
immediately because cache_dirs lack explicit min-size and/or max-size
settings and simply rely on the *same-for-all* minimum_object_size and
maximum_object_size values.

We could make the trade-off between costly protections (i.e., accumulate
until the entry size is known) and occasional gradual resource waste
(i.e., start swapping out ASAP) configurable. However, I think it is
best to wait for the use case that requires such configuration and can
guide the design of those new configuration options.

Side changes:

* Honor forgotten minimum_object_size for cache_dirs without min-size in
  Store::Disk::objectSizeIsAcceptable() and fix its initial value to
  correctly detect a manually configured cache_dir min-size (which may
  be zero). However, the fixed bug is probably hidden by another (yet
  unfixed) bug: checkTooSmall() forgets about cache_dirs with min-size!

* Allow unknown-size objects into the shared memory cache, which code
  could handle partial writes (since collapsed forwarding changes?).

* Fixed Rock::SwapDir::canStore() handling of unknown-size objects. I do
  not see how such objects could get that far before, but if they could,
  most would probably be cached because the bug would hide the unknown
  size from Store::Disk::canStore() that declares them unstorable.