Alex Rousskov [Thu, 15 Aug 2013 22:26:40 +0000 (16:26 -0600)]
Fixed swap_file_sz calculation when loading rock entries. Polished debugging.
Supply storeRebuildParseEntry() with known "swap file size" so that it can
adjust swap_file_sz after loading store entry meta info. Entries are often
stored with swap_file_sz in the meta header missing the swap_hdr_len
component. storeRebuildParseEntry() adds swap_hdr_len when needed, using known
entry size to detect that need.
Alex Rousskov [Thu, 15 Aug 2013 22:21:16 +0000 (16:21 -0600)]
Do not use StoreEntry::swap_file_sz to write DbCellHeader::entrySize,
even during the last write.
StoreEntry::swap_file_sz is often set by storeSwapOutFileClosed, which is
called after the last write.
Also, I am not sure whether partial StoreEntry::swap_file_sz info might later
confuse store rebuild code into thinking that the whole entry is malformed.
That would be [different] bug.
Alex Rousskov [Tue, 30 Jul 2013 17:10:57 +0000 (11:10 -0600)]
Prevented STORE_DISK_CLIENT assertions for aborted entries. Polished code.
To prevent store_client.cc:445: "STORE_DISK_CLIENT == getType()" assertions,
rewrote the storeClientNoMoreToSend() function so that it does not send a
STORE_MEM_CLIENT to read from disk. Used this opportunity to polish this
negative function code and convert it into a positive method.
Documented known (and mostly old) StoreEntry::storeClientType() problems.
Alex Rousskov [Mon, 29 Jul 2013 00:46:55 +0000 (18:46 -0600)]
Better support for things with shared locks that can be opened many times,
such as Ipc::StoreMap entries. Maintain a lock counter instead of boolean
opened flag.
Better support for things with multipart IDs
such as Ipc::StoreMap entries that have an anchor/inode ID and map name.
Alex Rousskov [Mon, 29 Jul 2013 00:43:55 +0000 (18:43 -0600)]
Re-enabled on-disk collapsing of entries after fixing related code.
Since we started writing partial entries, we cannot rely on negative sidNext
marking the end of the slice/write sequence. Added a WriteRequest::eof field
to signal that end explicitly.
Do not leak db slices when write fails or IoState is closed before the write
succeeds.
Handle store client requesting an offset we have not stored yet. This might
happen for collapsed hits (and also if the client is buggy). May need more
work to slow the reader down.
Do not update various shared stats until the corresponding slot is written.
Alex Rousskov [Mon, 29 Jul 2013 00:27:23 +0000 (18:27 -0600)]
Improved STORE_MEM_CLIENT detection.
IN_MEMORY mem_status does not guarantee that the entore object is in the
memory cache. We may be just loading it from a shared memrory cache, and
loading may fail. We may have nibbled at the entry already (although that may
not be possible, not sure). The whole memory/disk store_client designation
probably needs more work, but the now-removed condition was causing
store_client.cc:445: "STORE_DISK_CLIENT == getType()" assertions.
Alex Rousskov [Sat, 27 Jul 2013 17:19:29 +0000 (11:19 -0600)]
Keep anchor.basics.swap_file_sz in sync with slice sizes.
The old code updated anchor.basics.swap_file_sz _after_ copying all of the
available data into shared memory. An exception in the copying loop (e.g., the
map is out of available slots) could prevent that update. For another worker,
the entry would then appear to be fully completed (no writer, last slice size
stable, and last slice poiner is -1) and that worker would assert due to
anchor.basics.swap_file_sz mismatching the sum of slice sizes.
Alex Rousskov [Wed, 24 Jul 2013 21:48:45 +0000 (15:48 -0600)]
Disconnect StoreEntries before deleting their memory objects.
The new cleanup order helps identify the write Rock entry state (reading or
writing) and avoid assertions related to state identification bugs (such
as unlocking a writing entry for reading).
Similar to the memory cache code, we should not disconnect disk entries during
shutdown because Store::Root() may be missing by then.
Alex Rousskov [Wed, 24 Jul 2013 21:45:02 +0000 (15:45 -0600)]
Avoid !writeableAnchor_ assertions when Squid shuts down.
A shutting down Squid deletes locked StoreEntry objects, which may trigger
deletion of Rock::IoState that is still writing to disk. We should fix the
shutdown sequence. Meanwhile, the Rock::IoState code does not need to mislead
admins with an assert.
Alex Rousskov [Mon, 22 Jul 2013 17:04:00 +0000 (11:04 -0600)]
Fixed StoreEntry::mayStartSwapOut() logic to handle terminated swapouts.
StoreEntry::mayStartSwapOut() should return true if a swapout can start. If
swapout was started earlier but then terminated for some reason (setting sio
to nil), the method should not return true. Checking swap_status ==
SWAPOUT_DONE does not work reliably because the status may be reset to
SWAPOUT_NONE in some cases (and the check was too late anyway). Checking
decision == swPossible does not work at all because while swapout start was
possible at some point, it is no longer possible after we started swapping
out.
Added MemObject::SwapOut::swStarted to detect started swapouts reliably.
This patch add new logformat codes to log TOS/DSCP values and netfilter marks
for client and server connections. If multiple outgoing connections were used,
the last used connection value logged.
The values printed in hexadecimal form.
The logformat codes are:
%>tos Client connection tos mark set by Squid
%<tos Server connection tos mark set by Squid
%>nfmark Client connection netfilter mark set by Squid
%<nfmark Server connection netfilter mark set by Squid
This patch also modify qos related code to set Comm::Connection::nfmark and
Comm::Connection::tos members in Ip::Qos::setSockNfmark and Ip::Qos::setSockTos
methods. The Comm::Connection members are now set only if the tos and nfmark
set successfuly.
This patch sends an If-None-Match request, when we need to re-validate
if a cached object which has a strong ETag is still valid.
This is also done in the cases an HTTP client request contains HTTP
headers prohibiting a from-cache response (i.e., a "reload" request).
The use of If-None-Match request in this context violates RFC 2616 and
requires using reload-into-ims option within refresh_pattern squid.conf
directive.
The exact definition of a "reload request" and the adjustment/removal of
"reload" headers is the same as currently used for reload-into-ims
option support. This patch is not modifying that code/logic, just adding
an If-None-Match header in addition to the IMS header that Squid already
adds.
Fix external ACL user:pass detail logging after adaptation
When a request is successfully adapted, the external ACL username and
password are now inherited with this patch. This means the
LFT_USER_NAME log token can display the username from an external ACL
if available, for adapted requests.
The HttpRequest will inherit the password for good measure as well -
while none too useful, it seems strange to inherit the username but
not the password.
We can do better than just producing errors about invalid port details
and treatign it as port-0.
We can instead undo the port separation and pass it through as part of
the host name to be verified with the default port number properly
assumed.
Protect against buffer overrun in DNS query generation
see SQUID-2013:2.
This bug has been present as long as the internal DNS component however
most code reaching this point is passing through URL validation first.
With Squid-3.2 Host header verification using DNS directly we may have
problems.
Alex Rousskov [Wed, 10 Jul 2013 00:41:01 +0000 (18:41 -0600)]
Use Rock::IoState::writeableAnchor_ to detect rock entries open for writing.
Just e.mem_obj->swapout.sio presence is not reliable enough because we
may switch from writing to reading while the [writing] sio is still around.
More explicitly disabled on-disk collapsing of entries. The relevant code is
unstable under load [at least when combined with memory caching]. We were not
calling Ipc::StoreMap::startAppending() before so we probably did not fully
disk-collapsed entries before these temporary changes.
Added an XXX to mark an assert() that may fail if we allow on-disk collapsing.
Alan Mizrahi [Tue, 9 Jul 2013 11:15:51 +0000 (05:15 -0600)]
Add storeid_file_rewrite helper
Based on work by Eliezer Croitoru <eliezer@ngtech.co.il>
This program acts as a Store-ID helper program, rewriting URLs passed
by Squid into storage-ids that can be used to achieve better caching
for websites that use different URLs for the same content.
It takes a text file with two tab separated columns.
Column 1: Regular expression to match against the URL
Column 2: Rewrite rule to generate a Store-ID
Rewrite rules are matched in the same order as they appear in the file.
So for best performance, sort it in order of frequency of occurrence.
Alexis Robert [Tue, 9 Jul 2013 10:04:39 +0000 (22:04 +1200)]
Support IPv6 NAT interception on Linux
NAT support has been included for IPv6 in Linux 3.7 (along with
REDIRECT/DNAT rules), as well as IP6T_SO_ORIGINAL_DST in Linux 3.8.
Add support for transparent proxies over IPv6.
There is a bug in linux/netfilter_ipv6/ip6_tables.h on C++ compilers,
the bug report and patch to fix it can be found at
https://lkml.org/lkml/2012/9/30/146.
It is only used for the constant IP6T_SO_ORIGINAL_DST. We attempt to use
the official header whenever possible but if it is detected missing or
broken we define our own version of the option.
IPv6 is now permitted on any http_port or https_port in squid.conf
however on older Linux systems and Unix systems without the required NAT
support Squid will fail when accepting the traffic.
Also, this removes the blocker checks preventing BSD systems using NAT
interception on IPv6 ports. Several version of PF have long since
supported IPv6 NAT operations although it was discouraged, such support
is not easily detected though so results WILL vary by operating system.
Bug 3876: mDNS support segfault when using --disable-ipv6
When IPv6 is disabled the mDNS IPv6 multicast group gets rejected by
idnsAddnameserver() resulting in invalid pointers for the remaining
mDNS NS setup operations.
Convert the hard-coded mDNS nameserver count to dynamic global count and
elide the relevant NS when IPv6 support disabled.
Alex Rousskov [Tue, 2 Jul 2013 19:23:49 +0000 (13:23 -0600)]
Broadcast mem-cache writer departure to transient readers (in more/all cases).
Moved transientsAbandon() call to MemStore::disconnect() to make sure we
catch all cases where a mem-cache writer stops updating the cache entry.
Transient readers need to know so that they do not get stuck when a writer
disappears.
transientsAbandon() needs StoreEntry so MemStore::disconnect requires one now.
Alex Rousskov [Mon, 1 Jul 2013 19:59:32 +0000 (13:59 -0600)]
Do not become a store_client for entries that are not backed by Store.
If we ignore cache backing when becoming a store client, then
StoreEntry::storeClientType() is going to make us a DISK_CLIENT by default.
If there is no disk cache or it cannot be used for our entry, we will assert
in store_client constructor. Prevent those assertions by checking earlier in
StoreEntry::validToSend().
Alex Rousskov [Mon, 1 Jul 2013 02:25:50 +0000 (20:25 -0600)]
Several fixes and improvements to help collapsed forwarding work reliably:
Removed ENTRY_CACHABLE. AFAICT, it was just negating RELEASE_REQUEST AFAICT.
Broadcast transients index instead of key because key may become private.
Squid uses private keys to resolve store_table collisions (among other
things). Thus, a public entry may become private at any time, at any worker.
Using keys results in collapsed entries getting stuck waiting for an update.
The transients index remains constant and can be used for reliable
synchronization.
Using transient index, however, requires storing a pointer to the transient
entry corresponding to that index. Otherwise, there is no API to find the
entry object when a notification comes: Store::Root().get() needs a key.
Mark an entry for release when setting its key from public to private. The old
code was only logging SWAP_LOG_DEL, but we now need to prevent requests in
other workers from collapsing on top of a now-private cache entry. In many
cases, such an entry is in trouble (but not all cases because private keys are
also used for store_table collision resolution).
Fixed syncing of abandoned entries.
Prevent new requests from collapsing on writer-less transient entries.
- The SSL_CTX_new in newer openSSL releases requires a const
'SSL_METHOD *' argument and in older releases requires non const
'SSL_METHD *' argument. Currently we are trying to identify openSSL
version using the OPENSSL_VERSION_NUMBER macro define but we are failing
to correctly identify all cases.
- sk_OPENSSL_PSTRING_value is buggy in early openSSL-1.0.0? releases
causing compile errors to squid.
Amos Jeffries [Sat, 29 Jun 2013 14:43:23 +0000 (08:43 -0600)]
Bug 3762: remove bogus WARNING in cache.log
The warning is bogus for several reasons:
* it appears with memory-only cache configurations
* it only checks the size of first SwapDir (as seen in bug 3762)
* very large memory spaces are now possible which may make disk appear
small by comparison.
Its usefulness in detecting memory and disk misconfigurations has long
been almosy nil. Removing this entirey to resolve the bogus noise in
the above mentinoed legitimate configurations.
Alex Rousskov [Thu, 27 Jun 2013 21:26:57 +0000 (15:26 -0600)]
Tightened StoreEntry locking. Fixed entry touching and synchronization code:
Tightened StoreEntry locking code to use accessors instead of manipulating the
locking counter directly. Helps with locking bugs detection. Do not consider
STORE_PENDING and SWAPOUT_WRITING entries locked by default because it is
confusing and might even leave zero lock_count but locked() entries in the
global table. Entry users should lock them instead.
StoreController::get() is now the only place where we touch() a store entry.
We used to touch entries every time they were locked, which possibly did not
touch some entries often enough (e.g. during Vary mismatches and such where
the get() entry is discarded) and definitely touched some entries too often
(every time the entry was locked multiple times during the same master
transaction). This addresses a design bug marked RBC 20050104.
Fixed interpretation of IN_MEMORY status. The status means that the store
entry was, at some point, fully loaded into memory. And since we prohibit
trimming of IN_MEMORY entries, it should still be fully loaded. Collapsing
changes started to use IN_MEMORY for partially loaded entries, which helps
detecting entries associated with the [shared] memory cache, but goes against
old Squid code assumptions, triggering assertions.
Handle synchronization of entries the worker is writing. Normally, the writing
worker will not receive synchronization notifications (it will send them) but
a stale notification is possible and should not lead to asserts. The worker
writing an entry will see a false mem_obj->smpCollapsed.
Do not re-anchor entries that were already anchored, fully loaded (ioDone),
and are now disassociated from the [shared] memory cache.
For shared caching to work reliably, StoreEntry::setReleaseFlag() should mark
cache entries for future release. We should not wait for release() time.
Waiting creates stuck entries because Squid sometimes changes the key from
public to private and collapsed forwarding broadcasts are incapable of
tracking such key changes (but they are capable of detecting entries abandoned
by their writers via the deletion mark in the transients table).
Alex Rousskov [Tue, 25 Jun 2013 17:51:30 +0000 (11:51 -0600)]
Avoid "STORE_DISK_CLIENT == getType()" assertions for ENTRY_ABORTED clients
and no disk cache configured.
StoreEntry::abort() makes entry STORE_OK, which makes
storeClientNoMoreToSend() return false for entries with unknown objectLen(),
triggering a disk read for some of them (when store_client::doCopy() cannot
schedule a memory read). If the entry is not really on disk, we hit an
assertion in store_client::scheduleDiskRead().
Alex Rousskov [Tue, 25 Jun 2013 16:06:37 +0000 (10:06 -0600)]
Various fixes related to overlapping and collapsed entry caching.
Wrote Transients description, replacing an irrelevant copy-pasted comment.
Maintain proper transient entry locks, distinguishing reading and writing
cases.
Fixed transients synchronization logic. Store::get() must not return
incomplete from-cache entries, except for local or transient ones. Otherwise,
the returned entry will not be updated when its remote writer makes changes.
Marked entries fully loaded from the shared memory cache as STORE_OK.
Avoid caching ENTRY_SPECIAL in the shared memory cache for now. This is not
strictly necessary, I think, but it simplifies shared caching log when
triaging start-test-analyze test cases. The restriction can be removed
when ENTRY_SPECIAL generation code becomes shared cache-aware, for example.
Fixed copy-paste error in Transients::disconnect().
Changed CollapsedForwarding::Broadcast() profile in preparation for excluding
broadcasts for entries without remote readers.
Do not purge entire cache entries just because we have to trim their RAM
footprint. The old code assumed that non-swappable entries may not have any
other stored content (which is no longer correct because they may still reside
in the shared memory cache) so it almost made sense to purge them, but it is
possible for clients to use partial in-RAM data when serving range requests,
so we should not be purging unless there are other reasons to do that. This
may expose client-side bugs if the hit validation code is not checking for RAM
entries being incomplete.
Allow MemObject::trimUnSwappable() to be called when there is nothing to trim.
This used to be a special case in StoreEntry::trimMemory(), but we do not need
it anymore after the above change.
Added transient and shared memory indexes to StoreEntry debugging summaries.
Alex Rousskov [Tue, 25 Jun 2013 15:39:10 +0000 (09:39 -0600)]
Mark client streams that sent everything as STREAM_COMPLETE.
The old code used STREAM_UNPLANNED_COMPLETE if the completed stream was
associated with a non-persistent connection, which did not make sense to me
and, IIRC, led to store entry aborts even though the entries were not damaged
in any way.
This change may expose other subtle bugs, but none are known at this time.
See also:
http://www.squid-cache.org/mail-archive/squid-dev/200702/0017.html
http://www.squid-cache.org/mail-archive/squid-dev/201102/0210.html
Alex Rousskov [Mon, 24 Jun 2013 17:05:13 +0000 (11:05 -0600)]
Removed StoreEntry::hidden_mem_obj.
Replaced MemObject::url with MemObject::urlXXX() and storeId().
* Replace StoreEntry::hidden_mem_obj hack with explicit MemObject::setUris().
We need MemObject to tie Store::get() results to locked memory cache entries
and such but Store::get() does not know the entry URIs so we had to use fake
"TBD" URIs instead. The hidden_mem_obj hack was added to minimize chances
that those temporary "TBD" URIs are going to be logged or forwarded.
However, new code uses MemObject cache ties a lot more, and it became too
cumbersome and error prone to always check whether there is a hidden object
holding indexes of locked StoreMap entries. It should be easier to ensure
that true URIs are set after Store::get() instead.
* Provide accessors for MemObject::url (which is actually a store ID these
days) and MemObject::log_url (which is usually the same as the url so we now
do not allocated it when it is the same). These accessors allow us to verify
that the caller is not going to use an undefined URI or Store ID because some
code forgot to set them explicitly.
* Add urlXXX() to mark old callers that appear to assume that MemObject::url
still holds a URI (instead of StoreID). Fixing those callers is outside this
project scope, but this was a good opportunity to identify/mark them because
we needed to hide raw Store ID field name ("url") anyway.
Alexis Robert [Mon, 24 Jun 2013 07:42:35 +0000 (01:42 -0600)]
Fix Ip::Address::operator =(sockaddr_storage)
The memcpy() for AF_INET6 is using a length of sizeof(sockaddr_in) instead
of sizeof(sockaddr_in6), so squid was trying to connect to truncatured IPv6
addresses with strange ports.
Alex Rousskov [Sat, 22 Jun 2013 15:24:34 +0000 (09:24 -0600)]
Various shared memory-based collapsed forwarding improvements and fixes.
Lock transient entries while in use. Transient entry presence is used
used to detect collapsed entry aborts for not-yet-cached entries.
Store current transient locks and memory cache entry state in MemObject. Why
not in StoreEntry like the disk cache does? To avoid penalizing those Stores
that keep idle StoreEntries in RAM.
Mark collapsing entries specially (in MemObject) so that we can stop updating
(un-tie) local entries that tried to collapse but did not like the collapsed
hit object that they started to get from another worker. When this happens,
the client side creates a new StoreEntry, but without a flag Store cannot tell
whether that entry needs to be kept in sync with the collapsed writer because
both the old entry and the new one have the same key. We may eventually find
a better way to distinguish the two cases.
Do not require MemObjects to be disassociated from various caches during
shutdown because Squid is currently incapable of maintaining Store::Root()
during shutdown.
Support incremental shared memory caching. Maintain and honor the
ENTRY_FWD_HDR_WAIT flag. Maintain shared memory cache reading/writing states.
Better updates of collapsed entries. Detect aborted entries. Do not release
entries that are not yet cached anywhere at the update time.
Alex Rousskov [Sat, 22 Jun 2013 15:11:30 +0000 (09:11 -0600)]
Properly reinitialize reused acnhor.start and slice.size.
Since we allowed readers and [appending] writers to share an entry, it is
no longer possible to implement abortIo(). The caller must either close
the reading entry or abort the writing one, depending on the caller's lock.
Alex Rousskov [Fri, 21 Jun 2013 22:04:04 +0000 (16:04 -0600)]
Make !lock.readers and !lock.writers assertions safe.
The lock class used readers level counter to count both attempts to read and
current readers. The attempts part made assertions declaring that there should
be no readers unsafe because even a writing entry may have a reading attempt.
Same for writers counter: A reading entry may have a writing attempt.
We now segragate the attempts level, which is internal information required
for shared lock to work, from counting the number of successful attempts
(i.e., actual readers and writers), which is public information useful for
assertions, stats, etc.
Alex Rousskov [Fri, 21 Jun 2013 00:50:35 +0000 (18:50 -0600)]
Fixed ipc/Queue notification race leading to stuck, overflowing queues.
The writer calling OneToOneUniQueue::push() must tell readers if it places the
first item into a previously empty queue. We used to determine emptiness prior
to incrementing queue size. That created a window between wasEmpty calculation
and queuing the new item (by incrementing the queue size). During that window,
the readers could pop() all previously queued items (resulting in an empty
queue) but since that happened after wasEmpty was computed to be false, the
writer would not notify them about the new item it just placed, and they will
get stuck, eventually resulting in queue overflow errors.
The fix attempts to increment the queue size and extract the previous size
value atomically.
- The redirectStateData handlers requires the HelperReply::Okay helper reply
result code else will drop the helper reply, but we are always pass to them
the HelperReply::Unknown reply result code
- The NotePairs are not support "=" operator. This patch replaces a such command
using the NotePairs::append member, and also adds unimplemented private
= operator and copy constructor to prevent developers from using it.
Amos Jeffries [Tue, 18 Jun 2013 23:26:17 +0000 (17:26 -0600)]
Add Master Transaction class
... to store and propigate the shared state used end-to-end through Squid
for logging or server-side component input. This excludes Job and Call
pointers, but does include any 'factual' data regarding the traansaction.
Alex Rousskov [Tue, 18 Jun 2013 22:30:39 +0000 (16:30 -0600)]
Make sure %<tt includes all [failed] connection attempts.
The old code was using zero n_tries to detect the first connection attempt,
but n_tries is not incremented when we are opening a new connection rather
than reusing an old one. Perhaps n_tries should be updated differently as
well, but this change simply makes %<tt (hier.total_response_time) management
independent from that [complex] counter.
This patch modify squid cert validation subsystem to sent to cert validator
helper the complete certificates chain, not only the certificates sent by
web server. This is may not be possible in all cases, for example in cases
where the root certificate is not stored localy.
Also this patch includes a small optimization, it checks for domain mismatch
error only when the checked (current) certificate is the server certificate.
Deprecate log_icap and log_access configuration directives
The log_icap and log_access are not really needed to control requests logging.
Someone can use acls with access_log and icap_log configuration directives
for this purpose.
Also currently the requests denied for logging using the log_access access list
will not be accounted for in performance counters.
This patch:
- removes log_icap and log_access options from configuration file.
- adds the "stats_collection" access list to control performane counters
accounting.
Alex Rousskov [Mon, 10 Jun 2013 20:46:08 +0000 (14:46 -0600)]
Support forwarding intercepted but not bumped connections to cache_peers.
When talking to a cache_peer (i.e., sending a CONNECT request before tunneling
the transaction), tunnel code is using a clever hack: Squid does not parse
the CONNECT response from peer but blindly forwards it to the client. This
works great and simplifies code a lot, except when the client connection
was intercepted and, hence, the client did not send a CONNECT request and
is not expecting a CONNECT response.
In those situations, we now accumulate, parse, and strip the peer CONNECT
response (or close connection on errors).
The existing tunnel I/O code is too simple to accommodate that task -- it
cannot accumulate read data (its I/O buffers work in lockstep fashion, writing
everything it reads before reading again). Instead of rewriting the entire
tunnel code to use more complex buffers, I added a temporary accumulation
buffer for the CONNECT response. That buffer is not allocated unless it is
needed and does not grow beyond SQUID_TCP_SO_RCVBUF size, just like the
simple buffers.
Alex Rousskov [Sat, 8 Jun 2013 23:21:23 +0000 (17:21 -0600)]
Fix detection of concurrent ACLChecklist checks, avoiding !accessList asserts.
Concurrent checks are not supported, but it is possible for the same
ACLChecklist to be used for a sequence of checks, alternating fastCheck(void)
and fastCheck(list) calls. We needed a different/dedicated mechanism to detect
check concurrency (added ACLChecklist::occupied_), and we needed to preserve
(and then restore) pre-set accessList during fastCheck(list) checks.
Alex Rousskov [Sat, 8 Jun 2013 00:56:36 +0000 (18:56 -0600)]
Simplified MemObject::write() API.
The API required a callback, but the call was always synchronous and the
required callback mechanism could not reliably support an async call anyway.
The method adjusted the buffer offset to become relative to headers rather
than body. While the intent to separate headers from body is noble, none of
the existing caches support that separation, and a different API will be
needed to support it correctly anyway. For now, let's reduce the number of
special cases and offset manipulations.
Alex Rousskov [Fri, 7 Jun 2013 23:34:36 +0000 (17:34 -0600)]
Support "appending" read/write lock state that can be shared by readers
and writer. Writer promises not to update key metadata (except growing
object size and next pointers) and readers promise to be careful when
reading growing slices.
Support copying of partially cached entries from the shared memory cache to
local RAM. This is required for collapsed shared memory hits to receive new
data during broadcasted updates.
Properly unlock objects in the shared memory cache when their entries are
abandoned by a worker. This was not necessary before because we never locked
memory cache entries for more than a single method call. Now, with partially
cached entries support, the locks may persist much longer.
Properly delete objects from the shared memory cache when they are purged by a
worker. Before this change, locally purged objects may have stayed in the
shared memory cache.
Update disk cache index _after_ the changes are written to disk. Another
worker may be using that index and will expect to find the indexed slices on
disk. Disk queues are not FIFOs across workers.
Made CollapsedForwarding work better in non-SMP mode.
Polished broadcasting code. We need to broadcast entry key because the entry
may not have any other information (it may no longer be cached by the sender,
for example).
Implemented "anchoring" in-transit entries when the writer caches the
corresponding object. This allows the reader's entry object to reflect its
cached status and, hence, be able to ask for cached data during broadcasted
entry updates. Still need to handle the case where the writer does not cache
the object (by aborting collapsed hit).