Timo Sirainen [Thu, 16 Apr 2020 12:59:24 +0000 (15:59 +0300)]
lib-index: Fix cache purging when index is being rebuilt
All the messages until trans->first_new_seq no longer exist after reset.
At best they cause confusion and are ignored. They could also point to
high UIDs that the rebuilding removes, causing further corruption errors.
Timo Sirainen [Sun, 5 Apr 2020 15:36:44 +0000 (18:36 +0300)]
lib-index: Cache purging shouldn't always change YES decisions to TEMP
Previously every purging changed YES decision to be changed to TEMP.
This behaved rather badly if a cache file was purged twice within short
time period, because the clients might not have had time to access the
mailbox and change the decision back to YES. That in turn could have
dropped old mails' cached fields even though the client might still want
to use them. (A workaround for this has been to list all useful fields in
mail_always_cache_fields setting.)
The new behavior is to update the last_used field for a YES decision only
when the YES decision has been confirmed. If it's not confirmed for 30
days (mail_cache_unaccessed_field_drop) then its decision is changed to
TEMP on the next purge.
The new behavior also doubles the time when unaccessed field is dropped
from 30 days to 60 days (2*mail_cache_unaccessed_field_drop). This is
needed so that the field isn't dropped too early after YES -> TEMP
decision is changed.
Timo Sirainen [Sun, 5 Apr 2020 20:30:41 +0000 (23:30 +0300)]
lib-storage: Initialize new cache hdr.* fields with NO decision
Calling mail_cache_add() afterwards will change their decision to TEMP.
Or if it's not called, it probably wasn't wanted to be cached anyway.
Most importantly this change will cause mail_cache_decision event to be
triggered for these newly cached headers.
It can only happen on I/O errors, which most likely means that there's no
disk space to write to transaction log. There's no reason to delete the
entire cache, since it might be expensive to recreate.
Timo Sirainen [Thu, 2 Apr 2020 21:41:55 +0000 (00:41 +0300)]
lib-storage: Use mailbox.event as parent to mail_index
The index lives longer than the struct mailbox, which is a bit confusing.
In some cases the index could even be used for a different mailbox name
(symlink/alias), in which case the event's mailbox name wouldn't be exactly
correct. However, these downsides are still preferable to not inheriting
from the mailbox event, since then there is no mailbox name.
Timo Sirainen [Tue, 31 Mar 2020 11:59:25 +0000 (14:59 +0300)]
lib-index: Rename cache compress -> cache purge
Using "purge" term is less confusing than "compress". This was already
decided earlier for "doveadm mailbox cache purge" name, and it's similar
to mdbox purging.
Timo Sirainen [Wed, 1 Apr 2020 21:29:07 +0000 (00:29 +0300)]
lib-index: Make sure purging clears saved bitmask offsets state between mails
This is just to make sure it doesn't try to use previous mail's offset when
merging bitmasks. This shouldn't have been possible anyway, because
field_seen array guaranteed that the bitmask_pos was always set before
reading it. This just makes extra sure of it.
Timo Sirainen [Tue, 31 Mar 2020 15:53:33 +0000 (18:53 +0300)]
lib-index: Wait for lock in cache compression, not just try once
Sometimes there's an important reason for the cache to be compressed, and it
shouldn't give up just because there's a race condition with another process
that just happens to be writing to it.
Timo Sirainen [Wed, 1 Apr 2020 19:19:52 +0000 (22:19 +0300)]
lib-index: Don't compress cache file if there are cache transactions with changes
The cache compression could have lost changes in the uncommitted
transactions if they had already written some changes to the file.
This seems to have happened sometimes at least with dsync.
Timo Sirainen [Wed, 1 Apr 2020 13:31:32 +0000 (16:31 +0300)]
lib-index: Cache compression could have lost data for recently added mails
mail_index_sync_ctx.view may not have seen all the newly added mails,
which caused compression to skip those mails. Use a whole new view
instead, which guarantees that all mails are seen.
Timo Sirainen [Tue, 31 Mar 2020 09:08:43 +0000 (12:08 +0300)]
lib-index: Fix cache lookup for a newly added field in non-committed transaction
The cache lookup didn't return anything unless the field already existed in
the current cache file. It could still have already been added to the
transaction and found from there.
Timo Sirainen [Tue, 31 Mar 2020 08:40:58 +0000 (11:40 +0300)]
lib-index: Don't even try to add fields to cache that exceed record_max_size
Previously these were temporarily added to memory and later on the whole
cache record was just dropped. But usually not all the cache fields are
huge, so it's better to add some fields to cache than none.
Timo Sirainen [Wed, 1 Apr 2020 19:11:21 +0000 (22:11 +0300)]
lib-index: Cache transaction rollback used too small value to increase deleted_record_count
Rollback used records_written to update it, but that was actually counting
the number of transaction flushes, not the number of individual mail cache
records that were written.
Timo Sirainen [Mon, 30 Mar 2020 13:48:21 +0000 (16:48 +0300)]
lib-index: mail_cache_*lock() - On reset_id mismatch wait for cache compression to finish
It was possible that the reset_id mismatch happened because cache
compression had already recreated the .cache file, but hadn't yet written
the new reset_id to the .log file. By waiting for the .log lock, we can be
sure that this won't happen.
If reading a cache file failed because of some temporary syscall error, it
was treated the same as if the cache was corrupted. This could have
caused compression to lose cached data.
Timo Sirainen [Tue, 31 Mar 2020 13:43:07 +0000 (16:43 +0300)]
lib-index: Fix checking if cache should be compressed because it's getting too large
Most importantly fixed updating last_stat_size whenever cache is locked.
This now guarantees that it's always fully up-to-date when it's being
checked. Most of the other places updating last_stat_size are now
unnecessary, but I left them in case they'll be useful in the future.
Fixed also off-by-one where purging was done also when cache was exactly at
its maximum size.
Now that transaction log is always locked during cache compression, it
doesn't matter if the cache is unlocked before or after the transaction log
commit.
Timo Sirainen [Fri, 27 Mar 2020 10:49:40 +0000 (12:49 +0200)]
doveadm: Rewrite "doveadm mailbox cache purge" to use mail_cache_compress()
Also removed checks to verify whether the cache file existed or was usable.
Compression will create/fix the file, which is more likely to be the wanted
behavior than failing.
This can be used to specify when the cache file should be compressed.
This was previously hidden in the cache->compress_file_seq field, but
now it's more explicit.
This commit also makes sure that cache is compressed when index rebuilding
is done.
Timo Sirainen [Fri, 27 Mar 2020 16:07:53 +0000 (18:07 +0200)]
lib-index: If cache transaction lock fails because reset_id mismatch, compress cache
This is also done in mail_cache_transaction_open_if_needed(), so it usually
worked anyway. But that function is otherwise unnecessary, so it's going
away in the next commit.