From: Timo Sirainen Date: Wed, 30 Jun 2004 12:18:35 +0000 (+0300) Subject: Cache decision explanation comment. X-Git-Tag: 1.1.alpha1~3847 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=3c57664b9dce82cd3e43347394b92ef3591b8901;p=thirdparty%2Fdovecot%2Fcore.git Cache decision explanation comment. --HG-- branch : HEAD --- diff --git a/src/lib-index/mail-cache-decisions.c b/src/lib-index/mail-cache-decisions.c index 0864620090..d26532b95a 100644 --- a/src/lib-index/mail-cache-decisions.c +++ b/src/lib-index/mail-cache-decisions.c @@ -1,5 +1,71 @@ /* Copyright (C) 2004 Timo Sirainen */ +/* + Users can be divided to three groups: + + 1. Most users will use only a single IMAP client which caches everything + locally. For these users it's quite pointless to do any kind of caching + as it only wastes disk space. That might also mean more disk I/O. + + 2. Some users use multiple IMAP clients which cache everything locally. + These could benefit from caching until all clients have fetched the + data. After that it's useless. + + 3. Some clients don't do permanent local caching at all. For example + Pine and webmails. These clients would benefit from caching everything. + Some locally caching clients might also access some data from server + again, such as when searching messages. They could benefit from caching + only these fields. + + After thinking about these a while, I figured out that people who care + about performance most will be using Dovecot optimized LDA anyway + which updates the indexes/cache immediately. In that case even the first + user group would benefit from caching the same way as second group. LDA + reads the mail anyway, so it might as well extract some information + about it and store them into cache. + + So, group 1. and 2. could be optimally implemented by keeping things + cached only for a while. I thought a week would be good. When cache file + is compressed, everything older than week will be dropped. + + But how to figure out if user is in group 3? One quite easy rule would + be to see if client is accessing messages older than a week. But with + only that rule we might have already dropped useful cached data. It's + not very nice if we have to read and cache it twice. + + Most locally caching clients always fetch new messages (all but body) + when they see them. They fetch them in ascending order. Noncaching + clients might fetch messages in pretty much any order, as they usually + don't fetch everything they can, only what's visible in screen. Some + will use server side sorting/threading which also makes messages to be + fetched in random order. Second rule would then be that if a session + doesn't fetch messages in ascending order, the fetched field type will + be permanently cached. + + So, we have three caching decisions: + + 1. Don't cache: Clients have never wanted the field + 2. Cache temporarily: Clients want this only once + 3. Cache permanently: Clients want this more than once + + Different mailboxes have different decisions. Different fields have + different decisions. + + There are some problems, such as if a client accesses message older than + a week, we can't know if user just started using a new client which is + just filling it's local cache for the first time. Or it might be a + client user hasn't just used for over a week. In these cases we + shouldn't have marked the field to be permanently cached. User might + also switch clients from non-caching to caching. + + So we should re-evaluate our caching decisions from time to time. This + is done by checking the above rules constantly and marking when was the + last time the decision was right. If decision hasn't matched for two + months, it's changed. I picked two months because people go to at least + one month vacations where they might still be reading mails, but with + different clients. +*/ + #include "lib.h" #include "write-full.h" #include "mail-cache-private.h"