From: Timo Sirainen Date: Wed, 16 Jul 2003 01:46:20 +0000 (+0300) Subject: Updated, added ideas how to create better NFS indexes. X-Git-Tag: 1.1.alpha1~4479 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=00ed7b8c0d0842c0b8fd1ab16fb206786dff321a;p=thirdparty%2Fdovecot%2Fcore.git Updated, added ideas how to create better NFS indexes. --HG-- branch : HEAD --- diff --git a/doc/nfs.txt b/doc/nfs.txt index 2ef24f87cf..0d7805b0aa 100644 --- a/doc/nfs.txt +++ b/doc/nfs.txt @@ -1,32 +1,68 @@ -Index files aren't NFS safe, and they likely won't be made. You can access -the actual mailboxes via NFS, but place the indexes into local hard disk -(see mail-storage.txt). +Index files can be stored over NFS by setting index_mmap_invalidate = yes. +This may not be such a good idea however since they are not cached locally +at all. Rereading them constantly over network can be more inefficient than +simply accessing the mail files directly. - - .customflags and .subscriptions files require fcntl() locking currently. - Many NFS servers don't support it, at least without a separate lockd - daemon. These will be fixed later to be NFS-safe by default. +Better way is to store indexes locally while still keeping the mailboxes +stored over NFS. The local index doesn't have to be fully synced all the +time so it doesn't matter if you use a couple of machines to access the +mailbox. In clustered environment it'd be better to try to keep the users +directed into same machine whenever possible. - - .lock files currently rely on O_EXCL which doesn't work with NFSv2 - servers. Also Linux's NFSv3 client still doesn't work with this either. - This will be fixed later to use temporary files and link(). +So append either :INDEX=MEMORY or :INDEX=/local/var/indexes/%u to +default_mail_env. + + - .customflags files requires fcntl() locking currently. Many NFS servers + don't support it, at least without a separate lockd daemon. This will be + fixed later to be NFS-safe by default. .customflags belongs to indexes + with mbox, so this is a real problem only with Maildir. - gethostname() must return different name for each IMAP server accessing a user's mailboxes - Clocks should be synchronized or things can start to fail. + If you _really_ wish to try using indexes via NFS: - - Indexes are shared mmap()ed and we rely on noticing changes made by others. - If your OS doesn't perform some magical mmap() updates (likely won't), - you'll need to modify the code so that each index update will update - sync_id in the header, and each time when index is accessed, the sync_id - should be read(). If it has changed, the file has to be re-mmap()ed. + - Set index_mmap_invalidate = yes - Indexes require fcntl() locking. .lock files would be pretty slow, but - possible. + possible except for modify log. + + +Ideas how to make indexes work pretty well with NFS: + +Reading shouldn't require locks, so modifying should be done using only +atomic operations. These include: + + - Replacing the file completely using rename() + - We probably can't assume that writing more than one byte at a time is + atomic. If we have to modify a larger dataset, we could do: + - struct { bit_t use_first; dataset_t first; dataset_t second; } + - when reading, we use first if use_first is set, second if it's unset + - when writing, we first write to the non-used variable, only then we + update the flag. + - This of course requires twice the amount of space for dataset plus + one extra bit, so it shouldn't be used too much + - If data can be only set, but never changed, we need only one extra bit + to specify if the data is set. + - Appending new data to end of file. We'd have to have used_file_size + variable in header, done like described above. + - Each cached message record would have a pointer to next part, so more + cached data could be appended to file. + +Message flags are the most commonly modified data. If we just modify them +directly, a simultaneous read might catch the change only partially. But +luckily for us, this is accepted behaviour so we can do it. + +Another commonly modified data is maildir filenames. We probably want to +store only the base name in index and keep the full name synchronized only +locally. + +Compressing unused data from index files would have to be done by rewriting +the index into index.lock file and renaming it over the index file. - - Modifylog uses fcntl() for figuring out when to delete the log file, and - assumes that changing file locking between F_RDLCK / F_WRLCK is atomic - (not sure if this is the case with all operating systems, I hope so). - This anyway would be more difficult to change not to use fcntl(). +All file operations should probably be done with lseek(), read() and +write() to avoid extra network traffic. There should be some clever +read-ahead caching however.