While we attempt to use a lockfile to grant exclusive access to the
mapfile, our implementation is buggy. Specifically, we create a lockfile,
then lock it, at which point new instances can open the lockfile and
attempt to lock it, which will cause them to block. However, when we are
ready to unlock it, we unlink the file. This causes existing lock waiters
to get a lock on an unlinked inode while a different instance may now
create a new lockfile and get an exclusive lock on it.
There are several possible fixes. The chosen one is to test if
->s_nlink is zero after we get the lock and to retry if it isn't.
This means:
- failing to unlink a file doesn't leave a stale lock
- we can block waiting to get a lock rather than busy-waiting
- we don't need to leave a lock file permanently in place.
Reported-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
static int lwhich = 0;
int map_lock(struct map_ent **melp)
{
- if (lf == NULL) {
+ while (lf == NULL) {
+ struct stat buf;
lf = open_map(2, &lwhich);
if (lf == NULL)
return -1;
lf = NULL;
return -1;
}
+ if (fstat(fileno(lf), &buf) != 0 ||
+ buf.st_nlink == 0) {
+ /* The owner of the lock unlinked it,
+ * so we have a lock on a stale file,
+ * try again
+ */
+ fclose(lf);
+ lf = NULL;
+ }
}
if (*melp)
map_free(*melp);
void map_unlock(struct map_ent **melp)
{
if (lf) {
- flock(fileno(lf), LOCK_UN);
+ /* must unlink before closing the file,
+ * as only the owner of the lock may
+ * unlink the file
+ */
+ unlink(mapname[lwhich][2]);
fclose(lf);
}
- unlink(mapname[lwhich][2]);
lf = NULL;
}