A couple people reported xfs_repair hangs after
"Traversing filesystem ..." in xfs_repair. This happens
when all slots in the cache are full and referenced, and the
loop in cache_node_get() which tries to shake unused entries
fails to find any - it just keeps upping the priority and goes
forever.
This can be worked around by restarting xfs_repair with
-P and/or "-o bhash=<largersize>" for older xfs_repair.
I started down the path of increasing the number of hash buckets
on the fly, but Barry suggested simply increasing the max allowed
depth which is much simpler (thanks!)
Resizing the hash lengths does mean that cache_report ends up with
most things in the "greater-than" category:
...
Hash buckets with 23 entries 3 ( 3%)
Hash buckets with 24 entries 3 ( 3%)
Hash buckets with >24 entries 50 ( 85%)
but I think I'll save that fix for another patch unless there's
real concern right now.
I tested this on the metadump image provided by Tomek.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reported-by: Tomek Kruszona <bloodyscarion@gmail.com>
Reported-by: Riku Paananen <riku.paananen@helsinki.fi>
Reviewed-by: Christoph Hellwig <hch@lst.de>
return cache;
}
+void
+cache_expand(
+ struct cache * cache)
+{
+ pthread_mutex_lock(&cache->c_mutex);
+#ifdef CACHE_DEBUG
+ fprintf(stderr, "doubling cache size to %d\n", 2 * cache->c_maxcount);
+#endif
+ cache->c_maxcount *= 2;
+ pthread_mutex_unlock(&cache->c_mutex);
+}
+
void
cache_walk(
struct cache * cache,
if (node)
break;
priority = cache_shake(cache, priority, 0);
+ /*
+ * We start at 0; if we free CACHE_SHAKE_COUNT we get
+ * back the same priority, if not we get back priority+1.
+ * If we exceed CACHE_MAX_PRIORITY all slots are full; grow it.
+ */
+ if (priority > CACHE_MAX_PRIORITY) {
+ priority = 0;
+ cache_expand(cache);
+ }
}
node->cn_hashidx = hashidx;