simpe_xattr: use per-sb cache
Move the hash table to the super block to remove excessive overhead in case
of small number of xattrs per inode.
Add linked list to the inode, used for listxattr and eviction. Listxattr
uses rcu protection to iterate the list of xattrs.
Before being made per-sb, lazy allocation was protected by inode lock. Now
inode lock no longer provides sufficient exclusion, so use cmpxchg() to
ensure atomicity.
Though I haven't found a description of this pattern, after some research
it seems that cmpxchg_release() and READ_ONCE() should provide the
necessary memory barriers.
Use simple_xattr_free_rcu() in simple_xattrs_free(). This is needed because
the hash table is now shared between inodes and lookup on a different inode
might be running the compare function on the just freed element within the
RCU grace period.
Following stats are based on slabinfo diff, after creating 100k empty
files, then adding a "user.test=foo" xattr to each:
v7.0 (no rhashtable):
File creation: 993.40 bytes/file
Xattr addition: 79.99 bytes/file
v7.1-rc2 (per-inode rhashtable):
File creation: 939.73 bytes/file
Xattr addition: 1296.08 bytes/file
v7.1-rc2 + this patch (per-sb rhashtable)
File creation: 946.84 bytes/file
Xattr addition: 111.86 bytes/file
The overhead of a single xattr is reduced to nearly v7.0 levels. The per
xattr overhead is slightly larger due to the addition of three pointers to
struct simple_xattr.
Fixes: b32c4a213698 ("xattr: add rhashtable-based simple_xattr infrastructure")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://patch.msgid.link/20260605135322.2632068-5-mszeredi@redhat.com
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>