xattrs: maintain a hashtable in order to speed up find_matching_xattr()
As a testcase I've used one directory on gpfs with
1000000 files,
each with an xattr called 'name$i' having a value of 'value$i'.
So we also have
1000000 unique xattrs. The source and dest directories
are already in sync before. So the rsync command is basically a noop,
just verifying that everything is already in sync.
The results before this patchset are:
[gpfs]# time rsync -a -P -X -q source-xattr/ dest-with-xattr/
real 8m46.191s
user 6m29.016s
sys 0m24.883s
[gpfs]# time rsync -a -P -q source-xattr/ dest-without-xattr/
real 1m58.462s
user 0m0.957s
sys 0m11.801s
With the patchset I got:
[gpfs]# time /gpfs/rsync.install/bin/rsync -a -P -X -q source-xattr/ dest-with-xattr/
real 2m4.150s
user 0m1.917s
sys 0m17.077s
[gpfs]# time /gpfs/rsync.install/bin/rsync -a -P -q source-xattr/ dest-without-xattr/
real 1m59.534s
user 0m0.924s
sys 0m11.599s
It means the time in userspace dropped from 6m29.016s down to 0m1.917s!
Without -X we get ~ 0m0.9s with or without the patch.
Part of a patchset for bug 5324.