nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
Syzbot reported a hung task in nilfs_transaction_begin() where multiple
tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
waiting to acquire ns_segctor_sem for read:
INFO: task syz.0.17:5918 blocked for more than 143 seconds.
Call Trace:
schedule+0x164/0x360
rwsem_down_read_slowpath+0x6d9/0x940
down_read+0x99/0x2e0
nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
notify_change+0xc1a/0xf40
chmod_common+0x273/0x4a0
do_fchmodat+0x12d/0x230
The writer holding ns_segctor_sem was a concurrent
NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting
per-element warnings from nilfs_sufile_updatev():
The root cause is that user-supplied segment numbers are not validated
before nilfs_clean_segments() begins doing work; the range check on
each segnum is performed deep inside the call chain by
nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
while still holding the segctor lock and the sufile mi_sem. Under load
(repeated invocations across multiple mounts saturating the global
printk path), the cumulative printk latency keeps ns_segctor_sem held
long enough to trip the hung_task watchdog, blocking concurrent
operations such as chmod() that need ns_segctor_sem for read.
Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
Holding ns_segctor_sem serializes the check against
nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
uses a consistent value. Out-of-range segment numbers are rejected
with -EINVAL before any segment-cleaning work begins, so the bad
entries never reach the per-element diagnostic path inside
nilfs_sufile_updatev().