ntfs: skip extent mft records in writeback to prevent deadlock
This patch fixes the ABBA deadlock between extent_lock and extent
mrec_lock triggered by xfstests generic/113, that occurs since the commit
6994acf33bae ("ntfs: use base mft_no when looking up base inode for
extent record").
Path A (inode writeback):
VFS writeback
-> ntfs_write_inode()
-> __ntfs_write_inode()
-> mutex_lock(&ni->extent_lock)
-> mutex_lock(&tni->mrec_lock)
Path B (MFT folio writeback):
VFS writeback of $MFT dirty folios
-> ntfs_mft_writepages()
-> ntfs_write_mft_block()
-> ntfs_may_write_mft_record()
-> holds one extent mrec_lock from a previous iteration
-> tries to acquire another base inode extent_lock
By removing all extent_lock and extent mrec_lock acquisition from the MFT
folio writeback path, the ABBA lock ordering is eliminated:
Path A: __ntfs_write_inode(): extent_lock -> mrec_lock
Path B (removed): ntfs_write_mft_block(): mrec_lock -> extent_lock
Path B is always redundant for extent records because:
1. mark_mft_record_dirty(ext_ni) does NOT dirty the MFT folio.
It only sets NInoDirty(ext_ni) and marks the base VFS inode dirty
via __mark_inode_dirty(I_DIRTY_DATASYNC), which triggers Path A.
Therefore, normal extent modifications never create a situation where
the MFT folio is dirty and Path B is not scheduled.
2. The MFT folio only gets dirtied via ntfs_mft_mark_dirty() inside
ntfs_mft_record_alloc(). But all identified callers in attrib.c
(ntfs_attr_add, ntfs_attr_record_move_away,
ntfs_attr_make_non_resident, ntfs_attr_record_resize) follow through
with mark_mft_record_dirty(), which triggers Path A to write the
complete record.
3. ntfs_evict_big_inode() calls ntfs_commit_inode() before freeing extent
inodes, ensuring all dirty extents are flushed via Path A before the
base inode leaves the icache.
Cc: stable@vger.kernel.org # v7.1
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>