From adef9255e51300b477380c41bf0377bbe81bc47e Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Fri, 26 Jan 2024 16:19:34 -0800
Subject: [PATCH] 5.10-stable patches

added patches:
	rename-fix-the-locking-of-subdirectories.patch
---
 ...me-fix-the-locking-of-subdirectories.patch | 276 ++++++++++++++++++
 queue-5.10/series                             |   1 +
 2 files changed, 277 insertions(+)
 create mode 100644 queue-5.10/rename-fix-the-locking-of-subdirectories.patch

diff --git a/queue-5.10/rename-fix-the-locking-of-subdirectories.patch b/queue-5.10/rename-fix-the-locking-of-subdirectories.patch
new file mode 100644
index 00000000000..cfee0238feb
--- /dev/null
+++ b/queue-5.10/rename-fix-the-locking-of-subdirectories.patch
@@ -0,0 +1,276 @@
+From 22e111ed6c83dcde3037fc81176012721bc34c0b Mon Sep 17 00:00:00 2001
+From: Al Viro <viro@zeniv.linux.org.uk>
+Date: Sun, 19 Nov 2023 20:25:58 -0500
+Subject: rename(): fix the locking of subdirectories
+
+From: Al Viro <viro@zeniv.linux.org.uk>
+
+commit 22e111ed6c83dcde3037fc81176012721bc34c0b upstream.
+
+	We should never lock two subdirectories without having taken
+->s_vfs_rename_mutex; inode pointer order or not, the "order" proposed
+in 28eceeda130f "fs: Lock moved directories" is not transitive, with
+the usual consequences.
+
+	The rationale for locking renamed subdirectory in all cases was
+the possibility of race between rename modifying .. in a subdirectory to
+reflect the new parent and another thread modifying the same subdirectory.
+For a lot of filesystems that's not a problem, but for some it can lead
+to trouble (e.g. the case when short directory contents is kept in the
+inode, but creating a file in it might push it across the size limit
+and copy its contents into separate data block(s)).
+
+	However, we need that only in case when the parent does change -
+otherwise ->rename() doesn't need to do anything with .. entry in the
+first place.  Some instances are lazy and do a tautological update anyway,
+but it's really not hard to avoid.
+
+Amended locking rules for rename():
+	find the parent(s) of source and target
+	if source and target have the same parent
+		lock the common parent
+	else
+		lock ->s_vfs_rename_mutex
+		lock both parents, in ancestor-first order; if neither
+		is an ancestor of another, lock the parent of source
+		first.
+	find the source and target.
+	if source and target have the same parent
+		if operation is an overwriting rename of a subdirectory
+			lock the target subdirectory
+	else
+		if source is a subdirectory
+			lock the source
+		if target is a subdirectory
+			lock the target
+	lock non-directories involved, in inode pointer order if both
+	source and target are such.
+
+That way we are guaranteed that parents are locked (for obvious reasons),
+that any renamed non-directory is locked (nfsd relies upon that),
+that any victim is locked (emptiness check needs that, among other things)
+and subdirectory that changes parent is locked (needed to protect the update
+of .. entries).  We are also guaranteed that any operation locking more
+than one directory either takes ->s_vfs_rename_mutex or locks a parent
+followed by its child.
+
+Cc: stable@vger.kernel.org
+Fixes: 28eceeda130f "fs: Lock moved directories"
+Reviewed-by: Jan Kara <jack@suse.cz>
+Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ Documentation/filesystems/directory-locking.rst |   29 ++++++-----
+ Documentation/filesystems/locking.rst           |    5 +-
+ Documentation/filesystems/porting.rst           |   18 +++++++
+ fs/namei.c                                      |   60 ++++++++++++++----------
+ 4 files changed, 74 insertions(+), 38 deletions(-)
+
+--- a/Documentation/filesystems/directory-locking.rst
++++ b/Documentation/filesystems/directory-locking.rst
+@@ -22,13 +22,16 @@ exclusive.
+ 3) object removal.  Locking rules: caller locks parent, finds victim,
+ locks victim and calls the method.  Locks are exclusive.
+ 
+-4) rename() that is _not_ cross-directory.  Locking rules: caller locks the
+-parent and finds source and target.  We lock both (provided they exist).  If we
+-need to lock two inodes of different type (dir vs non-dir), we lock directory
+-first.  If we need to lock two inodes of the same type, lock them in inode
+-pointer order.  Then call the method.  All locks are exclusive.
+-NB: we might get away with locking the source (and target in exchange
+-case) shared.
++4) rename() that is _not_ cross-directory.  Locking rules: caller locks
++the parent and finds source and target.  Then we decide which of the
++source and target need to be locked.  Source needs to be locked if it's a
++non-directory; target - if it's a non-directory or about to be removed.
++Take the locks that need to be taken, in inode pointer order if need
++to take both (that can happen only when both source and target are
++non-directories - the source because it wouldn't be locked otherwise
++and the target because mixing directory and non-directory is allowed
++only with RENAME_EXCHANGE, and that won't be removing the target).
++After the locks had been taken, call the method.  All locks are exclusive.
+ 
+ 5) link creation.  Locking rules:
+ 
+@@ -44,20 +47,17 @@ rules:
+ 
+ 	* lock the filesystem
+ 	* lock parents in "ancestors first" order. If one is not ancestor of
+-	  the other, lock them in inode pointer order.
++	  the other, lock the parent of source first.
+ 	* find source and target.
+ 	* if old parent is equal to or is a descendent of target
+ 	  fail with -ENOTEMPTY
+ 	* if new parent is equal to or is a descendent of source
+ 	  fail with -ELOOP
+-	* Lock both the source and the target provided they exist. If we
+-	  need to lock two inodes of different type (dir vs non-dir), we lock
+-	  the directory first. If we need to lock two inodes of the same type,
+-	  lock them in inode pointer order.
++	* Lock subdirectories involved (source before target).
++	* Lock non-directories involved, in inode pointer order.
+ 	* call the method.
+ 
+-All ->i_rwsem are taken exclusive.  Again, we might get away with locking
+-the source (and target in exchange case) shared.
++All ->i_rwsem are taken exclusive.
+ 
+ The rules above obviously guarantee that all directories that are going to be
+ read, modified or removed by method will be locked by caller.
+@@ -67,6 +67,7 @@ If no directory is its own ancestor, the
+ 
+ Proof:
+ 
++[XXX: will be updated once we are done massaging the lock_rename()]
+ 	First of all, at any moment we have a linear ordering of the
+ 	objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
+         of A and ptr(A) < ptr(B)).
+--- a/Documentation/filesystems/locking.rst
++++ b/Documentation/filesystems/locking.rst
+@@ -95,7 +95,7 @@ symlink:	exclusive
+ mkdir:		exclusive
+ unlink:		exclusive (both)
+ rmdir:		exclusive (both)(see below)
+-rename:		exclusive (all)	(see below)
++rename:		exclusive (both parents, some children)	(see below)
+ readlink:	no
+ get_link:	no
+ setattr:	exclusive
+@@ -113,6 +113,9 @@ tmpfile:	no
+ 	Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem
+ 	exclusive on victim.
+ 	cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
++	->unlink() and ->rename() have ->i_rwsem exclusive on all non-directories
++	involved.
++	->rename() has ->i_rwsem exclusive on any subdirectory that changes parent.
+ 
+ See Documentation/filesystems/directory-locking.rst for more detailed discussion
+ of the locking scheme for directory operations.
+--- a/Documentation/filesystems/porting.rst
++++ b/Documentation/filesystems/porting.rst
+@@ -865,3 +865,21 @@ no matter what.  Everything is handled b
+ 
+ clone_private_mount() returns a longterm mount now, so the proper destructor of
+ its result is kern_unmount() or kern_unmount_array().
++
++---
++
++**mandatory**
++
++If ->rename() update of .. on cross-directory move needs an exclusion with
++directory modifications, do *not* lock the subdirectory in question in your
++->rename() - it's done by the caller now [that item should've been added in
++28eceeda130f "fs: Lock moved directories"].
++
++---
++
++**mandatory**
++
++On same-directory ->rename() the (tautological) update of .. is not protected
++by any locks; just don't do it if the old parent is the same as the new one.
++We really can't lock two subdirectories in same-directory rename - not without
++deadlocks.
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -2771,20 +2771,14 @@ struct dentry *lock_rename(struct dentry
+ 	p = d_ancestor(p2, p1);
+ 	if (p) {
+ 		inode_lock_nested(p2->d_inode, I_MUTEX_PARENT);
+-		inode_lock_nested(p1->d_inode, I_MUTEX_CHILD);
++		inode_lock_nested(p1->d_inode, I_MUTEX_PARENT2);
+ 		return p;
+ 	}
+ 
+ 	p = d_ancestor(p1, p2);
+-	if (p) {
+-		inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
+-		inode_lock_nested(p2->d_inode, I_MUTEX_CHILD);
+-		return p;
+-	}
+-
+-	lock_two_inodes(p1->d_inode, p2->d_inode,
+-			I_MUTEX_PARENT, I_MUTEX_PARENT2);
+-	return NULL;
++	inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
++	inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
++	return p;
+ }
+ EXPORT_SYMBOL(lock_rename);
+ 
+@@ -4260,11 +4254,12 @@ SYSCALL_DEFINE2(link, const char __user
+  *
+  *	a) we can get into loop creation.
+  *	b) race potential - two innocent renames can create a loop together.
+- *	   That's where 4.4 screws up. Current fix: serialization on
++ *	   That's where 4.4BSD screws up. Current fix: serialization on
+  *	   sb->s_vfs_rename_mutex. We might be more accurate, but that's another
+  *	   story.
+- *	c) we have to lock _four_ objects - parents and victim (if it exists),
+- *	   and source.
++ *	c) we may have to lock up to _four_ objects - parents and victim (if it exists),
++ *	   and source (if it's a non-directory or a subdirectory that moves to
++ *	   different parent).
+  *	   And that - after we got ->i_mutex on parents (until then we don't know
+  *	   whether the target exists).  Solution: try to be smart with locking
+  *	   order for inodes.  We rely on the fact that tree topology may change
+@@ -4293,6 +4288,7 @@ int vfs_rename(struct inode *old_dir, st
+ 	bool new_is_dir = false;
+ 	unsigned max_links = new_dir->i_sb->s_max_links;
+ 	struct name_snapshot old_name;
++	bool lock_old_subdir, lock_new_subdir;
+ 
+ 	if (source == target)
+ 		return 0;
+@@ -4342,15 +4338,32 @@ int vfs_rename(struct inode *old_dir, st
+ 	take_dentry_name_snapshot(&old_name, old_dentry);
+ 	dget(new_dentry);
+ 	/*
+-	 * Lock all moved children. Moved directories may need to change parent
+-	 * pointer so they need the lock to prevent against concurrent
+-	 * directory changes moving parent pointer. For regular files we've
+-	 * historically always done this. The lockdep locking subclasses are
+-	 * somewhat arbitrary but RENAME_EXCHANGE in particular can swap
+-	 * regular files and directories so it's difficult to tell which
+-	 * subclasses to use.
++	 * Lock children.
++	 * The source subdirectory needs to be locked on cross-directory
++	 * rename or cross-directory exchange since its parent changes.
++	 * The target subdirectory needs to be locked on cross-directory
++	 * exchange due to parent change and on any rename due to becoming
++	 * a victim.
++	 * Non-directories need locking in all cases (for NFS reasons);
++	 * they get locked after any subdirectories (in inode address order).
++	 *
++	 * NOTE: WE ONLY LOCK UNRELATED DIRECTORIES IN CROSS-DIRECTORY CASE.
++	 * NEVER, EVER DO THAT WITHOUT ->s_vfs_rename_mutex.
+ 	 */
+-	lock_two_inodes(source, target, I_MUTEX_NORMAL, I_MUTEX_NONDIR2);
++	lock_old_subdir = new_dir != old_dir;
++	lock_new_subdir = new_dir != old_dir || !(flags & RENAME_EXCHANGE);
++	if (is_dir) {
++		if (lock_old_subdir)
++			inode_lock_nested(source, I_MUTEX_CHILD);
++		if (target && (!new_is_dir || lock_new_subdir))
++			inode_lock(target);
++	} else if (new_is_dir) {
++		if (lock_new_subdir)
++			inode_lock_nested(target, I_MUTEX_CHILD);
++		inode_lock(source);
++	} else {
++		lock_two_nondirectories(source, target);
++	}
+ 
+ 	error = -EBUSY;
+ 	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
+@@ -4394,8 +4407,9 @@ int vfs_rename(struct inode *old_dir, st
+ 			d_exchange(old_dentry, new_dentry);
+ 	}
+ out:
+-	inode_unlock(source);
+-	if (target)
++	if (!is_dir || lock_old_subdir)
++		inode_unlock(source);
++	if (target && (!new_is_dir || lock_new_subdir))
+ 		inode_unlock(target);
+ 	dput(new_dentry);
+ 	if (!error) {
diff --git a/queue-5.10/series b/queue-5.10/series
index 7dde75a01d5..ec12295ee3d 100644
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -34,3 +34,4 @@ mmc-mmc_spi-remove-custom-dma-mapped-buffers.patch
 rtc-adjust-failure-return-code-for-cmos_set_alarm.patch
 nouveau-vmm-don-t-set-addr-on-the-fail-path-to-avoid-warning.patch
 ubifs-ubifs_symlink-fix-memleak-of-inode-i_link-in-error-path.patch
+rename-fix-the-locking-of-subdirectories.patch
-- 
2.47.3