From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Tue, 28 Mar 2023 13:54:30 +0000 (+0200)
Subject: 5.10-stable patches
X-Git-Tag: v5.15.105~7
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=1fd2c84d271789329ac140474df8a2decc4c3e07;p=thirdparty%2Fkernel%2Fstable-queue.git

5.10-stable patches

added patches:
	ocfs2-fix-data-corruption-after-failed-write.patch
	xfs-don-t-reuse-busy-extents-on-extent-trim.patch
	xfs-shut-down-the-filesystem-if-we-screw-up-quota-reservation.patch
---

diff --git a/queue-5.10/ocfs2-fix-data-corruption-after-failed-write.patch b/queue-5.10/ocfs2-fix-data-corruption-after-failed-write.patch
new file mode 100644
index 00000000000..632a13594c6
--- /dev/null
+++ b/queue-5.10/ocfs2-fix-data-corruption-after-failed-write.patch
@@ -0,0 +1,67 @@
+From 90410bcf873cf05f54a32183afff0161f44f9715 Mon Sep 17 00:00:00 2001
+From: Jan Kara via Ocfs2-devel <ocfs2-devel@oss.oracle.com>
+Date: Thu, 2 Mar 2023 16:38:43 +0100
+Subject: ocfs2: fix data corruption after failed write
+
+From: Jan Kara via Ocfs2-devel <ocfs2-devel@oss.oracle.com>
+
+commit 90410bcf873cf05f54a32183afff0161f44f9715 upstream.
+
+When buffered write fails to copy data into underlying page cache page,
+ocfs2_write_end_nolock() just zeroes out and dirties the page.  This can
+leave dirty page beyond EOF and if page writeback tries to write this page
+before write succeeds and expands i_size, page gets into inconsistent
+state where page dirty bit is clear but buffer dirty bits stay set
+resulting in page data never getting written and so data copied to the
+page is lost.  Fix the problem by invalidating page beyond EOF after
+failed write.
+
+Link: https://lkml.kernel.org/r/20230302153843.18499-1-jack@suse.cz
+Fixes: 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()")
+Signed-off-by: Jan Kara <jack@suse.cz>
+Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
+Cc: Mark Fasheh <mark@fasheh.com>
+Cc: Joel Becker <jlbec@evilplan.org>
+Cc: Junxiao Bi <junxiao.bi@oracle.com>
+Cc: Changwei Ge <gechangwei@live.cn>
+Cc: Gang He <ghe@suse.com>
+Cc: Jun Piao <piaojun@huawei.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+[ replace block_invalidate_folio to block_invalidatepage ]
+Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/ocfs2/aops.c |   18 ++++++++++++++++--
+ 1 file changed, 16 insertions(+), 2 deletions(-)
+
+--- a/fs/ocfs2/aops.c
++++ b/fs/ocfs2/aops.c
+@@ -1981,11 +1981,25 @@ int ocfs2_write_end_nolock(struct addres
+ 	}
+ 
+ 	if (unlikely(copied < len) && wc->w_target_page) {
++		loff_t new_isize;
++
+ 		if (!PageUptodate(wc->w_target_page))
+ 			copied = 0;
+ 
+-		ocfs2_zero_new_buffers(wc->w_target_page, start+copied,
+-				       start+len);
++		new_isize = max_t(loff_t, i_size_read(inode), pos + copied);
++		if (new_isize > page_offset(wc->w_target_page))
++			ocfs2_zero_new_buffers(wc->w_target_page, start+copied,
++					       start+len);
++		else {
++			/*
++			 * When page is fully beyond new isize (data copy
++			 * failed), do not bother zeroing the page. Invalidate
++			 * it instead so that writeback does not get confused
++			 * put page & buffer dirty bits into inconsistent
++			 * state.
++			 */
++			block_invalidatepage(wc->w_target_page, 0, PAGE_SIZE);
++		}
+ 	}
+ 	if (wc->w_target_page)
+ 		flush_dcache_page(wc->w_target_page);
diff --git a/queue-5.10/series b/queue-5.10/series
index 5c0b3abf17b..de2dbccc373 100644
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -97,3 +97,6 @@ dm-stats-check-for-and-propagate-alloc_percpu-failure.patch
 dm-crypt-add-cond_resched-to-dmcrypt_write.patch
 sched-fair-sanitize-vruntime-of-entity-being-placed.patch
 sched-fair-sanitize-vruntime-of-entity-being-migrated.patch
+ocfs2-fix-data-corruption-after-failed-write.patch
+xfs-shut-down-the-filesystem-if-we-screw-up-quota-reservation.patch
+xfs-don-t-reuse-busy-extents-on-extent-trim.patch
diff --git a/queue-5.10/xfs-don-t-reuse-busy-extents-on-extent-trim.patch b/queue-5.10/xfs-don-t-reuse-busy-extents-on-extent-trim.patch
new file mode 100644
index 00000000000..8fb40e916ea
--- /dev/null
+++ b/queue-5.10/xfs-don-t-reuse-busy-extents-on-extent-trim.patch
@@ -0,0 +1,95 @@
+From stable-owner@vger.kernel.org Tue Mar 28 09:36:25 2023
+From: Amir Goldstein <amir73il@gmail.com>
+Date: Tue, 28 Mar 2023 10:35:12 +0300
+Subject: xfs: don't reuse busy extents on extent trim
+To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Sasha Levin <sashal@kernel.org>, Chandan Babu R <chandan.babu@oracle.com>, "Darrick J . Wong" <djwong@kernel.org>, Leah Rumancik <leah.rumancik@gmail.com>, linux-xfs@vger.kernel.org, stable@vger.kernel.org, Brian Foster <bfoster@redhat.com>, Chandan Babu R <chandanrlinux@gmail.com>, Christoph Hellwig <hch@lst.de>
+Message-ID: <20230328073512.460533-3-amir73il@gmail.com>
+
+From: Brian Foster <bfoster@redhat.com>
+
+commit 06058bc40534530e617e5623775c53bb24f032cb upstream.
+
+Freed extents are marked busy from the point the freeing transaction
+commits until the associated CIL context is checkpointed to the log.
+This prevents reuse and overwrite of recently freed blocks before
+the changes are committed to disk, which can lead to corruption
+after a crash. The exception to this rule is that metadata
+allocation is allowed to reuse busy extents because metadata changes
+are also logged.
+
+As of commit 97d3ac75e5e0 ("xfs: exact busy extent tracking"), XFS
+has allowed modification or complete invalidation of outstanding
+busy extents for metadata allocations. This implementation assumes
+that use of the associated extent is imminent, which is not always
+the case. For example, the trimmed extent might not satisfy the
+minimum length of the allocation request, or the allocation
+algorithm might be involved in a search for the optimal result based
+on locality.
+
+generic/019 reproduces a corruption caused by this scenario. First,
+a metadata block (usually a bmbt or symlink block) is freed from an
+inode. A subsequent bmbt split on an unrelated inode attempts a near
+mode allocation request that invalidates the busy block during the
+search, but does not ultimately allocate it. Due to the busy state
+invalidation, the block is no longer considered busy to subsequent
+allocation. A direct I/O write request immediately allocates the
+block and writes to it. Finally, the filesystem crashes while in a
+state where the initial metadata block free had not committed to the
+on-disk log. After recovery, the original metadata block is in its
+original location as expected, but has been corrupted by the
+aforementioned dio.
+
+This demonstrates that it is fundamentally unsafe to modify busy
+extent state for extents that are not guaranteed to be allocated.
+This applies to pretty much all of the code paths that currently
+trim busy extents for one reason or another. Therefore to address
+this problem, drop the reuse mechanism from the busy extent trim
+path. This code already knows how to return partial non-busy ranges
+of the targeted free extent and higher level code tracks the busy
+state of the allocation attempt. If a block allocation fails where
+one or more candidate extents is busy, we force the log and retry
+the allocation.
+
+Signed-off-by: Brian Foster <bfoster@redhat.com>
+Reviewed-by: Darrick J. Wong <djwong@kernel.org>
+Signed-off-by: Darrick J. Wong <djwong@kernel.org>
+Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
+Reviewed-by: Christoph Hellwig <hch@lst.de>
+Signed-off-by: Amir Goldstein <amir73il@gmail.com>
+Acked-by: Darrick J. Wong <djwong@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/xfs/xfs_extent_busy.c |   14 --------------
+ 1 file changed, 14 deletions(-)
+
+--- a/fs/xfs/xfs_extent_busy.c
++++ b/fs/xfs/xfs_extent_busy.c
+@@ -344,7 +344,6 @@ xfs_extent_busy_trim(
+ 	ASSERT(*len > 0);
+ 
+ 	spin_lock(&args->pag->pagb_lock);
+-restart:
+ 	fbno = *bno;
+ 	flen = *len;
+ 	rbp = args->pag->pagb_tree.rb_node;
+@@ -363,19 +362,6 @@ restart:
+ 			continue;
+ 		}
+ 
+-		/*
+-		 * If this is a metadata allocation, try to reuse the busy
+-		 * extent instead of trimming the allocation.
+-		 */
+-		if (!(args->datatype & XFS_ALLOC_USERDATA) &&
+-		    !(busyp->flags & XFS_EXTENT_BUSY_DISCARDED)) {
+-			if (!xfs_extent_busy_update_extent(args->mp, args->pag,
+-							  busyp, fbno, flen,
+-							  false))
+-				goto restart;
+-			continue;
+-		}
+-
+ 		if (bbno <= fbno) {
+ 			/* start overlap */
+ 
diff --git a/queue-5.10/xfs-shut-down-the-filesystem-if-we-screw-up-quota-reservation.patch b/queue-5.10/xfs-shut-down-the-filesystem-if-we-screw-up-quota-reservation.patch
new file mode 100644
index 00000000000..d8041c5504a
--- /dev/null
+++ b/queue-5.10/xfs-shut-down-the-filesystem-if-we-screw-up-quota-reservation.patch
@@ -0,0 +1,62 @@
+From stable-owner@vger.kernel.org Tue Mar 28 09:36:26 2023
+From: Amir Goldstein <amir73il@gmail.com>
+Date: Tue, 28 Mar 2023 10:35:11 +0300
+Subject: xfs: shut down the filesystem if we screw up quota reservation
+To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+Cc: Sasha Levin <sashal@kernel.org>, Chandan Babu R <chandan.babu@oracle.com>, "Darrick J . Wong" <djwong@kernel.org>, Leah Rumancik <leah.rumancik@gmail.com>, linux-xfs@vger.kernel.org, stable@vger.kernel.org, Christoph Hellwig <hch@lst.de>, Brian Foster <bfoster@redhat.com>
+Message-ID: <20230328073512.460533-2-amir73il@gmail.com>
+
+From: "Darrick J. Wong" <djwong@kernel.org>
+
+commit 2a4bdfa8558ca2904dc17b83497dc82aa7fc05e9 upstream.
+
+If we ever screw up the quota reservations enough to trip the
+assertions, something's wrong with the quota code.  Shut down the
+filesystem when this happens, because this is corruption.
+
+Signed-off-by: Darrick J. Wong <djwong@kernel.org>
+Reviewed-by: Christoph Hellwig <hch@lst.de>
+Reviewed-by: Brian Foster <bfoster@redhat.com>
+Signed-off-by: Amir Goldstein <amir73il@gmail.com>
+Acked-by: Darrick J. Wong <djwong@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/xfs/xfs_trans_dquot.c |   13 ++++++++++---
+ 1 file changed, 10 insertions(+), 3 deletions(-)
+
+--- a/fs/xfs/xfs_trans_dquot.c
++++ b/fs/xfs/xfs_trans_dquot.c
+@@ -16,6 +16,7 @@
+ #include "xfs_quota.h"
+ #include "xfs_qm.h"
+ #include "xfs_trace.h"
++#include "xfs_error.h"
+ 
+ STATIC void	xfs_trans_alloc_dqinfo(xfs_trans_t *);
+ 
+@@ -708,9 +709,11 @@ xfs_trans_dqresv(
+ 					    XFS_TRANS_DQ_RES_INOS,
+ 					    ninos);
+ 	}
+-	ASSERT(dqp->q_blk.reserved >= dqp->q_blk.count);
+-	ASSERT(dqp->q_rtb.reserved >= dqp->q_rtb.count);
+-	ASSERT(dqp->q_ino.reserved >= dqp->q_ino.count);
++
++	if (XFS_IS_CORRUPT(mp, dqp->q_blk.reserved < dqp->q_blk.count) ||
++	    XFS_IS_CORRUPT(mp, dqp->q_rtb.reserved < dqp->q_rtb.count) ||
++	    XFS_IS_CORRUPT(mp, dqp->q_ino.reserved < dqp->q_ino.count))
++		goto error_corrupt;
+ 
+ 	xfs_dqunlock(dqp);
+ 	return 0;
+@@ -720,6 +723,10 @@ error_return:
+ 	if (xfs_dquot_type(dqp) == XFS_DQTYPE_PROJ)
+ 		return -ENOSPC;
+ 	return -EDQUOT;
++error_corrupt:
++	xfs_dqunlock(dqp);
++	xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
++	return -EFSCORRUPTED;
+ }
+ 
+