From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sat, 8 Aug 2015 22:05:27 +0000 (-0700)
Subject: 4.1-stable patches
X-Git-Tag: v4.1.5~2
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=b1ebf425cfeb410c6527f385194a57ba8c978704;p=thirdparty%2Fkernel%2Fstable-queue.git

4.1-stable patches

added patches:
	xfs-remote-attribute-headers-contain-an-invalid-lsn.patch
	xfs-remote-attributes-need-to-be-considered-data.patch
---

diff --git a/queue-4.1/series b/queue-4.1/series
index 627d6d91f5d..20d3d351220 100644
--- a/queue-4.1/series
+++ b/queue-4.1/series
@@ -119,3 +119,5 @@ drm-nouveau-fbcon-nv11-correctly-account-for-ring-space-usage.patch
 drm-nouveau-kms-nv50-guard-against-enabling-cursor-on-disabled-heads.patch
 drm-nouveau-hold-mutex-when-calling-nouveau_abi16_fini.patch
 drm-nouveau-drm-nv04-nv40-instmem-protect-access-to-priv-heap-by-mutex.patch
+xfs-remote-attribute-headers-contain-an-invalid-lsn.patch
+xfs-remote-attributes-need-to-be-considered-data.patch
diff --git a/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch b/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch
new file mode 100644
index 00000000000..ce750b2cea7
--- /dev/null
+++ b/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch
@@ -0,0 +1,162 @@
+From e3c32ee9e3e747fec01eb38e6610a9157d44c3ea Mon Sep 17 00:00:00 2001
+From: Dave Chinner <dchinner@redhat.com>
+Date: Wed, 29 Jul 2015 11:48:01 +1000
+Subject: xfs: remote attribute headers contain an invalid LSN
+
+From: Dave Chinner <dchinner@redhat.com>
+
+commit e3c32ee9e3e747fec01eb38e6610a9157d44c3ea upstream.
+
+In recent testing, a system that crashed failed log recovery on
+restart with a bad symlink buffer magic number:
+
+XFS (vda): Starting recovery (logdev: internal)
+XFS (vda): Bad symlink block magic!
+XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060
+
+On examination of the log via xfs_logprint, none of the symlink
+buffers in the log had a bad magic number, nor were any other types
+of buffer log format headers mis-identified as symlink buffers.
+Tracing was used to find the buffer the kernel was tripping over,
+and xfs_db identified it's contents as:
+
+000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e
+020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d
+040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
+060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
+.....
+
+This is a remote attribute buffer, which are notable in that they
+are not logged but are instead written synchronously by the remote
+attribute code so that they exist on disk before the attribute
+transactions are committed to the journal.
+
+The above remote attribute block has an invalid LSN in it - cycle
+0xd002000, block 0 - which means when log recovery comes along to
+determine if the transaction that writes to the underlying block
+should be replayed, it sees a block that has a future LSN and so
+does not replay the buffer data in the transaction. Instead, it
+validates the buffer magic number and attaches the buffer verifier
+to it.  It is this buffer magic number check that is failing in the
+above assert, indicating that we skipped replay due to the LSN of
+the underlying buffer.
+
+The problem here is that the remote attribute buffers cannot have a
+valid LSN placed into them, because the transaction that contains
+the attribute tree pointer changes and the block allocation that the
+attribute data is being written to hasn't yet been committed. Hence
+the LSN field in the attribute block is completely unwritten,
+thereby leaving the underlying contents of the block in the LSN
+field. It could have any value, and hence a future overwrite of the
+block by log recovery may or may not work correctly.
+
+Fix this by always writing an invalid LSN to the remote attribute
+block, as any buffer in log recovery that needs to write over the
+remote attribute should occur. We are protected from having old data
+written over the attribute by the fact that freeing the block before
+the remote attribute is written will result in the buffer being
+marked stale in the log and so all changes prior to the buffer stale
+transaction will be cancelled by log recovery.
+
+Hence it is safe to ignore the LSN in the case or synchronously
+written, unlogged metadata such as remote attribute blocks, and to
+ensure we do that correctly, we need to write an invalid LSN to all
+remote attribute blocks to trigger immediate recovery of metadata
+that is written over the top.
+
+As a further protection for filesystems that may already have remote
+attribute blocks with bad LSNs on disk, change the log recovery code
+to always trigger immediate recovery of metadata over remote
+attribute blocks.
+
+Signed-off-by: Dave Chinner <dchinner@redhat.com>
+Reviewed-by: Brian Foster <bfoster@redhat.com>
+Signed-off-by: Dave Chinner <david@fromorbit.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/xfs/libxfs/xfs_attr_remote.c |   29 +++++++++++++++++++++++------
+ fs/xfs/xfs_log_recover.c        |   11 ++++++++---
+ 2 files changed, 31 insertions(+), 9 deletions(-)
+
+--- a/fs/xfs/libxfs/xfs_attr_remote.c
++++ b/fs/xfs/libxfs/xfs_attr_remote.c
+@@ -159,11 +159,10 @@ xfs_attr3_rmt_write_verify(
+ 	struct xfs_buf	*bp)
+ {
+ 	struct xfs_mount *mp = bp->b_target->bt_mount;
+-	struct xfs_buf_log_item	*bip = bp->b_fspriv;
++	int		blksize = mp->m_attr_geo->blksize;
+ 	char		*ptr;
+ 	int		len;
+ 	xfs_daddr_t	bno;
+-	int		blksize = mp->m_attr_geo->blksize;
+ 
+ 	/* no verification of non-crc buffers */
+ 	if (!xfs_sb_version_hascrc(&mp->m_sb))
+@@ -175,16 +174,22 @@ xfs_attr3_rmt_write_verify(
+ 	ASSERT(len >= blksize);
+ 
+ 	while (len > 0) {
++		struct xfs_attr3_rmt_hdr *rmt = (struct xfs_attr3_rmt_hdr *)ptr;
++
+ 		if (!xfs_attr3_rmt_verify(mp, ptr, blksize, bno)) {
+ 			xfs_buf_ioerror(bp, -EFSCORRUPTED);
+ 			xfs_verifier_error(bp);
+ 			return;
+ 		}
+-		if (bip) {
+-			struct xfs_attr3_rmt_hdr *rmt;
+ 
+-			rmt = (struct xfs_attr3_rmt_hdr *)ptr;
+-			rmt->rm_lsn = cpu_to_be64(bip->bli_item.li_lsn);
++		/*
++		 * Ensure we aren't writing bogus LSNs to disk. See
++		 * xfs_attr3_rmt_hdr_set() for the explanation.
++		 */
++		if (rmt->rm_lsn != cpu_to_be64(NULLCOMMITLSN)) {
++			xfs_buf_ioerror(bp, -EFSCORRUPTED);
++			xfs_verifier_error(bp);
++			return;
+ 		}
+ 		xfs_update_cksum(ptr, blksize, XFS_ATTR3_RMT_CRC_OFF);
+ 
+@@ -221,6 +226,18 @@ xfs_attr3_rmt_hdr_set(
+ 	rmt->rm_owner = cpu_to_be64(ino);
+ 	rmt->rm_blkno = cpu_to_be64(bno);
+ 
++	/*
++	 * Remote attribute blocks are written synchronously, so we don't
++	 * have an LSN that we can stamp in them that makes any sense to log
++	 * recovery. To ensure that log recovery handles overwrites of these
++	 * blocks sanely (i.e. once they've been freed and reallocated as some
++	 * other type of metadata) we need to ensure that the LSN has a value
++	 * that tells log recovery to ignore the LSN and overwrite the buffer
++	 * with whatever is in it's log. To do this, we use the magic
++	 * NULLCOMMITLSN to indicate that the LSN is invalid.
++	 */
++	rmt->rm_lsn = cpu_to_be64(NULLCOMMITLSN);
++
+ 	return sizeof(struct xfs_attr3_rmt_hdr);
+ }
+ 
+--- a/fs/xfs/xfs_log_recover.c
++++ b/fs/xfs/xfs_log_recover.c
+@@ -1887,9 +1887,14 @@ xlog_recover_get_buf_lsn(
+ 		uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid;
+ 		break;
+ 	case XFS_ATTR3_RMT_MAGIC:
+-		lsn = be64_to_cpu(((struct xfs_attr3_rmt_hdr *)blk)->rm_lsn);
+-		uuid = &((struct xfs_attr3_rmt_hdr *)blk)->rm_uuid;
+-		break;
++		/*
++		 * Remote attr blocks are written synchronously, rather than
++		 * being logged. That means they do not contain a valid LSN
++		 * (i.e. transactionally ordered) in them, and hence any time we
++		 * see a buffer to replay over the top of a remote attribute
++		 * block we should simply do so.
++		 */
++		goto recover_immediately;
+ 	case XFS_SB_MAGIC:
+ 		lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn);
+ 		uuid = &((struct xfs_dsb *)blk)->sb_uuid;
diff --git a/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch b/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch
new file mode 100644
index 00000000000..c82621cea75
--- /dev/null
+++ b/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch
@@ -0,0 +1,68 @@
+From df150ed102baa0e78c06e08e975dfb47147dd677 Mon Sep 17 00:00:00 2001
+From: Dave Chinner <dchinner@redhat.com>
+Date: Wed, 29 Jul 2015 11:48:02 +1000
+Subject: xfs: remote attributes need to be considered data
+
+From: Dave Chinner <dchinner@redhat.com>
+
+commit df150ed102baa0e78c06e08e975dfb47147dd677 upstream.
+
+We don't log remote attribute contents, and instead write them
+synchronously before we commit the block allocation and attribute
+tree update transaction. As a result we are writing to the allocated
+space before the allcoation has been made permanent.
+
+As a result, we cannot consider this allocation to be a metadata
+allocation. Metadata allocation can take blocks from the free list
+and so reuse them before the transaction that freed the block is
+committed to disk. This behaviour is perfectly fine for journalled
+metadata changes as log recovery will ensure the free operation is
+replayed before the overwrite, but for remote attribute writes this
+is not the case.
+
+Hence we have to consider the remote attribute blocks to contain
+data and allocate accordingly. We do this by dropping the
+XFS_BMAPI_METADATA flag from the block allocation. This means the
+allocation will not use blocks that are on the busy list without
+first ensuring that the freeing transaction has been committed to
+disk and the blocks removed from the busy list. This ensures we will
+never overwrite a freed block without first ensuring that it is
+really free.
+
+Signed-off-by: Dave Chinner <dchinner@redhat.com>
+Reviewed-by: Brian Foster <bfoster@redhat.com>
+Signed-off-by: Dave Chinner <david@fromorbit.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/xfs/libxfs/xfs_attr_remote.c |   15 +++++++++++----
+ 1 file changed, 11 insertions(+), 4 deletions(-)
+
+--- a/fs/xfs/libxfs/xfs_attr_remote.c
++++ b/fs/xfs/libxfs/xfs_attr_remote.c
+@@ -451,14 +451,21 @@ xfs_attr_rmtval_set(
+ 
+ 		/*
+ 		 * Allocate a single extent, up to the size of the value.
++		 *
++		 * Note that we have to consider this a data allocation as we
++		 * write the remote attribute without logging the contents.
++		 * Hence we must ensure that we aren't using blocks that are on
++		 * the busy list so that we don't overwrite blocks which have
++		 * recently been freed but their transactions are not yet
++		 * committed to disk. If we overwrite the contents of a busy
++		 * extent and then crash then the block may not contain the
++		 * correct metadata after log recovery occurs.
+ 		 */
+ 		xfs_bmap_init(args->flist, args->firstblock);
+ 		nmap = 1;
+ 		error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)lblkno,
+-				  blkcnt,
+-				  XFS_BMAPI_ATTRFORK | XFS_BMAPI_METADATA,
+-				  args->firstblock, args->total, &map, &nmap,
+-				  args->flist);
++				  blkcnt, XFS_BMAPI_ATTRFORK, args->firstblock,
++				  args->total, &map, &nmap, args->flist);
+ 		if (!error) {
+ 			error = xfs_bmap_finish(&args->trans, args->flist,
+ 						&committed);