From: Greg Kroah-Hartman Date: Sat, 8 Aug 2015 22:05:27 +0000 (-0700) Subject: 4.1-stable patches X-Git-Tag: v4.1.5~2 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=b1ebf425cfeb410c6527f385194a57ba8c978704;p=thirdparty%2Fkernel%2Fstable-queue.git 4.1-stable patches added patches: xfs-remote-attribute-headers-contain-an-invalid-lsn.patch xfs-remote-attributes-need-to-be-considered-data.patch --- diff --git a/queue-4.1/series b/queue-4.1/series index 627d6d91f5d..20d3d351220 100644 --- a/queue-4.1/series +++ b/queue-4.1/series @@ -119,3 +119,5 @@ drm-nouveau-fbcon-nv11-correctly-account-for-ring-space-usage.patch drm-nouveau-kms-nv50-guard-against-enabling-cursor-on-disabled-heads.patch drm-nouveau-hold-mutex-when-calling-nouveau_abi16_fini.patch drm-nouveau-drm-nv04-nv40-instmem-protect-access-to-priv-heap-by-mutex.patch +xfs-remote-attribute-headers-contain-an-invalid-lsn.patch +xfs-remote-attributes-need-to-be-considered-data.patch diff --git a/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch b/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch new file mode 100644 index 00000000000..ce750b2cea7 --- /dev/null +++ b/queue-4.1/xfs-remote-attribute-headers-contain-an-invalid-lsn.patch @@ -0,0 +1,162 @@ +From e3c32ee9e3e747fec01eb38e6610a9157d44c3ea Mon Sep 17 00:00:00 2001 +From: Dave Chinner +Date: Wed, 29 Jul 2015 11:48:01 +1000 +Subject: xfs: remote attribute headers contain an invalid LSN + +From: Dave Chinner + +commit e3c32ee9e3e747fec01eb38e6610a9157d44c3ea upstream. + +In recent testing, a system that crashed failed log recovery on +restart with a bad symlink buffer magic number: + +XFS (vda): Starting recovery (logdev: internal) +XFS (vda): Bad symlink block magic! +XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060 + +On examination of the log via xfs_logprint, none of the symlink +buffers in the log had a bad magic number, nor were any other types +of buffer log format headers mis-identified as symlink buffers. +Tracing was used to find the buffer the kernel was tripping over, +and xfs_db identified it's contents as: + +000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e +020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d +040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d +060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d +..... + +This is a remote attribute buffer, which are notable in that they +are not logged but are instead written synchronously by the remote +attribute code so that they exist on disk before the attribute +transactions are committed to the journal. + +The above remote attribute block has an invalid LSN in it - cycle +0xd002000, block 0 - which means when log recovery comes along to +determine if the transaction that writes to the underlying block +should be replayed, it sees a block that has a future LSN and so +does not replay the buffer data in the transaction. Instead, it +validates the buffer magic number and attaches the buffer verifier +to it. It is this buffer magic number check that is failing in the +above assert, indicating that we skipped replay due to the LSN of +the underlying buffer. + +The problem here is that the remote attribute buffers cannot have a +valid LSN placed into them, because the transaction that contains +the attribute tree pointer changes and the block allocation that the +attribute data is being written to hasn't yet been committed. Hence +the LSN field in the attribute block is completely unwritten, +thereby leaving the underlying contents of the block in the LSN +field. It could have any value, and hence a future overwrite of the +block by log recovery may or may not work correctly. + +Fix this by always writing an invalid LSN to the remote attribute +block, as any buffer in log recovery that needs to write over the +remote attribute should occur. We are protected from having old data +written over the attribute by the fact that freeing the block before +the remote attribute is written will result in the buffer being +marked stale in the log and so all changes prior to the buffer stale +transaction will be cancelled by log recovery. + +Hence it is safe to ignore the LSN in the case or synchronously +written, unlogged metadata such as remote attribute blocks, and to +ensure we do that correctly, we need to write an invalid LSN to all +remote attribute blocks to trigger immediate recovery of metadata +that is written over the top. + +As a further protection for filesystems that may already have remote +attribute blocks with bad LSNs on disk, change the log recovery code +to always trigger immediate recovery of metadata over remote +attribute blocks. + +Signed-off-by: Dave Chinner +Reviewed-by: Brian Foster +Signed-off-by: Dave Chinner +Signed-off-by: Greg Kroah-Hartman + +--- + fs/xfs/libxfs/xfs_attr_remote.c | 29 +++++++++++++++++++++++------ + fs/xfs/xfs_log_recover.c | 11 ++++++++--- + 2 files changed, 31 insertions(+), 9 deletions(-) + +--- a/fs/xfs/libxfs/xfs_attr_remote.c ++++ b/fs/xfs/libxfs/xfs_attr_remote.c +@@ -159,11 +159,10 @@ xfs_attr3_rmt_write_verify( + struct xfs_buf *bp) + { + struct xfs_mount *mp = bp->b_target->bt_mount; +- struct xfs_buf_log_item *bip = bp->b_fspriv; ++ int blksize = mp->m_attr_geo->blksize; + char *ptr; + int len; + xfs_daddr_t bno; +- int blksize = mp->m_attr_geo->blksize; + + /* no verification of non-crc buffers */ + if (!xfs_sb_version_hascrc(&mp->m_sb)) +@@ -175,16 +174,22 @@ xfs_attr3_rmt_write_verify( + ASSERT(len >= blksize); + + while (len > 0) { ++ struct xfs_attr3_rmt_hdr *rmt = (struct xfs_attr3_rmt_hdr *)ptr; ++ + if (!xfs_attr3_rmt_verify(mp, ptr, blksize, bno)) { + xfs_buf_ioerror(bp, -EFSCORRUPTED); + xfs_verifier_error(bp); + return; + } +- if (bip) { +- struct xfs_attr3_rmt_hdr *rmt; + +- rmt = (struct xfs_attr3_rmt_hdr *)ptr; +- rmt->rm_lsn = cpu_to_be64(bip->bli_item.li_lsn); ++ /* ++ * Ensure we aren't writing bogus LSNs to disk. See ++ * xfs_attr3_rmt_hdr_set() for the explanation. ++ */ ++ if (rmt->rm_lsn != cpu_to_be64(NULLCOMMITLSN)) { ++ xfs_buf_ioerror(bp, -EFSCORRUPTED); ++ xfs_verifier_error(bp); ++ return; + } + xfs_update_cksum(ptr, blksize, XFS_ATTR3_RMT_CRC_OFF); + +@@ -221,6 +226,18 @@ xfs_attr3_rmt_hdr_set( + rmt->rm_owner = cpu_to_be64(ino); + rmt->rm_blkno = cpu_to_be64(bno); + ++ /* ++ * Remote attribute blocks are written synchronously, so we don't ++ * have an LSN that we can stamp in them that makes any sense to log ++ * recovery. To ensure that log recovery handles overwrites of these ++ * blocks sanely (i.e. once they've been freed and reallocated as some ++ * other type of metadata) we need to ensure that the LSN has a value ++ * that tells log recovery to ignore the LSN and overwrite the buffer ++ * with whatever is in it's log. To do this, we use the magic ++ * NULLCOMMITLSN to indicate that the LSN is invalid. ++ */ ++ rmt->rm_lsn = cpu_to_be64(NULLCOMMITLSN); ++ + return sizeof(struct xfs_attr3_rmt_hdr); + } + +--- a/fs/xfs/xfs_log_recover.c ++++ b/fs/xfs/xfs_log_recover.c +@@ -1887,9 +1887,14 @@ xlog_recover_get_buf_lsn( + uuid = &((struct xfs_dir3_blk_hdr *)blk)->uuid; + break; + case XFS_ATTR3_RMT_MAGIC: +- lsn = be64_to_cpu(((struct xfs_attr3_rmt_hdr *)blk)->rm_lsn); +- uuid = &((struct xfs_attr3_rmt_hdr *)blk)->rm_uuid; +- break; ++ /* ++ * Remote attr blocks are written synchronously, rather than ++ * being logged. That means they do not contain a valid LSN ++ * (i.e. transactionally ordered) in them, and hence any time we ++ * see a buffer to replay over the top of a remote attribute ++ * block we should simply do so. ++ */ ++ goto recover_immediately; + case XFS_SB_MAGIC: + lsn = be64_to_cpu(((struct xfs_dsb *)blk)->sb_lsn); + uuid = &((struct xfs_dsb *)blk)->sb_uuid; diff --git a/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch b/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch new file mode 100644 index 00000000000..c82621cea75 --- /dev/null +++ b/queue-4.1/xfs-remote-attributes-need-to-be-considered-data.patch @@ -0,0 +1,68 @@ +From df150ed102baa0e78c06e08e975dfb47147dd677 Mon Sep 17 00:00:00 2001 +From: Dave Chinner +Date: Wed, 29 Jul 2015 11:48:02 +1000 +Subject: xfs: remote attributes need to be considered data + +From: Dave Chinner + +commit df150ed102baa0e78c06e08e975dfb47147dd677 upstream. + +We don't log remote attribute contents, and instead write them +synchronously before we commit the block allocation and attribute +tree update transaction. As a result we are writing to the allocated +space before the allcoation has been made permanent. + +As a result, we cannot consider this allocation to be a metadata +allocation. Metadata allocation can take blocks from the free list +and so reuse them before the transaction that freed the block is +committed to disk. This behaviour is perfectly fine for journalled +metadata changes as log recovery will ensure the free operation is +replayed before the overwrite, but for remote attribute writes this +is not the case. + +Hence we have to consider the remote attribute blocks to contain +data and allocate accordingly. We do this by dropping the +XFS_BMAPI_METADATA flag from the block allocation. This means the +allocation will not use blocks that are on the busy list without +first ensuring that the freeing transaction has been committed to +disk and the blocks removed from the busy list. This ensures we will +never overwrite a freed block without first ensuring that it is +really free. + +Signed-off-by: Dave Chinner +Reviewed-by: Brian Foster +Signed-off-by: Dave Chinner +Signed-off-by: Greg Kroah-Hartman + +--- + fs/xfs/libxfs/xfs_attr_remote.c | 15 +++++++++++---- + 1 file changed, 11 insertions(+), 4 deletions(-) + +--- a/fs/xfs/libxfs/xfs_attr_remote.c ++++ b/fs/xfs/libxfs/xfs_attr_remote.c +@@ -451,14 +451,21 @@ xfs_attr_rmtval_set( + + /* + * Allocate a single extent, up to the size of the value. ++ * ++ * Note that we have to consider this a data allocation as we ++ * write the remote attribute without logging the contents. ++ * Hence we must ensure that we aren't using blocks that are on ++ * the busy list so that we don't overwrite blocks which have ++ * recently been freed but their transactions are not yet ++ * committed to disk. If we overwrite the contents of a busy ++ * extent and then crash then the block may not contain the ++ * correct metadata after log recovery occurs. + */ + xfs_bmap_init(args->flist, args->firstblock); + nmap = 1; + error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)lblkno, +- blkcnt, +- XFS_BMAPI_ATTRFORK | XFS_BMAPI_METADATA, +- args->firstblock, args->total, &map, &nmap, +- args->flist); ++ blkcnt, XFS_BMAPI_ATTRFORK, args->firstblock, ++ args->total, &map, &nmap, args->flist); + if (!error) { + error = xfs_bmap_finish(&args->trans, args->flist, + &committed);