releases/4.19.31/btrfs-fix-corruption-reading-shared-and-compressed-extents-after-hole-punching.patch

   1 From 8e928218780e2f1cf2f5891c7575e8f0b284fcce Mon Sep 17 00:00:00 2001
   2 From: Filipe Manana <fdmanana@suse.com>
   3 Date: Thu, 14 Feb 2019 15:17:20 +0000
   4 Subject: Btrfs: fix corruption reading shared and compressed extents after hole punching
   5
   6 From: Filipe Manana <fdmanana@suse.com>
   7
   8 commit 8e928218780e2f1cf2f5891c7575e8f0b284fcce upstream.
   9
  10 In the past we had data corruption when reading compressed extents that
  11 are shared within the same file and they are consecutive, this got fixed
  12 by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and
  13 shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read
  14 corruption of compressed and shared extents"). However there was a case
  15 that was missing in those fixes, which is when the shared and compressed
  16 extents are referenced with a non-zero offset. The following shell script
  17 creates a reproducer for this issue:
  18
  19   #!/bin/bash
  20
  21   mkfs.btrfs -f /dev/sdc &> /dev/null
  22   mount -o compress /dev/sdc /mnt/sdc
  23
  24   # Create a file with 3 consecutive compressed extents, each has an
  25   # uncompressed size of 128Kb and a compressed size of 4Kb.
  26   for ((i = 1; i <= 3; i++)); do
  27       head -c 4096 /dev/zero
  28       for ((j = 1; j <= 31; j++)); do
  29           head -c 4096 /dev/zero | tr '\0' "\377"
  30       done
  31   done > /mnt/sdc/foobar
  32   sync
  33
  34   echo "Digest after file creation:   $(md5sum /mnt/sdc/foobar)"
  35
  36   # Clone the first extent into offsets 128K and 256K.
  37   xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
  38   xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
  39   sync
  40
  41   echo "Digest after cloning:         $(md5sum /mnt/sdc/foobar)"
  42
  43   # Punch holes into the regions that are already full of zeroes.
  44   xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
  45   xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
  46   xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
  47   sync
  48
  49   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
  50
  51   echo "Dropping page cache..."
  52   sysctl -q vm.drop_caches=1
  53   echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
  54
  55   umount /dev/sdc
  56
  57 When running the script we get the following output:
  58
  59   Digest after file creation:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
  60   linked 131072/131072 bytes at offset 131072
  61   128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
  62   linked 131072/131072 bytes at offset 262144
  63   128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
  64   Digest after cloning:         5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
  65   Digest after hole punching:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
  66   Dropping page cache...
  67   Digest after hole punching:   fba694ae8664ed0c2e9ff8937e7f1484  /mnt/sdc/foobar
  68
  69 This happens because after reading all the pages of the extent in the
  70 range from 128K to 256K for example, we read the hole at offset 256K
  71 and then when reading the page at offset 260K we don't submit the
  72 existing bio, which is responsible for filling all the page in the
  73 range 128K to 256K only, therefore adding the pages from range 260K
  74 to 384K to the existing bio and submitting it after iterating over the
  75 entire range. Once the bio completes, the uncompressed data fills only
  76 the pages in the range 128K to 256K because there's no more data read
  77 from disk, leaving the pages in the range 260K to 384K unfilled. It is
  78 just a slightly different variant of what was solved by commit
  79 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared
  80 extents").
  81
  82 Fix this by forcing a bio submit, during readpages(), whenever we find a
  83 compressed extent map for a page that is different from the extent map
  84 for the previous page or has a different starting offset (in case it's
  85 the same compressed extent), instead of the extent map's original start
  86 offset.
  87
  88 A test case for fstests follows soon.
  89
  90 Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
  91 Fixes: 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents")
  92 Fixes: 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents")
  93 Cc: stable@vger.kernel.org # 4.3+
  94 Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
  95 Signed-off-by: Filipe Manana <fdmanana@suse.com>
  96 Signed-off-by: David Sterba <dsterba@suse.com>
  97 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  98
  99 ---
 100  fs/btrfs/extent_io.c |    4 ++--
 101  1 file changed, 2 insertions(+), 2 deletions(-)
 102
 103 --- a/fs/btrfs/extent_io.c
 104 +++ b/fs/btrfs/extent_io.c
 105 @@ -3002,11 +3002,11 @@ static int __do_readpage(struct extent_i
 106                  */
 107                 if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) &&
 108                     prev_em_start && *prev_em_start != (u64)-1 &&
 109 -                   *prev_em_start != em->orig_start)
 110 +                   *prev_em_start != em->start)
 111                         force_bio_submit = true;
 112
 113                 if (prev_em_start)
 114 -                       *prev_em_start = em->orig_start;
 115 +                       *prev_em_start = em->start;
 116
 117                 free_extent_map(em);
 118                 em = NULL;