]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - releases/3.18.85/nilfs2-fix-race-condition-that-causes-file-system-corruption.patch
4.9-stable patches
[thirdparty/kernel/stable-queue.git] / releases / 3.18.85 / nilfs2-fix-race-condition-that-causes-file-system-corruption.patch
1 From 31ccb1f7ba3cfe29631587d451cf5bb8ab593550 Mon Sep 17 00:00:00 2001
2 From: Andreas Rohner <andreas.rohner@gmx.net>
3 Date: Fri, 17 Nov 2017 15:29:35 -0800
4 Subject: nilfs2: fix race condition that causes file system corruption
5
6 From: Andreas Rohner <andreas.rohner@gmx.net>
7
8 commit 31ccb1f7ba3cfe29631587d451cf5bb8ab593550 upstream.
9
10 There is a race condition between nilfs_dirty_inode() and
11 nilfs_set_file_dirty().
12
13 When a file is opened, nilfs_dirty_inode() is called to update the
14 access timestamp in the inode. It calls __nilfs_mark_inode_dirty() in a
15 separate transaction. __nilfs_mark_inode_dirty() caches the ifile
16 buffer_head in the i_bh field of the inode info structure and marks it
17 as dirty.
18
19 After some data was written to the file in another transaction, the
20 function nilfs_set_file_dirty() is called, which adds the inode to the
21 ns_dirty_files list.
22
23 Then the segment construction calls nilfs_segctor_collect_dirty_files(),
24 which goes through the ns_dirty_files list and checks the i_bh field.
25 If there is a cached buffer_head in i_bh it is not marked as dirty
26 again.
27
28 Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate
29 transactions, it is possible that a segment construction that writes out
30 the ifile occurs in-between the two. If this happens the inode is not
31 on the ns_dirty_files list, but its ifile block is still marked as dirty
32 and written out.
33
34 In the next segment construction, the data for the file is written out
35 and nilfs_bmap_propagate() updates the b-tree. Eventually the bmap root
36 is written into the i_bh block, which is not dirty, because it was
37 written out in another segment construction.
38
39 As a result the bmap update can be lost, which leads to file system
40 corruption. Either the virtual block address points to an unallocated
41 DAT block, or the DAT entry will be reused for something different.
42
43 The error can remain undetected for a long time. A typical error
44 message would be one of the "bad btree" errors or a warning that a DAT
45 entry could not be found.
46
47 This bug can be reproduced reliably by a simple benchmark that creates
48 and overwrites millions of 4k files.
49
50 Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jp
51 Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
52 Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
53 Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
54 Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
55 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
56 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
57 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
58
59 ---
60 fs/nilfs2/segment.c | 6 ++++--
61 1 file changed, 4 insertions(+), 2 deletions(-)
62
63 --- a/fs/nilfs2/segment.c
64 +++ b/fs/nilfs2/segment.c
65 @@ -1884,8 +1884,6 @@ static int nilfs_segctor_collect_dirty_f
66 "failed to get inode block.\n");
67 return err;
68 }
69 - mark_buffer_dirty(ibh);
70 - nilfs_mdt_mark_dirty(ifile);
71 spin_lock(&nilfs->ns_inode_lock);
72 if (likely(!ii->i_bh))
73 ii->i_bh = ibh;
74 @@ -1894,6 +1892,10 @@ static int nilfs_segctor_collect_dirty_f
75 goto retry;
76 }
77
78 + // Always redirty the buffer to avoid race condition
79 + mark_buffer_dirty(ii->i_bh);
80 + nilfs_mdt_mark_dirty(ifile);
81 +
82 clear_bit(NILFS_I_QUEUED, &ii->i_state);
83 set_bit(NILFS_I_BUSY, &ii->i_state);
84 list_move_tail(&ii->i_dirty, &sci->sc_dirty_files);