]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - releases/4.11.2/fs-fix-data-invalidation-in-the-cleancache-during-direct-io.patch
4.9-stable patches
[thirdparty/kernel/stable-queue.git] / releases / 4.11.2 / fs-fix-data-invalidation-in-the-cleancache-during-direct-io.patch
1 From 55635ba76ef91f26b418702ace5e6287eb727f6a Mon Sep 17 00:00:00 2001
2 From: Andrey Ryabinin <aryabinin@virtuozzo.com>
3 Date: Wed, 3 May 2017 14:55:59 -0700
4 Subject: fs: fix data invalidation in the cleancache during direct IO
5
6 From: Andrey Ryabinin <aryabinin@virtuozzo.com>
7
8 commit 55635ba76ef91f26b418702ace5e6287eb727f6a upstream.
9
10 Patch series "Properly invalidate data in the cleancache", v2.
11
12 We've noticed that after direct IO write, buffered read sometimes gets
13 stale data which is coming from the cleancache. The reason for this is
14 that some direct write hooks call call invalidate_inode_pages2[_range]()
15 conditionally iff mapping->nrpages is not zero, so we may not invalidate
16 data in the cleancache.
17
18 Another odd thing is that we check only for ->nrpages and don't check
19 for ->nrexceptional, but invalidate_inode_pages2[_range] also
20 invalidates exceptional entries as well. So we invalidate exceptional
21 entries only if ->nrpages != 0? This doesn't feel right.
22
23 - Patch 1 fixes direct IO writes by removing ->nrpages check.
24 - Patch 2 fixes similar case in invalidate_bdev().
25 Note: I only fixed conditional cleancache_invalidate_inode() here.
26 Do we also need to add ->nrexceptional check in into invalidate_bdev()?
27
28 - Patches 3-4: some optimizations.
29
30 This patch (of 4):
31
32 Some direct IO write fs hooks call invalidate_inode_pages2[_range]()
33 conditionally iff mapping->nrpages is not zero. This can't be right,
34 because invalidate_inode_pages2[_range]() also invalidate data in the
35 cleancache via cleancache_invalidate_inode() call. So if page cache is
36 empty but there is some data in the cleancache, buffered read after
37 direct IO write would get stale data from the cleancache.
38
39 Also it doesn't feel right to check only for ->nrpages because
40 invalidate_inode_pages2[_range] invalidates exceptional entries as well.
41
42 Fix this by calling invalidate_inode_pages2[_range]() regardless of
43 nrpages state.
44
45 Note: nfs,cifs,9p doesn't need similar fix because the never call
46 cleancache_get_page() (nor directly, nor via mpage_readpage[s]()), so
47 they are not affected by this bug.
48
49 Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache")
50 Link: http://lkml.kernel.org/r/20170424164135.22350-2-aryabinin@virtuozzo.com
51 Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
52 Reviewed-by: Jan Kara <jack@suse.cz>
53 Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
54 Cc: Alexander Viro <viro@zeniv.linux.org.uk>
55 Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
56 Cc: Jens Axboe <axboe@kernel.dk>
57 Cc: Johannes Weiner <hannes@cmpxchg.org>
58 Cc: Alexey Kuznetsov <kuznet@virtuozzo.com>
59 Cc: Christoph Hellwig <hch@lst.de>
60 Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
61 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
62 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
63 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
64
65 ---
66 fs/iomap.c | 20 +++++++++-----------
67 mm/filemap.c | 26 +++++++++++---------------
68 2 files changed, 20 insertions(+), 26 deletions(-)
69
70 --- a/fs/iomap.c
71 +++ b/fs/iomap.c
72 @@ -887,16 +887,14 @@ iomap_dio_rw(struct kiocb *iocb, struct
73 flags |= IOMAP_WRITE;
74 }
75
76 - if (mapping->nrpages) {
77 - ret = filemap_write_and_wait_range(mapping, start, end);
78 - if (ret)
79 - goto out_free_dio;
80 -
81 - ret = invalidate_inode_pages2_range(mapping,
82 - start >> PAGE_SHIFT, end >> PAGE_SHIFT);
83 - WARN_ON_ONCE(ret);
84 - ret = 0;
85 - }
86 + ret = filemap_write_and_wait_range(mapping, start, end);
87 + if (ret)
88 + goto out_free_dio;
89 +
90 + ret = invalidate_inode_pages2_range(mapping,
91 + start >> PAGE_SHIFT, end >> PAGE_SHIFT);
92 + WARN_ON_ONCE(ret);
93 + ret = 0;
94
95 inode_dio_begin(inode);
96
97 @@ -951,7 +949,7 @@ iomap_dio_rw(struct kiocb *iocb, struct
98 * one is a pretty crazy thing to do, so we don't support it 100%. If
99 * this invalidation fails, tough, the write still worked...
100 */
101 - if (iov_iter_rw(iter) == WRITE && mapping->nrpages) {
102 + if (iov_iter_rw(iter) == WRITE) {
103 int err = invalidate_inode_pages2_range(mapping,
104 start >> PAGE_SHIFT, end >> PAGE_SHIFT);
105 WARN_ON_ONCE(err);
106 --- a/mm/filemap.c
107 +++ b/mm/filemap.c
108 @@ -2719,18 +2719,16 @@ generic_file_direct_write(struct kiocb *
109 * about to write. We do this *before* the write so that we can return
110 * without clobbering -EIOCBQUEUED from ->direct_IO().
111 */
112 - if (mapping->nrpages) {
113 - written = invalidate_inode_pages2_range(mapping,
114 + written = invalidate_inode_pages2_range(mapping,
115 pos >> PAGE_SHIFT, end);
116 - /*
117 - * If a page can not be invalidated, return 0 to fall back
118 - * to buffered write.
119 - */
120 - if (written) {
121 - if (written == -EBUSY)
122 - return 0;
123 - goto out;
124 - }
125 + /*
126 + * If a page can not be invalidated, return 0 to fall back
127 + * to buffered write.
128 + */
129 + if (written) {
130 + if (written == -EBUSY)
131 + return 0;
132 + goto out;
133 }
134
135 data = *from;
136 @@ -2744,10 +2742,8 @@ generic_file_direct_write(struct kiocb *
137 * so we don't support it 100%. If this invalidation
138 * fails, tough, the write still worked...
139 */
140 - if (mapping->nrpages) {
141 - invalidate_inode_pages2_range(mapping,
142 - pos >> PAGE_SHIFT, end);
143 - }
144 + invalidate_inode_pages2_range(mapping,
145 + pos >> PAGE_SHIFT, end);
146
147 if (written > 0) {
148 pos += written;