]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blob - releases/2.6.36.2/mm-vfs-revalidate-page-mapping-in-do_generic_file_read.patch
fixes for 4.19
[thirdparty/kernel/stable-queue.git] / releases / 2.6.36.2 / mm-vfs-revalidate-page-mapping-in-do_generic_file_read.patch
1 From 8d056cb965b8fb7c53c564abf28b1962d1061cd3 Mon Sep 17 00:00:00 2001
2 From: Dave Hansen <dave@linux.vnet.ibm.com>
3 Date: Thu, 11 Nov 2010 14:05:15 -0800
4 Subject: mm/vfs: revalidate page->mapping in do_generic_file_read()
5
6 From: Dave Hansen <dave@linux.vnet.ibm.com>
7
8 commit 8d056cb965b8fb7c53c564abf28b1962d1061cd3 upstream.
9
10 70 hours into some stress tests of a 2.6.32-based enterprise kernel, we
11 ran into a NULL dereference in here:
12
13 int block_is_partially_uptodate(struct page *page, read_descriptor_t *desc,
14 unsigned long from)
15 {
16 ----> struct inode *inode = page->mapping->host;
17
18 It looks like page->mapping was the culprit. (xmon trace is below).
19 After closer examination, I realized that do_generic_file_read() does a
20 find_get_page(), and eventually locks the page before calling
21 block_is_partially_uptodate(). However, it doesn't revalidate the
22 page->mapping after the page is locked. So, there's a small window
23 between the find_get_page() and ->is_partially_uptodate() where the page
24 could get truncated and page->mapping cleared.
25
26 We _have_ a reference, so it can't get reclaimed, but it certainly
27 can be truncated.
28
29 I think the correct thing is to check page->mapping after the
30 trylock_page(), and jump out if it got truncated. This patch has been
31 running in the test environment for a month or so now, and we have not
32 seen this bug pop up again.
33
34 xmon info:
35
36 1f:mon> e
37 cpu 0x1f: Vector: 300 (Data Access) at [c0000002ae36f770]
38 pc: c0000000001e7a6c: .block_is_partially_uptodate+0xc/0x100
39 lr: c000000000142944: .generic_file_aio_read+0x1e4/0x770
40 sp: c0000002ae36f9f0
41 msr: 8000000000009032
42 dar: 0
43 dsisr: 40000000
44 current = 0xc000000378f99e30
45 paca = 0xc000000000f66300
46 pid = 21946, comm = bash
47 1f:mon> r
48 R00 = 0025c0500000006d R16 = 0000000000000000
49 R01 = c0000002ae36f9f0 R17 = c000000362cd3af0
50 R02 = c000000000e8cd80 R18 = ffffffffffffffff
51 R03 = c0000000031d0f88 R19 = 0000000000000001
52 R04 = c0000002ae36fa68 R20 = c0000003bb97b8a0
53 R05 = 0000000000000000 R21 = c0000002ae36fa68
54 R06 = 0000000000000000 R22 = 0000000000000000
55 R07 = 0000000000000001 R23 = c0000002ae36fbb0
56 R08 = 0000000000000002 R24 = 0000000000000000
57 R09 = 0000000000000000 R25 = c000000362cd3a80
58 R10 = 0000000000000000 R26 = 0000000000000002
59 R11 = c0000000001e7b60 R27 = 0000000000000000
60 R12 = 0000000042000484 R28 = 0000000000000001
61 R13 = c000000000f66300 R29 = c0000003bb97b9b8
62 R14 = 0000000000000001 R30 = c000000000e28a08
63 R15 = 000000000000ffff R31 = c0000000031d0f88
64 pc = c0000000001e7a6c .block_is_partially_uptodate+0xc/0x100
65 lr = c000000000142944 .generic_file_aio_read+0x1e4/0x770
66 msr = 8000000000009032 cr = 22000488
67 ctr = c0000000001e7a60 xer = 0000000020000000 trap = 300
68 dar = 0000000000000000 dsisr = 40000000
69 1f:mon> t
70 [link register ] c000000000142944 .generic_file_aio_read+0x1e4/0x770
71 [c0000002ae36f9f0] c000000000142a14 .generic_file_aio_read+0x2b4/0x770 (unreliable)
72 [c0000002ae36fb40] c0000000001b03e4 .do_sync_read+0xd4/0x160
73 [c0000002ae36fce0] c0000000001b153c .vfs_read+0xec/0x1f0
74 [c0000002ae36fd80] c0000000001b1768 .SyS_read+0x58/0xb0
75 [c0000002ae36fe30] c00000000000852c syscall_exit+0x0/0x40
76 --- Exception: c00 (System Call) at 00000080a840bc54
77 SP (fffca15df30) is in userspace
78 1f:mon> di c0000000001e7a6c
79 c0000000001e7a6c e9290000 ld r9,0(r9)
80 c0000000001e7a70 418200c0 beq c0000000001e7b30 # .block_is_partially_uptodate+0xd0/0x100
81 c0000000001e7a74 e9440008 ld r10,8(r4)
82 c0000000001e7a78 78a80020 clrldi r8,r5,32
83 c0000000001e7a7c 3c000001 lis r0,1
84 c0000000001e7a80 812900a8 lwz r9,168(r9)
85 c0000000001e7a84 39600001 li r11,1
86 c0000000001e7a88 7c080050 subf r0,r8,r0
87 c0000000001e7a8c 7f805040 cmplw cr7,r0,r10
88 c0000000001e7a90 7d6b4830 slw r11,r11,r9
89 c0000000001e7a94 796b0020 clrldi r11,r11,32
90 c0000000001e7a98 419d00a8 bgt cr7,c0000000001e7b40 # .block_is_partially_uptodate+0xe0/0x100
91 c0000000001e7a9c 7fa55840 cmpld cr7,r5,r11
92 c0000000001e7aa0 7d004214 add r8,r0,r8
93 c0000000001e7aa4 79080020 clrldi r8,r8,32
94 c0000000001e7aa8 419c0078 blt cr7,c0000000001e7b20 # .block_is_partially_uptodate+0xc0/0x100
95
96 Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
97 Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
98 Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
99 Acked-by: Rik van Riel <riel@redhat.com>
100 Cc: <arunabal@in.ibm.com>
101 Cc: <sbest@us.ibm.com>
102 Cc: Christoph Hellwig <hch@lst.de>
103 Cc: Al Viro <viro@zeniv.linux.org.uk>
104 Cc: Minchan Kim <minchan.kim@gmail.com>
105 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
106 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
107 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
108
109 ---
110 mm/filemap.c | 3 +++
111 1 file changed, 3 insertions(+)
112
113 --- a/mm/filemap.c
114 +++ b/mm/filemap.c
115 @@ -1010,6 +1010,9 @@ find_page:
116 goto page_not_up_to_date;
117 if (!trylock_page(page))
118 goto page_not_up_to_date;
119 + /* Did it get truncated before we got the lock? */
120 + if (!page->mapping)
121 + goto page_not_up_to_date_locked;
122 if (!mapping->a_ops->is_partially_uptodate(page,
123 desc, offset))
124 goto page_not_up_to_date_locked;