]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blame - releases/4.4.133/mm-filemap-avoid-unnecessary-calls-to-lock_page-when-waiting-for-io-to-complete-during-a-read.patch
Fixes for 5.10
[thirdparty/kernel/stable-queue.git] / releases / 4.4.133 / mm-filemap-avoid-unnecessary-calls-to-lock_page-when-waiting-for-io-to-complete-during-a-read.patch
CommitLineData
bdae2828
GKH
1From ebded02788b5d7c7600f8cff26ae07896d568649 Mon Sep 17 00:00:00 2001
2From: Mel Gorman <mgorman@techsingularity.net>
3Date: Tue, 15 Mar 2016 14:55:39 -0700
4Subject: mm: filemap: avoid unnecessary calls to lock_page when waiting for IO to complete during a read
5
6From: Mel Gorman <mgorman@techsingularity.net>
7
8commit ebded02788b5d7c7600f8cff26ae07896d568649 upstream.
9
10In the generic read paths the kernel looks up a page in the page cache
11and if it's up to date, it is used. If not, the page lock is acquired
12to wait for IO to complete and then check the page. If multiple
13processes are waiting on IO, they all serialise against the lock and
14duplicate the checks. This is unnecessary.
15
16The page lock in itself does not give any guarantees to the callers
17about the page state as it can be immediately truncated or reclaimed
18after the page is unlocked. It's sufficient to wait_on_page_locked and
19then continue if the page is up to date on wakeup.
20
21It is possible that a truncated but up-to-date page is returned but the
22reference taken during read prevents it disappearing underneath the
23caller and the data is still valid if PageUptodate.
24
25The overall impact is small as even if processes serialise on the lock,
26the lock section is tiny once the IO is complete. Profiles indicated
27that unlock_page and friends are generally a tiny portion of a
28read-intensive workload. An artificial test was created that had
29instances of dd access a cache-cold file on an ext4 filesystem and
30measure how long the read took.
31
32paralleldd
33 4.4.0 4.4.0
34 vanilla avoidlock
35Amean Elapsd-1 5.28 ( 0.00%) 5.15 ( 2.50%)
36Amean Elapsd-4 5.29 ( 0.00%) 5.17 ( 2.12%)
37Amean Elapsd-7 5.28 ( 0.00%) 5.18 ( 1.78%)
38Amean Elapsd-12 5.20 ( 0.00%) 5.33 ( -2.50%)
39Amean Elapsd-21 5.14 ( 0.00%) 5.21 ( -1.41%)
40Amean Elapsd-30 5.30 ( 0.00%) 5.12 ( 3.38%)
41Amean Elapsd-48 5.78 ( 0.00%) 5.42 ( 6.21%)
42Amean Elapsd-79 6.78 ( 0.00%) 6.62 ( 2.46%)
43Amean Elapsd-110 9.09 ( 0.00%) 8.99 ( 1.15%)
44Amean Elapsd-128 10.60 ( 0.00%) 10.43 ( 1.66%)
45
46The impact is small but intuitively, it makes sense to avoid unnecessary
47calls to lock_page.
48
49Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
50Reviewed-by: Jan Kara <jack@suse.cz>
51Cc: Hugh Dickins <hughd@google.com>
52Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
53Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
54Signed-off-by: Mel Gorman <mgorman@suse.de>
55Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
56
57---
58 mm/filemap.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
59 1 file changed, 49 insertions(+)
60
61--- a/mm/filemap.c
62+++ b/mm/filemap.c
63@@ -1581,6 +1581,15 @@ find_page:
64 index, last_index - index);
65 }
66 if (!PageUptodate(page)) {
67+ /*
68+ * See comment in do_read_cache_page on why
69+ * wait_on_page_locked is used to avoid unnecessarily
70+ * serialisations and why it's safe.
71+ */
72+ wait_on_page_locked_killable(page);
73+ if (PageUptodate(page))
74+ goto page_ok;
75+
76 if (inode->i_blkbits == PAGE_CACHE_SHIFT ||
77 !mapping->a_ops->is_partially_uptodate)
78 goto page_not_up_to_date;
79@@ -2253,12 +2262,52 @@ filler:
80 if (PageUptodate(page))
81 goto out;
82
83+ /*
84+ * Page is not up to date and may be locked due one of the following
85+ * case a: Page is being filled and the page lock is held
86+ * case b: Read/write error clearing the page uptodate status
87+ * case c: Truncation in progress (page locked)
88+ * case d: Reclaim in progress
89+ *
90+ * Case a, the page will be up to date when the page is unlocked.
91+ * There is no need to serialise on the page lock here as the page
92+ * is pinned so the lock gives no additional protection. Even if the
93+ * the page is truncated, the data is still valid if PageUptodate as
94+ * it's a race vs truncate race.
95+ * Case b, the page will not be up to date
96+ * Case c, the page may be truncated but in itself, the data may still
97+ * be valid after IO completes as it's a read vs truncate race. The
98+ * operation must restart if the page is not uptodate on unlock but
99+ * otherwise serialising on page lock to stabilise the mapping gives
100+ * no additional guarantees to the caller as the page lock is
101+ * released before return.
102+ * Case d, similar to truncation. If reclaim holds the page lock, it
103+ * will be a race with remove_mapping that determines if the mapping
104+ * is valid on unlock but otherwise the data is valid and there is
105+ * no need to serialise with page lock.
106+ *
107+ * As the page lock gives no additional guarantee, we optimistically
108+ * wait on the page to be unlocked and check if it's up to date and
109+ * use the page if it is. Otherwise, the page lock is required to
110+ * distinguish between the different cases. The motivation is that we
111+ * avoid spurious serialisations and wakeups when multiple processes
112+ * wait on the same page for IO to complete.
113+ */
114+ wait_on_page_locked(page);
115+ if (PageUptodate(page))
116+ goto out;
117+
118+ /* Distinguish between all the cases under the safety of the lock */
119 lock_page(page);
120+
121+ /* Case c or d, restart the operation */
122 if (!page->mapping) {
123 unlock_page(page);
124 page_cache_release(page);
125 goto repeat;
126 }
127+
128+ /* Someone else locked and filled the page in a very small window */
129 if (PageUptodate(page)) {
130 unlock_page(page);
131 goto out;