From: Greg Kroah-Hartman Date: Mon, 26 Jun 2023 15:26:42 +0000 (+0200) Subject: 5.4-stable patches X-Git-Tag: v4.14.320~4 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=456f9c6afca98e3766073879f163a6d9bf66ee17;p=thirdparty%2Fkernel%2Fstable-queue.git 5.4-stable patches added patches: mm-fix-vm_bug_on-pagetail-and-bug_on-pagewriteback.patch mm-make-wait_on_page_writeback-wait-for-multiple-pending-writebacks.patch --- diff --git a/queue-5.4/mm-fix-vm_bug_on-pagetail-and-bug_on-pagewriteback.patch b/queue-5.4/mm-fix-vm_bug_on-pagetail-and-bug_on-pagewriteback.patch new file mode 100644 index 00000000000..ed717c9c098 --- /dev/null +++ b/queue-5.4/mm-fix-vm_bug_on-pagetail-and-bug_on-pagewriteback.patch @@ -0,0 +1,122 @@ +From 073861ed77b6b957c3c8d54a11dc503f7d986ceb Mon Sep 17 00:00:00 2001 +From: Hugh Dickins +Date: Tue, 24 Nov 2020 08:46:43 -0800 +Subject: mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback) + +From: Hugh Dickins + +commit 073861ed77b6b957c3c8d54a11dc503f7d986ceb upstream. + +Twice now, when exercising ext4 looped on shmem huge pages, I have crashed +on the PF_ONLY_HEAD check inside PageWaiters(): ext4_finish_bio() calling +end_page_writeback() calling wake_up_page() on tail of a shmem huge page, +no longer an ext4 page at all. + +The problem is that PageWriteback is not accompanied by a page reference +(as the NOTE at the end of test_clear_page_writeback() acknowledges): as +soon as TestClearPageWriteback has been done, that page could be removed +from page cache, freed, and reused for something else by the time that +wake_up_page() is reached. + +https://lore.kernel.org/linux-mm/20200827122019.GC14765@casper.infradead.org/ +Matthew Wilcox suggested avoiding or weakening the PageWaiters() tail +check; but I'm paranoid about even looking at an unreferenced struct page, +lest its memory might itself have already been reused or hotremoved (and +wake_up_page_bit() may modify that memory with its ClearPageWaiters()). + +Then on crashing a second time, realized there's a stronger reason against +that approach. If my testing just occasionally crashes on that check, +when the page is reused for part of a compound page, wouldn't it be much +more common for the page to get reused as an order-0 page before reaching +wake_up_page()? And on rare occasions, might that reused page already be +marked PageWriteback by its new user, and already be waited upon? What +would that look like? + +It would look like BUG_ON(PageWriteback) after wait_on_page_writeback() +in write_cache_pages() (though I have never seen that crash myself). + +Matthew Wilcox explaining this to himself: + "page is allocated, added to page cache, dirtied, writeback starts, + + --- thread A --- + filesystem calls end_page_writeback() + test_clear_page_writeback() + --- context switch to thread B --- + truncate_inode_pages_range() finds the page, it doesn't have writeback set, + we delete it from the page cache. Page gets reallocated, dirtied, writeback + starts again. Then we call write_cache_pages(), see + PageWriteback() set, call wait_on_page_writeback() + --- context switch back to thread A --- + wake_up_page(page, PG_writeback); + ... thread B is woken, but because the wakeup was for the old use of + the page, PageWriteback is still set. + + Devious" + +And prior to 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") +this would have been much less likely: before that, wake_page_function()'s +non-exclusive case would stop walking and not wake if it found Writeback +already set again; whereas now the non-exclusive case proceeds to wake. + +I have not thought of a fix that does not add a little overhead: the +simplest fix is for end_page_writeback() to get_page() before calling +test_clear_page_writeback(), then put_page() after wake_up_page(). + +Was there a chance of missed wakeups before, since a page freed before +reaching wake_up_page() would have PageWaiters cleared? I think not, +because each waiter does hold a reference on the page. This bug comes +when the old use of the page, the one we do TestClearPageWriteback on, +had *no* waiters, so no additional page reference beyond the page cache +(and whoever racily freed it). The reuse of the page has a waiter +holding a reference, and its own PageWriteback set; but the belated +wake_up_page() has woken the reuse to hit that BUG_ON(PageWriteback). + +Reported-by: syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com +Reported-by: Qian Cai +Fixes: 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") +Signed-off-by: Hugh Dickins +Cc: stable@vger.kernel.org # v5.8+ +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/filemap.c | 8 ++++++++ + mm/page-writeback.c | 6 ------ + 2 files changed, 8 insertions(+), 6 deletions(-) + +--- a/mm/filemap.c ++++ b/mm/filemap.c +@@ -1388,11 +1388,19 @@ void end_page_writeback(struct page *pag + rotate_reclaimable_page(page); + } + ++ /* ++ * Writeback does not hold a page reference of its own, relying ++ * on truncation to wait for the clearing of PG_writeback. ++ * But here we must make sure that the page is not freed and ++ * reused before the wake_up_page(). ++ */ ++ get_page(page); + if (!test_clear_page_writeback(page)) + BUG(); + + smp_mb__after_atomic(); + wake_up_page(page, PG_writeback); ++ put_page(page); + } + EXPORT_SYMBOL(end_page_writeback); + +--- a/mm/page-writeback.c ++++ b/mm/page-writeback.c +@@ -2746,12 +2746,6 @@ int test_clear_page_writeback(struct pag + } else { + ret = TestClearPageWriteback(page); + } +- /* +- * NOTE: Page might be free now! Writeback doesn't hold a page +- * reference on its own, it relies on truncation to wait for +- * the clearing of PG_writeback. The below can only access +- * page state that is static across allocation cycles. +- */ + if (ret) { + dec_lruvec_state(lruvec, NR_WRITEBACK); + dec_zone_page_state(page, NR_ZONE_WRITE_PENDING); diff --git a/queue-5.4/mm-make-wait_on_page_writeback-wait-for-multiple-pending-writebacks.patch b/queue-5.4/mm-make-wait_on_page_writeback-wait-for-multiple-pending-writebacks.patch new file mode 100644 index 00000000000..cf7e8154555 --- /dev/null +++ b/queue-5.4/mm-make-wait_on_page_writeback-wait-for-multiple-pending-writebacks.patch @@ -0,0 +1,102 @@ +From c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 Mon Sep 17 00:00:00 2001 +From: Linus Torvalds +Date: Tue, 5 Jan 2021 11:33:00 -0800 +Subject: mm: make wait_on_page_writeback() wait for multiple pending writebacks + +From: Linus Torvalds + +commit c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 upstream. + +Ever since commit 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() +logic") we've had some very occasional reports of BUG_ON(PageWriteback) +in write_cache_pages(), which we thought we already fixed in commit +073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"). + +But syzbot just reported another one, even with that commit in place. + +And it turns out that there's a simpler way to trigger the BUG_ON() than +the one Hugh found with page re-use. It all boils down to the fact that +the page writeback is ostensibly serialized by the page lock, but that +isn't actually really true. + +Yes, the people _setting_ writeback all do so under the page lock, but +the actual clearing of the bit - and waking up any waiters - happens +without any page lock. + +This gives us this fairly simple race condition: + + CPU1 = end previous writeback + CPU2 = start new writeback under page lock + CPU3 = write_cache_pages() + + CPU1 CPU2 CPU3 + ---- ---- ---- + + end_page_writeback() + test_clear_page_writeback(page) + ... delayed... + + lock_page(); + set_page_writeback() + unlock_page() + + lock_page() + wait_on_page_writeback(); + + wake_up_page(page, PG_writeback); + .. wakes up CPU3 .. + + BUG_ON(PageWriteback(page)); + +where the BUG_ON() happens because we woke up the PG_writeback bit +becasue of the _previous_ writeback, but a new one had already been +started because the clearing of the bit wasn't actually atomic wrt the +actual wakeup or serialized by the page lock. + +The reason this didn't use to happen was that the old logic in waiting +on a page bit would just loop if it ever saw the bit set again. + +The nice proper fix would probably be to get rid of the whole "wait for +writeback to clear, and then set it" logic in the writeback path, and +replace it with an atomic "wait-to-set" (ie the same as we have for page +locking: we set the page lock bit with a single "lock_page()", not with +"wait for lock bit to clear and then set it"). + +However, out current model for writeback is that the waiting for the +writeback bit is done by the generic VFS code (ie write_cache_pages()), +but the actual setting of the writeback bit is done much later by the +filesystem ".writepages()" function. + +IOW, to make the writeback bit have that same kind of "wait-to-set" +behavior as we have for page locking, we'd have to change our roughly +~50 different writeback functions. Painful. + +Instead, just make "wait_on_page_writeback()" loop on the very unlikely +situation that the PG_writeback bit is still set, basically re-instating +the old behavior. This is very non-optimal in case of contention, but +since we only ever set the bit under the page lock, that situation is +controlled. + +Reported-by: syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com +Fixes: 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") +Acked-by: Hugh Dickins +Cc: Andrew Morton +Cc: Matthew Wilcox +Cc: stable@kernel.org +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + mm/page-writeback.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/mm/page-writeback.c ++++ b/mm/page-writeback.c +@@ -2811,7 +2811,7 @@ EXPORT_SYMBOL(__test_set_page_writeback) + */ + void wait_on_page_writeback(struct page *page) + { +- if (PageWriteback(page)) { ++ while (PageWriteback(page)) { + trace_wait_on_page_writeback(page, page_mapping(page)); + wait_on_page_bit(page, PG_writeback); + } diff --git a/queue-5.4/series b/queue-5.4/series index b2f30634880..59ff235412c 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -55,3 +55,5 @@ drm-exynos-fix-race-condition-uaf-in-exynos_g2d_exec.patch drm-radeon-fix-race-condition-uaf-in-radeon_gem_set_.patch x86-apic-fix-kernel-panic-when-booting-with-intremap.patch i2c-imx-lpi2c-fix-type-char-overflow-issue-when-calc.patch +mm-fix-vm_bug_on-pagetail-and-bug_on-pagewriteback.patch +mm-make-wait_on_page_writeback-wait-for-multiple-pending-writebacks.patch