From: Greg Kroah-Hartman Date: Sun, 14 Sep 2025 12:52:45 +0000 (+0200) Subject: 6.1-stable patches X-Git-Tag: v6.1.153~40 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=052c026d4bf0dc617093efdc91fd71e8fd95ae56;p=thirdparty%2Fkernel%2Fstable-queue.git 6.1-stable patches added patches: mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch --- diff --git a/queue-6.1/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch b/queue-6.1/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch new file mode 100644 index 0000000000..6e7ba0ebea --- /dev/null +++ b/queue-6.1/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch @@ -0,0 +1,110 @@ +From d613f53c83ec47089c4e25859d5e8e0359f6f8da Mon Sep 17 00:00:00 2001 +From: Miaohe Lin +Date: Thu, 28 Aug 2025 10:46:18 +0800 +Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory + +From: Miaohe Lin + +commit d613f53c83ec47089c4e25859d5e8e0359f6f8da upstream. + +When I did memory failure tests, below panic occurs: + +page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) +kernel BUG at include/linux/page-flags.h:616! +Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI +CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 +RIP: 0010:unpoison_memory+0x2f3/0x590 +RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 +RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 +RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 +RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb +R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 +R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe +FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 +Call Trace: + + unpoison_memory+0x2f3/0x590 + simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 + debugfs_attr_write+0x42/0x60 + full_proxy_write+0x5b/0x80 + vfs_write+0xd5/0x540 + ksys_write+0x64/0xe0 + do_syscall_64+0xb9/0x1d0 + entry_SYSCALL_64_after_hwframe+0x77/0x7f +RIP: 0033:0x7f08f0314887 +RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 +RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 +RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 +RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff +R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 +R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 + +Modules linked in: hwpoison_inject +---[ end trace 0000000000000000 ]--- +RIP: 0010:unpoison_memory+0x2f3/0x590 +RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 +RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 +RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 +RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb +R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 +R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe +FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 +Kernel panic - not syncing: Fatal exception +Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) +---[ end Kernel panic - not syncing: Fatal exception ]--- + +The root cause is that unpoison_memory() tries to check the PG_HWPoison +flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is +triggered. This can be reproduced by below steps: + +1.Offline memory block: + + echo offline > /sys/devices/system/memory/memory12/state + +2.Get offlined memory pfn: + + page-types -b n -rlN + +3.Write pfn to unpoison-pfn + + echo > /sys/kernel/debug/hwpoison/unpoison-pfn + +This scenario can be identified by pfn_to_online_page() returning NULL. +And ZONE_DEVICE pages are never expected, so we can simply fail if +pfn_to_online_page() == NULL to fix the bug. + +Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com +Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") +Signed-off-by: Miaohe Lin +Suggested-by: David Hildenbrand +Acked-by: David Hildenbrand +Cc: Naoya Horiguchi +Cc: +Signed-off-by: Andrew Morton +[ Adjust context ] +Signed-off-by: Sasha Levin +Signed-off-by: Greg Kroah-Hartman +--- + mm/memory-failure.c | 7 +++---- + 1 file changed, 3 insertions(+), 4 deletions(-) + +--- a/mm/memory-failure.c ++++ b/mm/memory-failure.c +@@ -2346,10 +2346,9 @@ int unpoison_memory(unsigned long pfn) + static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + +- if (!pfn_valid(pfn)) +- return -ENXIO; +- +- p = pfn_to_page(pfn); ++ p = pfn_to_online_page(pfn); ++ if (!p) ++ return -EIO; + page = compound_head(p); + + mutex_lock(&mf_mutex); diff --git a/queue-6.1/series b/queue-6.1/series index 0b61671619..c670003b2d 100644 --- a/queue-6.1/series +++ b/queue-6.1/series @@ -39,3 +39,4 @@ mtd-nand-raw-atmel-respect-tar-tclr-in-read-setup-timing.patch mm-khugepaged-convert-hpage_collapse_scan_pmd-to-use-folios.patch mm-khugepaged-fix-the-address-passed-to-notifier-on-testing-young.patch kernfs-fix-uaf-in-polling-when-open-file-is-released.patch +mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch