From: Greg Kroah-Hartman Date: Wed, 17 Sep 2025 09:07:38 +0000 (+0200) Subject: 5.4-stable patches X-Git-Tag: v6.1.153~8 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=5920da01d68fcc3f31e9afa0109f5d1aeb6f9930;p=thirdparty%2Fkernel%2Fstable-queue.git 5.4-stable patches added patches: mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch --- diff --git a/queue-5.4/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch b/queue-5.4/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch new file mode 100644 index 0000000000..9d0b5e27bf --- /dev/null +++ b/queue-5.4/mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch @@ -0,0 +1,113 @@ +From stable+bounces-179574-greg=kroah.com@vger.kernel.org Sun Sep 14 15:30:50 2025 +From: Sasha Levin +Date: Sun, 14 Sep 2025 09:30:42 -0400 +Subject: mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory +To: stable@vger.kernel.org +Cc: Miaohe Lin , David Hildenbrand , Naoya Horiguchi , Andrew Morton , Sasha Levin +Message-ID: <20250914133042.87228-1-sashal@kernel.org> + +From: Miaohe Lin + +[ Upstream commit d613f53c83ec47089c4e25859d5e8e0359f6f8da ] + +When I did memory failure tests, below panic occurs: + +page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) +kernel BUG at include/linux/page-flags.h:616! +Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI +CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 +RIP: 0010:unpoison_memory+0x2f3/0x590 +RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 +RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 +RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 +RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb +R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 +R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe +FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 +Call Trace: + + unpoison_memory+0x2f3/0x590 + simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 + debugfs_attr_write+0x42/0x60 + full_proxy_write+0x5b/0x80 + vfs_write+0xd5/0x540 + ksys_write+0x64/0xe0 + do_syscall_64+0xb9/0x1d0 + entry_SYSCALL_64_after_hwframe+0x77/0x7f +RIP: 0033:0x7f08f0314887 +RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 +RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 +RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 +RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff +R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 +R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 + +Modules linked in: hwpoison_inject +---[ end trace 0000000000000000 ]--- +RIP: 0010:unpoison_memory+0x2f3/0x590 +RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 +RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 +RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 +RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb +R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 +R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe +FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 +CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 +Kernel panic - not syncing: Fatal exception +Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) +---[ end Kernel panic - not syncing: Fatal exception ]--- + +The root cause is that unpoison_memory() tries to check the PG_HWPoison +flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is +triggered. This can be reproduced by below steps: + +1.Offline memory block: + + echo offline > /sys/devices/system/memory/memory12/state + +2.Get offlined memory pfn: + + page-types -b n -rlN + +3.Write pfn to unpoison-pfn + + echo > /sys/kernel/debug/hwpoison/unpoison-pfn + +This scenario can be identified by pfn_to_online_page() returning NULL. +And ZONE_DEVICE pages are never expected, so we can simply fail if +pfn_to_online_page() == NULL to fix the bug. + +Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com +Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") +Signed-off-by: Miaohe Lin +Suggested-by: David Hildenbrand +Acked-by: David Hildenbrand +Cc: Naoya Horiguchi +Cc: +Signed-off-by: Andrew Morton +[ Adjust context ] +Signed-off-by: Sasha Levin +Signed-off-by: Greg Kroah-Hartman +--- + mm/memory-failure.c | 7 +++---- + 1 file changed, 3 insertions(+), 4 deletions(-) + +--- a/mm/memory-failure.c ++++ b/mm/memory-failure.c +@@ -1543,10 +1543,9 @@ int unpoison_memory(unsigned long pfn) + static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + +- if (!pfn_valid(pfn)) +- return -ENXIO; +- +- p = pfn_to_page(pfn); ++ p = pfn_to_online_page(pfn); ++ if (!p) ++ return -EIO; + page = compound_head(p); + + if (!PageHWPoison(p)) { diff --git a/queue-5.4/series b/queue-5.4/series index 879729a0dd..d0777847da 100644 --- a/queue-5.4/series +++ b/queue-5.4/series @@ -28,3 +28,4 @@ dmaengine-ti-edma-fix-memory-allocation-size-for-que.patch dmaengine-qcom-bam_dma-fix-dt-error-handling-for-num-channels-ees.patch phy-ti-pipe3-fix-device-leak-at-unbind.patch soc-qcom-mdt_loader-deal-with-zero-e_shentsize.patch +mm-memory-failure-fix-vm_bug_on_page-pagepoisoned-page-when-unpoison-memory.patch