]> git.ipfire.org Git - thirdparty/linux.git/commitdiff
mm/memory-failure: use zone_pcp_disable() for poison handling
authorKaitao Cheng <chengkaitao@kylinos.cn>
Thu, 14 May 2026 08:57:54 +0000 (16:57 +0800)
committerAndrew Morton <akpm@linux-foundation.org>
Tue, 2 Jun 2026 22:22:33 +0000 (15:22 -0700)
__page_handle_poison() used drain_all_pages() instead of
zone_pcp_disable() because dissolve_free_hugetlb_folio() could restore HVO
vmemmap pages and decrement hugetlb_optimize_vmemmap_key.  That static key
update took cpu_hotplug_lock through static_key_slow_dec(), while
zone_pcp_disable() holds pcp_batch_high_lock.  CPU hotplug takes the locks
in the opposite order through page_alloc_cpu_online/dead(), so the
combination could deadlock.

That dependency no longer exists.  Commit da3e2d1ca43d ("mm/hugetlb:
remove hugetlb_optimize_vmemmap_key static key") removed the HVO static
key and the static_branch_dec() from hugetlb_vmemmap_restore_folio().  The
dissolve_free_hugetlb_folio() path no longer reaches
static_key_slow_dec().

Use zone_pcp_disable() again while dissolving the hugetlb folio and taking
the target page off the buddy allocator.  This prevents the drained PCP
lists from being refilled before take_page_off_buddy() runs, making the
page isolation deterministic.

Link: https://lore.kernel.org/20260514085754.84097-1-kaitao.cheng@linux.dev
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/memory-failure.c

index 1b8d0bade04a716f2ad4ed844cc2495bd8e2085f..51508a55c4055e67be20173cfd6fe5ceb205b92e 100644 (file)
@@ -172,23 +172,11 @@ static int __page_handle_poison(struct page *page)
 {
        int ret;
 
-       /*
-        * zone_pcp_disable() can't be used here. It will
-        * hold pcp_batch_high_lock and dissolve_free_hugetlb_folio() might hold
-        * cpu_hotplug_lock via static_key_slow_dec() when hugetlb vmemmap
-        * optimization is enabled. This will break current lock dependency
-        * chain and leads to deadlock.
-        * Disabling pcp before dissolving the page was a deterministic
-        * approach because we made sure that those pages cannot end up in any
-        * PCP list. Draining PCP lists expels those pages to the buddy system,
-        * but nothing guarantees that those pages do not get back to a PCP
-        * queue if we need to refill those.
-        */
+       zone_pcp_disable(page_zone(page));
        ret = dissolve_free_hugetlb_folio(page_folio(page));
-       if (!ret) {
-               drain_all_pages(page_zone(page));
+       if (!ret)
                ret = take_page_off_buddy(page);
-       }
+       zone_pcp_enable(page_zone(page));
 
        return ret;
 }