Patch series "mm/memory_hotplug: use PageOffline() instead of
PageReserved() for !ZONE_DEVICE".
This can be a considered a long-overdue follow-up to some parts of [1].
The patches are based on [2], but they are not strictly required -- just
makes it clearer why we can use adjust_managed_page_count() for memory
hotplug without going into details about highmem.
We stop initializing pages with PageReserved() in memory hotplug code --
except when dealing with ZONE_DEVICE for now. Instead, we use
PageOffline(): all pages are initialized to PageOffline() when onlining a
memory section, and only the ones actually getting exposed to the
system/page allocator will get PageOffline cleared.
This way, we enlighten memory hotplug more about PageOffline() pages and
can cleanup some hacks we have in virtio-mem code.
What about ZONE_DEVICE? PageOffline() is wrong, but we might just stop
using PageReserved() for them later by simply checking for
is_zone_device_page() at suitable places. That will be a separate patch
set / proposal.
This primarily affects virtio-mem, HV-balloon and XEN balloon. I only
briefly tested with virtio-mem, which benefits most from these cleanups.
[1] https://lore.kernel.org/all/
20191024120938.11237-1-david@redhat.com/
[2] https://lkml.kernel.org/r/
20240607083711.62833-1-david@redhat.com
This patch (of 3):
In preparation for further changes, let's teach __free_pages_core() about
the differences of memory hotplug handling.
Move the memory hotplug specific handling from generic_online_page() to
__free_pages_core(), use adjust_managed_page_count() on the memory hotplug
path, and spell out why memory freed via memblock cannot currently use
adjust_managed_page_count().
[david@redhat.com: add missed CONFIG_DEFERRED_STRUCT_PAGE_INIT]
Link: https://lkml.kernel.org/r/b72e6efd-fb0a-459c-b1a0-88a98e5b19e2@redhat.com
[david@redhat.com: fix up the memblock comment, per Oscar]
Link: https://lkml.kernel.org/r/2ed64218-7f3b-4302-a5dc-27f060654fe2@redhat.com
[david@redhat.com: add the parameter name also in the declaration]
Link: https://lkml.kernel.org/r/ca575956-f0dd-4fb9-a307-6b7621681ed9@redhat.com
Link: https://lkml.kernel.org/r/20240607090939.89524-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240607090939.89524-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eugenio Pérez <eperezma@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Marco Elver <elver@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
int mt);
extern void memblock_free_pages(struct page *page, unsigned long pfn,
unsigned int order);
-extern void __free_pages_core(struct page *page, unsigned int order);
+extern void __free_pages_core(struct page *page, unsigned int order,
+ enum meminit_context context);
/*
* This will have no effect, other than possibly generating a warning, if the
shadow = smallstack_pop(&collect);
origin = smallstack_pop(&collect);
kmsan_setup_meta(page, shadow, origin, collect.order);
- __free_pages_core(page, collect.order);
+ __free_pages_core(page, collect.order, MEMINIT_EARLY);
}
}
void generic_online_page(struct page *page, unsigned int order)
{
- /*
- * Freeing the page with debug_pagealloc enabled will try to unmap it,
- * so we should map it first. This is better than introducing a special
- * case in page freeing fast path.
- */
- debug_pagealloc_map_pages(page, 1 << order);
- __free_pages_core(page, order);
- totalram_pages_add(1UL << order);
+ __free_pages_core(page, order, MEMINIT_HOTPLUG);
}
EXPORT_SYMBOL_GPL(generic_online_page);
if (nr_pages == MAX_ORDER_NR_PAGES && IS_MAX_ORDER_ALIGNED(pfn)) {
for (i = 0; i < nr_pages; i += pageblock_nr_pages)
set_pageblock_migratetype(page + i, MIGRATE_MOVABLE);
- __free_pages_core(page, MAX_PAGE_ORDER);
+ __free_pages_core(page, MAX_PAGE_ORDER, MEMINIT_EARLY);
return;
}
for (i = 0; i < nr_pages; i++, page++, pfn++) {
if (pageblock_aligned(pfn))
set_pageblock_migratetype(page, MIGRATE_MOVABLE);
- __free_pages_core(page, 0);
+ __free_pages_core(page, 0, MEMINIT_EARLY);
}
}
}
}
- __free_pages_core(page, order);
+ __free_pages_core(page, order, MEMINIT_EARLY);
}
DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, init_on_alloc);
__count_vm_events(PGFREE, 1 << order);
}
-void __free_pages_core(struct page *page, unsigned int order)
+void __free_pages_core(struct page *page, unsigned int order,
+ enum meminit_context context)
{
unsigned int nr_pages = 1 << order;
struct page *p = page;
__ClearPageReserved(p);
set_page_count(p, 0);
- atomic_long_add(nr_pages, &page_zone(page)->managed_pages);
+ if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) &&
+ unlikely(context == MEMINIT_HOTPLUG)) {
+ /*
+ * Freeing the page with debug_pagealloc enabled will try to
+ * unmap it; some archs don't like double-unmappings, so
+ * map it first.
+ */
+ debug_pagealloc_map_pages(page, nr_pages);
+ adjust_managed_page_count(page, nr_pages);
+ } else {
+ /* memblock adjusts totalram_pages() manually. */
+ atomic_long_add(nr_pages, &page_zone(page)->managed_pages);
+ }
if (page_contains_unaccepted(page, order)) {
if (order == MAX_PAGE_ORDER && __free_unaccepted(page))