From: Lorenzo Stoakes Date: Mon, 21 Jul 2025 15:55:30 +0000 (+0100) Subject: docs: update THP documentation to clarify sysfs "never" setting X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=a27848a03504f140ea14b6b0f711c1a7c722f698;p=thirdparty%2Flinux.git docs: update THP documentation to clarify sysfs "never" setting Rather confusingly, setting all Transparent Huge Page sysfs settings to "never" does not in fact result in THP being globally disabled. Rather, it results in khugepaged being disabled, but one can still obtain THP pages using madvise(..., MADV_COLLAPSE). This is something that has remained poorly documented for some time, and it is likely the received wisdom of most users of THP that never does, in fact, mean never. It is therefore important to highlight, very clearly, that this is not the case. [lorenzo.stoakes@oracle.com: update transhuge page to mention MADV_COLLAPSE] Link: https://lkml.kernel.org/r/d54d1dfb-f06d-4979-983b-73998f05867e@lucifer.local Link: https://lkml.kernel.org/r/20250721155530.75944-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes Acked-by: SeongJae Park Reviewed-by: Zi Yan Reviewed-by: Baolin Wang Reviewed-by: Barry Song Acked-by: David Hildenbrand Cc: Dev Jain Cc: Jonathan Corbet Cc: Liam Howlett Cc: Mariano Pache Cc: Ryan Roberts Signed-off-by: Andrew Morton --- diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index dff8d5985f0f2..370fba1134606 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -107,7 +107,7 @@ sysfs Global THP controls ------------------- -Transparent Hugepage Support for anonymous memory can be entirely disabled +Transparent Hugepage Support for anonymous memory can be disabled (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE regions (to avoid the risk of consuming more memory resources) or enabled system wide. This can be achieved per-supported-THP-size with one of:: @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: where is the hugepage size being addressed, the available sizes for which vary by system. +.. note:: Setting "never" in all sysfs THP controls does **not** disable + Transparent Huge Pages globally. This is because ``madvise(..., + MADV_COLLAPSE)`` ignores these settings and collapses ranges to + PMD-sized huge pages unconditionally. + For example:: echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled @@ -187,7 +192,9 @@ madvise behaviour. never - should be self-explanatory. + should be self-explanatory. Note that ``madvise(..., + MADV_COLLAPSE)`` can still cause transparent huge pages to be + obtained even if this mode is specified everywhere. By default kernel tries to use huge, PMD-mappable zero page on read page fault to anonymous mapping. It's possible to disable huge zero @@ -378,7 +385,9 @@ always Attempt to allocate huge pages every time we need a new page; never - Do not allocate huge pages; + Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)`` + can still cause transparent huge pages to be obtained even if this mode + is specified everywhere; within_size Only allocate huge page if it will be fully within i_size. @@ -434,7 +443,9 @@ inherit have enabled="inherit" and all other hugepage sizes have enabled="never"; never - Do not allocate huge pages; + Do not allocate huge pages. Note that ``madvise(..., + MADV_COLLAPSE)`` can still cause transparent huge pages to be obtained + even if this mode is specified everywhere; within_size Only allocate huge page if it will be fully within i_size.