From: Barry Song Date: Thu, 26 Sep 2024 21:19:36 +0000 (+1200) Subject: mm: avoid unconditional one-tick sleep when swapcache_prepare fails X-Git-Tag: v6.12-rc6~41^2 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=01626a18230246efdcea322aa8f067e60ffe5ccd;p=thirdparty%2Fkernel%2Flinux.git mm: avoid unconditional one-tick sleep when swapcache_prepare fails Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") introduced an unconditional one-tick sleep when `swapcache_prepare()` fails, which has led to reports of UI stuttering on latency-sensitive Android devices. To address this, we can use a waitqueue to wake up tasks that fail `swapcache_prepare()` sooner, instead of always sleeping for a full tick. While tasks may occasionally be woken by an unrelated `do_swap_page()`, this method is preferable to two scenarios: rapid re-entry into page faults, which can cause livelocks, and multiple millisecond sleeps, which visibly degrade user experience. Oven's testing shows that a single waitqueue resolves the UI stuttering issue. If a 'thundering herd' problem becomes apparent later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit locks can be introduced. [v-songbaohua@oppo.com: wake_up only when swapcache_wq waitqueue is active] Link: https://lkml.kernel.org/r/20241008130807.40833-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240926211936.75373-1-21cnbao@gmail.com Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") Signed-off-by: Barry Song Reported-by: Oven Liyang Tested-by: Oven Liyang Cc: Kairui Song Cc: "Huang, Ying" Cc: Yu Zhao Cc: David Hildenbrand Cc: Chris Li Cc: Hugh Dickins Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Minchan Kim Cc: Yosry Ahmed Cc: SeongJae Park Cc: Kalesh Singh Cc: Suren Baghdasaryan Cc: Signed-off-by: Andrew Morton --- diff --git a/mm/memory.c b/mm/memory.c index 3ccee51adfbbd..bdf77a3ec47bc 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4187,6 +4187,8 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf) } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq); + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4199,6 +4201,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct folio *swapcache, *folio = NULL; + DECLARE_WAITQUEUE(wait, current); struct page *page; struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; @@ -4297,7 +4300,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * Relax a bit to prevent rapid * repeated page faults. */ + add_wait_queue(&swapcache_wq, &wait); schedule_timeout_uninterruptible(1); + remove_wait_queue(&swapcache_wq, &wait); goto out_page; } need_clear_cache = true; @@ -4604,8 +4609,11 @@ unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); out: /* Clear the swap cache pin for direct swapin after PTL unlock */ - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + if (waitqueue_active(&swapcache_wq)) + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret; @@ -4620,8 +4628,11 @@ out_release: folio_unlock(swapcache); folio_put(swapcache); } - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + if (waitqueue_active(&swapcache_wq)) + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret;