From: Greg Kroah-Hartman Date: Sat, 2 Dec 2017 09:43:53 +0000 (+0000) Subject: 4.14-stable patches X-Git-Tag: v3.18.86~25 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=1d671f69d5f8d8d0c9d2114b3d41cb302bdd3369;p=thirdparty%2Fkernel%2Fstable-queue.git 4.14-stable patches added patches: mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch --- diff --git a/queue-4.14/mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch b/queue-4.14/mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch new file mode 100644 index 00000000000..5d248a43687 --- /dev/null +++ b/queue-4.14/mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch @@ -0,0 +1,64 @@ +From 4b81cb2ff69c8a8e297a147d2eb4d9b5e8d7c435 Mon Sep 17 00:00:00 2001 +From: Michal Hocko +Date: Wed, 29 Nov 2017 16:09:54 -0800 +Subject: mm, memory_hotplug: do not back off draining pcp free pages from kworker context + +From: Michal Hocko + +commit 4b81cb2ff69c8a8e297a147d2eb4d9b5e8d7c435 upstream. + +drain_all_pages backs off when called from a kworker context since +commit 0ccce3b92421 ("mm, page_alloc: drain per-cpu pages from workqueue +context") because the original IPI based pcp draining has been replaced +by a WQ based one and the check wanted to prevent from recursion and +inter workers dependencies. This has made some sense at the time +because the system WQ has been used and one worker holding the lock +could be blocked while waiting for new workers to emerge which can be a +problem under OOM conditions. + +Since then commit ce612879ddc7 ("mm: move pcp and lru-pcp draining into +single wq") has moved draining to a dedicated (mm_percpu_wq) WQ with a +rescuer so we shouldn't depend on any other WQ activity to make a +forward progress so calling drain_all_pages from a worker context is +safe as long as this doesn't happen from mm_percpu_wq itself which is +not the case because all workers are required to _not_ depend on any MM +locks. + +Why is this a problem in the first place? ACPI driven memory hot-remove +(acpi_device_hotplug) is executed from the worker context. We end up +calling __offline_pages to free all the pages and that requires both +lru_add_drain_all_cpuslocked and drain_all_pages to do their job +otherwise we can have dangling pages on pcp lists and fail the offline +operation (__test_page_isolated_in_pageblock would see a page with 0 ref +count but without PageBuddy set). + +Fix the issue by removing the worker check in drain_all_pages. +lru_add_drain_all_cpuslocked doesn't have this restriction so it works +as expected. + +Link: http://lkml.kernel.org/r/20170828093341.26341-1-mhocko@kernel.org +Fixes: 0ccce3b924212 ("mm, page_alloc: drain per-cpu pages from workqueue context") +Signed-off-by: Michal Hocko +Cc: Mel Gorman +Cc: Tejun Heo +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/page_alloc.c | 4 ---- + 1 file changed, 4 deletions(-) + +--- a/mm/page_alloc.c ++++ b/mm/page_alloc.c +@@ -2487,10 +2487,6 @@ void drain_all_pages(struct zone *zone) + if (WARN_ON_ONCE(!mm_percpu_wq)) + return; + +- /* Workqueues cannot recurse */ +- if (current->flags & PF_WQ_WORKER) +- return; +- + /* + * Do not drain if one is already in progress unless it's specific to + * a zone. Such callers are primarily CMA and memory hotplug and need diff --git a/queue-4.14/mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch b/queue-4.14/mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch new file mode 100644 index 00000000000..0610c22fffe --- /dev/null +++ b/queue-4.14/mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch @@ -0,0 +1,86 @@ +From 687cb0884a714ff484d038e9190edc874edcf146 Mon Sep 17 00:00:00 2001 +From: Wang Nan +Date: Wed, 29 Nov 2017 16:09:58 -0800 +Subject: mm, oom_reaper: gather each vma to prevent leaking TLB entry + +From: Wang Nan + +commit 687cb0884a714ff484d038e9190edc874edcf146 upstream. + +tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory +space. In this case, tlb->fullmm is true. Some archs like arm64 +doesn't flush TLB when tlb->fullmm is true: + + commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1"). + +Which causes leaking of tlb entries. + +Will clarifies his patch: + "Basically, we tag each address space with an ASID (PCID on x86) which + is resident in the TLB. This means we can elide TLB invalidation when + pulling down a full mm because we won't ever assign that ASID to + another mm without doing TLB invalidation elsewhere (which actually + just nukes the whole TLB). + + I think that means that we could potentially not fault on a kernel + uaccess, because we could hit in the TLB" + +There could be a window between complete_signal() sending IPI to other +cores and all threads sharing this mm are really kicked off from cores. +In this window, the oom reaper may calls tlb_flush_mmu_tlbonly() to +flush TLB then frees pages. However, due to the above problem, the TLB +entries are not really flushed on arm64. Other threads are possible to +access these pages through TLB entries. Moreover, a copy_to_user() can +also write to these pages without generating page fault, causes +use-after-free bugs. + +This patch gathers each vma instead of gathering full vm space. In this +case tlb->fullmm is not true. The behavior of oom reaper become similar +to munmapping before do_exit, which should be safe for all archs. + +Link: http://lkml.kernel.org/r/20171107095453.179940-1-wangnan0@huawei.com +Fixes: aac453635549 ("mm, oom: introduce oom reaper") +Signed-off-by: Wang Nan +Acked-by: Michal Hocko +Acked-by: David Rientjes +Cc: Minchan Kim +Cc: Will Deacon +Cc: Bob Liu +Cc: Ingo Molnar +Cc: Roman Gushchin +Cc: Konstantin Khlebnikov +Cc: Andrea Arcangeli +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/oom_kill.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +--- a/mm/oom_kill.c ++++ b/mm/oom_kill.c +@@ -532,7 +532,6 @@ static bool __oom_reap_task_mm(struct ta + */ + set_bit(MMF_UNSTABLE, &mm->flags); + +- tlb_gather_mmu(&tlb, mm, 0, -1); + for (vma = mm->mmap ; vma; vma = vma->vm_next) { + if (!can_madv_dontneed_vma(vma)) + continue; +@@ -547,11 +546,13 @@ static bool __oom_reap_task_mm(struct ta + * we do not want to block exit_mmap by keeping mm ref + * count elevated without a good reason. + */ +- if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) ++ if (vma_is_anonymous(vma) || !(vma->vm_flags & VM_SHARED)) { ++ tlb_gather_mmu(&tlb, mm, vma->vm_start, vma->vm_end); + unmap_page_range(&tlb, vma, vma->vm_start, vma->vm_end, + NULL); ++ tlb_finish_mmu(&tlb, vma->vm_start, vma->vm_end); ++ } + } +- tlb_finish_mmu(&tlb, 0, -1); + pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, file-rss:%lukB, shmem-rss:%lukB\n", + task_pid_nr(tsk), tsk->comm, + K(get_mm_counter(mm, MM_ANONPAGES)), diff --git a/queue-4.14/series b/queue-4.14/series index e67efbcfa26..51d61ddee04 100644 --- a/queue-4.14/series +++ b/queue-4.14/series @@ -1 +1,3 @@ platform-x86-hp-wmi-fix-tablet-mode-detection-for-convertibles.patch +mm-memory_hotplug-do-not-back-off-draining-pcp-free-pages-from-kworker-context.patch +mm-oom_reaper-gather-each-vma-to-prevent-leaking-tlb-entry.patch