From: Greg Kroah-Hartman Date: Sat, 3 May 2014 18:30:12 +0000 (-0400) Subject: 3.10-stable patches X-Git-Tag: v3.4.89~15 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=6df70ace0c522098103456b7b6804c4fbf8c7912;p=thirdparty%2Fkernel%2Fstable-queue.git 3.10-stable patches added patches: hung_task-check-the-value-of-sysctl_hung_task_timeout_sec.patch mm-hugetlb-fix-softlockup-when-a-large-number-of-hugepages-are-freed.patch mm-try_to_unmap_cluster-should-lock_page-before-mlocking.patch mm-vmscan-do-not-swap-anon-pages-just-because-free-file-is-low.patch sh-fix-format-string-bug-in-stack-tracer.patch --- diff --git a/queue-3.10/hung_task-check-the-value-of-sysctl_hung_task_timeout_sec.patch b/queue-3.10/hung_task-check-the-value-of-sysctl_hung_task_timeout_sec.patch new file mode 100644 index 00000000000..ff3a4508a90 --- /dev/null +++ b/queue-3.10/hung_task-check-the-value-of-sysctl_hung_task_timeout_sec.patch @@ -0,0 +1,55 @@ +From 80df28476505ed4e6701c3448c63c9229a50c655 Mon Sep 17 00:00:00 2001 +From: Liu Hua +Date: Mon, 7 Apr 2014 15:38:57 -0700 +Subject: hung_task: check the value of "sysctl_hung_task_timeout_sec" + +From: Liu Hua + +commit 80df28476505ed4e6701c3448c63c9229a50c655 upstream. + +As sysctl_hung_task_timeout_sec is unsigned long, when this value is +larger then LONG_MAX/HZ, the function schedule_timeout_interruptible in +watchdog will return immediately without sleep and with print : + + schedule_timeout: wrong timeout value ffffffffffffff83 + +and then the funtion watchdog will call schedule_timeout_interruptible +again and again. The screen will be filled with + + "schedule_timeout: wrong timeout value ffffffffffffff83" + +This patch does some check and correction in sysctl, to let the function +schedule_timeout_interruptible allways get the valid parameter. + +Signed-off-by: Liu Hua +Tested-by: Satoru Takeuchi +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + kernel/sysctl.c | 6 ++++++ + 1 file changed, 6 insertions(+) + +--- a/kernel/sysctl.c ++++ b/kernel/sysctl.c +@@ -144,6 +144,11 @@ static int min_percpu_pagelist_fract = 8 + static int ngroups_max = NGROUPS_MAX; + static const int cap_last_cap = CAP_LAST_CAP; + ++/*this is needed for proc_doulongvec_minmax of sysctl_hung_task_timeout_secs */ ++#ifdef CONFIG_DETECT_HUNG_TASK ++static unsigned long hung_task_timeout_max = (LONG_MAX/HZ); ++#endif ++ + #ifdef CONFIG_INOTIFY_USER + #include + #endif +@@ -966,6 +971,7 @@ static struct ctl_table kern_table[] = { + .maxlen = sizeof(unsigned long), + .mode = 0644, + .proc_handler = proc_dohung_task_timeout_secs, ++ .extra2 = &hung_task_timeout_max, + }, + { + .procname = "hung_task_warnings", diff --git a/queue-3.10/mm-hugetlb-fix-softlockup-when-a-large-number-of-hugepages-are-freed.patch b/queue-3.10/mm-hugetlb-fix-softlockup-when-a-large-number-of-hugepages-are-freed.patch new file mode 100644 index 00000000000..e7dfcbf4765 --- /dev/null +++ b/queue-3.10/mm-hugetlb-fix-softlockup-when-a-large-number-of-hugepages-are-freed.patch @@ -0,0 +1,89 @@ +From 55f67141a8927b2be3e51840da37b8a2320143ed Mon Sep 17 00:00:00 2001 +From: "Mizuma, Masayoshi" +Date: Mon, 7 Apr 2014 15:37:54 -0700 +Subject: mm: hugetlb: fix softlockup when a large number of hugepages are freed. + +From: "Mizuma, Masayoshi" + +commit 55f67141a8927b2be3e51840da37b8a2320143ed upstream. + +When I decrease the value of nr_hugepage in procfs a lot, softlockup +happens. It is because there is no chance of context switch during this +process. + +On the other hand, when I allocate a large number of hugepages, there is +some chance of context switch. Hence softlockup doesn't happen during +this process. So it's necessary to add the context switch in the +freeing process as same as allocating process to avoid softlockup. + +When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing +process occupied a CPU over 150 seconds and following softlockup message +appeared twice or more. + +$ echo 6000000 > /proc/sys/vm/nr_hugepages +$ cat /proc/sys/vm/nr_hugepages +6000000 +$ grep ^Huge /proc/meminfo +HugePages_Total: 6000000 +HugePages_Free: 6000000 +HugePages_Rsvd: 0 +HugePages_Surp: 0 +Hugepagesize: 2048 kB +$ echo 0 > /proc/sys/vm/nr_hugepages + +BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ... +Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1 +Call Trace: + free_pool_huge_page+0xb8/0xd0 + set_max_huge_pages+0x128/0x190 + hugetlb_sysctl_handler_common+0x113/0x140 + hugetlb_sysctl_handler+0x1e/0x20 + proc_sys_call_handler+0x97/0xd0 + proc_sys_write+0x14/0x20 + vfs_write+0xb8/0x1a0 + sys_write+0x51/0x90 + __audit_syscall_exit+0x265/0x290 + system_call_fastpath+0x16/0x1b + +I have not confirmed this problem with upstream kernels because I am not +able to prepare the machine equipped with 12TB memory now. However I +confirmed that the amount of decreasing hugepages was directly +proportional to the amount of required time. + +I measured required times on a smaller machine. It showed 130-145 +hugepages decreased in a millisecond. + + Amount of decreasing Required time Decreasing rate + hugepages (msec) (pages/msec) + ------------------------------------------------------------ + 10,000 pages == 20GB 70 - 74 135-142 + 30,000 pages == 60GB 208 - 229 131-144 + +It means decrement of 6TB hugepages will trigger softlockup with the +default threshold 20sec, in this decreasing rate. + +Signed-off-by: Masayoshi Mizuma +Cc: Joonsoo Kim +Cc: Michal Hocko +Cc: Wanpeng Li +Cc: Aneesh Kumar +Cc: KOSAKI Motohiro +Cc: Naoya Horiguchi +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/hugetlb.c | 1 + + 1 file changed, 1 insertion(+) + +--- a/mm/hugetlb.c ++++ b/mm/hugetlb.c +@@ -1487,6 +1487,7 @@ static unsigned long set_max_huge_pages( + while (min_count < persistent_huge_pages(h)) { + if (!free_pool_huge_page(h, nodes_allowed, 0)) + break; ++ cond_resched_lock(&hugetlb_lock); + } + while (count < persistent_huge_pages(h)) { + if (!adjust_pool_surplus(h, nodes_allowed, 1)) diff --git a/queue-3.10/mm-try_to_unmap_cluster-should-lock_page-before-mlocking.patch b/queue-3.10/mm-try_to_unmap_cluster-should-lock_page-before-mlocking.patch new file mode 100644 index 00000000000..f518b14cb30 --- /dev/null +++ b/queue-3.10/mm-try_to_unmap_cluster-should-lock_page-before-mlocking.patch @@ -0,0 +1,90 @@ +From 57e68e9cd65b4b8eb4045a1e0d0746458502554c Mon Sep 17 00:00:00 2001 +From: Vlastimil Babka +Date: Mon, 7 Apr 2014 15:37:50 -0700 +Subject: mm: try_to_unmap_cluster() should lock_page() before mlocking + +From: Vlastimil Babka + +commit 57e68e9cd65b4b8eb4045a1e0d0746458502554c upstream. + +A BUG_ON(!PageLocked) was triggered in mlock_vma_page() by Sasha Levin +fuzzing with trinity. The call site try_to_unmap_cluster() does not lock +the pages other than its check_page parameter (which is already locked). + +The BUG_ON in mlock_vma_page() is not documented and its purpose is +somewhat unclear, but apparently it serializes against page migration, +which could otherwise fail to transfer the PG_mlocked flag. This would +not be fatal, as the page would be eventually encountered again, but +NR_MLOCK accounting would become distorted nevertheless. This patch adds +a comment to the BUG_ON in mlock_vma_page() and munlock_vma_page() to that +effect. + +The call site try_to_unmap_cluster() is fixed so that for page != +check_page, trylock_page() is attempted (to avoid possible deadlocks as we +already have check_page locked) and mlock_vma_page() is performed only +upon success. If the page lock cannot be obtained, the page is left +without PG_mlocked, which is again not a problem in the whole unevictable +memory design. + +Signed-off-by: Vlastimil Babka +Signed-off-by: Bob Liu +Reported-by: Sasha Levin +Cc: Wanpeng Li +Cc: Michel Lespinasse +Cc: KOSAKI Motohiro +Acked-by: Rik van Riel +Cc: David Rientjes +Cc: Mel Gorman +Cc: Hugh Dickins +Cc: Joonsoo Kim +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/mlock.c | 2 ++ + mm/rmap.c | 14 ++++++++++++-- + 2 files changed, 14 insertions(+), 2 deletions(-) + +--- a/mm/mlock.c ++++ b/mm/mlock.c +@@ -76,6 +76,7 @@ void clear_page_mlock(struct page *page) + */ + void mlock_vma_page(struct page *page) + { ++ /* Serialize with page migration */ + BUG_ON(!PageLocked(page)); + + if (!TestSetPageMlocked(page)) { +@@ -106,6 +107,7 @@ unsigned int munlock_vma_page(struct pag + { + unsigned int page_mask = 0; + ++ /* For try_to_munlock() and to serialize with page migration */ + BUG_ON(!PageLocked(page)); + + if (TestClearPageMlocked(page)) { +--- a/mm/rmap.c ++++ b/mm/rmap.c +@@ -1390,9 +1390,19 @@ static int try_to_unmap_cluster(unsigned + BUG_ON(!page || PageAnon(page)); + + if (locked_vma) { +- mlock_vma_page(page); /* no-op if already mlocked */ +- if (page == check_page) ++ if (page == check_page) { ++ /* we know we have check_page locked */ ++ mlock_vma_page(page); + ret = SWAP_MLOCK; ++ } else if (trylock_page(page)) { ++ /* ++ * If we can lock the page, perform mlock. ++ * Otherwise leave the page alone, it will be ++ * eventually encountered again later. ++ */ ++ mlock_vma_page(page); ++ unlock_page(page); ++ } + continue; /* don't unmap */ + } + diff --git a/queue-3.10/mm-vmscan-do-not-swap-anon-pages-just-because-free-file-is-low.patch b/queue-3.10/mm-vmscan-do-not-swap-anon-pages-just-because-free-file-is-low.patch new file mode 100644 index 00000000000..509e46524d5 --- /dev/null +++ b/queue-3.10/mm-vmscan-do-not-swap-anon-pages-just-because-free-file-is-low.patch @@ -0,0 +1,73 @@ +From 0bf1457f0cfca7bc026a82323ad34bcf58ad035d Mon Sep 17 00:00:00 2001 +From: Johannes Weiner +Date: Tue, 8 Apr 2014 16:04:10 -0700 +Subject: mm: vmscan: do not swap anon pages just because free+file is low + +From: Johannes Weiner + +commit 0bf1457f0cfca7bc026a82323ad34bcf58ad035d upstream. + +Page reclaim force-scans / swaps anonymous pages when file cache drops +below the high watermark of a zone in order to prevent what little cache +remains from thrashing. + +However, on bigger machines the high watermark value can be quite large +and when the workload is dominated by a static anonymous/shmem set, the +file set might just be a small window of used-once cache. In such +situations, the VM starts swapping heavily when instead it should be +recycling the no longer used cache. + +This is a longer-standing problem, but it's more likely to trigger after +commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") +because file pages can no longer accumulate in a single zone and are +dispersed into smaller fractions among the available zones. + +To resolve this, do not force scan anon when file pages are low but +instead rely on the scan/rotation ratios to make the right prediction. + +Signed-off-by: Johannes Weiner +Acked-by: Rafael Aquini +Cc: Rik van Riel +Cc: Mel Gorman +Cc: Hugh Dickins +Cc: Suleiman Souhlal +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/vmscan.c | 16 +--------------- + 1 file changed, 1 insertion(+), 15 deletions(-) + +--- a/mm/vmscan.c ++++ b/mm/vmscan.c +@@ -1659,7 +1659,7 @@ static void get_scan_count(struct lruvec + struct zone *zone = lruvec_zone(lruvec); + unsigned long anon_prio, file_prio; + enum scan_balance scan_balance; +- unsigned long anon, file, free; ++ unsigned long anon, file; + bool force_scan = false; + unsigned long ap, fp; + enum lru_list lru; +@@ -1713,20 +1713,6 @@ static void get_scan_count(struct lruvec + get_lru_size(lruvec, LRU_INACTIVE_FILE); + + /* +- * If it's foreseeable that reclaiming the file cache won't be +- * enough to get the zone back into a desirable shape, we have +- * to swap. Better start now and leave the - probably heavily +- * thrashing - remaining file pages alone. +- */ +- if (global_reclaim(sc)) { +- free = zone_page_state(zone, NR_FREE_PAGES); +- if (unlikely(file + free <= high_wmark_pages(zone))) { +- scan_balance = SCAN_ANON; +- goto out; +- } +- } +- +- /* + * There is enough inactive page cache, do not reclaim + * anything from the anonymous working set right now. + */ diff --git a/queue-3.10/series b/queue-3.10/series index 636b6dbff4c..fa931cc89fc 100644 --- a/queue-3.10/series +++ b/queue-3.10/series @@ -75,3 +75,8 @@ hvc-ensure-hvc_init-is-only-ever-called-once-in-hvc_console.c.patch usb-phy-add-ulpi-ids-for-smsc-usb3320-and-ti-tusb1210.patch usb-unbind-all-interfaces-before-rebinding-any.patch mtip32xx-set-queue-bounce-limit.patch +sh-fix-format-string-bug-in-stack-tracer.patch +mm-try_to_unmap_cluster-should-lock_page-before-mlocking.patch +mm-hugetlb-fix-softlockup-when-a-large-number-of-hugepages-are-freed.patch +mm-vmscan-do-not-swap-anon-pages-just-because-free-file-is-low.patch +hung_task-check-the-value-of-sysctl_hung_task_timeout_sec.patch diff --git a/queue-3.10/sh-fix-format-string-bug-in-stack-tracer.patch b/queue-3.10/sh-fix-format-string-bug-in-stack-tracer.patch new file mode 100644 index 00000000000..142ff0b7463 --- /dev/null +++ b/queue-3.10/sh-fix-format-string-bug-in-stack-tracer.patch @@ -0,0 +1,40 @@ +From a0c32761e73c9999cbf592b702f284221fea8040 Mon Sep 17 00:00:00 2001 +From: Matt Fleming +Date: Thu, 3 Apr 2014 14:46:20 -0700 +Subject: sh: fix format string bug in stack tracer + +From: Matt Fleming + +commit a0c32761e73c9999cbf592b702f284221fea8040 upstream. + +Kees reported the following error: + + arch/sh/kernel/dumpstack.c: In function 'print_trace_address': + arch/sh/kernel/dumpstack.c:118:2: error: format not a string literal and no format arguments [-Werror=format-security] + +Use the "%s" format so that it's impossible to interpret 'data' as a +format string. + +Signed-off-by: Matt Fleming +Reported-by: Kees Cook +Acked-by: Kees Cook +Cc: Paul Mundt +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + arch/sh/kernel/dumpstack.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/arch/sh/kernel/dumpstack.c ++++ b/arch/sh/kernel/dumpstack.c +@@ -115,7 +115,7 @@ static int print_trace_stack(void *data, + */ + static void print_trace_address(void *data, unsigned long addr, int reliable) + { +- printk(data); ++ printk("%s", (char *)data); + printk_address(addr, reliable); + } +