From: Greg Kroah-Hartman Date: Sat, 1 Mar 2014 01:02:38 +0000 (-0800) Subject: 3.10-stable patches X-Git-Tag: v3.10.33~41 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=81ee9259cbc52e00e4a9c4afbeb2de55cdf62300;p=thirdparty%2Fkernel%2Fstable-queue.git 3.10-stable patches added patches: cgroup-use-an-ordered-workqueue-for-cgroup-destruction.patch memcg-fix-endless-loop-caused-by-mem_cgroup_iter.patch --- diff --git a/queue-3.10/cgroup-use-an-ordered-workqueue-for-cgroup-destruction.patch b/queue-3.10/cgroup-use-an-ordered-workqueue-for-cgroup-destruction.patch new file mode 100644 index 00000000000..0ec6333948d --- /dev/null +++ b/queue-3.10/cgroup-use-an-ordered-workqueue-for-cgroup-destruction.patch @@ -0,0 +1,54 @@ +From ab3f5faa6255a0eb4f832675507d9e295ca7e9ba Mon Sep 17 00:00:00 2001 +From: Hugh Dickins +Date: Thu, 6 Feb 2014 15:56:01 -0800 +Subject: cgroup: use an ordered workqueue for cgroup destruction + +From: Hugh Dickins + +commit ab3f5faa6255a0eb4f832675507d9e295ca7e9ba upstream. + +Sometimes the cleanup after memcg hierarchy testing gets stuck in +mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0. + +There may turn out to be several causes, but a major cause is this: the +workitem to offline parent can get run before workitem to offline child; +parent's mem_cgroup_reparent_charges() circles around waiting for the +child's pages to be reparented to its lrus, but it's holding cgroup_mutex +which prevents the child from reaching its mem_cgroup_reparent_charges(). + +Just use an ordered workqueue for cgroup_destroy_wq. + +tj: Committing as the temporary fix until the reverse dependency can + be removed from memcg. Comment updated accordingly. + +Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction") +Suggested-by: Filipe Brandenburger +Signed-off-by: Hugh Dickins +Signed-off-by: Tejun Heo +Signed-off-by: Greg Kroah-Hartman + +--- + kernel/cgroup.c | 8 ++++++-- + 1 file changed, 6 insertions(+), 2 deletions(-) + +--- a/kernel/cgroup.c ++++ b/kernel/cgroup.c +@@ -4699,12 +4699,16 @@ static int __init cgroup_wq_init(void) + /* + * There isn't much point in executing destruction path in + * parallel. Good chunk is serialized with cgroup_mutex anyway. +- * Use 1 for @max_active. ++ * ++ * XXX: Must be ordered to make sure parent is offlined after ++ * children. The ordering requirement is for memcg where a ++ * parent's offline may wait for a child's leading to deadlock. In ++ * the long term, this should be fixed from memcg side. + * + * We would prefer to do this in cgroup_init() above, but that + * is called before init_workqueues(): so leave this until after. + */ +- cgroup_destroy_wq = alloc_workqueue("cgroup_destroy", 0, 1); ++ cgroup_destroy_wq = alloc_ordered_workqueue("cgroup_destroy", 0); + BUG_ON(!cgroup_destroy_wq); + return 0; + } diff --git a/queue-3.10/memcg-fix-endless-loop-caused-by-mem_cgroup_iter.patch b/queue-3.10/memcg-fix-endless-loop-caused-by-mem_cgroup_iter.patch new file mode 100644 index 00000000000..33613cc7214 --- /dev/null +++ b/queue-3.10/memcg-fix-endless-loop-caused-by-mem_cgroup_iter.patch @@ -0,0 +1,77 @@ +From ecc736fc3c71c411a9d201d8588c9e7e049e5d8c Mon Sep 17 00:00:00 2001 +From: Michal Hocko +Date: Thu, 23 Jan 2014 15:53:35 -0800 +Subject: memcg: fix endless loop caused by mem_cgroup_iter + +From: Michal Hocko + +commit ecc736fc3c71c411a9d201d8588c9e7e049e5d8c upstream. + +Hugh has reported an endless loop when the hardlimit reclaim sees the +same group all the time. This might happen when the reclaim races with +the memcg removal. + +shrink_zone + [rmdir root] + mem_cgroup_iter(root, NULL, reclaim) + // prev = NULL + rcu_read_lock() + mem_cgroup_iter_load + last_visited = iter->last_visited // gets root || NULL + css_tryget(last_visited) // failed + last_visited = NULL [1] + memcg = root = __mem_cgroup_iter_next(root, NULL) + mem_cgroup_iter_update + iter->last_visited = root; + reclaim->generation = iter->generation + + mem_cgroup_iter(root, root, reclaim) + // prev = root + rcu_read_lock + mem_cgroup_iter_load + last_visited = iter->last_visited // gets root + css_tryget(last_visited) // failed + [1] + +The issue seemed to be introduced by commit 5f5781619718 ("memcg: relax +memcg iter caching") which has replaced unconditional css_get/css_put by +css_tryget/css_put for the cached iterator. + +This patch fixes the issue by skipping css_tryget on the root of the +tree walk in mem_cgroup_iter_load and symmetrically doesn't release it +in mem_cgroup_iter_update. + +Signed-off-by: Michal Hocko +Reported-by: Hugh Dickins +Tested-by: Hugh Dickins +Cc: Johannes Weiner +Cc: Greg Thelen +Cc: [3.10+] +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman + +--- + mm/memcontrol.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/mm/memcontrol.c ++++ b/mm/memcontrol.c +@@ -1220,7 +1220,7 @@ struct mem_cgroup *mem_cgroup_iter(struc + if (dead_count == iter->last_dead_count) { + smp_rmb(); + last_visited = iter->last_visited; +- if (last_visited && ++ if (last_visited && last_visited != root && + !css_tryget(&last_visited->css)) + last_visited = NULL; + } +@@ -1229,7 +1229,7 @@ struct mem_cgroup *mem_cgroup_iter(struc + memcg = __mem_cgroup_iter_next(root, last_visited); + + if (reclaim) { +- if (last_visited) ++ if (last_visited && last_visited != root) + css_put(&last_visited->css); + + iter->last_visited = memcg; diff --git a/queue-3.10/series b/queue-3.10/series index 824b7d8153a..4cebd2f6220 100644 --- a/queue-3.10/series +++ b/queue-3.10/series @@ -35,3 +35,5 @@ net-add-and-use-skb_gso_transport_seglen.patch net-core-introduce-netif_skb_dev_features.patch net-ip-ipv6-handle-gso-skbs-in-forwarding-path.patch net-use-__gfp_noretry-for-high-order-allocations.patch +memcg-fix-endless-loop-caused-by-mem_cgroup_iter.patch +cgroup-use-an-ordered-workqueue-for-cgroup-destruction.patch