1 From 77f999a460d1455d7f0513e20c01c52605b5577a Mon Sep 17 00:00:00 2001
2 From: Sasha Levin <sashal@kernel.org>
3 Date: Sun, 1 Nov 2020 17:07:34 -0800
4 Subject: mm: memcg: link page counters to root if use_hierarchy is false
6 Content-Type: text/plain; charset=UTF-8
7 Content-Transfer-Encoding: 8bit
9 From: Roman Gushchin <guro@fb.com>
11 [ Upstream commit 8de15e920dc85d1705ab9c202c95d56845bc2d48 ]
13 Richard reported a warning which can be reproduced by running the LTP
14 madvise6 test (cgroup v1 in the non-hierarchical mode should be used):
16 WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
18 CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77
19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014
20 Workqueue: events drain_local_stock
21 RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
23 __memcg_kmem_uncharge (mm/memcontrol.c:3022)
24 drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114)
25 drain_local_stock (mm/memcontrol.c:2255)
26 process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274)
27 worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416)
28 kthread (kernel/kthread.c:292)
29 ret_from_fork (arch/x86/entry/entry_64.S:300)
31 The problem occurs because in the non-hierarchical mode non-root page
32 counters are not linked to root page counters, so the charge is not
33 propagated to the root memory cgroup.
35 After the removal of the original memory cgroup and reparenting of the
36 object cgroup, the root cgroup might be uncharged by draining a objcg
37 stock, for example. It leads to an eventual underflow of the charge and
40 Fix it by linking all page counters to corresponding root page counters
41 in the non-hierarchical mode.
43 Please note, that in the non-hierarchical mode all objcgs are always
44 reparented to the root memory cgroup, even if the hierarchy has more
45 than 1 level. This patch doesn't change it.
47 The patch also doesn't affect how the hierarchical mode is working,
48 which is the only sane and truly supported mode now.
50 Thanks to Richard for reporting, debugging and providing an alternative
53 Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
54 Reported-by: <ltp@lists.linux.it>
55 Signed-off-by: Roman Gushchin <guro@fb.com>
56 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
57 Reviewed-by: Shakeel Butt <shakeelb@google.com>
58 Reviewed-by: Michal Koutný <mkoutny@suse.com>
59 Acked-by: Johannes Weiner <hannes@cmpxchg.org>
60 Cc: Michal Hocko <mhocko@kernel.org>
61 Cc: <stable@vger.kernel.org>
62 Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com
63 Debugged-by: Richard Palethorpe <rpalethorpe@suse.com>
64 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
65 Signed-off-by: Sasha Levin <sashal@kernel.org>
67 mm/memcontrol.c | 18 ++++++++++++------
68 1 file changed, 12 insertions(+), 6 deletions(-)
70 diff --git a/mm/memcontrol.c b/mm/memcontrol.c
71 index 9eefdb9cc2303..de51787831728 100644
74 @@ -5298,7 +5298,13 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
75 memcg->swappiness = mem_cgroup_swappiness(parent);
76 memcg->oom_kill_disable = parent->oom_kill_disable;
78 - if (parent && parent->use_hierarchy) {
80 + page_counter_init(&memcg->memory, NULL);
81 + page_counter_init(&memcg->swap, NULL);
82 + page_counter_init(&memcg->memsw, NULL);
83 + page_counter_init(&memcg->kmem, NULL);
84 + page_counter_init(&memcg->tcpmem, NULL);
85 + } else if (parent->use_hierarchy) {
86 memcg->use_hierarchy = true;
87 page_counter_init(&memcg->memory, &parent->memory);
88 page_counter_init(&memcg->swap, &parent->swap);
89 @@ -5306,11 +5312,11 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
90 page_counter_init(&memcg->kmem, &parent->kmem);
91 page_counter_init(&memcg->tcpmem, &parent->tcpmem);
93 - page_counter_init(&memcg->memory, NULL);
94 - page_counter_init(&memcg->swap, NULL);
95 - page_counter_init(&memcg->memsw, NULL);
96 - page_counter_init(&memcg->kmem, NULL);
97 - page_counter_init(&memcg->tcpmem, NULL);
98 + page_counter_init(&memcg->memory, &root_mem_cgroup->memory);
99 + page_counter_init(&memcg->swap, &root_mem_cgroup->swap);
100 + page_counter_init(&memcg->memsw, &root_mem_cgroup->memsw);
101 + page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem);
102 + page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem);
104 * Deeper hierachy with use_hierarchy == false doesn't make
105 * much sense so let cgroup subsystem know about this