]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blame - releases/3.10.30/mm-oom-base-root-bonus-on-current-usage.patch
5.1-stable patches
[thirdparty/kernel/stable-queue.git] / releases / 3.10.30 / mm-oom-base-root-bonus-on-current-usage.patch
CommitLineData
a756faa3
GKH
1From 778c14affaf94a9e4953179d3e13a544ccce7707 Mon Sep 17 00:00:00 2001
2From: David Rientjes <rientjes@google.com>
3Date: Thu, 30 Jan 2014 15:46:11 -0800
4Subject: mm, oom: base root bonus on current usage
5
6From: David Rientjes <rientjes@google.com>
7
8commit 778c14affaf94a9e4953179d3e13a544ccce7707 upstream.
9
10A 3% of system memory bonus is sometimes too excessive in comparison to
11other processes.
12
13With commit a63d83f427fb ("oom: badness heuristic rewrite"), the OOM
14killer tries to avoid killing privileged tasks by subtracting 3% of
15overall memory (system or cgroup) from their per-task consumption. But
16as a result, all root tasks that consume less than 3% of overall memory
17are considered equal, and so it only takes 33+ privileged tasks pushing
18the system out of memory for the OOM killer to do something stupid and
19kill dhclient or other root-owned processes. For example, on a 32G
20machine it can't tell the difference between the 1M agetty and the 10G
21fork bomb member.
22
23The changelog describes this 3% boost as the equivalent to the global
24overcommit limit being 3% higher for privileged tasks, but this is not
25the same as discounting 3% of overall memory from _every privileged task
26individually_ during OOM selection.
27
28Replace the 3% of system memory bonus with a 3% of current memory usage
29bonus.
30
31By giving root tasks a bonus that is proportional to their actual size,
32they remain comparable even when relatively small. In the example
33above, the OOM killer will discount the 1M agetty's 256 badness points
34down to 179, and the 10G fork bomb's 262144 points down to 183500 points
35and make the right choice, instead of discounting both to 0 and killing
36agetty because it's first in the task list.
37
38Signed-off-by: David Rientjes <rientjes@google.com>
39Reported-by: Johannes Weiner <hannes@cmpxchg.org>
40Acked-by: Johannes Weiner <hannes@cmpxchg.org>
41Cc: Michal Hocko <mhocko@suse.cz>
42Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
43Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
44Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
45
46---
47 Documentation/filesystems/proc.txt | 4 ++--
48 mm/oom_kill.c | 2 +-
49 2 files changed, 3 insertions(+), 3 deletions(-)
50
51--- a/Documentation/filesystems/proc.txt
52+++ b/Documentation/filesystems/proc.txt
53@@ -1372,8 +1372,8 @@ may allocate from based on an estimation
54 For example, if a task is using all allowed memory, its badness score will be
55 1000. If it is using half of its allowed memory, its score will be 500.
56
57-There is an additional factor included in the badness score: root
58-processes are given 3% extra memory over other tasks.
59+There is an additional factor included in the badness score: the current memory
60+and swap usage is discounted by 3% for root processes.
61
62 The amount of "allowed" memory depends on the context in which the oom killer
63 was called. If it is due to the memory assigned to the allocating task's cpuset
64--- a/mm/oom_kill.c
65+++ b/mm/oom_kill.c
66@@ -170,7 +170,7 @@ unsigned long oom_badness(struct task_st
67 * implementation used by LSMs.
68 */
69 if (has_capability_noaudit(p, CAP_SYS_ADMIN))
70- adj -= 30;
71+ points -= (points * 3) / 100;
72
73 /* Normalize to oom_score_adj units */
74 adj *= totalpages / 1000;