From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Thu, 20 Feb 2014 21:23:36 +0000 (-0800)
Subject: 3.4-stable patches
X-Git-Tag: v3.4.82~42
X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=c7ceed3c78074f9de01ace18a6f443ae7970e71f;p=thirdparty%2Fkernel%2Fstable-queue.git

3.4-stable patches

added patches:
	fs-file.c-fdtable-avoid-triggering-ooms-from-alloc_fdmem.patch
---

diff --git a/queue-3.4/fs-file.c-fdtable-avoid-triggering-ooms-from-alloc_fdmem.patch b/queue-3.4/fs-file.c-fdtable-avoid-triggering-ooms-from-alloc_fdmem.patch
new file mode 100644
index 00000000000..c507a360b99
--- /dev/null
+++ b/queue-3.4/fs-file.c-fdtable-avoid-triggering-ooms-from-alloc_fdmem.patch
@@ -0,0 +1,64 @@
+From 96c7a2ff21501691587e1ae969b83cbec8b78e08 Mon Sep 17 00:00:00 2001
+From: "Eric W. Biederman" <ebiederm@xmission.com>
+Date: Mon, 10 Feb 2014 14:25:41 -0800
+Subject: fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem
+
+From: "Eric W. Biederman" <ebiederm@xmission.com>
+
+commit 96c7a2ff21501691587e1ae969b83cbec8b78e08 upstream.
+
+Recently due to a spike in connections per second memcached on 3
+separate boxes triggered the OOM killer from accept.  At the time the
+OOM killer was triggered there was 4GB out of 36GB free in zone 1.  The
+problem was that alloc_fdtable was allocating an order 3 page (32KiB) to
+hold a bitmap, and there was sufficient fragmentation that the largest
+page available was 8KiB.
+
+I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious
+but I do agree that order 3 allocations are very likely to succeed.
+
+There are always pathologies where order > 0 allocations can fail when
+there are copious amounts of free memory available.  Using the pigeon
+hole principle it is easy to show that it requires 1 page more than 50%
+of the pages being free to guarantee an order 1 (8KiB) allocation will
+succeed, 1 page more than 75% of the pages being free to guarantee an
+order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of
+the pages being free to guarantee an order 3 allocate will succeed.
+
+A server churning memory with a lot of small requests and replies like
+memcached is a common case that if anything can will skew the odds
+against large pages being available.
+
+Therefore let's not give external applications a practical way to kill
+linux server applications, and specify __GFP_NORETRY to the kmalloc in
+alloc_fdmem.  Unless I am misreading the code and by the time the code
+reaches should_alloc_retry in __alloc_pages_slowpath (where
+__GFP_NORETRY becomes signification).  We have already tried everything
+reasonable to allocate a page and the only thing left to do is wait.  So
+not waiting and falling back to vmalloc immediately seems like the
+reasonable thing to do even if there wasn't a chance of triggering the
+OOM killer.
+
+Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
+Cc: Eric Dumazet <eric.dumazet@gmail.com>
+Acked-by: David Rientjes <rientjes@google.com>
+Cc: Cong Wang <cwang@twopensource.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ fs/file.c |    2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/fs/file.c
++++ b/fs/file.c
+@@ -47,7 +47,7 @@ static void *alloc_fdmem(size_t size)
+ 	 * vmalloc() if the allocation size will be considered "large" by the VM.
+ 	 */
+ 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
+-		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN);
++		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
+ 		if (data != NULL)
+ 			return data;
+ 	}
diff --git a/queue-3.4/series b/queue-3.4/series
index 0e02a08abc9..f473d4b431b 100644
--- a/queue-3.4/series
+++ b/queue-3.4/series
@@ -1 +1,2 @@
 xen-blkfront-handle-backend-closed-without-closing.patch
+fs-file.c-fdtable-avoid-triggering-ooms-from-alloc_fdmem.patch