--- /dev/null
+From 7ed695e069c3cbea5e1fd08f84a04536da91f584 Mon Sep 17 00:00:00 2001
+From: Vlastimil Babka <vbabka@suse.cz>
+Date: Tue, 21 Jan 2014 15:51:09 -0800
+Subject: mm: compaction: detect when scanners meet in isolate_freepages
+
+From: Vlastimil Babka <vbabka@suse.cz>
+
+commit 7ed695e069c3cbea5e1fd08f84a04536da91f584 upstream.
+
+Compaction of a zone is finished when the migrate scanner (which begins
+at the zone's lowest pfn) meets the free page scanner (which begins at
+the zone's highest pfn). This is detected in compact_zone() and in the
+case of direct compaction, the compact_blockskip_flush flag is set so
+that kswapd later resets the cached scanner pfn's, and a new compaction
+may again start at the zone's borders.
+
+The meeting of the scanners can happen during either scanner's activity.
+However, it may currently fail to be detected when it occurs in the free
+page scanner, due to two problems. First, isolate_freepages() keeps
+free_pfn at the highest block where it isolated pages from, for the
+purposes of not missing the pages that are returned back to allocator
+when migration fails. Second, failing to isolate enough free pages due
+to scanners meeting results in -ENOMEM being returned by
+migrate_pages(), which makes compact_zone() bail out immediately without
+calling compact_finished() that would detect scanners meeting.
+
+This failure to detect scanners meeting might result in repeated
+attempts at compaction of a zone that keep starting from the cached
+pfn's close to the meeting point, and quickly failing through the
+-ENOMEM path, without the cached pfns being reset, over and over. This
+has been observed (through additional tracepoints) in the third phase of
+the mmtests stress-highalloc benchmark, where the allocator runs on an
+otherwise idle system. The problem was observed in the DMA32 zone,
+which was used as a fallback to the preferred Normal zone, but on the
+4GB system it was actually the largest zone. The problem is even
+amplified for such fallback zone - the deferred compaction logic, which
+could (after being fixed by a previous patch) reset the cached scanner
+pfn's, is only applied to the preferred zone and not for the fallbacks.
+
+The problem in the third phase of the benchmark was further amplified by
+commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") which
+resulted in a non-deterministic regression of the allocation success
+rate from ~85% to ~65%. This occurs in about half of benchmark runs,
+making bisection problematic. It is unlikely that the commit itself is
+buggy, but it should put more pressure on the DMA32 zone during phases 1
+and 2, which may leave it more fragmented in phase 3 and expose the bugs
+that this patch fixes.
+
+The fix is to make scanners meeting in isolate_freepage() stay that way,
+and to check in compact_zone() for scanners meeting when migrate_pages()
+returns -ENOMEM. The result is that compact_finished() also detects
+scanners meeting and sets the compact_blockskip_flush flag to make
+kswapd reset the scanner pfn's.
+
+The results in stress-highalloc benchmark show that the "regression" by
+commit 81c0a2bb515f in phase 3 no longer occurs, and phase 1 and 2
+allocation success rates are also significantly improved.
+
+Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
+Cc: Mel Gorman <mgorman@suse.de>
+Cc: Rik van Riel <riel@redhat.com>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/compaction.c | 19 +++++++++++++++----
+ 1 file changed, 15 insertions(+), 4 deletions(-)
+
+--- a/mm/compaction.c
++++ b/mm/compaction.c
+@@ -667,7 +667,7 @@ static void isolate_freepages(struct zon
+ * is the end of the pageblock the migration scanner is using.
+ */
+ pfn = cc->free_pfn;
+- low_pfn = cc->migrate_pfn + pageblock_nr_pages;
++ low_pfn = ALIGN(cc->migrate_pfn + 1, pageblock_nr_pages);
+
+ /*
+ * Take care that if the migration scanner is at the end of the zone
+@@ -683,7 +683,7 @@ static void isolate_freepages(struct zon
+ * pages on cc->migratepages. We stop searching if the migrate
+ * and free page scanners meet or enough free pages are isolated.
+ */
+- for (; pfn > low_pfn && cc->nr_migratepages > nr_freepages;
++ for (; pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
+ pfn -= pageblock_nr_pages) {
+ unsigned long isolated;
+
+@@ -738,7 +738,14 @@ static void isolate_freepages(struct zon
+ /* split_free_page does not map the pages */
+ map_pages(freelist);
+
+- cc->free_pfn = high_pfn;
++ /*
++ * If we crossed the migrate scanner, we want to keep it that way
++ * so that compact_finished() may detect this
++ */
++ if (pfn < low_pfn)
++ cc->free_pfn = max(pfn, zone->zone_start_pfn);
++ else
++ cc->free_pfn = high_pfn;
+ cc->nr_freepages = nr_freepages;
+ }
+
+@@ -1003,7 +1010,11 @@ static int compact_zone(struct zone *zon
+ if (err) {
+ putback_movable_pages(&cc->migratepages);
+ cc->nr_migratepages = 0;
+- if (err == -ENOMEM) {
++ /*
++ * migrate_pages() may return -ENOMEM when scanners meet
++ * and we want compact_finished() to detect it
++ */
++ if (err == -ENOMEM && cc->free_pfn > cc->migrate_pfn) {
+ ret = COMPACT_PARTIAL;
+ goto out;
+ }
--- /dev/null
+From d3132e4b83e6bd383c74d716f7281d7c3136089c Mon Sep 17 00:00:00 2001
+From: Vlastimil Babka <vbabka@suse.cz>
+Date: Tue, 21 Jan 2014 15:51:08 -0800
+Subject: mm: compaction: reset cached scanner pfn's before reading them
+
+From: Vlastimil Babka <vbabka@suse.cz>
+
+commit d3132e4b83e6bd383c74d716f7281d7c3136089c upstream.
+
+Compaction caches pfn's for its migrate and free scanners to avoid
+scanning the whole zone each time. In compact_zone(), the cached values
+are read to set up initial values for the scanners. There are several
+situations when these cached pfn's are reset to the first and last pfn
+of the zone, respectively. One of these situations is when a compaction
+has been deferred for a zone and is now being restarted during a direct
+compaction, which is also done in compact_zone().
+
+However, compact_zone() currently reads the cached pfn's *before*
+resetting them. This means the reset doesn't affect the compaction that
+performs it, and with good chance also subsequent compactions, as
+update_pageblock_skip() is likely to be called and update the cached
+pfn's to those being processed. Another chance for a successful reset
+is when a direct compaction detects that migration and free scanners
+meet (which has its own problems addressed by another patch) and sets
+update_pageblock_skip flag which kswapd uses to do the reset because it
+goes to sleep.
+
+This is clearly a bug that results in non-deterministic behavior, so
+this patch moves the cached pfn reset to be performed *before* the
+values are read.
+
+Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
+Acked-by: Mel Gorman <mgorman@suse.de>
+Acked-by: Rik van Riel <riel@redhat.com>
+Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+
+---
+ mm/compaction.c | 16 ++++++++--------
+ 1 file changed, 8 insertions(+), 8 deletions(-)
+
+--- a/mm/compaction.c
++++ b/mm/compaction.c
+@@ -947,6 +947,14 @@ static int compact_zone(struct zone *zon
+ }
+
+ /*
++ * Clear pageblock skip if there were failures recently and compaction
++ * is about to be retried after being deferred. kswapd does not do
++ * this reset as it'll reset the cached information when going to sleep.
++ */
++ if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
++ __reset_isolation_suitable(zone);
++
++ /*
+ * Setup to move all movable pages to the end of the zone. Used cached
+ * information on where the scanners should start but check that it
+ * is initialised by ensuring the values are within zone boundaries.
+@@ -962,14 +970,6 @@ static int compact_zone(struct zone *zon
+ zone->compact_cached_migrate_pfn = cc->migrate_pfn;
+ }
+
+- /*
+- * Clear pageblock skip if there were failures recently and compaction
+- * is about to be retried after being deferred. kswapd does not do
+- * this reset as it'll reset the cached information when going to sleep.
+- */
+- if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
+- __reset_isolation_suitable(zone);
+-
+ migrate_prep_local();
+
+ while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) {