4.5-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)
diff --git a/queue-4.5/0001-perf-x86-pebs-Add-workaround-for-broken-OVFL-status-.patch b/queue-4.5/0001-perf-x86-pebs-Add-workaround-for-broken-OVFL-status-.patch

new file mode 100644 (file)

index 0000000..51c4047
--- /dev/null
+++ b/queue-4.5/0001-perf-x86-pebs-Add-workaround-for-broken-OVFL-status-.patch
@@ -0,0 +1,90 @@
+From 804333ac935cfec53e96f1fc2264b373bdd7c923 Mon Sep 17 00:00:00 2001
+From: Stephane Eranian <eranian@google.com>
+Date: Thu, 3 Mar 2016 20:50:41 +0100
+Subject: perf/x86/pebs: Add workaround for broken OVFL status on HSW+
+
+From: Stephane Eranian <eranian@google.com>
+
+commit 8077eca079a212f26419c57226f28696b7100683 upstream.
+
+This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on
+Haswell, Broadwell and Skylake processors when using PEBS.
+
+The SDM stipulates that when the PEBS iterrupt threshold is crossed,
+an interrupt is posted and the kernel is interrupted. The kernel will
+find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to
+drain. But the bits corresponding to the actual counters should NOT be
+set. The kernel follows the SDM and assumes that all PEBS events are
+processed in the drain_pebs() callback. The kernel then checks for
+remaining overflows on any other (non-PEBS) events and processes these
+in the for_each_bit_set(&status) loop.
+
+As it turns out, under certain conditions on HSW and later processors,
+on PEBS buffer interrupt, bit 62 is set but the counter bits may be
+set as well. In that case, the kernel drains PEBS and generates
+SAMPLES with the EXACT tag, then it processes the counter bits, and
+generates normal (non-EXACT) SAMPLES.
+
+I ran into this problem by trying to understand why on HSW sampling on
+a PEBS event was sometimes returning SAMPLES without the EXACT tag.
+This should not happen on user level code because HSW has the
+eventing_ip which always point to the instruction that caused the
+event.
+
+The workaround in this patch simply ensures that the bits for the
+counters used for PEBS events are cleared after the PEBS buffer has
+been drained. With this fix 100% of the PEBS samples on my user code
+report the EXACT tag.
+
+Before:
+  $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
+  $ perf report -D | fgrep SAMPLES
+  PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
+                           \--- EXACT tag is missing
+
+After:
+  $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
+  $ perf report -D | fgrep SAMPLES
+  PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
+                           \--- EXACT tag is set
+
+The problem tends to appear more often when multiple PEBS events are used.
+
+Signed-off-by: Stephane Eranian <eranian@google.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
+Cc: Jiri Olsa <jolsa@redhat.com>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Vince Weaver <vincent.weaver@maine.edu>
+Cc: adrian.hunter@intel.com
+Cc: kan.liang@intel.com
+Cc: namhyung@kernel.org
+Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kernel/cpu/perf_event_intel.c |   10 ++++++++++
+ 1 file changed, 10 insertions(+)
+
+--- a/arch/x86/kernel/cpu/perf_event_intel.c
++++ b/arch/x86/kernel/cpu/perf_event_intel.c
+@@ -1884,6 +1884,16 @@ again:
+       if (__test_and_clear_bit(62, (unsigned long *)&status)) {
+               handled++;
+               x86_pmu.drain_pebs(regs);
++              /*
++               * There are cases where, even though, the PEBS ovfl bit is set
++               * in GLOBAL_OVF_STATUS, the PEBS events may also have their
++               * overflow bits set for their counters. We must clear them
++               * here because they have been processed as exact samples in
++               * the drain_pebs() routine. They must not be processed again
++               * in the for_each_bit_set() loop for regular samples below.
++               */
++              status &= ~cpuc->pebs_enabled;
++              status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
+       }
+ 
+       /*
diff --git a/queue-4.5/0002-perf-x86-intel-uncore-Remove-SBOX-support-for-BDX-DE.patch b/queue-4.5/0002-perf-x86-intel-uncore-Remove-SBOX-support-for-BDX-DE.patch

new file mode 100644 (file)

index 0000000..58fdc7a
--- /dev/null
+++ b/queue-4.5/0002-perf-x86-intel-uncore-Remove-SBOX-support-for-BDX-DE.patch
@@ -0,0 +1,58 @@
+From e178b147e530c12a95871e78569554666fe01c0f Mon Sep 17 00:00:00 2001
+From: Kan Liang <kan.liang@intel.com>
+Date: Wed, 24 Feb 2016 05:07:43 -0500
+Subject: perf/x86/intel/uncore: Remove SBOX support for BDX-DE
+
+From: Kan Liang <kan.liang@intel.com>
+
+commit e178b147e530c12a95871e78569554666fe01c0f upstream.
+
+BDX-DE and BDX-EP share the same uncore code path. But there is no sbox
+in BDX-DE. This patch remove SBOX support for BDX-DE.
+
+Signed-off-by: Kan Liang <kan.liang@intel.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Cc: <tonyb@cybernetics.com>
+Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
+Cc: Jiri Olsa <jolsa@redhat.com>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Stephane Eranian <eranian@google.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Tony Battersby <tonyb@cybernetics.com>
+Cc: Vince Weaver <vincent.weaver@maine.edu>
+Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F0770589D336@SHSMSX103.ccr.corp.intel.com
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c |    8 +++++++-
+ 1 file changed, 7 insertions(+), 1 deletion(-)
+
+--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
++++ b/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
+@@ -2875,11 +2875,13 @@ static struct intel_uncore_type bdx_unco
+       .format_group           = &hswep_uncore_sbox_format_group,
+ };
+ 
++#define BDX_MSR_UNCORE_SBOX   3
++
+ static struct intel_uncore_type *bdx_msr_uncores[] = {
+       &bdx_uncore_ubox,
+       &bdx_uncore_cbox,
+-      &bdx_uncore_sbox,
+       &hswep_uncore_pcu,
++      &bdx_uncore_sbox,
+       NULL,
+ };
+ 
+@@ -2888,6 +2890,10 @@ void bdx_uncore_cpu_init(void)
+       if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
+               bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+       uncore_msr_uncores = bdx_msr_uncores;
++
++      /* BDX-DE doesn't have SBOX */
++      if (boot_cpu_data.x86_model == 86)
++              uncore_msr_uncores[BDX_MSR_UNCORE_SBOX] = NULL;
+ }
+ 
+ static struct intel_uncore_type bdx_uncore_ha = {
diff --git a/queue-4.5/0003-perf-x86-intel-Fix-PEBS-warning-by-only-restoring-ac.patch b/queue-4.5/0003-perf-x86-intel-Fix-PEBS-warning-by-only-restoring-ac.patch

new file mode 100644 (file)

index 0000000..0e52ca1
--- /dev/null
+++ b/queue-4.5/0003-perf-x86-intel-Fix-PEBS-warning-by-only-restoring-ac.patch
@@ -0,0 +1,189 @@
+From 48253050e1a4b8b82c775185c1bc066a2a826f14 Mon Sep 17 00:00:00 2001
+From: Kan Liang <kan.liang@intel.com>
+Date: Thu, 3 Mar 2016 18:07:28 -0500
+Subject: [PATCH 3/5] perf/x86/intel: Fix PEBS warning by only restoring active
+ PMU in pmi
+
+From: Kan Liang <kan.liang@intel.com>
+
+commit 48253050e1a4b8b82c775185c1bc066a2a826f14 upstream.
+
+This patch tries to fix a PEBS warning found in my stress test. The
+following perf command can easily trigger the pebs warning or spurious
+NMI error on Skylake/Broadwell/Haswell platforms:
+
+  sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a
+
+Also the NMI watchdog must be enabled.
+
+For this case, the events number is larger than counter number. So
+perf has to do multiplexing.
+
+In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out
+old events, rotate_ctx, schedule in new events and finally
+perf_pmu_enable().
+
+If the old events include precise event, the MSR_IA32_PEBS_ENABLE
+should be cleared when perf_pmu_disable().  The MSR_IA32_PEBS_ENABLE
+should keep 0 until the perf_pmu_enable() is called and the new event is
+precise event.
+
+However, there is a corner case which could restore PEBS_ENABLE to
+stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL
+will be set to 0 to stop overflow and followed PMI. But there may be
+pending PMI from an earlier overflow, which cannot be stopped. So even
+GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At
+the end of the PMI handler, __intel_pmu_enable_all() will be called,
+which will restore the stale values if old events haven't scheduled
+out.
+
+Once the stale pebs value is set, it's impossible to be corrected if
+the new events are non-precise. Because the pebs_enabled will be set
+to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE
+setting. As a result, the following NMI with stale PEBS_ENABLE
+trigger pebs warning.
+
+The pending PMI after enabled=0 will become harmless if the NMI handler
+does not change the state. This patch checks cpuc->enabled in pmi and
+only restore the state when PMU is active.
+
+Here is the dump:
+
+  Call Trace:
+   <NMI>  [<ffffffff813c3a2e>] dump_stack+0x63/0x85
+   [<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0
+   [<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20
+   [<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320
+   [<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460
+   [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
+   [<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330
+   [<ffffffff811f2f11>] ?  unmap_kernel_range_noflush+0x11/0x20
+   [<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0
+   [<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170
+   [<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50
+   [<ffffffff810310b9>] nmi_handle+0x69/0x120
+   [<ffffffff810316f6>] default_do_nmi+0xe6/0x100
+   [<ffffffff810317f2>] do_nmi+0xe2/0x130
+   [<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e
+   [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
+   [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
+   [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
+   <<EOE>>  <IRQ>  [<ffffffff81006df8>] ?  x86_perf_event_set_period+0xd8/0x180
+   [<ffffffff81006eec>] x86_pmu_start+0x4c/0x100
+   [<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300
+   [<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10
+   [<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280
+   [<ffffffff8119c970>] ?  __perf_install_in_context+0xc0/0xc0
+   [<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280
+   [<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190
+   [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
+   [<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60
+   [<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50
+   [<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0
+   <EOI>  [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
+   [<ffffffff81123de5>] ?  smp_call_function_single+0xd5/0x130
+   [<ffffffff81123ddb>] ?  smp_call_function_single+0xcb/0x130
+   [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
+   [<ffffffff8119765a>] event_function_call+0x10a/0x120
+   [<ffffffff8119c660>] ? ctx_resched+0x90/0x90
+   [<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30
+   [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
+   [<ffffffff8119772b>] _perf_event_enable+0x5b/0x70
+   [<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0
+   [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
+   [<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0
+   [<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0
+   [<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0
+   [<ffffffff81036d29>] ? sched_clock+0x9/0x10
+   [<ffffffff8124a919>] SyS_ioctl+0x79/0x90
+   [<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
+  ---[ end trace aef202839fe9a71d ]---
+  Uhhuh. NMI received for unknown reason 2d on CPU 2.
+  Do you have a strange power saving mode enabled?
+
+Signed-off-by: Kan Liang <kan.liang@intel.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
+Cc: Jiri Olsa <jolsa@redhat.com>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Stephane Eranian <eranian@google.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Vince Weaver <vincent.weaver@maine.edu>
+Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com
+[ Fixed various typos and other small details. ]
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kernel/cpu/perf_event.c       |   13 +++++++++++++
+ arch/x86/kernel/cpu/perf_event_intel.c |   15 +++++++++++++--
+ arch/x86/kernel/cpu/perf_event_knc.c   |    4 +++-
+ 3 files changed, 29 insertions(+), 3 deletions(-)
+
+--- a/arch/x86/kernel/cpu/perf_event.c
++++ b/arch/x86/kernel/cpu/perf_event.c
+@@ -596,6 +596,19 @@ void x86_pmu_disable_all(void)
+       }
+ }
+ 
++/*
++ * There may be PMI landing after enabled=0. The PMI hitting could be before or
++ * after disable_all.
++ *
++ * If PMI hits before disable_all, the PMU will be disabled in the NMI handler.
++ * It will not be re-enabled in the NMI handler again, because enabled=0. After
++ * handling the NMI, disable_all will be called, which will not change the
++ * state either. If PMI hits after disable_all, the PMU is already disabled
++ * before entering NMI handler. The NMI handler will not change the state
++ * either.
++ *
++ * So either situation is harmless.
++ */
+ static void x86_pmu_disable(struct pmu *pmu)
+ {
+       struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+--- a/arch/x86/kernel/cpu/perf_event_intel.c
++++ b/arch/x86/kernel/cpu/perf_event_intel.c
+@@ -1502,7 +1502,15 @@ static __initconst const u64 knl_hw_cach
+ };
+ 
+ /*
+- * Use from PMIs where the LBRs are already disabled.
++ * Used from PMIs where the LBRs are already disabled.
++ *
++ * This function could be called consecutively. It is required to remain in
++ * disabled state if called consecutively.
++ *
++ * During consecutive calls, the same disable value will be written to related
++ * registers, so the PMU state remains unchanged. hw.state in
++ * intel_bts_disable_local will remain PERF_HES_STOPPED too in consecutive
++ * calls.
+  */
+ static void __intel_pmu_disable_all(void)
+ {
+@@ -1939,7 +1947,10 @@ again:
+               goto again;
+ 
+ done:
+-      __intel_pmu_enable_all(0, true);
++      /* Only restore PMU state when it's active. See x86_pmu_disable(). */
++      if (cpuc->enabled)
++              __intel_pmu_enable_all(0, true);
++
+       /*
+        * Only unmask the NMI after the overflow counters
+        * have been reset. This avoids spurious NMIs on
+--- a/arch/x86/kernel/cpu/perf_event_knc.c
++++ b/arch/x86/kernel/cpu/perf_event_knc.c
+@@ -263,7 +263,9 @@ again:
+               goto again;
+ 
+ done:
+-      knc_pmu_enable_all(0);
++      /* Only restore PMU state when it's active. See x86_pmu_disable(). */
++      if (cpuc->enabled)
++              knc_pmu_enable_all(0);
+ 
+       return handled;
+ }
diff --git a/queue-4.5/0004-perf-x86-intel-Use-PAGE_SIZE-for-PEBS-buffer-size-on.patch b/queue-4.5/0004-perf-x86-intel-Use-PAGE_SIZE-for-PEBS-buffer-size-on.patch

new file mode 100644 (file)

index 0000000..b66c12d
--- /dev/null
+++ b/queue-4.5/0004-perf-x86-intel-Use-PAGE_SIZE-for-PEBS-buffer-size-on.patch
@@ -0,0 +1,97 @@
+From 3135a66b768c5ee84c8a98b21d0330dc1c1234b4 Mon Sep 17 00:00:00 2001
+From: Jiri Olsa <jolsa@redhat.com>
+Date: Tue, 1 Mar 2016 20:03:52 +0100
+Subject: perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2
+
+From: Jiri Olsa <jolsa@redhat.com>
+
+commit 3135a66b768c5ee84c8a98b21d0330dc1c1234b4 upstream.
+
+Using PAGE_SIZE buffers makes the WRMSR to PERF_GLOBAL_CTRL in
+intel_pmu_enable_all() mysteriously hang on Core2. As a workaround, we
+don't do this.
+
+The hard lockup is easily triggered by running 'perf test attr'
+repeatedly. Most of the time it gets stuck on sample session with
+small periods.
+
+  # perf test attr -vv
+  14: struct perf_event_attr setup                             :
+  --- start ---
+  ...
+    'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1
+
+Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
+Signed-off-by: Jiri Olsa <jolsa@kernel.org>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Reviewed-by: Andi Kleen <ak@linux.intel.com>
+Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
+Cc: Jiri Olsa <jolsa@redhat.com>
+Cc: Kan Liang <kan.liang@intel.com>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Stephane Eranian <eranian@google.com>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Vince Weaver <vincent.weaver@maine.edu>
+Cc: Wang Nan <wangnan0@huawei.com>
+Link: http://lkml.kernel.org/r/20160301190352.GA8355@krava.redhat.com
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kernel/cpu/perf_event.h          |    1 +
+ arch/x86/kernel/cpu/perf_event_intel_ds.c |   13 +++++++++++--
+ 2 files changed, 12 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/kernel/cpu/perf_event.h
++++ b/arch/x86/kernel/cpu/perf_event.h
+@@ -586,6 +586,7 @@ struct x86_pmu {
+                       pebs_broken     :1,
+                       pebs_prec_dist  :1;
+       int             pebs_record_size;
++      int             pebs_buffer_size;
+       void            (*drain_pebs)(struct pt_regs *regs);
+       struct event_constraint *pebs_constraints;
+       void            (*pebs_aliases)(struct perf_event *event);
+--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
++++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
+@@ -269,7 +269,7 @@ static int alloc_pebs_buffer(int cpu)
+       if (!x86_pmu.pebs)
+               return 0;
+ 
+-      buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
++      buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
+       if (unlikely(!buffer))
+               return -ENOMEM;
+ 
+@@ -286,7 +286,7 @@ static int alloc_pebs_buffer(int cpu)
+               per_cpu(insn_buffer, cpu) = ibuffer;
+       }
+ 
+-      max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
++      max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
+ 
+       ds->pebs_buffer_base = (u64)(unsigned long)buffer;
+       ds->pebs_index = ds->pebs_buffer_base;
+@@ -1319,6 +1319,7 @@ void __init intel_ds_init(void)
+ 
+       x86_pmu.bts  = boot_cpu_has(X86_FEATURE_BTS);
+       x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
++      x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
+       if (x86_pmu.pebs) {
+               char pebs_type = x86_pmu.intel_cap.pebs_trap ?  '+' : '-';
+               int format = x86_pmu.intel_cap.pebs_format;
+@@ -1327,6 +1328,14 @@ void __init intel_ds_init(void)
+               case 0:
+                       printk(KERN_CONT "PEBS fmt0%c, ", pebs_type);
+                       x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
++                      /*
++                       * Using >PAGE_SIZE buffers makes the WRMSR to
++                       * PERF_GLOBAL_CTRL in intel_pmu_enable_all()
++                       * mysteriously hang on Core2.
++                       *
++                       * As a workaround, we don't do this.
++                       */
++                      x86_pmu.pebs_buffer_size = PAGE_SIZE;
+                       x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
+                       break;
+ 
diff --git a/queue-4.5/0005-perf-x86-intel-Fix-PEBS-data-source-interpretation-o.patch b/queue-4.5/0005-perf-x86-intel-Fix-PEBS-data-source-interpretation-o.patch

new file mode 100644 (file)

index 0000000..f218754
--- /dev/null
+++ b/queue-4.5/0005-perf-x86-intel-Fix-PEBS-data-source-interpretation-o.patch
@@ -0,0 +1,87 @@
+From 5e3f4cbd906c178510dccfed1131b007c96255ff Mon Sep 17 00:00:00 2001
+From: Andi Kleen <ak@linux.intel.com>
+Date: Tue, 1 Mar 2016 14:25:24 -0800
+Subject: perf/x86/intel: Fix PEBS data source interpretation on Nehalem/Westmere
+
+From: Andi Kleen <ak@linux.intel.com>
+
+commit 5e3f4cbd906c178510dccfed1131b007c96255ff upstream.
+
+Jiri reported some time ago that some entries in the PEBS data source table
+in perf do not agree with the SDM. We investigated and the bits
+changed for Sandy Bridge, but the SDM was not updated.
+
+perf already implements the bits correctly for Sandy Bridge
+and later. This patch patches it up for Nehalem and Westmere.
+
+Signed-off-by: Andi Kleen <ak@linux.intel.com>
+Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
+Cc: Linus Torvalds <torvalds@linux-foundation.org>
+Cc: Peter Zijlstra <peterz@infradead.org>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: jolsa@kernel.org
+Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org
+Signed-off-by: Ingo Molnar <mingo@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kernel/cpu/perf_event.h          |    2 ++
+ arch/x86/kernel/cpu/perf_event_intel.c    |    2 ++
+ arch/x86/kernel/cpu/perf_event_intel_ds.c |   11 ++++++++++-
+ 3 files changed, 14 insertions(+), 1 deletion(-)
+
+--- a/arch/x86/kernel/cpu/perf_event.h
++++ b/arch/x86/kernel/cpu/perf_event.h
+@@ -905,6 +905,8 @@ void intel_pmu_lbr_init_skl(void);
+ 
+ void intel_pmu_lbr_init_knl(void);
+ 
++void intel_pmu_pebs_data_source_nhm(void);
++
+ int intel_pmu_setup_lbr_filter(struct perf_event *event);
+ 
+ void intel_pt_interrupt(void);
+--- a/arch/x86/kernel/cpu/perf_event_intel.c
++++ b/arch/x86/kernel/cpu/perf_event_intel.c
+@@ -3417,6 +3417,7 @@ __init int intel_pmu_init(void)
+               intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
+                       X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1);
+ 
++              intel_pmu_pebs_data_source_nhm();
+               x86_add_quirk(intel_nehalem_quirk);
+ 
+               pr_cont("Nehalem events, ");
+@@ -3480,6 +3481,7 @@ __init int intel_pmu_init(void)
+               intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
+                       X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1);
+ 
++              intel_pmu_pebs_data_source_nhm();
+               pr_cont("Westmere events, ");
+               break;
+ 
+--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
++++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
+@@ -51,7 +51,8 @@ union intel_x86_pebs_dse {
+ #define OP_LH (P(OP, LOAD) | P(LVL, HIT))
+ #define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
+ 
+-static const u64 pebs_data_source[] = {
++/* Version for Sandy Bridge and later */
++static u64 pebs_data_source[] = {
+       P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
+       OP_LH | P(LVL, L1)  | P(SNOOP, NONE),   /* 0x01: L1 local */
+       OP_LH | P(LVL, LFB) | P(SNOOP, NONE),   /* 0x02: LFB hit */
+@@ -70,6 +71,14 @@ static const u64 pebs_data_source[] = {
+       OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
+ };
+ 
++/* Patch up minor differences in the bits */
++void __init intel_pmu_pebs_data_source_nhm(void)
++{
++      pebs_data_source[0x05] = OP_LH | P(LVL, L3)  | P(SNOOP, HIT);
++      pebs_data_source[0x06] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
++      pebs_data_source[0x07] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
++}
++
+ static u64 precise_store_data(u64 status)
+ {
+       union intel_x86_pebs_dse dse;
diff --git a/queue-4.5/series b/queue-4.5/series

index 82249a878ced8a372d6cc44637c887de7839e7ca..742bcfa05dc27384d6854adf2ecddaf955513ca1 100644 (file)
--- a/queue-4.5/series
+++ b/queue-4.5/series
@@ -231,3 +231,8 @@ intel_idle-prevent-skl-h-boot-failure-when-c8-c9-c10-enabled.patch
  pm-sleep-clear-pm_suspend_global_flags-upon-hibernate.patch
  scsi_common-do-not-clobber-fixed-sense-information.patch
  sched-cputime-fix-steal-time-accounting-vs.-cpu-hotplug.patch
+0001-perf-x86-pebs-Add-workaround-for-broken-OVFL-status-.patch
+0002-perf-x86-intel-uncore-Remove-SBOX-support-for-BDX-DE.patch
+0003-perf-x86-intel-Fix-PEBS-warning-by-only-restoring-ac.patch
+0004-perf-x86-intel-Use-PAGE_SIZE-for-PEBS-buffer-size-on.patch
+0005-perf-x86-intel-Fix-PEBS-data-source-interpretation-o.patch
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 10 Apr 2016 18:33:41 +0000 (11:33 -0700)
queue-4.5/0001-perf-x86-pebs-Add-workaround-for-broken-OVFL-status-.patch	[new file with mode: 0644]	patch \| blob
queue-4.5/0002-perf-x86-intel-uncore-Remove-SBOX-support-for-BDX-DE.patch	[new file with mode: 0644]	patch \| blob
queue-4.5/0003-perf-x86-intel-Fix-PEBS-warning-by-only-restoring-ac.patch	[new file with mode: 0644]	patch \| blob
queue-4.5/0004-perf-x86-intel-Use-PAGE_SIZE-for-PEBS-buffer-size-on.patch	[new file with mode: 0644]	patch \| blob
queue-4.5/0005-perf-x86-intel-Fix-PEBS-data-source-interpretation-o.patch	[new file with mode: 0644]	patch \| blob
queue-4.5/series		patch \| blob \| blame \| history