]> git.ipfire.org Git - people/arne_f/kernel.git/log
people/arne_f/kernel.git
8 years agotools lib traceevent: Support %ps/%pS
Scott Wood [Mon, 31 Aug 2015 21:16:37 +0000 (16:16 -0500)] 
tools lib traceevent: Support %ps/%pS

Commits such as 65dd297ac25565 ("xfs: %pF is only for function
pointers") caused a regression because pretty_print() didn't support
%ps/%pS.  The current %pf/%pF implementation in pretty_print() is what
%ps/%pS is supposed to do, so use the same code for %ps/%pS.

Addressing the incorrect %pf/%pF implementation is beyond the scope of
this patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Chinner <david@fromorbit.com>
Link: http://lkml.kernel.org/r/20150831211637.GA12848@home.buserror.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 22 Oct 2015 07:33:46 +0000 (09:33 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Print branch filter state in verbose mode. (Andi Kleen)

  - Fix core dump caused by per-socket/core system-wide stat. (Kan Liang)

  - Update libtraceevent KVM plugin. (Paolo Bonzini)

Infrastructure changes:

  - Add fixdep to 'tools/build' .gitignore. (Yunlong Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf annotate: Add debug message for out of bounds sample
Arnaldo Carvalho de Melo [Wed, 21 Oct 2015 18:45:13 +0000 (15:45 -0300)] 
perf annotate: Add debug message for out of bounds sample

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q0lde9ajs84oi38nlyjcqbwg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf evsel: Print branch filter state with -vv
Andi Kleen [Tue, 20 Oct 2015 18:46:36 +0000 (11:46 -0700)] 
perf evsel: Print branch filter state with -vv

Add a missing field to the perf_event_attr debug output.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1445366797-30894-4-git-send-email-andi@firstfloor.org
[ Print it between config2 and sample_regs_user (peterz)]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf cpu_map: Fix core dump caused by per-socket/core system-wide stat
Kan Liang [Fri, 9 Oct 2015 10:59:23 +0000 (06:59 -0400)] 
perf cpu_map: Fix core dump caused by per-socket/core system-wide stat

Perf will core dump if --per-socket/core -a are applied for perf stat.

The root cause is that cpu_map__build_map set refcnt of evlist's cpu_map
to 1.  It should set refcnt for the newly created cpu_map, not evlist's
cpu_map.

Here is the example:

  # perf stat -e cycles --per-socket -a sleep 1

   Performance counter stats for 'system wide':

  S0       36         30,196,257      cycles
  S1       28         15,823,536      cycles

       1.001126828 seconds time elapsed

  *** Error in `./perf': corrupted double-linked list: 0x00000000021f9090 ***
  ======= Backtrace: =========
  /lib64/libc.so.6[0x3002e7bbe7]
  /lib64/libc.so.6[0x3002e7d2b5]
  ./perf(perf_evsel__delete+0x28)[0x485bdd]
  ./perf[0x4800e8]
  ./perf(perf_evlist__delete+0x5e)[0x482cd5]
  ./perf(cmd_stat+0xf25)[0x432328]
  ./perf[0x4768e0]
  ./perf[0x476ad6]
  ./perf[0x476b41]
  ./perf(main+0x1d0)[0x476db2]
  /lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45]
  ./perf[0x4202c5]

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1444388363-35936-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib traceevent: update KVM plugin
Paolo Bonzini [Thu, 1 Oct 2015 10:28:11 +0000 (12:28 +0200)] 
tools lib traceevent: update KVM plugin

The format of the role word has changed through the years and the plugin
was never updated; some VMX exit reasons were missing too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1443695293-31127-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build: Add fixdep to .gitignore
Yunlong Song [Thu, 15 Oct 2015 08:51:56 +0000 (16:51 +0800)] 
perf build: Add fixdep to .gitignore

Commit 7c422f5572667fef0db38d2046ecce69dcf0afc8 ("tools build: Build
fixdep helper from perf and basic libs") dynamically creates fixdep
during the perf building. Add it to .gitignore.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 7c422f557266 ("tools build: Build fixdep helper from perf and basic libs")
Link: http://lkml.kernel.org/r/1444899116-8220-1-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf record: Add ability to sample call branches
Stephane Eranian [Tue, 13 Oct 2015 07:09:11 +0000 (09:09 +0200)] 
perf record: Add ability to sample call branches

This patch add a new branch type sampling filter to perf record.
It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples
direct call branches only, unlike 'any_call' which includes indirect
calls as well.

 $ perf record -j call -e cycles .....

The man page is updated accordingly.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALL
Stephane Eranian [Tue, 13 Oct 2015 07:09:10 +0000 (09:09 +0200)] 
perf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALL

The patch catches PERF_SAMPLE_BRANCH_CALL because it is not clear whether
this is actually supported by the hardware.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Add support for PERF_SAMPLE_BRANCH_CALL
Stephane Eranian [Tue, 13 Oct 2015 07:09:09 +0000 (09:09 +0200)] 
perf/x86: Add support for PERF_SAMPLE_BRANCH_CALL

This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
for Intel x86 processors. When the processor support LBR filtering
this the selection is done in hardware. Otherwise, the filter is
applied by software. Note that we chose to include zero length calls
because they also represent calls.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf: Add PERF_SAMPLE_BRANCH_CALL
Stephane Eranian [Tue, 13 Oct 2015 07:09:08 +0000 (09:09 +0200)] 
perf: Add PERF_SAMPLE_BRANCH_CALL

Add a new branch sample type to cover only call branches (function calls).
The current ANY_CALL included direct, indirect calls and far jumps.

We want to be able to differentiate indirect from direct calls. Therefore
we introduce PERF_SAMPLE_BRANCH_CALL. The implementation is up to each
architecture.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Fix time_shift in perf_event_mmap_page
Adrian Hunter [Fri, 16 Oct 2015 13:24:05 +0000 (16:24 +0300)] 
perf/x86: Fix time_shift in perf_event_mmap_page

Commit:

  b20112edeadf ("perf/x86: Improve accuracy of perf/sched clock")

allowed the time_shift value in perf_event_mmap_page to be as much
as 32.  Unfortunately the documented algorithms for using time_shift
have it shifting an integer, whereas to work correctly with the value
32, the type must be u64.

In the case of perf tools, Intel PT decodes correctly but the timestamps
that are output (for example by perf script) have lost 32-bits of
granularity so they look like they are not changing at all.

Fix by limiting the shift to 31 and adjusting the multiplier accordingly.

Also update the documentation of perf_event_mmap_page so that new code
based on it will be more future-proof.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: b20112edeadf ("perf/x86: Improve accuracy of perf/sched clock")
Link: http://lkml.kernel.org/r/1445001845-13688-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Tue, 20 Oct 2015 07:31:22 +0000 (09:31 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes:

User visible changes:

  - 'perf bench mem' now prefaults unconditionally, no sense in
    providing modes where page faults are measured. (Ingo Molnar)

  - Harmonize -l/--nr_loops accross 'perf bench'. (Ingo Molnar)

  - Various 'perf bench' consistency improvements. (Ingo Molnar)

  - Suppress libtraceevent warnings in non-verbose 'perf test' mode.
    (Namhyung Kim)

  - Move some tracepoint event test error messages to the verbose mode
    of 'perf test'. (Namhyung Kim)

  - Make 'perf help' usage message consistent with other tools. (Yunlong Song)

Build fixes:

  - Fix 'perf bench' build with gcc 4.4.7. (Arnaldo Carvalho de Melo)

Infrastructure changes:

  - 'perf stat' prep work for the 'perf stat scripting' patchkit. (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf bench: Use named initializers in the trailer too
Arnaldo Carvalho de Melo [Mon, 19 Oct 2015 21:17:25 +0000 (18:17 -0300)] 
perf bench: Use named initializers in the trailer too

To avoid this splat with gcc 4.4.7:

  cc1: warnings being treated as errors
  bench/mem-functions.c:273: error: missing initializer
  bench/mem-functions.c:273: error: (near initialization for ‘memcpy_functions[4].desc’)
  bench/mem-functions.c:366: error: missing initializer
  bench/mem-functions.c:366: error: (near initialization for ‘memset_functions[4].desc’)

Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-0s8o6tgw1pdwvdv02llb9tkd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf script: Check output fields only for samples
Jiri Olsa [Fri, 16 Oct 2015 10:41:25 +0000 (12:41 +0200)] 
perf script: Check output fields only for samples

There's no need to check sampling output fields for events without
perf_event_attr::sample_type field set.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-51-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf cpu_map: Add data arg to cpu_map__build_map callback
Jiri Olsa [Fri, 16 Oct 2015 10:41:15 +0000 (12:41 +0200)] 
perf cpu_map: Add data arg to cpu_map__build_map callback

Adding data arg to cpu_map__build_map callback, so we could pass data
along to the callback. It'll be needed in following patches to retrieve
topology info from perf.data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-41-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf cpu_map: Make cpu_map__build_map global
Jiri Olsa [Fri, 16 Oct 2015 10:41:14 +0000 (12:41 +0200)] 
perf cpu_map: Make cpu_map__build_map global

We'll need to call it from perf stat in the stat_script patchkit

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-40-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf stat: Add AGGR_UNSET mode
Jiri Olsa [Fri, 16 Oct 2015 10:41:04 +0000 (12:41 +0200)] 
perf stat: Add AGGR_UNSET mode

Adding AGGR_UNSET mode, so we could distinguish unset aggr_mode in
following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-30-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf stat: Rename perf_stat struct into perf_stat_evsel
Jiri Olsa [Fri, 16 Oct 2015 10:41:03 +0000 (12:41 +0200)] 
perf stat: Rename perf_stat struct into perf_stat_evsel

It's used as the perf_evsel::priv data, so the name suits better. Also
we'll need the perf_stat name free for more generic struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-29-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf help: Change 'usage' to 'Usage' for consistency
Yunlong Song [Thu, 15 Oct 2015 07:39:51 +0000 (15:39 +0800)] 
perf help: Change 'usage' to 'Usage' for consistency

Capitalize 'usage' to make it consistent with all the other 'Usage' in
the codes, e.g., usage_builtin.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Sriram Raghunathan <sriram.r@nokia.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1444894792-2338-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Run benchmarks, don't test them
Ingo Molnar [Mon, 19 Oct 2015 08:04:30 +0000 (10:04 +0200)] 
perf bench: Run benchmarks, don't test them

So right now we output this text:

        memcpy: Benchmark for memcpy() functions
        memset: Benchmark for memset() functions
           all: Test all memory access benchmarks

But the right verb to use with benchmarks is to 'run' them, not 'test'
them.

So change this (and all similar texts) to:

        memcpy: Benchmark for memcpy() functions
        memset: Benchmark for memset() functions
           all: Run all memory access benchmarks

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-15-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Rename 'routine' to 'function'
Ingo Molnar [Mon, 19 Oct 2015 08:04:29 +0000 (10:04 +0200)] 
perf bench mem: Rename 'routine' to 'function'

So right now there's a somewhat inconsistent mess of the benchmarking
code and options sometimes calling benchmarked functions 'functions',
sometimes calling them 'routines'.

Name them 'functions' consistently.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-14-git-send-email-mingo@kernel.org
[ Updated perf-bench man page, pointed out by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Harmonize all the -l/--nr_loops options
Ingo Molnar [Mon, 19 Oct 2015 08:04:28 +0000 (10:04 +0200)] 
perf bench: Harmonize all the -l/--nr_loops options

We have three benchmarking subsystems that specify some sort of 'number
of loops' parameter - but all of them do it inconsistently:

 numa:              -l/--nr_loops
 sched messaging:   -l/--loops
 mem memset/memcpy: -i/--iterations

Harmonize them to -l/--nr_loops by picking the numa variant - which is
also the most likely one to have existing scripting which we don't want
to break.

Plus improve the parameter help texts to indicate the default value for
the nr_loops variable to keep users from guessing ...

Also propagate the naming to internal variables.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-13-git-send-email-mingo@kernel.org
[ Let the harmonisation reach the perf-bench man page as well ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Reorganize the code a bit
Ingo Molnar [Mon, 19 Oct 2015 08:04:27 +0000 (10:04 +0200)] 
perf bench mem: Reorganize the code a bit

Reorder functions a bit, so that we synchronize the layout of the
memcpy() and memset() portions of the code.

This improves the code, especially after we'll add an strlcpy() variant
as well.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-12-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Improve user visible strings
Ingo Molnar [Mon, 19 Oct 2015 08:04:26 +0000 (10:04 +0200)] 
perf bench mem: Improve user visible strings

 - fix various typos in user visible output strings
 - make the output consistent (wrt. capitalization and spelling)
 - offer the list of routines to benchmark on '-r help'.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-11-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Fix 'length' vs. 'size' naming confusion
Ingo Molnar [Mon, 19 Oct 2015 08:04:25 +0000 (10:04 +0200)] 
perf bench mem: Fix 'length' vs. 'size' naming confusion

So 'perf bench mem memcpy/memset' consistently uses 'len' and 'length'
for buffer sizes - while it's really a memory buffer size. (strings have
length.)

Rename all affected variables.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-10-git-send-email-mingo@kernel.org
[ Update perf-bench man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Rename 'routine' to 'routine_str'
Ingo Molnar [Mon, 19 Oct 2015 08:04:24 +0000 (10:04 +0200)] 
perf bench mem: Rename 'routine' to 'routine_str'

So bench/mem-functions.c has a 'routine' name for the routines parameter
string, but a 'length_str' name for the length parameter string.

We also have another entity named 'routine': 'struct routine'.

This is inconsistent and confusing: rename 'routine' to 'routine_str'.

Also fix typos in the --routine help text.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-9-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench mem: Change 'cycle' to 'cycles'
Ingo Molnar [Mon, 19 Oct 2015 08:04:23 +0000 (10:04 +0200)] 
perf bench mem: Change 'cycle' to 'cycles'

So 'perf bench mem memset/memcpy' has a CPU cycles measurement method,
but calls it 'cycle' (singular) throughout the code, which makes it
harder to read.

Rename all related functions, variables and options to a plural 'cycles'
nomenclature.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-8-git-send-email-mingo@kernel.org
[ s/--cycle/--cycles/g in perf-bench man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: List output formatting options on 'perf bench -h'
Ingo Molnar [Mon, 19 Oct 2015 08:04:22 +0000 (10:04 +0200)] 
perf bench: List output formatting options on 'perf bench -h'

So 'perf bench -h' is not very helpful when printing the help line
about the output formatting options:

    -f, --format <default>
                              Specify format style

There are two output format styles, 'default' and 'simple', so improve
the help text to:

    -f, --format <default|simple>
                              Specify the output formatting style

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-7-git-send-email-mingo@kernel.org
[ Removed leftovers from the mem-functions.c rename ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Remove the prefaulting complication from 'perf bench mem mem*'
Ingo Molnar [Mon, 19 Oct 2015 08:04:21 +0000 (10:04 +0200)] 
perf bench: Remove the prefaulting complication from 'perf bench mem mem*'

So 'perf bench mem memcpy/memset' has elaborate code to measure
memcpy()/memset() performance both with freshly allocated buffers (which
includes initial page fault overhead) and with preallocated buffers.

But the thing is, the resulting bandwidth results are mostly
meaningless, because page faults dominate so much of the cost.

It might make sense to measure cache cold vs. cache hot performance, but
the code does not do this.

So remove this complication, and always prefault the ranges before using
them.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-6-git-send-email-mingo@kernel.org
[ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Rename 'mem-memcpy.c' => 'mem-functions.c'
Ingo Molnar [Mon, 19 Oct 2015 08:04:20 +0000 (10:04 +0200)] 
perf bench: Rename 'mem-memcpy.c' => 'mem-functions.c'

So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew
memset() functionality and now I plan to add string copy benchmarks as
well.

This makes the file name a misnomer: rename it to the more generic
mem-functions.c name.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-5-git-send-email-mingo@kernel.org
[ The "rename" was introducing __unused, wasn't removing the old file,
  and didn't update tools/perf/bench/Build, fix it ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Eliminate unused argument from bench_mem_common()
Ingo Molnar [Mon, 19 Oct 2015 08:04:19 +0000 (10:04 +0200)] 
perf bench: Eliminate unused argument from bench_mem_common()

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-4-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Default to all routines in 'perf bench mem'
Ingo Molnar [Mon, 19 Oct 2015 08:04:18 +0000 (10:04 +0200)] 
perf bench: Default to all routines in 'perf bench mem'

So few people know that the --routine option to 'perf bench memcpy/memset'
exists, and would not know that it's capable of testing the kernel's
memcpy/memset implementations.

Furthermore, 'perf bench mem all' will not run all routines:

vega:~> perf bench mem all
# Running mem/memcpy benchmark...
Routine default (Default memcpy() provided by glibc)
# Copying 1MB Bytes ...

     894.454383 MB/Sec
       3.844734 GB/Sec (with prefault)

# Running mem/memset benchmark...
Routine default (Default memset() provided by glibc)
# Copying 1MB Bytes ...

       1.220703 GB/Sec
       9.042245 GB/Sec (with prefault)

Because misleadingly the 'all' refers to 'all sub-benchmarks', not 'all
sub-benchmarks and routines'.

Fix all this by making the memcpy/memset routine to default to 'all',
which results in all the benchmarks being run:

triton:~> perf bench mem all
# Running mem/memcpy benchmark...
Routine default (Default memcpy() provided by glibc)
# Copying 1MB Bytes ...

       1.448906 GB/Sec
       4.957170 GB/Sec (with prefault)
Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB Bytes ...

       1.614153 GB/Sec
       4.379204 GB/Sec (with prefault)
Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB Bytes ...

       1.570036 GB/Sec
       4.264465 GB/Sec (with prefault)
Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB Bytes ...

       1.788576 GB/Sec
       6.554111 GB/Sec (with prefault)

# Running mem/memset benchmark...
Routine default (Default memset() provided by glibc)
# Copying 1MB Bytes ...

       2.082223 GB/Sec
       9.126752 GB/Sec (with prefault)
Routine x86-64-unrolled (unrolled memset() in arch/x86/lib/memset_64.S)
# Copying 1MB Bytes ...

       5.710892 GB/Sec
       8.346688 GB/Sec (with prefault)
Routine x86-64-stosq (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 1MB Bytes ...

       9.765625 GB/Sec
      12.520032 GB/Sec (with prefault)
Routine x86-64-stosb (movsb-based memset() in arch/x86/lib/memset_64.S)
# Copying 1MB Bytes ...

       9.668936 GB/Sec
      12.682630 GB/Sec (with prefault)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-3-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf bench: Improve the 'perf bench mem memcpy' code readability
Ingo Molnar [Mon, 19 Oct 2015 08:04:17 +0000 (10:04 +0200)] 
perf bench: Improve the 'perf bench mem memcpy' code readability

 - improve the readability of initializations
 - fix unnecessary double negations
 - fix ugly line breaks
 - fix other small details

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-2-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf test: Suppress libtraceevent warnings
Namhyung Kim [Mon, 19 Oct 2015 15:23:49 +0000 (00:23 +0900)] 
perf test: Suppress libtraceevent warnings

Currently libtraceevent emits warning on unsupported event formats.
However it'd be better to see them only -v option is given.  To do that,
it needs to override the warning() function which is used in the
libtracevent.  Thus add set_warning_routine() same as set_die_routine()
and check the verbose flag in our warning routine.

Before:
  # perf test 5
   5: parse events tests                                       :
    Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
    Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
    Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
    Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
    Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
    ...
   Ok

After:
  # perf test 5
   5: parse events tests                                       : Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf test: Silence tracepoint event failures
Namhyung Kim [Mon, 19 Oct 2015 15:23:48 +0000 (00:23 +0900)] 
perf test: Silence tracepoint event failures

Currently, when 'perf test' is run by a normal user, it'll fail to
access tracepoint events.  The output becomes somewhat messy because it
tries to be nice with long error messages and hints.

IMHO this is not needed for 'perf test' by default and AFAIK 'perf test'
uses pr_debug() rather than pr_err() for such messages so that one can
use -v option to see further details on failed testcases if needed.

Before:
  $ perf test
   1: vmlinux symtab matches kallsyms                          : FAILED!
   2: detect openat syscall event                              :Error:
  No permissions to read
  /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
  Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
  FAILED!
   3: detect openat syscall event on all cpus                  :Error:
  No permissions to read
  /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
  Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
  FAILED!
   ...

After:
  $ perf test
   1: vmlinux symtab matches kallsyms                          : FAILED!
   2: detect openat syscall event                              : FAILED!
   3: detect openat syscall event on all cpus                  : FAILED!
   ...

  $ perf test -v 2
   2: detect openat syscall event                              :
  --- start ---
  test child forked, pid 30575
  Error:     No permissions to read
  /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
  Hint:  Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'

  test child finished with -1
  ---- end ----
  detect openat syscall event: FAILED!

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 14 Oct 2015 13:06:33 +0000 (15:06 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Use the alternative with the most descriptive filename containing
    a vmlinux file for a given build-id, providing a better title line
    for tools such as 'annotate'. (Arnaldo Carvalho de Melo)

  - Remove help messages about previous right and left arrow keybidings, that
    were repurposed for horizontal scrolling. (Arnaldo Carvalho de Melo)

  - Inform how to reset the symbol filter in the hists browser. (top & report)
    (Arnaldo Carvalho de Melo)

  - Add 'm' key for context menu display in the hists browser, that became
    inacessible with the repurposing of the right arrow key for horizontal
    scrolling. (Namhyung Kim)

  - Use debug_frame for callchains if eh_frame is unusable. (Rabin Vicent)

Build fixes:

  - Fix strict-aliasing breakage with gcc 4.4 in the READ_ONCE/WRITE_ONCE code
    adopted from the kernel tree, that builds with -fno-strict-aliasing while
    tools/perf/ uses -Wstrict-aliasing=3. (Jiri Olsa)

  - Fix unw_word_t pointer casts in code using libunwind for callchains,
    fixing the build in at least 32-bit MIPS systems. (Rabin Vicent)

  - Work around cross compile build problems related to fixdep. (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agotools build: Fix cross compile build
Jiri Olsa [Tue, 13 Oct 2015 12:43:58 +0000 (14:43 +0200)] 
tools build: Fix cross compile build

He Kuang the new fixdep tool breaks cross compiling. The reason is it
wouldn't get compiled under host arch, but under cross arch and failed
to run.

We need to add support for host side tools build, meanwhile disabling
fixdep usage for cross arch builds.

Reported-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20151013124358.GB9467@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools include: Fix strict-aliasing rules breakage
Jiri Olsa [Tue, 13 Oct 2015 08:52:14 +0000 (10:52 +0200)] 
tools include: Fix strict-aliasing rules breakage

Vinson reported build breakage with gcc 4.4 due to strict-aliasing.

   CC       util/annotate.o
 cc1: warnings being treated as errors
 util/annotate.c: In function ‘disasm__purge’:
 linux-next/tools/include/linux/compiler.h:66: error: dereferencing
 pointer ‘res.41’ does break strict-aliasing rules

The reason is READ_ONCE/WRITE_ONCE code we took from kernel sources.  They
intentionaly break aliasing rules. While this is ok for kernel because it's
built with -fno-strict-aliasing, it breaks perf which is build with
-Wstrict-aliasing=3.

Using extra __may_alias__ type to allow aliasing in this case.

Reported-and-tested-by: Vinson Lee <vlee@twopensource.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Martin Liska <mliska@suse.cz>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabin@rab.in>
Cc: linux-next@vger.kernel.org
Link: http://lkml.kernel.org/r/20151013085214.GB2705@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Add 'm' key for context menu display
Namhyung Kim [Tue, 13 Oct 2015 00:02:01 +0000 (09:02 +0900)] 
perf hists browser: Add 'm' key for context menu display

With horizontal scrolling, the left/right arrow keys are used to scroll
columns and ENTER/ESC keys are used to enter/exit menu.  However if
callchain is recorded, the ENTER key is used to toggle callchain
expansion so there's no way to display menu.  Use 'm' key to display the
menu for this case.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444694521-8136-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf callchains: Fix unw_word_t pointer casts
Rabin Vincent [Sun, 27 Sep 2015 18:37:55 +0000 (20:37 +0200)] 
perf callchains: Fix unw_word_t pointer casts

unw_word_t is uint64_t even on 32-bit MIPS.  Cast it to uintptr_t before
the cast to void *p to get rid of the following errors:

  util/unwind-libunwind.c: In function 'access_mem':
  util/unwind-libunwind.c:464:4: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
  util/unwind-libunwind.c:475:2: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
  cc1: all warnings being treated as errors
  make[3]: *** [util/unwind-libunwind.o] Error 1

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabinv@axis.com>
Link: http://lkml.kernel.org/r/1443379079-29133-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf callchain: Use debug_frame if eh_frame is unusable
Rabin Vincent [Sun, 27 Sep 2015 18:37:57 +0000 (20:37 +0200)] 
perf callchain: Use debug_frame if eh_frame is unusable

When NO_LIBUNWIND_DEBUG_FRAME=0, use the .debug_frame if the .eh_frame
doesn't contain the approprate unwind tables.

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabinv@axis.com>
Link: http://lkml.kernel.org/r/1443379079-29133-3-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Inform how to reset the symbol filter
Arnaldo Carvalho de Melo [Mon, 12 Oct 2015 17:02:29 +0000 (14:02 -0300)] 
perf hists browser: Inform how to reset the symbol filter

When in the hists browser, i.e. in 'perf report' or in 'perf top', it is
possible to press '/' and specify a substring to filter by symbol name.

Clarify how to remove a filter by making the prompt be:

   Please enter the name of symbol you want to see.
   To remove the filter later, press / + ENTER

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vbq2b0kyufwy6p0ctkfswcoe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf ui browsers: Remove help messages about use of right and arrow keys
Arnaldo Carvalho de Melo [Mon, 12 Oct 2015 16:56:50 +0000 (13:56 -0300)] 
perf ui browsers: Remove help messages about use of right and arrow keys

They were repurposed for horizontal scrolling, so use just ENTER/ESC in
the help messages.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: c6c3c02dea40 ("perf hists browser: Implement horizontal scrolling")
Link: http://lkml.kernel.org/n/tip-n5ar4qg8fs12ax4vhr3rxhxj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf symbols: Try the .debug/ DSO cache as a last resort
Arnaldo Carvalho de Melo [Sun, 11 Oct 2015 20:17:24 +0000 (17:17 -0300)] 
perf symbols: Try the .debug/ DSO cache as a last resort

Not as the first attempt at finding a vmlinux for the running kernel,
this way we get a more informative filename to present in tools, it will
check that the build-id is the same as the one previously loaded in the
DSO in dso->build_id, reading from /sys/kernel/notes, for instance.

E.g. in the annotation TUI, going from 'perf top', for the scsi_sg_alloc
kernel function, in the first line:

Before:

scsi_sg_alloc  /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1

After:

scsi_sg_alloc  /lib/modules/4.3.0-rc1+/build/vmlinux

And:

  # ls -la /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1
lrwxrwxrwx. 1 root root 81 Sep 22 16:11 /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1 -> ../../home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
  # file ~/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
/root/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=282777c262e6b3c0451375163c9a81c893218ab1, not stripped
  #

The same as:

  # file /lib/modules/4.3.0-rc1+/build/vmlinux
/lib/modules/4.3.0-rc1+/build/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=282777c262e6b3c0451375163c9a81c893218ab1, not stripped

Furthermore:

  # sha256sum /lib/modules/4.3.0-rc1+/build/vmlinux
  e7a789bbdc61029ec09140c228e1dd651271f38ef0b8416c0b7d5ff727b98be2  /lib/modules/4.3.0-rc1+/build/vmlinux
  # sha256sum ~/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
  e7a789bbdc61029ec09140c228e1dd651271f38ef0b8416c0b7d5ff727b98be2  /root/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
  [root@zoo new]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9y42ikzq3jisiddoi6f07n8z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 8 Oct 2015 08:52:44 +0000 (10:52 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Adding a field via 'perf report -F' that already is enabled makes
    the tool get stuck in a loop, fix it. (Jiri Olsa)

Infrastructure changes:

  - Support PERF_RECORD_SWITCH in the python binding. (Arnaldo Carvalho de Melo)

  - Fix handling read() result using a signed variable, found with Coccinelle.
    (Andrzej Hajda)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'perf/urgent' into perf/core, before pulling new changes
Ingo Molnar [Thu, 8 Oct 2015 08:52:18 +0000 (10:52 +0200)] 
Merge branch 'perf/urgent' into perf/core, before pulling new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf python: Support the PERF_RECORD_SWITCH event
Arnaldo Carvalho de Melo [Tue, 6 Oct 2015 20:46:46 +0000 (17:46 -0300)] 
perf python: Support the PERF_RECORD_SWITCH event

To test it check tools/perf/python/twatch.py, after following the
instructions there to enable context_switch, output looks like:

  [root@zoo linux]# tools/perf/python/twatch.py
  cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
  cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 0 }
  cpu: 2, pid: 31463, tid: 31496 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31496, switch_out: 1 }
  cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 0 }
  cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 1 }
  cpu: 3, pid: 31463, tid: 31527 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31527, switch_out: 1 }
  cpu: 1, pid: 31463, tid: 31463 { type: context_switch, next_prev_pid: 31463, next_prev_tid: 31463, switch_out: 0 }
  ^CTraceback (most recent call last):
    File "tools/perf/python/twatch.py", line 67, in <module>
      main(context_switch = 1, thread = 31463)
    File "tools/perf/python/twatch.py", line 40, in main
      evlist.poll(timeout = -1)
  KeyboardInterrupt
  [root@zoo linux]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Guy Streeter <streeter@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1ukistmpamc5z717k80ctcp2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 7 Oct 2015 16:33:10 +0000 (18:33 +0200)] 
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

  - Fix build break on (at least) powerpc due to sample_reg_masks, not being
    available for linking. (Sukadev Bhattiprolu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf tools: Fix build break on powerpc due to sample_reg_masks
Sukadev Bhattiprolu [Thu, 24 Sep 2015 21:53:49 +0000 (17:53 -0400)] 
perf tools: Fix build break on powerpc due to sample_reg_masks

perf_regs.c does not get built on Powerpc as CONFIG_PERF_REGS is false.
So the weak definition for 'sample_regs_masks' doesn't get picked up.

Adding perf_regs.o to util/Build unconditionally, exposes a redefinition
error for 'perf_reg_value()' function (due to the static inline version
in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that
function.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Wed, 7 Oct 2015 07:54:22 +0000 (08:54 +0100)] 
Merge tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Bugfixes:
   - Fix a use-after-free bug in the RPC/RDMA client
   - Fix a write performance regression
   - Fix up page writeback accounting
   - Don't try to reclaim unused state owners
   - Fix a NFSv4 nograce recovery hang
   - reset states to use open_stateid when returning delegation
     voluntarily
   - Fix a tracepoint NULL-pointer dereference"

* tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix a tracepoint NULL-pointer dereference
  nfs4: reset states to use open_stateid when returning delegation voluntarily
  NFSv4: Fix a nograce recovery hang
  NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FH
  NFSv4: Don't try to reclaim unused state owners
  NFS: Fix a write performance regression
  NFS: Fix up page writeback accounting
  xprtrdma: disconnect and flush cqs before freeing buffers

8 years agoRevert "fs: do not prefault sys_write() user buffer pages"
Linus Torvalds [Wed, 7 Oct 2015 07:32:38 +0000 (08:32 +0100)] 
Revert "fs: do not prefault sys_write() user buffer pages"

This reverts commit 998ef75ddb5709bbea0bf1506cd2717348a3c647.

The commit itself does not appear to be buggy per se, but it is exposing
a bug in ext4 (and Ted thinks ext3 too, but we solved that by getting
rid of it).  It's too late in the release cycle to really worry about
this, even if Dave Hansen has a patch that may actually fix the
underlying ext4 problem.  We can (and should) revisit this for the next
release.

The problem is that moving the prefaulting later now exposes a special
case with partially successful writes that isn't handled correctly.  And
the prefaulting likely isn't normally even that much of a performance
issue - it looks like at least one reason Dave saw this in his
performance tests is that he also ran them on Skylake that now supports
the new SMAP code, which makes the normally very cheap user space
prefaulting noticeably more expensive.

Bisected-and-acked-by: Ted Ts'o <tytso@mit.edu>
Analyzed-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoNFS: Fix a tracepoint NULL-pointer dereference
Anna Schumaker [Mon, 5 Oct 2015 20:43:26 +0000 (16:43 -0400)] 
NFS: Fix a tracepoint NULL-pointer dereference

Running xfstest generic/013 with the tracepoint nfs:nfs4_open_file
enabled produces a NULL-pointer dereference when calculating fileid and
filehandle of the opened file.  Fix this by checking if state is NULL
before trying to use the inode pointer.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
8 years agoperf tools: Fix handling read result using a signed variable
Andrzej Hajda [Tue, 6 Oct 2015 09:00:17 +0000 (11:00 +0200)] 
perf tools: Fix handling read result using a signed variable

The function can return negative value, assigning it to unsigned
variable can cause memory corruption.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2038576

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/1444122017-16856-1-git-send-email-a.hajda@samsung.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Use hpp_dimension__add_output to register hpp columns
Jiri Olsa [Tue, 6 Oct 2015 12:25:12 +0000 (14:25 +0200)] 
perf tools: Use hpp_dimension__add_output to register hpp columns

The perf_hpp__init currently does not respect sorting dimensions and the
setup_sorting function could endup queueing same format twice. That
screwed up the perf_hpp__list and got stuck in loop within
perf_hpp__setup_output_field function.

  $ perf report -F +overhead

  0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
  1506    {

     #0  0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
     #1  0x00000000004c139d in perf_hpp__same_sort_entry (a=a@entry=0x880440 <perf_hpp.format>, b=b@entry=0x2bb2fe0) at util/sort.c:1380
     #2  0x00000000004f8d3c in perf_hpp__setup_output_field () at ui/hist.c:554
     #3  0x00000000004c1d1e in setup_sorting () at util/sort.c:1984
     #4  0x000000000042efbf in cmd_report (argc=0, argv=0x7ffea5a0e790, prefix=<optimized out>) at builtin-report.c:874
     #5  0x0000000000476f13 in run_builtin (p=p@entry=0x875628 <commands+168>, argc=argc@entry=3, argv=argv@entry=0x7ffea5a0e790) at perf.c:385
     #6  0x000000000047710b in handle_internal_command (argc=3, argv=0x7ffea5a0e790) at perf.c:445
     #7  0x0000000000477176 in run_argv (argcp=argcp@entry=0x7ffea5a0e5fc, argv=argv@entry=0x7ffea5a0e5f0) at perf.c:489
     #8  0x00000000004773e7 in main (argc=3, argv=0x7ffea5a0e790) at perf.c:606

Using hpp_dimension__add_output function to register the output column.
It will also mark the dimension as taken and omit above stuck.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Introduce hpp_dimension__add_output function
Jiri Olsa [Tue, 6 Oct 2015 12:25:11 +0000 (14:25 +0200)] 
perf tools: Introduce hpp_dimension__add_output function

This function will allow to register output column from ui code and
respect taken sort/output dimensions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Get rid of superfluos call to reset_dimensions
Jiri Olsa [Tue, 6 Oct 2015 12:25:10 +0000 (14:25 +0200)] 
perf tools: Get rid of superfluos call to reset_dimensions

There's no need to call reset_dimensions within __setup_output_field
function. It's already called in its caller setup_sorting right before
perf_hpp__init, which will be changed in following patch to respect
taken dimension.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore
Taku Izumi [Thu, 24 Sep 2015 12:10:21 +0000 (21:10 +0900)] 
perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore

In multi-segment system, uncore devices may belong to buses whose segment
number is other than 0:

  ....
  0000:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:7f:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:bf:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03
  ...

In that case, relation of bus number and physical id may be broken
because "uncore_pcibus_to_physid" doesn't take account of PCI segment.
For example, bus 0000:ff and 0001:ff uses the same entry of
"uncore_pcibus_to_physid" array.

This patch fixes this problem by introducing the segment-aware pci2phy_map instead.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1443096621-4119-1-git-send-email-izumi.taku@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Add Intel cstate PMUs support
Kan Liang [Mon, 28 Sep 2015 12:30:04 +0000 (08:30 -0400)] 
perf/x86: Add Intel cstate PMUs support

This patch adds new PMUs to support cstate related free running
(read-only) counters. These counters may be used simultaneously by other
tools, such as turbostat. However, it still make sense to implement them
in perf. Because we can conveniently collect them together with other
events, and allow to use them from tools without special MSR access
code.

These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
According to counters' scope and category, two PMUs are registered with
the perf_event core subsystem.

 - 'cstate_core': The counter is available for each physical core. The
                  counters include CORE_C*_RESIDENCY.

 - 'cstate_pkg':  The counter is available for each physical package. The
                  counters include PKG_C*_RESIDENCY.

The events are exposed in sysfs for use by perf stat and other tools.
The files are:

  /sys/devices/cstate_core/events/c*-residency
  /sys/devices/cstate_pkg/events/c*-residency

These events only support system-wide mode counting.
The /sys/devices/cstate_*/cpumask file can be used by tools to figure
out which CPUs to monitor by default.

The PMU type (attr->type) is dynamically allocated and is available from
/sys/devices/core_misc/type and /sys/device/cstate_*/type.

Sampling is not supported.

Here is an example.

 - To caculate the fraction of time when the core is running in C6 state
   CORE_C6_time% = CORE_C6_RESIDENCY / TSC

 # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5

   11838820015,,cstate_core/c6-residency/,5175919658,100.00
   11877130740,,msr/tsc/,5175922010,100.00

 For sleep, 99.7% of time we ran in C6 state.

 # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop

   1253316,,cstate_core/c6-residency/,4360969154,100.00
   10012635248,,msr/tsc/,4360972366,100.00

 For busyloop, 0.01% of time we ran in C6 state.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 6 Oct 2015 14:05:02 +0000 (15:05 +0100)] 
Merge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - Fix VM save performance regression with x86 PV guests

 - Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)

 - Other minor fixes.

* tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen/p2m: hint at the last populated P2M entry
  x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
  x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
  xen/x86: Don't try to write syscall-related MSRs for PV guests
  xen: use correct type for HYPERVISOR_memory_op()

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Tue, 6 Oct 2015 13:59:36 +0000 (14:59 +0100)] 
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
 "Three bug fixes and an update to the default configuration"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/defconfig: set SCSI_DH=y
  s390/vtime: correct scaled cputime of partially idle CPUs
  s390/boot/decompression: disable floating point in decompressor
  s390/numa: use correct type for node_to_cpumask_map

8 years agoMerge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Tue, 6 Oct 2015 13:30:21 +0000 (14:30 +0100)] 
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French:
 "Two fixes for problems pointed out by automated tools.

  Thanks PaX/grsecurity team and Dan Carpenter (and the Smatch tool)"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  [CIFS] Update cifs version number
  [SMB3] Do not fall back to SMBWriteX in set_file_size error cases
  [SMB3] Missing null tcon check

8 years agox86/xen/p2m: hint at the last populated P2M entry
David Vrabel [Mon, 7 Sep 2015 16:14:08 +0000 (17:14 +0100)] 
x86/xen/p2m: hint at the last populated P2M entry

With commit 633d6f17cd91ad5bf2370265946f716e42d388c6 (x86/xen: prepare
p2m list for memory hotplug) the P2M may be sized to accomdate a much
larger amount of memory than the domain currently has.

When saving a domain, the toolstack must scan all the P2M looking for
populated pages.  This results in a performance regression due to the
unnecessary scanning.

Instead of reporting (via shared_info) the maximum possible size of
the P2M, hint at the last PFN which might be populated.  This hint is
increased as new leaves are added to the P2M (in the expectation that
they will be used for populated entries).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org> # 4.0+
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Tue, 6 Oct 2015 07:04:10 +0000 (09:04 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Switch the default callchain output mode to 'graph,0.5,caller', to make it
    look like the default for other tools, reducing the learning curve for
    people used to 'caller' based viewing. (Arnaldo Carvalho de Melo)

  - Implement column based horizontal scrolling in the hists browser (top, report),
    making it possible to use the TUI for things like 'perf mem report' where
    there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo)

  - Support sorting by symbol_iaddr with perf.data files produced by
    'perf mem record'. (Don Zickus)

  - Display DATA_SRC sample type bit, i.e. when running 'perf evlist -v' the
    "DATA_SRC" wasn't appearing when set, fix it to look like: (Jiri Olsa)

      cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC

  - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.:

     $ perf record -e cycles:P usleep 1

    Is now similar to:

     $ perf record usleep 1

    Useful, for instance, when specifying multiple events. (Jiri Olsa)

  - Make 'perf -v' and 'perf -h' work. (Jiri Olsa)

  - Fail properly when pattern matching fails to find a tracepoint, i.e.
    '-e non:existent' was being correctly handled, with a proper error message
    about that not being a valid event, but '-e non:existent*' wasn't,
    fix it. (Jiri Olsa)

Infrastructure changes:

  - Separate arch specific entries in 'perf test' and add an 'Intel CQM' one
    to be fun on x86 only. (Matt Fleming)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf tools: Fail properly in case pattern matching fails to find tracepoint
Jiri Olsa [Mon, 5 Oct 2015 19:31:17 +0000 (21:31 +0200)] 
perf tools: Fail properly in case pattern matching fails to find tracepoint

Currently we dont fail properly when pattern matching fails to find any
tracepoint.

Current behaviour:

  $ perf record -e 'sched:krava*' sleep 1
  WARNING: event parser found nothinginvalid or unsupported event: 'sched:krava*'
  Run 'perf list' for a list of valid events

  usage: perf record [<options>] [<command>]
     or: perf record [<options>] -- <command> [<options>]

This patch change:

  $ perf record -e 'sched:krava*' sleep 1
  event syntax error: 'sched:krava*'
                       \___ unknown tracepoint

  Error:  File /sys/kernel/debug/tracing/events/sched/krava* not found.
  Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.

  Run 'perf list' for a list of valid events

   usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444073477-3181-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists browser: Implement horizontal scrolling
Arnaldo Carvalho de Melo [Tue, 11 Aug 2015 20:22:43 +0000 (17:22 -0300)] 
perf hists browser: Implement horizontal scrolling

Do it using the recently introduced ui_brower scrolling mode, setting
ui_browser.columns to the number of sort columns and then, when
rendering each line, skipping as many initial columns as the user
pressed the right arrow.

As the user presses the left arrow, the ui_browser code will remove the
scrolling counter and the left scrolling takes place.

The right arrow key was an alias for ENTER, so people used to press it
may get a bit annoyed at first, sorry! Ditto for ESC and the left key.

Callchains can be left as is or we can, when rendering the Symbol
column, store the at what position on the screen it is and then
using ui_browser__gotorc() to print it from there, i.e. the callchain
would move around with the symbol.

Leaving it as is, i.e. at a fixed position, close to the left, saves
precious screen real state for it, so I'm inclined to leave it as is
now.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ccqq9sabgfge5dwbqjwh71ij@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf ui browser: Optional horizontal scrolling key binding
Arnaldo Carvalho de Melo [Tue, 11 Aug 2015 20:14:40 +0000 (17:14 -0300)] 
perf ui browser: Optional horizontal scrolling key binding

If the classes derived from ui_browser want to do some sort of
horizontal scrolling, they have just to set ui_browser->columns to
the number of columns available.

Those columns can be the number of characters on the screen, if what is
desired is to scroll character by character, or the number of columns in
a spreadsheet like table.

This is what the hist_browser will do, skipping ui_browser->horiz_scroll
columns when rendering each of its lines.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-q6a22bpmpgcr1awgzrmd4jrs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf callchain: Switch default to 'graph,0.5,caller'
Arnaldo Carvalho de Melo [Mon, 5 Oct 2015 20:05:35 +0000 (17:05 -0300)] 
perf callchain: Switch default to 'graph,0.5,caller'

Which is the most common default found in other similar tools.

Requested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://www.youtube.com/watch?v=nXaxk27zwlk
Link: http://lkml.kernel.org/n/tip-v8lq36aispvdwgxdmt9p9jd9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tests: Add Intel CQM test
Matt Fleming [Mon, 5 Oct 2015 14:40:21 +0000 (15:40 +0100)] 
perf tests: Add Intel CQM test

Peter reports that it's possible to trigger a WARN_ON_ONCE() in the
Intel CQM code by combining a hardware event and an Intel CQM
(software) event into a group. Unfortunately, the perf tools are not
able to create this bundle and we need to manually construct a test
case.

For posterity, record Peter's proof of concept test case in tools/perf
so that it presents a model for how we can perform architecture
specific tests, or "arch tests", in perf in the future.

The particular issue triggered in the test case is that when the
counter for the hardware event overflows and triggers a PMI we'll read
both the hardware event and the software event counters.
Unfortunately, for CQM that involves performing an IPI to read the CQM
event counters on all sockets, which in NMI context triggers the
WARN_ON_ONCE().

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
Link: http://lkml.kernel.org/n/tip-3p4ra0u8vzm7m289a1m799kf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tests: Move x86 tests into arch directory
Matt Fleming [Mon, 5 Oct 2015 14:40:20 +0000 (15:40 +0100)] 
perf tests: Move x86 tests into arch directory

Move out the x86-specific tests into tools/perf/arch/x86/tests and
define an 'arch_tests' array, which is the list of tests that only apply
to the build architecture.

We can also now begin to get rid of some of the #ifdef code that is
present in the generic perf tests.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-9s68h4ptg06ah0lgnjz55mqn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tests: Add arch tests
Matt Fleming [Mon, 5 Oct 2015 14:40:19 +0000 (15:40 +0100)] 
perf tests: Add arch tests

Tests that only make sense for some architectures currently live in
the same place as the generic tests. Move out the x86-specific tests
into tools/perf/arch/x86/tests and define an 'arch_tests' array, which
is the list of tests that only apply to the build architecture.

The main idea is to encourage developers to add arch tests to build
out perf's test coverage, without dumping everything in
tools/perf/tests.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-p4uc1c15ssbj8xj7ku5slpa6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Handle -h and -v options
Jiri Olsa [Mon, 5 Oct 2015 18:06:09 +0000 (20:06 +0200)] 
perf tools: Handle -h and -v options

Adding handling for '-h' and '-v' options to invoke help and version
command respectively.

Current behaviour is:

   $ perf -v
   Unknown option: -v

    Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
   $ perf -h
   Unknown option: -h

    Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

New behaviour:

  $ perf -h

   usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

   The most commonly used perf commands are:
     annotate        Read perf.data (created by perf record) and display annotated code
     archive         Create archive with object files with build-ids found in perf.data file
     bench           General framework for benchmark suites
   ...

  $ perf -v
  perf version 4.3.rc3.gc99e32

Updated man page.

Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Setup proper width for symbol_iaddr field
Jiri Olsa [Mon, 5 Oct 2015 18:06:08 +0000 (20:06 +0200)] 
perf tools: Setup proper width for symbol_iaddr field

We need to properly initialize column width for symbol_iaddr field, so
all symbols could fit in the column.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add support for sorting on the iaddr
Don Zickus [Mon, 5 Oct 2015 18:06:07 +0000 (20:06 +0200)] 
perf tools: Add support for sorting on the iaddr

Sorting on 'symbol' gives to broad a resolution as it can cover a range
of IP address.  Use the iaddr instead to get proper sorting on IP
addresses.  Need to use the 'mem_sort' feature of perf record.

New sort option is: symbol_iaddr, header label is 'Code Symbol'.

  $ perf mem report --stdio -F +symbol_iaddr
  # Overhead       Samples  Code Symbol              Local Weight
  # ........  ............  ........................ ............
  #
      54.08%             1  [k] nmi_handle           192
       4.51%             1  [k] finish_task_switch   16
       3.66%             1  [.] malloc               13
       3.10%             1  [.] __strcoll_l          11

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-8-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tests: Add parsing test for 'P' modifier
Jiri Olsa [Mon, 5 Oct 2015 18:06:06 +0000 (20:06 +0200)] 
perf tests: Add parsing test for 'P' modifier

We cant test 'P' modifier gets properly parsed, the functionality test
itself is beyond this suite.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Introduce 'P' modifier to request max precision
Jiri Olsa [Mon, 5 Oct 2015 18:06:05 +0000 (20:06 +0200)] 
perf tools: Introduce 'P' modifier to request max precision

The 'P' will cause the event to get maximum possible detected precise
level.

Following record:
  $ perf record -e cycles:P ...

will detect maximum precise level for 'cycles' event and use it.

Commiter note:

Testing it:

  $ perf record -e cycles:P usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
  $ perf evlist
  cycles:P
  $ perf evlist -v
  cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
  IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
  enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
  comm_exec: 1
  $

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Export perf_event_attr__set_max_precise_ip()
Jiri Olsa [Mon, 5 Oct 2015 18:06:04 +0000 (20:06 +0200)] 
perf tools: Export perf_event_attr__set_max_precise_ip()

It'll be used in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf annotate: Fix sizeof_sym_hist overflow issue
Jiri Olsa [Mon, 5 Oct 2015 18:06:03 +0000 (20:06 +0200)] 
perf annotate: Fix sizeof_sym_hist overflow issue

The annotated_source::sizeof_sym_hist could easily overflow int size,
resulting in crash in __symbol__inc_addr_samples.

Changing its type int size_t as was probably intended from beginning
based on the initialization code in symbol__alloc_hist.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf evlist: Display DATA_SRC sample type bit
Jiri Olsa [Mon, 5 Oct 2015 18:06:02 +0000 (20:06 +0200)] 
perf evlist: Display DATA_SRC sample type bit

Adding DATA_SRC bit_name call to display sample_type properly.

   $ perf evlist -v
   cpu/mem-loads/pp: ...SNIP... sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|DATA_SRC, ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib api fs: No need to use PATH_MAX + 1
Jiri Olsa [Mon, 5 Oct 2015 18:06:01 +0000 (20:06 +0200)] 
tools lib api fs: No need to use PATH_MAX + 1

Because there's no point, PATH_MAX is big enough.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoLinux 4.3-rc4 v4.3-rc4
Linus Torvalds [Sun, 4 Oct 2015 15:57:17 +0000 (16:57 +0100)] 
Linux 4.3-rc4

8 years agoMerge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf...
Linus Torvalds [Sun, 4 Oct 2015 15:31:13 +0000 (16:31 +0100)] 
Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile

Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures

8 years agoMerge tag 'md/4.3-fixes' of git://neil.brown.name/md
Linus Torvalds [Sun, 4 Oct 2015 10:47:28 +0000 (11:47 +0100)] 
Merge tag 'md/4.3-fixes' of git://neil.brown.name/md

Pull md fixes from Neil Brown:
 "Assorted fixes for md in 4.3-rc.

  Two tagged for -stable, and one is really a cleanup to match and
  improve kmemcache interface.

* tag 'md/4.3-fixes' of git://neil.brown.name/md:
  md/bitmap: don't pass -1 to bitmap_storage_alloc.
  md/raid1: Avoid raid1 resync getting stuck
  md: drop null test before destroy functions
  md: clear CHANGE_PENDING in readonly array
  md/raid0: apply base queue limits *before* disk_stack_limits
  md/raid5: don't index beyond end of array in need_this_block().
  raid5: update analysis state for failed stripe
  md: wait for pending superblock updates before switching to read-only

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 4 Oct 2015 10:41:58 +0000 (11:41 +0100)] 
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS updates from Ralf Baechle:
 "This week's round of MIPS fixes:
   - Fix JZ4740 build
   - Fix fallback to GFP_DMA
   - FP seccomp in case of ENOSYS
   - Fix bootmem panic
   - A number of FP and CPS fixes
   - Wire up new syscalls
   - Make sure BPF assembler objects can properly be disassembled
   - Fix BPF assembler code for MIPS I"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: scall: Always run the seccomp syscall filters
  MIPS: Octeon: Fix kernel panic on startup from memory corruption
  MIPS: Fix R2300 FP context switch handling
  MIPS: Fix octeon FP context switch handling
  MIPS: BPF: Fix load delay slots.
  MIPS: BPF: Do all exports of symbols with FEXPORT().
  MIPS: Fix the build on jz4740 after removing the custom gpio.h
  MIPS: CPS: #ifdef on CONFIG_MIPS_MT_SMP rather than CONFIG_MIPS_MT
  MIPS: CPS: Don't include MT code in non-MT kernels.
  MIPS: CPS: Stop dangling delay slot from has_mt.
  MIPS: dma-default: Fix 32-bit fall back to GFP_DMA
  MIPS: Wire up userfaultfd and membarrier syscalls.

8 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Oct 2015 10:40:09 +0000 (11:40 +0100)] 
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "This update contains:

   - Fix for a long standing race affecting /proc/irq/NNN

   - One line fix for ARM GICV3-ITS counting the wrong data

   - Warning silencing in ARM GICV3-ITS.  Another GCC trying to be
     overly clever issue"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Count additional LPIs for the aliased devices
  irqchip/gic-v3-its: Silence warning when its_lpi_alloc_chunks gets inlined
  genirq: Fix race in register_irq_proc()

8 years agoMIPS: scall: Always run the seccomp syscall filters
Markos Chandras [Fri, 25 Sep 2015 07:17:42 +0000 (08:17 +0100)] 
MIPS: scall: Always run the seccomp syscall filters

The MIPS syscall handler code used to return -ENOSYS on invalid
syscalls. Whilst this is expected, it caused problems for seccomp
filters because the said filters never had the change to run since
the code returned -ENOSYS before triggering them. This caused
problems on the chromium testsuite for filters looking for invalid
syscalls. This has now changed and the seccomp filters are always
run even if the syscall is invalid. We return -ENOSYS once we
return from the seccomp filters. Moreover, similar codepaths have
been merged in the process which simplifies somewhat the overall
syscall code.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11236/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
8 years ago[CIFS] Update cifs version number
Steve French [Sat, 3 Oct 2015 21:54:17 +0000 (16:54 -0500)] 
[CIFS] Update cifs version number

Update modinfo cifs.ko version number to 2.08

Signed-off-by: Steve French <steve.french@primarydata.com>
8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 3 Oct 2015 14:53:05 +0000 (10:53 -0400)] 
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Fixes all around the map: W+X kernel mapping fix, WCHAN fixes, two
  build failure fixes for corner case configs, x32 header fix and a
  speling fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds
  x86/mm: Set NX on gap between __ex_table and rodata
  x86/kexec: Fix kexec crash in syscall kexec_file_load()
  x86/process: Unify 32bit and 64bit implementations of get_wchan()
  x86/process: Add proper bound checks in 64bit get_wchan()
  x86, efi, kasan: Fix build failure on !KASAN && KMEMCHECK=y kernels
  x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case
  x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag

8 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 3 Oct 2015 14:51:41 +0000 (10:51 -0400)] 
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Ingo Molnar:
 "An abs64() fix in the watchdog driver, and two clocksource driver
  NO_IRQ assumption fixes"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Fix abs() usage w/ 64bit values
  clocksource/drivers/keystone: Fix bad NO_IRQ usage
  clocksource/drivers/rockchip: Fix bad NO_IRQ usage

8 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 3 Oct 2015 14:46:41 +0000 (10:46 -0400)] 
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI fixes from Ingo Molnar:
 "Two EFI fixes: one for x86, one for ARM, fixing a boot crash bug that
  can trigger under newer EFI firmware"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions
  x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down

8 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Sat, 3 Oct 2015 14:39:31 +0000 (10:39 -0400)] 
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Bunch of fixes all over the place, all pretty small: amdgpu, i915,
  exynos, one qxl and one vmwgfx.

  There is also a bunch of mst fixes, I left some cleanups in the series
  as I didn't think it was worth splitting up the tested series"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (37 commits)
  drm/dp/mst: add some defines for logical/physical ports
  drm/dp/mst: drop cancel work sync in the mstb destroy path (v2)
  drm/dp/mst: split connector registration into two parts (v2)
  drm/dp/mst: update the link_address_sent before sending the link address (v3)
  drm/dp/mst: fixup handling hotplug on port removal.
  drm/dp/mst: don't pass port into the path builder function
  drm/radeon: drop radeon_fb_helper_set_par
  drm: handle cursor_set2 in restore_fbdev_mode
  drm/exynos: Staticize local function in exynos_drm_gem.c
  drm/exynos: fimd: actually disable dp clock
  drm/exynos: dp: remove suspend/resume functions
  drm/qxl: recreate the primary surface when the bo is not primary
  drm/amdgpu: only print meaningful VM faults
  drm/amdgpu/cgs: remove import_gpu_mem
  drm/i915: Call non-locking version of drm_kms_helper_poll_enable(), v2
  drm: Add a non-locking version of drm_kms_helper_poll_enable(), v2
  drm/vmwgfx: Fix a command submission hang regression
  drm/exynos: remove unused mode_fixup() code
  drm/exynos: remove decon_mode_fixup()
  drm/exynos: remove fimd_mode_fixup()
  ...

8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Sat, 3 Oct 2015 06:20:14 +0000 (08:20 +0200)] 
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

 - Do event name substring search as last resort in 'perf list'.
   (Arnaldo Carvalho de Melo)

   E.g.:

    # perf list clock

    List of pre-defined events (to be used in -e):

     cpu-clock                                          [Software event]
     task-clock                                         [Software event]

     uncore_cbox_0/clockticks/                          [Kernel PMU event]
     uncore_cbox_1/clockticks/                          [Kernel PMU event]

     kvm:kvm_pvclock_update                             [Tracepoint event]
     kvm:kvm_update_master_clock                        [Tracepoint event]
     power:clock_disable                                [Tracepoint event]
     power:clock_enable                                 [Tracepoint event]
     power:clock_set_rate                               [Tracepoint event]
     syscalls:sys_enter_clock_adjtime                   [Tracepoint event]
     syscalls:sys_enter_clock_getres                    [Tracepoint event]
     syscalls:sys_enter_clock_gettime                   [Tracepoint event]
     syscalls:sys_enter_clock_nanosleep                 [Tracepoint event]
     syscalls:sys_enter_clock_settime                   [Tracepoint event]
     syscalls:sys_exit_clock_adjtime                    [Tracepoint event]
     syscalls:sys_exit_clock_getres                     [Tracepoint event]
     syscalls:sys_exit_clock_gettime                    [Tracepoint event]
     syscalls:sys_exit_clock_nanosleep                  [Tracepoint event]
     syscalls:sys_exit_clock_settime                    [Tracepoint event]

 - Reduce min 'perf stat --interval-print/-I' to 10ms. (Kan Liang)

   perf stat --interval in action:

   # perf stat -e cycles -I 50 -a usleep $((200 * 1000))
   print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution.
   #   time                    counts unit events
      0.050233636         48,240,396      cycles
      0.100557098         35,492,594      cycles
      0.150804687         39,295,112      cycles
      0.201032269         33,101,961      cycles
      0.201980732            786,379      cycles
  #

 - Allow for max_stack greater than PERF_MAX_STACK_DEPTH, as when
   synthesizing callchains from Intel PT data. (Adrian Hunter)

 - Allow probing on kmodules without DWARF. (Masami Hiramatsu)

 - Fix a segfault when processing a perf.data file with callchains using
   "perf report --call-graph none". (Namhyung Kim)

 - Fix unresolved COMMs in 'perf top' when -s comm is used. (Namhyung Kim)

 - Register idle thread in 'perf top'. (Namhyung Kim)

 - Change 'record.samples' type to unsigned long long, fixing output of
   number of samples in 32-bit architectures. (Yang Shi)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 2 Oct 2015 21:53:25 +0000 (17:53 -0400)] 
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input layer fixes from Dmitry Torokhov:
 "Fixes for two recent regressions (in Synaptics PS/2 and uinput
  drivers) and some more driver fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics - fix handling of disabling gesture mode"
  Input: psmouse - fix data race in __ps2_command
  Input: elan_i2c - add all valid ic type for i2c/smbus
  Input: zhenhua - ensure we have BITREVERSE
  Input: omap4-keypad - fix memory leak
  Input: serio - fix blocking of parport
  Input: uinput - fix crash when using ABS events
  Input: elan_i2c - expand maximum product_id form 0xFF to 0xFFFF
  Input: elan_i2c - add ic type 0x03
  Input: elan_i2c - don't require known iap version
  Input: imx6ul_tsc - fix controller name
  Input: imx6ul_tsc - use the preferred method for kzalloc()
  Input: imx6ul_tsc - check for negative return value
  Input: imx6ul_tsc - propagate the errors
  Input: walkera0701 - fix abs() calculations on 64 bit values
  Input: mms114 - remove unneded semicolons
  Input: pm8941-pwrkey - remove unneded semicolon
  Input: fix typo in MT documentation
  Input: cyapa - fix address of Gen3 devices in device tree documentation

8 years agoclocksource: Fix abs() usage w/ 64bit values
John Stultz [Tue, 15 Sep 2015 01:05:20 +0000 (18:05 -0700)] 
clocksource: Fix abs() usage w/ 64bit values

This patch fixes one cases where abs() was being used with 64-bit
nanosecond values, where the result may be capped at 32-bits.

This potentially could cause watchdog false negatives on 32-bit
systems, so this patch addresses the issue by using abs64().

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
8 years agoperf stat: Reduce min --interval-print to 10ms
Kan Liang [Fri, 2 Oct 2015 09:04:34 +0000 (05:04 -0400)] 
perf stat: Reduce min --interval-print to 10ms

The --interval-print parameter was limited to 100ms. However, for
example, 10ms is required to do sophisticated bandwidth analysis using
uncore events.

The test shows that the overhead of the system-wide uncore monitoring
with 10ms interval is only ~2%. So this patch reduces the minimal
interval-print allowd to 10ms.

But 10ms may not work well for all cases. For example, when the
cpus/threads number is very large, for system-wide core event monitoring
the overhead could be high.

To handle this issue, a warning will be displayed when the
interval-print is set between 10ms to 100ms. So users can make a
decision according to their specific cases.

 # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1

 print interval < 100ms. The overhead percentage could be high in some
 cases. Please proceed with caution.
 #           time             counts unit events
      0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
      0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
      0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
      0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
      0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
[ Added warning about overhead when using sub 100ms intervals to the man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'nfs-rdma-for-4.3-2' of git://git.linux-nfs.org/projects/anna/nfs-rdma
Trond Myklebust [Fri, 2 Oct 2015 19:49:33 +0000 (15:49 -0400)] 
Merge tag 'nfs-rdma-for-4.3-2' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA bugfix

Fixes a use-after-free bug.

Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
8 years agonfs4: reset states to use open_stateid when returning delegation voluntarily
Jeff Layton [Fri, 2 Oct 2015 17:14:37 +0000 (13:14 -0400)] 
nfs4: reset states to use open_stateid when returning delegation voluntarily

When the client goes to return a delegation, it should always update any
nfs4_state currently set up to use that delegation stateid to instead
use the open stateid. It already does do this in some cases,
particularly in the state recovery code, but not currently when the
delegation is voluntarily returned (e.g. in advance of a RENAME).  This
causes the client to try to continue using the delegation stateid after
the DELEGRETURN, e.g. in LAYOUTGET.

Set the nfs4_state back to using the open stateid in
nfs4_open_delegation_recall, just before clearing the
NFS_DELEGATED_STATE bit.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
8 years agoNFSv4: Fix a nograce recovery hang
Benjamin Coddington [Thu, 1 Oct 2015 13:17:33 +0000 (09:17 -0400)] 
NFSv4: Fix a nograce recovery hang

Since commit 5cae02f42793130e1387f4ec09c4d07056ce9fa5 an OPEN_CONFIRM should
have a privileged sequence in the recovery case to allow nograce recovery to
proceed for NFSv4.0.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
8 years agoNFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FH
Trond Myklebust [Fri, 2 Oct 2015 15:44:54 +0000 (11:44 -0400)] 
NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FH

We need to warn against broken NFSv4.1 servers that try to hand out
delegations in response to NFS4_OPEN_CLAIM_DELEG_CUR_FH.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
8 years agoNFSv4: Don't try to reclaim unused state owners
Trond Myklebust [Fri, 2 Oct 2015 15:11:16 +0000 (11:11 -0400)] 
NFSv4: Don't try to reclaim unused state owners

Currently, we don't test if the state owner is in use before we try to
recover it. The problem is that if the refcount is zero, then the
state owner will be waiting on the lru list for garbage collection.
The expectation in that case is that if you bump the refcount, then
you must also remove the state owner from the lru list. Otherwise
the call to nfs4_put_state_owner will corrupt that list by trying
to add our state owner a second time.

Avoid the whole problem by just skipping state owners that hold no
state.

Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>