From: Greg Kroah-Hartman Date: Mon, 1 Apr 2019 10:29:26 +0000 (+0200) Subject: 4.19-stable patches X-Git-Tag: v3.18.138~17 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=9e63ff13d6bd66a6e297a2c1fa84274e09d8d8b6;p=thirdparty%2Fkernel%2Fstable-queue.git 4.19-stable patches added patches: cpu-hotplug-prevent-crash-when-cpu-bringup-fails-on-config_hotplug_cpu-n.patch kvm-reject-device-ioctls-from-processes-other-than-the-vm-s-creator.patch kvm-x86-update-rip-after-emulating-io.patch objtool-query-pkg-config-for-libelf-location.patch perf-intel-pt-fix-tsc-slip.patch perf-pmu-fix-parser-error-for-uncore-event-alias.patch powerpc-64-fix-memcmp-reading-past-the-end-of-src-dest.patch powerpc-pseries-energy-use-of-accessor-functions-to-read-ibm-drc-indexes.patch watchdog-respect-watchdog-cpumask-on-cpu-hotplug.patch x86-smp-enforce-config_hotplug_cpu-when-smp-y.patch --- diff --git a/queue-4.19/cpu-hotplug-prevent-crash-when-cpu-bringup-fails-on-config_hotplug_cpu-n.patch b/queue-4.19/cpu-hotplug-prevent-crash-when-cpu-bringup-fails-on-config_hotplug_cpu-n.patch new file mode 100644 index 00000000000..7a6939b56ae --- /dev/null +++ b/queue-4.19/cpu-hotplug-prevent-crash-when-cpu-bringup-fails-on-config_hotplug_cpu-n.patch @@ -0,0 +1,142 @@ +From 206b92353c839c0b27a0b9bec24195f93fd6cf7a Mon Sep 17 00:00:00 2001 +From: Thomas Gleixner +Date: Tue, 26 Mar 2019 17:36:05 +0100 +Subject: cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n + +From: Thomas Gleixner + +commit 206b92353c839c0b27a0b9bec24195f93fd6cf7a upstream. + +Tianyu reported a crash in a CPU hotplug teardown callback when booting a +kernel which has CONFIG_HOTPLUG_CPU disabled with the 'nosmt' boot +parameter. + +It turns out that the SMP=y CONFIG_HOTPLUG_CPU=n case has been broken +forever in case that a bringup callback fails. Unfortunately this issue was +not recognized when the CPU hotplug code was reworked, so the shortcoming +just stayed in place. + +When a bringup callback fails, the CPU hotplug code rolls back the +operation and takes the CPU offline. + +The 'nosmt' command line argument uses a bringup failure to abort the +bringup of SMT sibling CPUs. This partial bringup is required due to the +MCE misdesign on Intel CPUs. + +With CONFIG_HOTPLUG_CPU=y the rollback works perfectly fine, but +CONFIG_HOTPLUG_CPU=n lacks essential mechanisms to exercise the low level +teardown of a CPU including the synchronizations in various facilities like +RCU, NOHZ and others. + +As a consequence the teardown callbacks which must be executed on the +outgoing CPU within stop machine with interrupts disabled are executed on +the control CPU in interrupt enabled and preemptible context causing the +kernel to crash and burn. The pre state machine code has a different +failure mode which is more subtle and resulting in a less obvious use after +free crash because the control side frees resources which are still in use +by the undead CPU. + +But this is not a x86 only problem. Any architecture which supports the +SMP=y HOTPLUG_CPU=n combination suffers from the same issue. It's just less +likely to be triggered because in 99.99999% of the cases all bringup +callbacks succeed. + +The easy solution of making HOTPLUG_CPU mandatory for SMP is not working on +all architectures as the following architectures have either no hotplug +support at all or not all subarchitectures support it: + + alpha, arc, hexagon, openrisc, riscv, sparc (32bit), mips (partial). + +Crashing the kernel in such a situation is not an acceptable state +either. + +Implement a minimal rollback variant by limiting the teardown to the point +where all regular teardown callbacks have been invoked and leave the CPU in +the 'dead' idle state. This has the following consequences: + + - the CPU is brought down to the point where the stop_machine takedown + would happen. + + - the CPU stays there forever and is idle + + - The CPU is cleared in the CPU active mask, but not in the CPU online + mask which is a legit state. + + - Interrupts are not forced away from the CPU + + - All facilities which only look at online mask would still see it, but + that is the case during normal hotplug/unplug operations as well. It's + just a (way) longer time frame. + +This will expose issues, which haven't been exposed before or only seldom, +because now the normally transient state of being non active but online is +a permanent state. In testing this exposed already an issue vs. work queues +where the vmstat code schedules work on the almost dead CPU which ends up +in an unbound workqueue and triggers 'preemtible context' warnings. This is +not a problem of this change, it merily exposes an already existing issue. +Still this is better than crashing fully without a chance to debug it. + +This is mainly thought as workaround for those architectures which do not +support HOTPLUG_CPU. All others should enforce HOTPLUG_CPU for SMP. + +Fixes: 2e1a3483ce74 ("cpu/hotplug: Split out the state walk into functions") +Reported-by: Tianyu Lan +Signed-off-by: Thomas Gleixner +Tested-by: Tianyu Lan +Acked-by: Greg Kroah-Hartman +Cc: Konrad Wilk +Cc: Josh Poimboeuf +Cc: Mukesh Ojha +Cc: Peter Zijlstra +Cc: Jiri Kosina +Cc: Rik van Riel +Cc: Andy Lutomirski +Cc: Micheal Kelley +Cc: "K. Y. Srinivasan" +Cc: Linus Torvalds +Cc: Borislav Petkov +Cc: K. Y. Srinivasan +Cc: stable@vger.kernel.org +Link: https://lkml.kernel.org/r/20190326163811.503390616@linutronix.de +Signed-off-by: Greg Kroah-Hartman + +--- + kernel/cpu.c | 20 ++++++++++++++++++-- + 1 file changed, 18 insertions(+), 2 deletions(-) + +--- a/kernel/cpu.c ++++ b/kernel/cpu.c +@@ -533,6 +533,20 @@ static void undo_cpu_up(unsigned int cpu + cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL); + } + ++static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st) ++{ ++ if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) ++ return true; ++ /* ++ * When CPU hotplug is disabled, then taking the CPU down is not ++ * possible because takedown_cpu() and the architecture and ++ * subsystem specific mechanisms are not available. So the CPU ++ * which would be completely unplugged again needs to stay around ++ * in the current state. ++ */ ++ return st->state <= CPUHP_BRINGUP_CPU; ++} ++ + static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st, + enum cpuhp_state target) + { +@@ -543,8 +557,10 @@ static int cpuhp_up_callbacks(unsigned i + st->state++; + ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL); + if (ret) { +- st->target = prev_state; +- undo_cpu_up(cpu, st); ++ if (can_rollback_cpu(st)) { ++ st->target = prev_state; ++ undo_cpu_up(cpu, st); ++ } + break; + } + } diff --git a/queue-4.19/kvm-reject-device-ioctls-from-processes-other-than-the-vm-s-creator.patch b/queue-4.19/kvm-reject-device-ioctls-from-processes-other-than-the-vm-s-creator.patch new file mode 100644 index 00000000000..90884281f54 --- /dev/null +++ b/queue-4.19/kvm-reject-device-ioctls-from-processes-other-than-the-vm-s-creator.patch @@ -0,0 +1,78 @@ +From ddba91801aeb5c160b660caed1800eb3aef403f8 Mon Sep 17 00:00:00 2001 +From: Sean Christopherson +Date: Fri, 15 Feb 2019 12:48:39 -0800 +Subject: KVM: Reject device ioctls from processes other than the VM's creator + +From: Sean Christopherson + +commit ddba91801aeb5c160b660caed1800eb3aef403f8 upstream. + +KVM's API requires thats ioctls must be issued from the same process +that created the VM. In other words, userspace can play games with a +VM's file descriptors, e.g. fork(), SCM_RIGHTS, etc..., but only the +creator can do anything useful. Explicitly reject device ioctls that +are issued by a process other than the VM's creator, and update KVM's +API documentation to extend its requirements to device ioctls. + +Fixes: 852b6d57dc7f ("kvm: add device control API") +Cc: +Signed-off-by: Sean Christopherson +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman + +--- + Documentation/virtual/kvm/api.txt | 16 +++++++++++----- + virt/kvm/kvm_main.c | 3 +++ + 2 files changed, 14 insertions(+), 5 deletions(-) + +--- a/Documentation/virtual/kvm/api.txt ++++ b/Documentation/virtual/kvm/api.txt +@@ -13,7 +13,7 @@ of a virtual machine. The ioctls belong + + - VM ioctls: These query and set attributes that affect an entire virtual + machine, for example memory layout. In addition a VM ioctl is used to +- create virtual cpus (vcpus). ++ create virtual cpus (vcpus) and devices. + + Only run VM ioctls from the same process (address space) that was used + to create the VM. +@@ -24,6 +24,11 @@ of a virtual machine. The ioctls belong + Only run vcpu ioctls from the same thread that was used to create the + vcpu. + ++ - device ioctls: These query and set attributes that control the operation ++ of a single device. ++ ++ device ioctls must be issued from the same process (address space) that ++ was used to create the VM. + + 2. File descriptors + ------------------- +@@ -32,10 +37,11 @@ The kvm API is centered around file desc + open("/dev/kvm") obtains a handle to the kvm subsystem; this handle + can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this + handle will create a VM file descriptor which can be used to issue VM +-ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu +-and return a file descriptor pointing to it. Finally, ioctls on a vcpu +-fd can be used to control the vcpu, including the important task of +-actually running guest code. ++ioctls. A KVM_CREATE_VCPU or KVM_CREATE_DEVICE ioctl on a VM fd will ++create a virtual cpu or device and return a file descriptor pointing to ++the new resource. Finally, ioctls on a vcpu or device fd can be used ++to control the vcpu or device. For vcpus, this includes the important ++task of actually running guest code. + + In general file descriptors can be migrated among processes by means + of fork() and the SCM_RIGHTS facility of unix domain socket. These +--- a/virt/kvm/kvm_main.c ++++ b/virt/kvm/kvm_main.c +@@ -2815,6 +2815,9 @@ static long kvm_device_ioctl(struct file + { + struct kvm_device *dev = filp->private_data; + ++ if (dev->kvm->mm != current->mm) ++ return -EIO; ++ + switch (ioctl) { + case KVM_SET_DEVICE_ATTR: + return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg); diff --git a/queue-4.19/kvm-x86-update-rip-after-emulating-io.patch b/queue-4.19/kvm-x86-update-rip-after-emulating-io.patch new file mode 100644 index 00000000000..99305c04607 --- /dev/null +++ b/queue-4.19/kvm-x86-update-rip-after-emulating-io.patch @@ -0,0 +1,156 @@ +From 45def77ebf79e2e8942b89ed79294d97ce914fa0 Mon Sep 17 00:00:00 2001 +From: Sean Christopherson +Date: Mon, 11 Mar 2019 20:01:05 -0700 +Subject: KVM: x86: update %rip after emulating IO + +From: Sean Christopherson + +commit 45def77ebf79e2e8942b89ed79294d97ce914fa0 upstream. + +Most (all?) x86 platforms provide a port IO based reset mechanism, e.g. +OUT 92h or CF9h. Userspace may emulate said mechanism, i.e. reset a +vCPU in response to KVM_EXIT_IO, without explicitly announcing to KVM +that it is doing a reset, e.g. Qemu jams vCPU state and resumes running. + +To avoid corruping %rip after such a reset, commit 0967b7bf1c22 ("KVM: +Skip pio instruction when it is emulated, not executed") changed the +behavior of PIO handlers, i.e. today's "fast" PIO handling to skip the +instruction prior to exiting to userspace. Full emulation doesn't need +such tricks becase re-emulating the instruction will naturally handle +%rip being changed to point at the reset vector. + +Updating %rip prior to executing to userspace has several drawbacks: + + - Userspace sees the wrong %rip on the exit, e.g. if PIO emulation + fails it will likely yell about the wrong address. + - Single step exits to userspace for are effectively dropped as + KVM_EXIT_DEBUG is overwritten with KVM_EXIT_IO. + - Behavior of PIO emulation is different depending on whether it + goes down the fast path or the slow path. + +Rather than skip the PIO instruction before exiting to userspace, +snapshot the linear %rip and cancel PIO completion if the current +value does not match the snapshot. For a 64-bit vCPU, i.e. the most +common scenario, the snapshot and comparison has negligible overhead +as VMCS.GUEST_RIP will be cached regardless, i.e. there is no extra +VMREAD in this case. + +All other alternatives to snapshotting the linear %rip that don't +rely on an explicit reset announcenment suffer from one corner case +or another. For example, canceling PIO completion on any write to +%rip fails if userspace does a save/restore of %rip, and attempting to +avoid that issue by canceling PIO only if %rip changed then fails if PIO +collides with the reset %rip. Attempting to zero in on the exact reset +vector won't work for APs, which means adding more hooks such as the +vCPU's MP_STATE, and so on and so forth. + +Checking for a linear %rip match technically suffers from corner cases, +e.g. userspace could theoretically rewrite the underlying code page and +expect a different instruction to execute, or the guest hardcodes a PIO +reset at 0xfffffff0, but those are far, far outside of what can be +considered normal operation. + +Fixes: 432baf60eee3 ("KVM: VMX: use kvm_fast_pio_in for handling IN I/O") +Cc: +Reported-by: Jim Mattson +Signed-off-by: Sean Christopherson +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/include/asm/kvm_host.h | 1 + + arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++---------- + 2 files changed, 27 insertions(+), 10 deletions(-) + +--- a/arch/x86/include/asm/kvm_host.h ++++ b/arch/x86/include/asm/kvm_host.h +@@ -315,6 +315,7 @@ struct kvm_mmu_page { + }; + + struct kvm_pio_request { ++ unsigned long linear_rip; + unsigned long count; + int in; + int port; +--- a/arch/x86/kvm/x86.c ++++ b/arch/x86/kvm/x86.c +@@ -6317,14 +6317,27 @@ int kvm_emulate_instruction_from_buffer( + } + EXPORT_SYMBOL_GPL(kvm_emulate_instruction_from_buffer); + ++static int complete_fast_pio_out(struct kvm_vcpu *vcpu) ++{ ++ vcpu->arch.pio.count = 0; ++ ++ if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.pio.linear_rip))) ++ return 1; ++ ++ return kvm_skip_emulated_instruction(vcpu); ++} ++ + static int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, + unsigned short port) + { + unsigned long val = kvm_register_read(vcpu, VCPU_REGS_RAX); + int ret = emulator_pio_out_emulated(&vcpu->arch.emulate_ctxt, + size, port, &val, 1); +- /* do not return to emulator after return from userspace */ +- vcpu->arch.pio.count = 0; ++ ++ if (!ret) { ++ vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu); ++ vcpu->arch.complete_userspace_io = complete_fast_pio_out; ++ } + return ret; + } + +@@ -6335,6 +6348,11 @@ static int complete_fast_pio_in(struct k + /* We should only ever be called with arch.pio.count equal to 1 */ + BUG_ON(vcpu->arch.pio.count != 1); + ++ if (unlikely(!kvm_is_linear_rip(vcpu, vcpu->arch.pio.linear_rip))) { ++ vcpu->arch.pio.count = 0; ++ return 1; ++ } ++ + /* For size less than 4 we merge, else we zero extend */ + val = (vcpu->arch.pio.size < 4) ? kvm_register_read(vcpu, VCPU_REGS_RAX) + : 0; +@@ -6347,7 +6365,7 @@ static int complete_fast_pio_in(struct k + vcpu->arch.pio.port, &val, 1); + kvm_register_write(vcpu, VCPU_REGS_RAX, val); + +- return 1; ++ return kvm_skip_emulated_instruction(vcpu); + } + + static int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, +@@ -6366,6 +6384,7 @@ static int kvm_fast_pio_in(struct kvm_vc + return ret; + } + ++ vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu); + vcpu->arch.complete_userspace_io = complete_fast_pio_in; + + return 0; +@@ -6373,16 +6392,13 @@ static int kvm_fast_pio_in(struct kvm_vc + + int kvm_fast_pio(struct kvm_vcpu *vcpu, int size, unsigned short port, int in) + { +- int ret = kvm_skip_emulated_instruction(vcpu); ++ int ret; + +- /* +- * TODO: we might be squashing a KVM_GUESTDBG_SINGLESTEP-triggered +- * KVM_EXIT_DEBUG here. +- */ + if (in) +- return kvm_fast_pio_in(vcpu, size, port) && ret; ++ ret = kvm_fast_pio_in(vcpu, size, port); + else +- return kvm_fast_pio_out(vcpu, size, port) && ret; ++ ret = kvm_fast_pio_out(vcpu, size, port); ++ return ret && kvm_skip_emulated_instruction(vcpu); + } + EXPORT_SYMBOL_GPL(kvm_fast_pio); + diff --git a/queue-4.19/objtool-query-pkg-config-for-libelf-location.patch b/queue-4.19/objtool-query-pkg-config-for-libelf-location.patch new file mode 100644 index 00000000000..c03ab64736f --- /dev/null +++ b/queue-4.19/objtool-query-pkg-config-for-libelf-location.patch @@ -0,0 +1,60 @@ +From 056d28d135bca0b1d0908990338e00e9dadaf057 Mon Sep 17 00:00:00 2001 +From: Rolf Eike Beer +Date: Tue, 26 Mar 2019 12:48:39 -0500 +Subject: objtool: Query pkg-config for libelf location + +From: Rolf Eike Beer + +commit 056d28d135bca0b1d0908990338e00e9dadaf057 upstream. + +If it is not in the default location, compilation fails at several points. + +Signed-off-by: Rolf Eike Beer +Signed-off-by: Josh Poimboeuf +Signed-off-by: Thomas Gleixner +Cc: stable@vger.kernel.org +Link: https://lkml.kernel.org/r/91a25e992566a7968fedc89ec80e7f4c83ad0548.1553622500.git.jpoimboe@redhat.com +Signed-off-by: Greg Kroah-Hartman + +--- + Makefile | 4 +++- + tools/objtool/Makefile | 7 +++++-- + 2 files changed, 8 insertions(+), 3 deletions(-) + +--- a/Makefile ++++ b/Makefile +@@ -948,9 +948,11 @@ mod_sign_cmd = true + endif + export mod_sign_cmd + ++HOST_LIBELF_LIBS = $(shell pkg-config libelf --libs 2>/dev/null || echo -lelf) ++ + ifdef CONFIG_STACK_VALIDATION + has_libelf := $(call try-run,\ +- echo "int main() {}" | $(HOSTCC) -xc -o /dev/null -lelf -,1,0) ++ echo "int main() {}" | $(HOSTCC) -xc -o /dev/null $(HOST_LIBELF_LIBS) -,1,0) + ifeq ($(has_libelf),1) + objtool_target := tools/objtool FORCE + else +--- a/tools/objtool/Makefile ++++ b/tools/objtool/Makefile +@@ -25,14 +25,17 @@ LIBSUBCMD = $(LIBSUBCMD_OUTPUT)libsubcm + OBJTOOL := $(OUTPUT)objtool + OBJTOOL_IN := $(OBJTOOL)-in.o + ++LIBELF_FLAGS := $(shell pkg-config libelf --cflags 2>/dev/null) ++LIBELF_LIBS := $(shell pkg-config libelf --libs 2>/dev/null || echo -lelf) ++ + all: $(OBJTOOL) + + INCLUDES := -I$(srctree)/tools/include \ + -I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi \ + -I$(srctree)/tools/objtool/arch/$(ARCH)/include + WARNINGS := $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -Wno-packed +-CFLAGS += -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) +-LDFLAGS += -lelf $(LIBSUBCMD) $(KBUILD_HOSTLDFLAGS) ++CFLAGS += -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) $(LIBELF_FLAGS) ++LDFLAGS += $(LIBELF_LIBS) $(LIBSUBCMD) $(KBUILD_HOSTLDFLAGS) + + # Allow old libelf to be used: + elfshdr := $(shell echo '$(pound)include ' | $(CC) $(CFLAGS) -x c -E - | grep elf_getshdr) diff --git a/queue-4.19/perf-intel-pt-fix-tsc-slip.patch b/queue-4.19/perf-intel-pt-fix-tsc-slip.patch new file mode 100644 index 00000000000..923ef39d601 --- /dev/null +++ b/queue-4.19/perf-intel-pt-fix-tsc-slip.patch @@ -0,0 +1,56 @@ +From f3b4e06b3bda759afd042d3d5fa86bea8f1fe278 Mon Sep 17 00:00:00 2001 +From: Adrian Hunter +Date: Mon, 25 Mar 2019 15:51:35 +0200 +Subject: perf intel-pt: Fix TSC slip + +From: Adrian Hunter + +commit f3b4e06b3bda759afd042d3d5fa86bea8f1fe278 upstream. + +A TSC packet can slip past MTC packets so that the timestamp appears to +go backwards. One estimate is that can be up to about 40 CPU cycles, +which is certainly less than 0x1000 TSC ticks, but accept slippage an +order of magnitude more to be on the safe side. + +Signed-off-by: Adrian Hunter +Cc: Jiri Olsa +Cc: stable@vger.kernel.org +Fixes: 79b58424b821c ("perf tools: Add Intel PT support for decoding MTC packets") +Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.com +Signed-off-by: Arnaldo Carvalho de Melo +Signed-off-by: Greg Kroah-Hartman + +--- + tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 20 ++++++++------------ + 1 file changed, 8 insertions(+), 12 deletions(-) + +--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c ++++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +@@ -251,19 +251,15 @@ struct intel_pt_decoder *intel_pt_decode + if (!(decoder->tsc_ctc_ratio_n % decoder->tsc_ctc_ratio_d)) + decoder->tsc_ctc_mult = decoder->tsc_ctc_ratio_n / + decoder->tsc_ctc_ratio_d; +- +- /* +- * Allow for timestamps appearing to backwards because a TSC +- * packet has slipped past a MTC packet, so allow 2 MTC ticks +- * or ... +- */ +- decoder->tsc_slip = multdiv(2 << decoder->mtc_shift, +- decoder->tsc_ctc_ratio_n, +- decoder->tsc_ctc_ratio_d); + } +- /* ... or 0x100 paranoia */ +- if (decoder->tsc_slip < 0x100) +- decoder->tsc_slip = 0x100; ++ ++ /* ++ * A TSC packet can slip past MTC packets so that the timestamp appears ++ * to go backwards. One estimate is that can be up to about 40 CPU ++ * cycles, which is certainly less than 0x1000 TSC ticks, but accept ++ * slippage an order of magnitude more to be on the safe side. ++ */ ++ decoder->tsc_slip = 0x10000; + + intel_pt_log("timestamp: mtc_shift %u\n", decoder->mtc_shift); + intel_pt_log("timestamp: tsc_ctc_ratio_n %u\n", decoder->tsc_ctc_ratio_n); diff --git a/queue-4.19/perf-pmu-fix-parser-error-for-uncore-event-alias.patch b/queue-4.19/perf-pmu-fix-parser-error-for-uncore-event-alias.patch new file mode 100644 index 00000000000..372e9a787b8 --- /dev/null +++ b/queue-4.19/perf-pmu-fix-parser-error-for-uncore-event-alias.patch @@ -0,0 +1,83 @@ +From e94d6b7f615e6dfbaf9fba7db6011db561461d0c Mon Sep 17 00:00:00 2001 +From: Kan Liang +Date: Fri, 15 Mar 2019 11:00:14 -0700 +Subject: perf pmu: Fix parser error for uncore event alias + +From: Kan Liang + +commit e94d6b7f615e6dfbaf9fba7db6011db561461d0c upstream. + +Perf fails to parse uncore event alias, for example: + + # perf stat -e unc_m_clockticks -a --no-merge sleep 1 + event syntax error: 'unc_m_clockticks' + \___ parser error + +Current code assumes that the event alias is from one specific PMU. + +To find the PMU, perf strcmps the PMU name of event alias with the real +PMU name on the system. + +However, the uncore event alias may be from multiple PMUs with common +prefix. The PMU name of uncore event alias is the common prefix. + +For example, UNC_M_CLOCKTICKS is clock event for iMC, which include 6 +PMUs with the same prefix "uncore_imc" on a skylake server. + +The real PMU names on the system for iMC are uncore_imc_0 ... +uncore_imc_5. + +The strncmp is used to only check the common prefix for uncore event +alias. + +With the patch: + + # perf stat -e unc_m_clockticks -a --no-merge sleep 1 + Performance counter stats for 'system wide': + + 723,594,722 unc_m_clockticks [uncore_imc_5] + 724,001,954 unc_m_clockticks [uncore_imc_3] + 724,042,655 unc_m_clockticks [uncore_imc_1] + 724,161,001 unc_m_clockticks [uncore_imc_4] + 724,293,713 unc_m_clockticks [uncore_imc_2] + 724,340,901 unc_m_clockticks [uncore_imc_0] + + 1.002090060 seconds time elapsed + +Signed-off-by: Kan Liang +Acked-by: Jiri Olsa +Cc: Andi Kleen +Cc: Thomas Richter +Cc: stable@vger.kernel.org +Fixes: ea1fa48c055f ("perf stat: Handle different PMU names with common prefix") +Link: http://lkml.kernel.org/r/1552672814-156173-1-git-send-email-kan.liang@linux.intel.com +Signed-off-by: Arnaldo Carvalho de Melo +Signed-off-by: Greg Kroah-Hartman + +--- + tools/perf/util/pmu.c | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +--- a/tools/perf/util/pmu.c ++++ b/tools/perf/util/pmu.c +@@ -773,10 +773,20 @@ static void pmu_add_cpu_aliases(struct l + + if (!is_arm_pmu_core(name)) { + pname = pe->pmu ? pe->pmu : "cpu"; ++ ++ /* ++ * uncore alias may be from different PMU ++ * with common prefix ++ */ ++ if (pmu_is_uncore(name) && ++ !strncmp(pname, name, strlen(pname))) ++ goto new_alias; ++ + if (strcmp(pname, name)) + continue; + } + ++new_alias: + /* need type casts to override 'const' */ + __perf_pmu__new_alias(head, NULL, (char *)pe->name, + (char *)pe->desc, (char *)pe->event, diff --git a/queue-4.19/powerpc-64-fix-memcmp-reading-past-the-end-of-src-dest.patch b/queue-4.19/powerpc-64-fix-memcmp-reading-past-the-end-of-src-dest.patch new file mode 100644 index 00000000000..f97f68570b3 --- /dev/null +++ b/queue-4.19/powerpc-64-fix-memcmp-reading-past-the-end-of-src-dest.patch @@ -0,0 +1,115 @@ +From d9470757398a700d9450a43508000bcfd010c7a4 Mon Sep 17 00:00:00 2001 +From: Michael Ellerman +Date: Fri, 22 Mar 2019 23:37:24 +1100 +Subject: powerpc/64: Fix memcmp reading past the end of src/dest + +From: Michael Ellerman + +commit d9470757398a700d9450a43508000bcfd010c7a4 upstream. + +Chandan reported that fstests' generic/026 test hit a crash: + + BUG: Unable to handle kernel data access at 0xc00000062ac40000 + Faulting instruction address: 0xc000000000092240 + Oops: Kernel access of bad area, sig: 11 [#1] + LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries + CPU: 0 PID: 27828 Comm: chacl Not tainted 5.0.0-rc2-next-20190115-00001-g6de6dba64dda #1 + NIP: c000000000092240 LR: c00000000066a55c CTR: 0000000000000000 + REGS: c00000062c0c3430 TRAP: 0300 Not tainted (5.0.0-rc2-next-20190115-00001-g6de6dba64dda) + MSR: 8000000002009033 CR: 44000842 XER: 20000000 + CFAR: 00007fff7f3108ac DAR: c00000062ac40000 DSISR: 40000000 IRQMASK: 0 + GPR00: 0000000000000000 c00000062c0c36c0 c0000000017f4c00 c00000000121a660 + GPR04: c00000062ac3fff9 0000000000000004 0000000000000020 00000000275b19c4 + GPR08: 000000000000000c 46494c4500000000 5347495f41434c5f c0000000026073a0 + GPR12: 0000000000000000 c0000000027a0000 0000000000000000 0000000000000000 + GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 + GPR20: c00000062ea70020 c00000062c0c38d0 0000000000000002 0000000000000002 + GPR24: c00000062ac3ffe8 00000000275b19c4 0000000000000001 c00000062ac30000 + GPR28: c00000062c0c38d0 c00000062ac30050 c00000062ac30058 0000000000000000 + NIP memcmp+0x120/0x690 + LR xfs_attr3_leaf_lookup_int+0x53c/0x5b0 + Call Trace: + xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable) + xfs_da3_node_lookup_int+0x32c/0x5a0 + xfs_attr_node_addname+0x170/0x6b0 + xfs_attr_set+0x2ac/0x340 + __xfs_set_acl+0xf0/0x230 + xfs_set_acl+0xd0/0x160 + set_posix_acl+0xc0/0x130 + posix_acl_xattr_set+0x68/0x110 + __vfs_setxattr+0xa4/0x110 + __vfs_setxattr_noperm+0xac/0x240 + vfs_setxattr+0x128/0x130 + setxattr+0x248/0x600 + path_setxattr+0x108/0x120 + sys_setxattr+0x28/0x40 + system_call+0x5c/0x70 + Instruction dump: + 7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c050000 + 4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040 + +The instruction dump decodes as: + subfic r6,r5,8 + rlwinm r6,r6,3,0,28 + ldbrx r9,0,r3 + ldbrx r10,0,r4 <- + +Which shows us doing an 8 byte load from c00000062ac3fff9, which +crosses the page boundary at c00000062ac40000 and faults. + +It's not OK for memcmp to read past the end of the source or +destination buffers if that would cross a page boundary, because we +don't know that the next page is mapped. + +As pointed out by Segher, we can read past the end of the source or +destination as long as we don't cross a 4K boundary, because that's +our minimum page size on all platforms. + +The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get +there we know that s1 is 8-byte aligned and we have at least 1 byte to +read, so a single 8-byte load won't read past the end of s1 and cross +a page boundary. + +But we have to be more careful with s2. So check if it's within 8 +bytes of a 4K boundary and if so go to the byte-by-byte loop. + +Fixes: 2d9ee327adce ("powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()") +Cc: stable@vger.kernel.org # v4.19+ +Reported-by: Chandan Rajendra +Signed-off-by: Michael Ellerman +Reviewed-by: Segher Boessenkool +Tested-by: Chandan Rajendra +Signed-off-by: Michael Ellerman +Signed-off-by: Greg Kroah-Hartman + +--- + arch/powerpc/lib/memcmp_64.S | 17 +++++++++++++---- + 1 file changed, 13 insertions(+), 4 deletions(-) + +--- a/arch/powerpc/lib/memcmp_64.S ++++ b/arch/powerpc/lib/memcmp_64.S +@@ -215,11 +215,20 @@ _GLOBAL_TOC(memcmp) + beq .Lzero + + .Lcmp_rest_lt8bytes: +- /* Here we have only less than 8 bytes to compare with. at least s1 +- * Address is aligned with 8 bytes. +- * The next double words are load and shift right with appropriate +- * bits. ++ /* ++ * Here we have less than 8 bytes to compare. At least s1 is aligned to ++ * 8 bytes, but s2 may not be. We must make sure s2 + 7 doesn't cross a ++ * page boundary, otherwise we might read past the end of the buffer and ++ * trigger a page fault. We use 4K as the conservative minimum page ++ * size. If we detect that case we go to the byte-by-byte loop. ++ * ++ * Otherwise the next double word is loaded from s1 and s2, and shifted ++ * right to compare the appropriate bits. + */ ++ clrldi r6,r4,(64-12) // r6 = r4 & 0xfff ++ cmpdi r6,0xff8 ++ bgt .Lshort ++ + subfic r6,r5,8 + slwi r6,r6,3 + LD rA,0,r3 diff --git a/queue-4.19/powerpc-pseries-energy-use-of-accessor-functions-to-read-ibm-drc-indexes.patch b/queue-4.19/powerpc-pseries-energy-use-of-accessor-functions-to-read-ibm-drc-indexes.patch new file mode 100644 index 00000000000..8b5c98534ef --- /dev/null +++ b/queue-4.19/powerpc-pseries-energy-use-of-accessor-functions-to-read-ibm-drc-indexes.patch @@ -0,0 +1,70 @@ +From ce9afe08e71e3f7d64f337a6e932e50849230fc2 Mon Sep 17 00:00:00 2001 +From: "Gautham R. Shenoy" +Date: Fri, 8 Mar 2019 21:03:24 +0530 +Subject: powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes + +From: Gautham R. Shenoy + +commit ce9afe08e71e3f7d64f337a6e932e50849230fc2 upstream. + +In cpu_to_drc_index() in the case when FW_FEATURE_DRC_INFO is absent, +we currently use of_read_property() to obtain the pointer to the array +corresponding to the property "ibm,drc-indexes". The elements of this +array are of type __be32, but are accessed without any conversion to +the OS-endianness, which is buggy on a Little Endian OS. + +Fix this by using of_property_read_u32_index() accessor function to +safely read the elements of the array. + +Fixes: e83636ac3334 ("pseries/drc-info: Search DRC properties for CPU indexes") +Cc: stable@vger.kernel.org # v4.16+ +Reported-by: Pavithra R. Prakash +Signed-off-by: Gautham R. Shenoy +Reviewed-by: Vaidyanathan Srinivasan +[mpe: Make the WARN_ON a WARN_ON_ONCE so it's not retriggerable] +Signed-off-by: Michael Ellerman +Signed-off-by: Greg Kroah-Hartman + +--- + arch/powerpc/platforms/pseries/pseries_energy.c | 27 ++++++++++++++++-------- + 1 file changed, 18 insertions(+), 9 deletions(-) + +--- a/arch/powerpc/platforms/pseries/pseries_energy.c ++++ b/arch/powerpc/platforms/pseries/pseries_energy.c +@@ -77,18 +77,27 @@ static u32 cpu_to_drc_index(int cpu) + + ret = drc.drc_index_start + (thread_index * drc.sequential_inc); + } else { +- const __be32 *indexes; +- +- indexes = of_get_property(dn, "ibm,drc-indexes", NULL); +- if (indexes == NULL) +- goto err_of_node_put; ++ u32 nr_drc_indexes, thread_drc_index; + + /* +- * The first element indexes[0] is the number of drc_indexes +- * returned in the list. Hence thread_index+1 will get the +- * drc_index corresponding to core number thread_index. ++ * The first element of ibm,drc-indexes array is the ++ * number of drc_indexes returned in the list. Hence ++ * thread_index+1 will get the drc_index corresponding ++ * to core number thread_index. + */ +- ret = indexes[thread_index + 1]; ++ rc = of_property_read_u32_index(dn, "ibm,drc-indexes", ++ 0, &nr_drc_indexes); ++ if (rc) ++ goto err_of_node_put; ++ ++ WARN_ON_ONCE(thread_index > nr_drc_indexes); ++ rc = of_property_read_u32_index(dn, "ibm,drc-indexes", ++ thread_index + 1, ++ &thread_drc_index); ++ if (rc) ++ goto err_of_node_put; ++ ++ ret = thread_drc_index; + } + + rc = 0; diff --git a/queue-4.19/series b/queue-4.19/series index 037482aad07..f22854241a1 100644 --- a/queue-4.19/series +++ b/queue-4.19/series @@ -112,3 +112,13 @@ mm-add-support-for-kmem-caches-in-dma32-zone.patch iommu-io-pgtable-arm-v7s-request-dma32-memory-and-improve-debugging.patch mm-mempolicy-make-mbind-return-eio-when-mpol_mf_strict-is-specified.patch mm-migrate.c-add-missing-flush_dcache_page-for-non-mapped-page-migrate.patch +perf-pmu-fix-parser-error-for-uncore-event-alias.patch +perf-intel-pt-fix-tsc-slip.patch +objtool-query-pkg-config-for-libelf-location.patch +powerpc-pseries-energy-use-of-accessor-functions-to-read-ibm-drc-indexes.patch +powerpc-64-fix-memcmp-reading-past-the-end-of-src-dest.patch +watchdog-respect-watchdog-cpumask-on-cpu-hotplug.patch +cpu-hotplug-prevent-crash-when-cpu-bringup-fails-on-config_hotplug_cpu-n.patch +x86-smp-enforce-config_hotplug_cpu-when-smp-y.patch +kvm-reject-device-ioctls-from-processes-other-than-the-vm-s-creator.patch +kvm-x86-update-rip-after-emulating-io.patch diff --git a/queue-4.19/watchdog-respect-watchdog-cpumask-on-cpu-hotplug.patch b/queue-4.19/watchdog-respect-watchdog-cpumask-on-cpu-hotplug.patch new file mode 100644 index 00000000000..94742e30e6a --- /dev/null +++ b/queue-4.19/watchdog-respect-watchdog-cpumask-on-cpu-hotplug.patch @@ -0,0 +1,56 @@ +From 7dd47617114921fdd8c095509e5e7b4373cc44a1 Mon Sep 17 00:00:00 2001 +From: Thomas Gleixner +Date: Tue, 26 Mar 2019 22:51:02 +0100 +Subject: watchdog: Respect watchdog cpumask on CPU hotplug + +From: Thomas Gleixner + +commit 7dd47617114921fdd8c095509e5e7b4373cc44a1 upstream. + +The rework of the watchdog core to use cpu_stop_work broke the watchdog +cpumask on CPU hotplug. + +The watchdog_enable/disable() functions are now called unconditionally from +the hotplug callback, i.e. even on CPUs which are not in the watchdog +cpumask. As a consequence the watchdog can become unstoppable. + +Only invoke them when the plugged CPU is in the watchdog cpumask. + +Fixes: 9cf57731b63e ("watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work") +Reported-by: Maxime Coquelin +Signed-off-by: Thomas Gleixner +Tested-by: Maxime Coquelin +Cc: Peter Zijlstra +Cc: Oleg Nesterov +Cc: Michael Ellerman +Cc: Nicholas Piggin +Cc: Don Zickus +Cc: Ricardo Neri +Cc: stable@vger.kernel.org +Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1903262245490.1789@nanos.tec.linutronix.de +Signed-off-by: Greg Kroah-Hartman + +--- + kernel/watchdog.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +--- a/kernel/watchdog.c ++++ b/kernel/watchdog.c +@@ -547,13 +547,15 @@ static void softlockup_start_all(void) + + int lockup_detector_online_cpu(unsigned int cpu) + { +- watchdog_enable(cpu); ++ if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) ++ watchdog_enable(cpu); + return 0; + } + + int lockup_detector_offline_cpu(unsigned int cpu) + { +- watchdog_disable(cpu); ++ if (cpumask_test_cpu(cpu, &watchdog_allowed_mask)) ++ watchdog_disable(cpu); + return 0; + } + diff --git a/queue-4.19/x86-smp-enforce-config_hotplug_cpu-when-smp-y.patch b/queue-4.19/x86-smp-enforce-config_hotplug_cpu-when-smp-y.patch new file mode 100644 index 00000000000..2bf757c8d54 --- /dev/null +++ b/queue-4.19/x86-smp-enforce-config_hotplug_cpu-when-smp-y.patch @@ -0,0 +1,62 @@ +From bebd024e4815b1a170fcd21ead9c2222b23ce9e6 Mon Sep 17 00:00:00 2001 +From: Thomas Gleixner +Date: Tue, 26 Mar 2019 17:36:06 +0100 +Subject: x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y + +From: Thomas Gleixner + +commit bebd024e4815b1a170fcd21ead9c2222b23ce9e6 upstream. + +The SMT disable 'nosmt' command line argument is not working properly when +CONFIG_HOTPLUG_CPU is disabled. The teardown of the sibling CPUs which are +required to be brought up due to the MCE issues, cannot work. The CPUs are +then kept in a half dead state. + +As the 'nosmt' functionality has become popular due to the speculative +hardware vulnerabilities, the half torn down state is not a proper solution +to the problem. + +Enforce CONFIG_HOTPLUG_CPU=y when SMP is enabled so the full operation is +possible. + +Reported-by: Tianyu Lan +Signed-off-by: Thomas Gleixner +Acked-by: Greg Kroah-Hartman +Cc: Konrad Wilk +Cc: Josh Poimboeuf +Cc: Mukesh Ojha +Cc: Peter Zijlstra +Cc: Jiri Kosina +Cc: Rik van Riel +Cc: Andy Lutomirski +Cc: Micheal Kelley +Cc: "K. Y. Srinivasan" +Cc: Linus Torvalds +Cc: Borislav Petkov +Cc: K. Y. Srinivasan +Cc: stable@vger.kernel.org +Link: https://lkml.kernel.org/r/20190326163811.598166056@linutronix.de +Signed-off-by: Greg Kroah-Hartman + +--- + arch/x86/Kconfig | 8 +------- + 1 file changed, 1 insertion(+), 7 deletions(-) + +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -2199,14 +2199,8 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING + If unsure, leave at the default value. + + config HOTPLUG_CPU +- bool "Support for hot-pluggable CPUs" ++ def_bool y + depends on SMP +- ---help--- +- Say Y here to allow turning CPUs off and on. CPUs can be +- controlled through /sys/devices/system/cpu. +- ( Note: power management support will enable this option +- automatically on SMP systems. ) +- Say N if you want to disable CPU hotplug. + + config BOOTPARAM_HOTPLUG_CPU0 + bool "Set default setting of cpu0_hotpluggable"