]> git.ipfire.org Git - thirdparty/kernel/stable-queue.git/blobdiff - queue-6.8/kvm-x86-pmu-disable-support-for-adaptive-pebs.patch
6.8-stable patches
[thirdparty/kernel/stable-queue.git] / queue-6.8 / kvm-x86-pmu-disable-support-for-adaptive-pebs.patch
diff --git a/queue-6.8/kvm-x86-pmu-disable-support-for-adaptive-pebs.patch b/queue-6.8/kvm-x86-pmu-disable-support-for-adaptive-pebs.patch
new file mode 100644 (file)
index 0000000..7cff685
--- /dev/null
@@ -0,0 +1,104 @@
+From 9e985cbf2942a1bb8fcef9adc2a17d90fd7ca8ee Mon Sep 17 00:00:00 2001
+From: Sean Christopherson <seanjc@google.com>
+Date: Wed, 6 Mar 2024 16:58:33 -0800
+Subject: KVM: x86/pmu: Disable support for adaptive PEBS
+
+From: Sean Christopherson <seanjc@google.com>
+
+commit 9e985cbf2942a1bb8fcef9adc2a17d90fd7ca8ee upstream.
+
+Drop support for virtualizing adaptive PEBS, as KVM's implementation is
+architecturally broken without an obvious/easy path forward, and because
+exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak
+host kernel addresses to the guest.
+
+Bug #1 is that KVM doesn't account for the upper 32 bits of
+IA32_FIXED_CTR_CTRL when (re)programming fixed counters, e.g
+fixed_ctrl_field() drops the upper bits, reprogram_fixed_counters()
+stores local variables as u8s and truncates the upper bits too, etc.
+
+Bug #2 is that, because KVM _always_ sets precise_ip to a non-zero value
+for PEBS events, perf will _always_ generate an adaptive record, even if
+the guest requested a basic record.  Note, KVM will also enable adaptive
+PEBS in individual *counter*, even if adaptive PEBS isn't exposed to the
+guest, but this is benign as MSR_PEBS_DATA_CFG is guaranteed to be zero,
+i.e. the guest will only ever see Basic records.
+
+Bug #3 is in perf.  intel_pmu_disable_fixed() doesn't clear the upper
+bits either, i.e. leaves ICL_FIXED_0_ADAPTIVE set, and
+intel_pmu_enable_fixed() effectively doesn't clear ICL_FIXED_0_ADAPTIVE
+either.  I.e. perf _always_ enables ADAPTIVE counters, regardless of what
+KVM requests.
+
+Bug #4 is that adaptive PEBS *might* effectively bypass event filters set
+by the host, as "Updated Memory Access Info Group" records information
+that might be disallowed by userspace via KVM_SET_PMU_EVENT_FILTER.
+
+Bug #5 is that KVM doesn't ensure LBR MSRs hold guest values (or at least
+zeros) when entering a vCPU with adaptive PEBS, which allows the guest
+to read host LBRs, i.e. host RIPs/addresses, by enabling "LBR Entries"
+records.
+
+Disable adaptive PEBS support as an immediate fix due to the severity of
+the LBR leak in particular, and because fixing all of the bugs will be
+non-trivial, e.g. not suitable for backporting to stable kernels.
+
+Note!  This will break live migration, but trying to make KVM play nice
+with live migration would be quite complicated, wouldn't be guaranteed to
+work (i.e. KVM might still kill/confuse the guest), and it's not clear
+that there are any publicly available VMMs that support adaptive PEBS,
+let alone live migrate VMs that support adaptive PEBS, e.g. QEMU doesn't
+support PEBS in any capacity.
+
+Link: https://lore.kernel.org/all/20240306230153.786365-1-seanjc@google.com
+Link: https://lore.kernel.org/all/ZeepGjHCeSfadANM@google.com
+Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
+Cc: stable@vger.kernel.org
+Cc: Like Xu <like.xu.linux@gmail.com>
+Cc: Mingwei Zhang <mizhang@google.com>
+Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
+Cc: Zhang Xiong <xiong.y.zhang@intel.com>
+Cc: Lv Zhiyuan <zhiyuan.lv@intel.com>
+Cc: Dapeng Mi <dapeng1.mi@intel.com>
+Cc: Jim Mattson <jmattson@google.com>
+Acked-by: Like Xu <likexu@tencent.com>
+Link: https://lore.kernel.org/r/20240307005833.827147-1-seanjc@google.com
+Signed-off-by: Sean Christopherson <seanjc@google.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ arch/x86/kvm/vmx/vmx.c |   24 ++++++++++++++++++++++--
+ 1 file changed, 22 insertions(+), 2 deletions(-)
+
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -7857,8 +7857,28 @@ static u64 vmx_get_perf_capabilities(voi
+       if (vmx_pebs_supported()) {
+               perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
+-              if ((perf_cap & PERF_CAP_PEBS_FORMAT) < 4)
+-                      perf_cap &= ~PERF_CAP_PEBS_BASELINE;
++
++              /*
++               * Disallow adaptive PEBS as it is functionally broken, can be
++               * used by the guest to read *host* LBRs, and can be used to
++               * bypass userspace event filters.  To correctly and safely
++               * support adaptive PEBS, KVM needs to:
++               *
++               * 1. Account for the ADAPTIVE flag when (re)programming fixed
++               *    counters.
++               *
++               * 2. Gain support from perf (or take direct control of counter
++               *    programming) to support events without adaptive PEBS
++               *    enabled for the hardware counter.
++               *
++               * 3. Ensure LBR MSRs cannot hold host data on VM-Entry with
++               *    adaptive PEBS enabled and MSR_PEBS_DATA_CFG.LBRS=1.
++               *
++               * 4. Document which PMU events are effectively exposed to the
++               *    guest via adaptive PEBS, and make adaptive PEBS mutually
++               *    exclusive with KVM_SET_PMU_EVENT_FILTER if necessary.
++               */
++              perf_cap &= ~PERF_CAP_PEBS_BASELINE;
+       }
+       return perf_cap;