From: Greg Kroah-Hartman Date: Fri, 29 Mar 2024 12:41:12 +0000 (+0100) Subject: 5.15-stable patches X-Git-Tag: v6.7.12~157 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=466f027b70dc250907b5272af15b0afade655d79;p=thirdparty%2Fkernel%2Fstable-queue.git 5.15-stable patches added patches: documentation-hw-vuln-add-documentation-for-rfds.patch kvm-vmx-move-verw-closer-to-vmentry-for-mds-mitigation.patch kvm-vmx-use-bt-jnc-i.e.-eflags.cf-to-select-vmresume-vs.-vmlaunch.patch kvm-x86-export-rfds_no-and-rfds_clear-to-guests.patch x86-asm-add-_asm_rip-macro-for-x86-64-rip-suffix.patch x86-bugs-add-asm-helpers-for-executing-verw.patch x86-bugs-use-alternative-instead-of-mds_user_clear-static-key.patch x86-entry_32-add-verw-just-before-userspace-transition.patch x86-entry_64-add-verw-just-before-userspace-transition.patch x86-mmio-disable-kvm-mitigation-when-x86_feature_clear_cpu_buf-is-set.patch x86-rfds-mitigate-register-file-data-sampling-rfds.patch --- diff --git a/queue-5.15/documentation-hw-vuln-add-documentation-for-rfds.patch b/queue-5.15/documentation-hw-vuln-add-documentation-for-rfds.patch new file mode 100644 index 00000000000..5f849c69e3e --- /dev/null +++ b/queue-5.15/documentation-hw-vuln-add-documentation-for-rfds.patch @@ -0,0 +1,142 @@ +From stable+bounces-27531-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:35 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:25 -0700 +Subject: Documentation/hw-vuln: Add documentation for RFDS +To: stable@vger.kernel.org +Cc: Dave Hansen , Thomas Gleixner , Josh Poimboeuf +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-9-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 4e42765d1be01111df0c0275bbaf1db1acef346e upstream. + +Add the documentation for transient execution vulnerability Register +File Data Sampling (RFDS) that affects Intel Atom CPUs. + + [ pawan: s/ATOM_GRACEMONT/ALDERLAKE_N/ ] + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Reviewed-by: Thomas Gleixner +Acked-by: Josh Poimboeuf +Signed-off-by: Greg Kroah-Hartman +--- + Documentation/admin-guide/hw-vuln/index.rst | 1 + Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst | 104 +++++++++++ + 2 files changed, 105 insertions(+) + +--- a/Documentation/admin-guide/hw-vuln/index.rst ++++ b/Documentation/admin-guide/hw-vuln/index.rst +@@ -21,3 +21,4 @@ are configurable at compile, boot or run + cross-thread-rsb.rst + gather_data_sampling.rst + srso ++ reg-file-data-sampling +--- /dev/null ++++ b/Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst +@@ -0,0 +1,104 @@ ++================================== ++Register File Data Sampling (RFDS) ++================================== ++ ++Register File Data Sampling (RFDS) is a microarchitectural vulnerability that ++only affects Intel Atom parts(also branded as E-cores). RFDS may allow ++a malicious actor to infer data values previously used in floating point ++registers, vector registers, or integer registers. RFDS does not provide the ++ability to choose which data is inferred. CVE-2023-28746 is assigned to RFDS. ++ ++Affected Processors ++=================== ++Below is the list of affected Intel processors [#f1]_: ++ ++ =================== ============ ++ Common name Family_Model ++ =================== ============ ++ ATOM_GOLDMONT 06_5CH ++ ATOM_GOLDMONT_D 06_5FH ++ ATOM_GOLDMONT_PLUS 06_7AH ++ ATOM_TREMONT_D 06_86H ++ ATOM_TREMONT 06_96H ++ ALDERLAKE 06_97H ++ ALDERLAKE_L 06_9AH ++ ATOM_TREMONT_L 06_9CH ++ RAPTORLAKE 06_B7H ++ RAPTORLAKE_P 06_BAH ++ ALDERLAKE_N 06_BEH ++ RAPTORLAKE_S 06_BFH ++ =================== ============ ++ ++As an exception to this table, Intel Xeon E family parts ALDERLAKE(06_97H) and ++RAPTORLAKE(06_B7H) codenamed Catlow are not affected. They are reported as ++vulnerable in Linux because they share the same family/model with an affected ++part. Unlike their affected counterparts, they do not enumerate RFDS_CLEAR or ++CPUID.HYBRID. This information could be used to distinguish between the ++affected and unaffected parts, but it is deemed not worth adding complexity as ++the reporting is fixed automatically when these parts enumerate RFDS_NO. ++ ++Mitigation ++========== ++Intel released a microcode update that enables software to clear sensitive ++information using the VERW instruction. Like MDS, RFDS deploys the same ++mitigation strategy to force the CPU to clear the affected buffers before an ++attacker can extract the secrets. This is achieved by using the otherwise ++unused and obsolete VERW instruction in combination with a microcode update. ++The microcode clears the affected CPU buffers when the VERW instruction is ++executed. ++ ++Mitigation points ++----------------- ++VERW is executed by the kernel before returning to user space, and by KVM ++before VMentry. None of the affected cores support SMT, so VERW is not required ++at C-state transitions. ++ ++New bits in IA32_ARCH_CAPABILITIES ++---------------------------------- ++Newer processors and microcode update on existing affected processors added new ++bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate ++vulnerability and mitigation capability: ++ ++- Bit 27 - RFDS_NO - When set, processor is not affected by RFDS. ++- Bit 28 - RFDS_CLEAR - When set, processor is affected by RFDS, and has the ++ microcode that clears the affected buffers on VERW execution. ++ ++Mitigation control on the kernel command line ++--------------------------------------------- ++The kernel command line allows to control RFDS mitigation at boot time with the ++parameter "reg_file_data_sampling=". The valid arguments are: ++ ++ ========== ================================================================= ++ on If the CPU is vulnerable, enable mitigation; CPU buffer clearing ++ on exit to userspace and before entering a VM. ++ off Disables mitigation. ++ ========== ================================================================= ++ ++Mitigation default is selected by CONFIG_MITIGATION_RFDS. ++ ++Mitigation status information ++----------------------------- ++The Linux kernel provides a sysfs interface to enumerate the current ++vulnerability status of the system: whether the system is vulnerable, and ++which mitigations are active. The relevant sysfs file is: ++ ++ /sys/devices/system/cpu/vulnerabilities/reg_file_data_sampling ++ ++The possible values in this file are: ++ ++ .. list-table:: ++ ++ * - 'Not affected' ++ - The processor is not vulnerable ++ * - 'Vulnerable' ++ - The processor is vulnerable, but no mitigation enabled ++ * - 'Vulnerable: No microcode' ++ - The processor is vulnerable but microcode is not updated. ++ * - 'Mitigation: Clear Register File' ++ - The processor is vulnerable and the CPU buffer clearing mitigation is ++ enabled. ++ ++References ++---------- ++.. [#f1] Affected Processors ++ https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html diff --git a/queue-5.15/kvm-vmx-move-verw-closer-to-vmentry-for-mds-mitigation.patch b/queue-5.15/kvm-vmx-move-verw-closer-to-vmentry-for-mds-mitigation.patch new file mode 100644 index 00000000000..6e70228d520 --- /dev/null +++ b/queue-5.15/kvm-vmx-move-verw-closer-to-vmentry-for-mds-mitigation.patch @@ -0,0 +1,81 @@ +From stable+bounces-27529-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:23 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:14 -0700 +Subject: KVM/VMX: Move VERW closer to VMentry for MDS mitigation +To: stable@vger.kernel.org +Cc: Dave Hansen , Sean Christopherson +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-7-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 43fb862de8f628c5db5e96831c915b9aebf62d33 upstream. + +During VMentry VERW is executed to mitigate MDS. After VERW, any memory +access like register push onto stack may put host data in MDS affected +CPU buffers. A guest can then use MDS to sample host data. + +Although likelihood of secrets surviving in registers at current VERW +callsite is less, but it can't be ruled out. Harden the MDS mitigation +by moving the VERW mitigation late in VMentry path. + +Note that VERW for MMIO Stale Data mitigation is unchanged because of +the complexity of per-guest conditional VERW which is not easy to handle +that late in asm with no GPRs available. If the CPU is also affected by +MDS, VERW is unconditionally executed late in asm regardless of guest +having MMIO access. + + [ pawan: conflict resolved in backport ] + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Acked-by: Sean Christopherson +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-6-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kvm/vmx/vmenter.S | 3 +++ + arch/x86/kvm/vmx/vmx.c | 12 ++++++++---- + 2 files changed, 11 insertions(+), 4 deletions(-) + +--- a/arch/x86/kvm/vmx/vmenter.S ++++ b/arch/x86/kvm/vmx/vmenter.S +@@ -99,6 +99,9 @@ SYM_FUNC_START(__vmx_vcpu_run) + /* Load guest RAX. This kills the @regs pointer! */ + mov VCPU_RAX(%_ASM_AX), %_ASM_AX + ++ /* Clobbers EFLAGS.ZF */ ++ CLEAR_CPU_BUFFERS ++ + /* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */ + jnc .Lvmlaunch + +--- a/arch/x86/kvm/vmx/vmx.c ++++ b/arch/x86/kvm/vmx/vmx.c +@@ -398,7 +398,8 @@ static __always_inline void vmx_enable_f + + static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx) + { +- vmx->disable_fb_clear = vmx_fb_clear_ctrl_available; ++ vmx->disable_fb_clear = !cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF) && ++ vmx_fb_clear_ctrl_available; + + /* + * If guest will not execute VERW, there is no need to set FB_CLEAR_DIS +@@ -6747,11 +6748,14 @@ static noinstr void vmx_vcpu_enter_exit( + { + kvm_guest_enter_irqoff(); + +- /* L1D Flush includes CPU buffer clear to mitigate MDS */ ++ /* ++ * L1D Flush includes CPU buffer clear to mitigate MDS, but VERW ++ * mitigation for MDS is done late in VMentry and is still ++ * executed in spite of L1D Flush. This is because an extra VERW ++ * should not matter much after the big hammer L1D Flush. ++ */ + if (static_branch_unlikely(&vmx_l1d_should_flush)) + vmx_l1d_flush(vcpu); +- else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF)) +- mds_clear_cpu_buffers(); + else if (static_branch_unlikely(&mmio_stale_data_clear) && + kvm_arch_has_assigned_device(vcpu->kvm)) + mds_clear_cpu_buffers(); diff --git a/queue-5.15/kvm-vmx-use-bt-jnc-i.e.-eflags.cf-to-select-vmresume-vs.-vmlaunch.patch b/queue-5.15/kvm-vmx-use-bt-jnc-i.e.-eflags.cf-to-select-vmresume-vs.-vmlaunch.patch new file mode 100644 index 00000000000..55a1512b2b0 --- /dev/null +++ b/queue-5.15/kvm-vmx-use-bt-jnc-i.e.-eflags.cf-to-select-vmresume-vs.-vmlaunch.patch @@ -0,0 +1,68 @@ +From stable+bounces-27528-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:13 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:08 -0700 +Subject: KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH +To: stable@vger.kernel.org +Cc: Sean Christopherson , Dave Hansen , Nikolay Borisov +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-6-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Sean Christopherson + +commit 706a189dcf74d3b3f955e9384785e726ed6c7c80 upstream. + +Use EFLAGS.CF instead of EFLAGS.ZF to track whether to use VMRESUME versus +VMLAUNCH. Freeing up EFLAGS.ZF will allow doing VERW, which clobbers ZF, +for MDS mitigations as late as possible without needing to duplicate VERW +for both paths. + + [ pawan: resolved merge conflict in __vmx_vcpu_run in backport. ] + +Signed-off-by: Sean Christopherson +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Reviewed-by: Nikolay Borisov +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-5-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kvm/vmx/run_flags.h | 7 +++++-- + arch/x86/kvm/vmx/vmenter.S | 6 +++--- + 2 files changed, 8 insertions(+), 5 deletions(-) + +--- a/arch/x86/kvm/vmx/run_flags.h ++++ b/arch/x86/kvm/vmx/run_flags.h +@@ -2,7 +2,10 @@ + #ifndef __KVM_X86_VMX_RUN_FLAGS_H + #define __KVM_X86_VMX_RUN_FLAGS_H + +-#define VMX_RUN_VMRESUME (1 << 0) +-#define VMX_RUN_SAVE_SPEC_CTRL (1 << 1) ++#define VMX_RUN_VMRESUME_SHIFT 0 ++#define VMX_RUN_SAVE_SPEC_CTRL_SHIFT 1 ++ ++#define VMX_RUN_VMRESUME BIT(VMX_RUN_VMRESUME_SHIFT) ++#define VMX_RUN_SAVE_SPEC_CTRL BIT(VMX_RUN_SAVE_SPEC_CTRL_SHIFT) + + #endif /* __KVM_X86_VMX_RUN_FLAGS_H */ +--- a/arch/x86/kvm/vmx/vmenter.S ++++ b/arch/x86/kvm/vmx/vmenter.S +@@ -77,7 +77,7 @@ SYM_FUNC_START(__vmx_vcpu_run) + mov (%_ASM_SP), %_ASM_AX + + /* Check if vmlaunch or vmresume is needed */ +- testb $VMX_RUN_VMRESUME, %bl ++ bt $VMX_RUN_VMRESUME_SHIFT, %bx + + /* Load guest registers. Don't clobber flags. */ + mov VCPU_RCX(%_ASM_AX), %_ASM_CX +@@ -99,8 +99,8 @@ SYM_FUNC_START(__vmx_vcpu_run) + /* Load guest RAX. This kills the @regs pointer! */ + mov VCPU_RAX(%_ASM_AX), %_ASM_AX + +- /* Check EFLAGS.ZF from 'testb' above */ +- jz .Lvmlaunch ++ /* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */ ++ jnc .Lvmlaunch + + /* + * After a successful VMRESUME/VMLAUNCH, control flow "magically" diff --git a/queue-5.15/kvm-x86-export-rfds_no-and-rfds_clear-to-guests.patch b/queue-5.15/kvm-x86-export-rfds_no-and-rfds_clear-to-guests.patch new file mode 100644 index 00000000000..7a45327d07b --- /dev/null +++ b/queue-5.15/kvm-x86-export-rfds_no-and-rfds_clear-to-guests.patch @@ -0,0 +1,52 @@ +From stable+bounces-27533-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:46 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:36 -0700 +Subject: KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests +To: stable@vger.kernel.org +Cc: Dave Hansen , Thomas Gleixner , Josh Poimboeuf +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-11-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 2a0180129d726a4b953232175857d442651b55a0 upstream. + +Mitigation for RFDS requires RFDS_CLEAR capability which is enumerated +by MSR_IA32_ARCH_CAPABILITIES bit 27. If the host has it set, export it +to guests so that they can deploy the mitigation. + +RFDS_NO indicates that the system is not vulnerable to RFDS, export it +to guests so that they don't deploy the mitigation unnecessarily. When +the host is not affected by X86_BUG_RFDS, but has RFDS_NO=0, synthesize +RFDS_NO to the guest. + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Reviewed-by: Thomas Gleixner +Acked-by: Josh Poimboeuf +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kvm/x86.c | 5 ++++- + 1 file changed, 4 insertions(+), 1 deletion(-) + +--- a/arch/x86/kvm/x86.c ++++ b/arch/x86/kvm/x86.c +@@ -1498,7 +1498,8 @@ static unsigned int num_msr_based_featur + ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \ + ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \ + ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \ +- ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO) ++ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO | \ ++ ARCH_CAP_RFDS_NO | ARCH_CAP_RFDS_CLEAR) + + static u64 kvm_get_arch_capabilities(void) + { +@@ -1535,6 +1536,8 @@ static u64 kvm_get_arch_capabilities(voi + data |= ARCH_CAP_SSB_NO; + if (!boot_cpu_has_bug(X86_BUG_MDS)) + data |= ARCH_CAP_MDS_NO; ++ if (!boot_cpu_has_bug(X86_BUG_RFDS)) ++ data |= ARCH_CAP_RFDS_NO; + + if (!boot_cpu_has(X86_FEATURE_RTM)) { + /* diff --git a/queue-5.15/series b/queue-5.15/series index 4990e9ddbea..a73bf3f4d9b 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -149,3 +149,14 @@ printk-update-console_may_schedule-in-console_tryloc.patch tty-serial-imx-fix-broken-rs485.patch kvm-arm64-work-out-supported-block-level-at-compile-time.patch kvm-arm64-limit-stage2_apply_range-batch-size-to-largest-block.patch +x86-asm-add-_asm_rip-macro-for-x86-64-rip-suffix.patch +x86-bugs-add-asm-helpers-for-executing-verw.patch +x86-entry_64-add-verw-just-before-userspace-transition.patch +x86-entry_32-add-verw-just-before-userspace-transition.patch +x86-bugs-use-alternative-instead-of-mds_user_clear-static-key.patch +kvm-vmx-use-bt-jnc-i.e.-eflags.cf-to-select-vmresume-vs.-vmlaunch.patch +kvm-vmx-move-verw-closer-to-vmentry-for-mds-mitigation.patch +x86-mmio-disable-kvm-mitigation-when-x86_feature_clear_cpu_buf-is-set.patch +documentation-hw-vuln-add-documentation-for-rfds.patch +x86-rfds-mitigate-register-file-data-sampling-rfds.patch +kvm-x86-export-rfds_no-and-rfds_clear-to-guests.patch diff --git a/queue-5.15/x86-asm-add-_asm_rip-macro-for-x86-64-rip-suffix.patch b/queue-5.15/x86-asm-add-_asm_rip-macro-for-x86-64-rip-suffix.patch new file mode 100644 index 00000000000..94292221980 --- /dev/null +++ b/queue-5.15/x86-asm-add-_asm_rip-macro-for-x86-64-rip-suffix.patch @@ -0,0 +1,55 @@ +From stable+bounces-27524-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:01 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:10:40 -0700 +Subject: x86/asm: Add _ASM_RIP() macro for x86-64 (%rip) suffix +To: stable@vger.kernel.org +Cc: "H. Peter Anvin (Intel)" , Borislav Petkov +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-1-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: "H. Peter Anvin (Intel)" + +commit f87bc8dc7a7c438c70f97b4e51c76a183313272e upstream. + +Add a macro _ASM_RIP() to add a (%rip) suffix on 64 bits only. This is +useful for immediate memory references where one doesn't want gcc +to possibly use a register indirection as it may in the case of an "m" +constraint. + + [ pawan: resolved merged conflict for __ASM_REGPFX ] + +Signed-off-by: H. Peter Anvin (Intel) +Signed-off-by: Borislav Petkov +Signed-off-by: Pawan Gupta +Link: https://lkml.kernel.org/r/20210910195910.2542662-3-hpa@zytor.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/include/asm/asm.h | 5 +++++ + 1 file changed, 5 insertions(+) + +--- a/arch/x86/include/asm/asm.h ++++ b/arch/x86/include/asm/asm.h +@@ -6,11 +6,13 @@ + # define __ASM_FORM(x, ...) x,## __VA_ARGS__ + # define __ASM_FORM_RAW(x, ...) x,## __VA_ARGS__ + # define __ASM_FORM_COMMA(x, ...) x,## __VA_ARGS__, ++# define __ASM_REGPFX % + #else + #include + # define __ASM_FORM(x, ...) " " __stringify(x,##__VA_ARGS__) " " + # define __ASM_FORM_RAW(x, ...) __stringify(x,##__VA_ARGS__) + # define __ASM_FORM_COMMA(x, ...) " " __stringify(x,##__VA_ARGS__) "," ++# define __ASM_REGPFX %% + #endif + + #define _ASM_BYTES(x, ...) __ASM_FORM(.byte x,##__VA_ARGS__ ;) +@@ -49,6 +51,9 @@ + #define _ASM_SI __ASM_REG(si) + #define _ASM_DI __ASM_REG(di) + ++/* Adds a (%rip) suffix on 64 bits only; for immediate memory references */ ++#define _ASM_RIP(x) __ASM_SEL_RAW(x, x (__ASM_REGPFX rip)) ++ + #ifndef __x86_64__ + /* 32 bit */ + diff --git a/queue-5.15/x86-bugs-add-asm-helpers-for-executing-verw.patch b/queue-5.15/x86-bugs-add-asm-helpers-for-executing-verw.patch new file mode 100644 index 00000000000..f625499dd6c --- /dev/null +++ b/queue-5.15/x86-bugs-add-asm-helpers-for-executing-verw.patch @@ -0,0 +1,132 @@ +From stable+bounces-27523-greg=kroah.com@vger.kernel.org Tue Mar 12 22:10:59 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:10:46 -0700 +Subject: x86/bugs: Add asm helpers for executing VERW +To: stable@vger.kernel.org +Cc: Alyssa Milburn , Andrew Cooper , Peter Zijlstra , Dave Hansen +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-2-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit baf8361e54550a48a7087b603313ad013cc13386 upstream. + +MDS mitigation requires clearing the CPU buffers before returning to +user. This needs to be done late in the exit-to-user path. Current +location of VERW leaves a possibility of kernel data ending up in CPU +buffers for memory accesses done after VERW such as: + + 1. Kernel data accessed by an NMI between VERW and return-to-user can + remain in CPU buffers since NMI returning to kernel does not + execute VERW to clear CPU buffers. + 2. Alyssa reported that after VERW is executed, + CONFIG_GCC_PLUGIN_STACKLEAK=y scrubs the stack used by a system + call. Memory accesses during stack scrubbing can move kernel stack + contents into CPU buffers. + 3. When caller saved registers are restored after a return from + function executing VERW, the kernel stack accesses can remain in + CPU buffers(since they occur after VERW). + +To fix this VERW needs to be moved very late in exit-to-user path. + +In preparation for moving VERW to entry/exit asm code, create macros +that can be used in asm. Also make VERW patching depend on a new feature +flag X86_FEATURE_CLEAR_CPU_BUF. + + [pawan: - Runtime patch jmp instead of verw in macro CLEAR_CPU_BUFFERS + due to lack of relative addressing support for relocations + in kernels < v6.5. + - Add UNWIND_HINT_EMPTY to avoid warning: + arch/x86/entry/entry.o: warning: objtool: mds_verw_sel+0x0: unreachable instruction] + +Reported-by: Alyssa Milburn +Suggested-by: Andrew Cooper +Suggested-by: Peter Zijlstra +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-1-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/entry/entry.S | 23 +++++++++++++++++++++++ + arch/x86/include/asm/cpufeatures.h | 2 +- + arch/x86/include/asm/nospec-branch.h | 15 +++++++++++++++ + 3 files changed, 39 insertions(+), 1 deletion(-) + +--- a/arch/x86/entry/entry.S ++++ b/arch/x86/entry/entry.S +@@ -6,6 +6,9 @@ + #include + #include + #include ++#include ++#include ++#include + + .pushsection .noinstr.text, "ax" + +@@ -20,3 +23,23 @@ SYM_FUNC_END(entry_ibpb) + EXPORT_SYMBOL_GPL(entry_ibpb); + + .popsection ++ ++/* ++ * Define the VERW operand that is disguised as entry code so that ++ * it can be referenced with KPTI enabled. This ensure VERW can be ++ * used late in exit-to-user path after page tables are switched. ++ */ ++.pushsection .entry.text, "ax" ++ ++.align L1_CACHE_BYTES, 0xcc ++SYM_CODE_START_NOALIGN(mds_verw_sel) ++ UNWIND_HINT_EMPTY ++ ANNOTATE_NOENDBR ++ .word __KERNEL_DS ++.align L1_CACHE_BYTES, 0xcc ++SYM_CODE_END(mds_verw_sel); ++/* For KVM */ ++EXPORT_SYMBOL_GPL(mds_verw_sel); ++ ++.popsection ++ +--- a/arch/x86/include/asm/cpufeatures.h ++++ b/arch/x86/include/asm/cpufeatures.h +@@ -302,7 +302,7 @@ + #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */ + #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */ + #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */ +- ++#define X86_FEATURE_CLEAR_CPU_BUF (11*32+18) /* "" Clear CPU buffers using VERW */ + + #define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */ + +--- a/arch/x86/include/asm/nospec-branch.h ++++ b/arch/x86/include/asm/nospec-branch.h +@@ -182,6 +182,19 @@ + #endif + .endm + ++/* ++ * Macro to execute VERW instruction that mitigate transient data sampling ++ * attacks such as MDS. On affected systems a microcode update overloaded VERW ++ * instruction to also clear the CPU buffers. VERW clobbers CFLAGS.ZF. ++ * ++ * Note: Only the memory operand variant of VERW clears the CPU buffers. ++ */ ++.macro CLEAR_CPU_BUFFERS ++ ALTERNATIVE "jmp .Lskip_verw_\@", "", X86_FEATURE_CLEAR_CPU_BUF ++ verw _ASM_RIP(mds_verw_sel) ++.Lskip_verw_\@: ++.endm ++ + #else /* __ASSEMBLY__ */ + + #define ANNOTATE_RETPOLINE_SAFE \ +@@ -364,6 +377,8 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ + + DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear); + ++extern u16 mds_verw_sel; ++ + #include + + /** diff --git a/queue-5.15/x86-bugs-use-alternative-instead-of-mds_user_clear-static-key.patch b/queue-5.15/x86-bugs-use-alternative-instead-of-mds_user_clear-static-key.patch new file mode 100644 index 00000000000..bcbc7e93970 --- /dev/null +++ b/queue-5.15/x86-bugs-use-alternative-instead-of-mds_user_clear-static-key.patch @@ -0,0 +1,204 @@ +From stable+bounces-27527-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:10 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:02 -0700 +Subject: x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key +To: stable@vger.kernel.org +Cc: Dave Hansen +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-5-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 6613d82e617dd7eb8b0c40b2fe3acea655b1d611 upstream. + +The VERW mitigation at exit-to-user is enabled via a static branch +mds_user_clear. This static branch is never toggled after boot, and can +be safely replaced with an ALTERNATIVE() which is convenient to use in +asm. + +Switch to ALTERNATIVE() to use the VERW mitigation late in exit-to-user +path. Also remove the now redundant VERW in exc_nmi() and +arch_exit_to_user_mode(). + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-4-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + Documentation/x86/mds.rst | 36 +++++++++++++++++++++++++---------- + arch/x86/include/asm/entry-common.h | 1 + arch/x86/include/asm/nospec-branch.h | 12 ----------- + arch/x86/kernel/cpu/bugs.c | 15 +++++--------- + arch/x86/kernel/nmi.c | 3 -- + arch/x86/kvm/vmx/vmx.c | 2 - + 6 files changed, 33 insertions(+), 36 deletions(-) + +--- a/Documentation/x86/mds.rst ++++ b/Documentation/x86/mds.rst +@@ -95,6 +95,9 @@ The kernel provides a function to invoke + + mds_clear_cpu_buffers() + ++Also macro CLEAR_CPU_BUFFERS can be used in ASM late in exit-to-user path. ++Other than CFLAGS.ZF, this macro doesn't clobber any registers. ++ + The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state + (idle) transitions. + +@@ -138,17 +141,30 @@ Mitigation points + + When transitioning from kernel to user space the CPU buffers are flushed + on affected CPUs when the mitigation is not disabled on the kernel +- command line. The migitation is enabled through the static key +- mds_user_clear. ++ command line. The mitigation is enabled through the feature flag ++ X86_FEATURE_CLEAR_CPU_BUF. + +- The mitigation is invoked in prepare_exit_to_usermode() which covers +- all but one of the kernel to user space transitions. The exception +- is when we return from a Non Maskable Interrupt (NMI), which is +- handled directly in do_nmi(). +- +- (The reason that NMI is special is that prepare_exit_to_usermode() can +- enable IRQs. In NMI context, NMIs are blocked, and we don't want to +- enable IRQs with NMIs blocked.) ++ The mitigation is invoked just before transitioning to userspace after ++ user registers are restored. This is done to minimize the window in ++ which kernel data could be accessed after VERW e.g. via an NMI after ++ VERW. ++ ++ **Corner case not handled** ++ Interrupts returning to kernel don't clear CPUs buffers since the ++ exit-to-user path is expected to do that anyways. But, there could be ++ a case when an NMI is generated in kernel after the exit-to-user path ++ has cleared the buffers. This case is not handled and NMI returning to ++ kernel don't clear CPU buffers because: ++ ++ 1. It is rare to get an NMI after VERW, but before returning to userspace. ++ 2. For an unprivileged user, there is no known way to make that NMI ++ less rare or target it. ++ 3. It would take a large number of these precisely-timed NMIs to mount ++ an actual attack. There's presumably not enough bandwidth. ++ 4. The NMI in question occurs after a VERW, i.e. when user state is ++ restored and most interesting data is already scrubbed. Whats left ++ is only the data that NMI touches, and that may or may not be of ++ any interest. + + + 2. C-State transition +--- a/arch/x86/include/asm/entry-common.h ++++ b/arch/x86/include/asm/entry-common.h +@@ -91,7 +91,6 @@ static inline void arch_exit_to_user_mod + + static __always_inline void arch_exit_to_user_mode(void) + { +- mds_user_clear_cpu_buffers(); + amd_clear_divider(); + } + #define arch_exit_to_user_mode arch_exit_to_user_mode +--- a/arch/x86/include/asm/nospec-branch.h ++++ b/arch/x86/include/asm/nospec-branch.h +@@ -370,7 +370,6 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_ + DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); + DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb); + +-DECLARE_STATIC_KEY_FALSE(mds_user_clear); + DECLARE_STATIC_KEY_FALSE(mds_idle_clear); + + DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush); +@@ -405,17 +404,6 @@ static __always_inline void mds_clear_cp + } + + /** +- * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability +- * +- * Clear CPU buffers if the corresponding static key is enabled +- */ +-static __always_inline void mds_user_clear_cpu_buffers(void) +-{ +- if (static_branch_likely(&mds_user_clear)) +- mds_clear_cpu_buffers(); +-} +- +-/** + * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability + * + * Clear CPU buffers if the corresponding static key is enabled +--- a/arch/x86/kernel/cpu/bugs.c ++++ b/arch/x86/kernel/cpu/bugs.c +@@ -110,9 +110,6 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_i + /* Control unconditional IBPB in switch_mm() */ + DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb); + +-/* Control MDS CPU buffer clear before returning to user space */ +-DEFINE_STATIC_KEY_FALSE(mds_user_clear); +-EXPORT_SYMBOL_GPL(mds_user_clear); + /* Control MDS CPU buffer clear before idling (halt, mwait) */ + DEFINE_STATIC_KEY_FALSE(mds_idle_clear); + EXPORT_SYMBOL_GPL(mds_idle_clear); +@@ -258,7 +255,7 @@ static void __init mds_select_mitigation + if (!boot_cpu_has(X86_FEATURE_MD_CLEAR)) + mds_mitigation = MDS_MITIGATION_VMWERV; + +- static_branch_enable(&mds_user_clear); ++ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); + + if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) && + (mds_nosmt || cpu_mitigations_auto_nosmt())) +@@ -362,7 +359,7 @@ static void __init taa_select_mitigation + * For guests that can't determine whether the correct microcode is + * present on host, enable the mitigation for UCODE_NEEDED as well. + */ +- static_branch_enable(&mds_user_clear); ++ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); + + if (taa_nosmt || cpu_mitigations_auto_nosmt()) + cpu_smt_disable(false); +@@ -430,7 +427,7 @@ static void __init mmio_select_mitigatio + */ + if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && + boot_cpu_has(X86_FEATURE_RTM))) +- static_branch_enable(&mds_user_clear); ++ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); + else + static_branch_enable(&mmio_stale_data_clear); + +@@ -490,12 +487,12 @@ static void __init md_clear_update_mitig + if (cpu_mitigations_off()) + return; + +- if (!static_key_enabled(&mds_user_clear)) ++ if (!boot_cpu_has(X86_FEATURE_CLEAR_CPU_BUF)) + goto out; + + /* +- * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data +- * mitigation, if necessary. ++ * X86_FEATURE_CLEAR_CPU_BUF is now enabled. Update MDS, TAA and MMIO ++ * Stale Data mitigation, if necessary. + */ + if (mds_mitigation == MDS_MITIGATION_OFF && + boot_cpu_has_bug(X86_BUG_MDS)) { +--- a/arch/x86/kernel/nmi.c ++++ b/arch/x86/kernel/nmi.c +@@ -519,9 +519,6 @@ nmi_restart: + write_cr2(this_cpu_read(nmi_cr2)); + if (this_cpu_dec_return(nmi_state)) + goto nmi_restart; +- +- if (user_mode(regs)) +- mds_user_clear_cpu_buffers(); + } + + #if defined(CONFIG_X86_64) && IS_ENABLED(CONFIG_KVM_INTEL) +--- a/arch/x86/kvm/vmx/vmx.c ++++ b/arch/x86/kvm/vmx/vmx.c +@@ -6750,7 +6750,7 @@ static noinstr void vmx_vcpu_enter_exit( + /* L1D Flush includes CPU buffer clear to mitigate MDS */ + if (static_branch_unlikely(&vmx_l1d_should_flush)) + vmx_l1d_flush(vcpu); +- else if (static_branch_unlikely(&mds_user_clear)) ++ else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF)) + mds_clear_cpu_buffers(); + else if (static_branch_unlikely(&mmio_stale_data_clear) && + kvm_arch_has_assigned_device(vcpu->kvm)) diff --git a/queue-5.15/x86-entry_32-add-verw-just-before-userspace-transition.patch b/queue-5.15/x86-entry_32-add-verw-just-before-userspace-transition.patch new file mode 100644 index 00000000000..b17ca8c692a --- /dev/null +++ b/queue-5.15/x86-entry_32-add-verw-just-before-userspace-transition.patch @@ -0,0 +1,50 @@ +From stable+bounces-27526-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:07 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:10:57 -0700 +Subject: x86/entry_32: Add VERW just before userspace transition +To: stable@vger.kernel.org +Cc: Dave Hansen +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-4-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit a0e2dab44d22b913b4c228c8b52b2a104434b0b3 upstream. + +As done for entry_64, add support for executing VERW late in exit to +user path for 32-bit mode. + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-3-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/entry/entry_32.S | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/arch/x86/entry/entry_32.S ++++ b/arch/x86/entry/entry_32.S +@@ -912,6 +912,7 @@ SYM_FUNC_START(entry_SYSENTER_32) + BUG_IF_WRONG_CR3 no_user_check=1 + popfl + popl %eax ++ CLEAR_CPU_BUFFERS + + /* + * Return back to the vDSO, which will pop ecx and edx. +@@ -981,6 +982,7 @@ restore_all_switch_stack: + + /* Restore user state */ + RESTORE_REGS pop=4 # skip orig_eax/error_code ++ CLEAR_CPU_BUFFERS + .Lirq_return: + /* + * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization +@@ -1173,6 +1175,7 @@ SYM_CODE_START(asm_exc_nmi) + + /* Not on SYSENTER stack. */ + call exc_nmi ++ CLEAR_CPU_BUFFERS + jmp .Lnmi_return + + .Lnmi_from_sysenter_stack: diff --git a/queue-5.15/x86-entry_64-add-verw-just-before-userspace-transition.patch b/queue-5.15/x86-entry_64-add-verw-just-before-userspace-transition.patch new file mode 100644 index 00000000000..b647b455283 --- /dev/null +++ b/queue-5.15/x86-entry_64-add-verw-just-before-userspace-transition.patch @@ -0,0 +1,113 @@ +From stable+bounces-27525-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:02 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:10:51 -0700 +Subject: x86/entry_64: Add VERW just before userspace transition +To: stable@vger.kernel.org +Cc: Dave Hansen , Dave Hansen +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-3-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 3c7501722e6b31a6e56edd23cea5e77dbb9ffd1a upstream. + +Mitigation for MDS is to use VERW instruction to clear any secrets in +CPU Buffers. Any memory accesses after VERW execution can still remain +in CPU buffers. It is safer to execute VERW late in return to user path +to minimize the window in which kernel data can end up in CPU buffers. +There are not many kernel secrets to be had after SWITCH_TO_USER_CR3. + +Add support for deploying VERW mitigation after user register state is +restored. This helps minimize the chances of kernel data ending up into +CPU buffers after executing VERW. + +Note that the mitigation at the new location is not yet enabled. + + Corner case not handled + ======================= + Interrupts returning to kernel don't clear CPUs buffers since the + exit-to-user path is expected to do that anyways. But, there could be + a case when an NMI is generated in kernel after the exit-to-user path + has cleared the buffers. This case is not handled and NMI returning to + kernel don't clear CPU buffers because: + + 1. It is rare to get an NMI after VERW, but before returning to user. + 2. For an unprivileged user, there is no known way to make that NMI + less rare or target it. + 3. It would take a large number of these precisely-timed NMIs to mount + an actual attack. There's presumably not enough bandwidth. + 4. The NMI in question occurs after a VERW, i.e. when user state is + restored and most interesting data is already scrubbed. Whats left + is only the data that NMI touches, and that may or may not be of + any interest. + + [ pawan: resolved conflict for hunk swapgs_restore_regs_and_return_to_usermode ] + +Suggested-by: Dave Hansen +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Link: https://lore.kernel.org/all/20240213-delay-verw-v8-2-a6216d83edb7%40linux.intel.com +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/entry/entry_64.S | 11 +++++++++++ + arch/x86/entry/entry_64_compat.S | 1 + + 2 files changed, 12 insertions(+) + +--- a/arch/x86/entry/entry_64.S ++++ b/arch/x86/entry/entry_64.S +@@ -219,6 +219,7 @@ syscall_return_via_sysret: + popq %rdi + popq %rsp + swapgs ++ CLEAR_CPU_BUFFERS + sysretq + SYM_CODE_END(entry_SYSCALL_64) + +@@ -637,6 +638,7 @@ SYM_INNER_LABEL(swapgs_restore_regs_and_ + /* Restore RDI. */ + popq %rdi + SWAPGS ++ CLEAR_CPU_BUFFERS + INTERRUPT_RETURN + + +@@ -743,6 +745,8 @@ native_irq_return_ldt: + */ + popq %rax /* Restore user RAX */ + ++ CLEAR_CPU_BUFFERS ++ + /* + * RSP now points to an ordinary IRET frame, except that the page + * is read-only and RSP[31:16] are preloaded with the userspace +@@ -1466,6 +1470,12 @@ nmi_restore: + movq $0, 5*8(%rsp) /* clear "NMI executing" */ + + /* ++ * Skip CLEAR_CPU_BUFFERS here, since it only helps in rare cases like ++ * NMI in kernel after user state is restored. For an unprivileged user ++ * these conditions are hard to meet. ++ */ ++ ++ /* + * iretq reads the "iret" frame and exits the NMI stack in a + * single instruction. We are returning to kernel mode, so this + * cannot result in a fault. Similarly, we don't need to worry +@@ -1482,6 +1492,7 @@ SYM_CODE_END(asm_exc_nmi) + SYM_CODE_START(ignore_sysret) + UNWIND_HINT_EMPTY + mov $-ENOSYS, %eax ++ CLEAR_CPU_BUFFERS + sysretl + SYM_CODE_END(ignore_sysret) + #endif +--- a/arch/x86/entry/entry_64_compat.S ++++ b/arch/x86/entry/entry_64_compat.S +@@ -319,6 +319,7 @@ sysret32_from_system_call: + xorl %r9d, %r9d + xorl %r10d, %r10d + swapgs ++ CLEAR_CPU_BUFFERS + sysretl + SYM_CODE_END(entry_SYSCALL_compat) + diff --git a/queue-5.15/x86-mmio-disable-kvm-mitigation-when-x86_feature_clear_cpu_buf-is-set.patch b/queue-5.15/x86-mmio-disable-kvm-mitigation-when-x86_feature_clear_cpu_buf-is-set.patch new file mode 100644 index 00000000000..72eb3162a20 --- /dev/null +++ b/queue-5.15/x86-mmio-disable-kvm-mitigation-when-x86_feature_clear_cpu_buf-is-set.patch @@ -0,0 +1,61 @@ +From stable+bounces-27530-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:26 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:19 -0700 +Subject: x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set +To: stable@vger.kernel.org +Cc: Dave Hansen +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-8-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit e95df4ec0c0c9791941f112db699fae794b9862a upstream. + +Currently MMIO Stale Data mitigation for CPUs not affected by MDS/TAA is +to only deploy VERW at VMentry by enabling mmio_stale_data_clear static +branch. No mitigation is needed for kernel->user transitions. If such +CPUs are also affected by RFDS, its mitigation may set +X86_FEATURE_CLEAR_CPU_BUF to deploy VERW at kernel->user and VMentry. +This could result in duplicate VERW at VMentry. + +Fix this by disabling mmio_stale_data_clear static branch when +X86_FEATURE_CLEAR_CPU_BUF is enabled. + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Reviewed-by: Dave Hansen +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kernel/cpu/bugs.c | 14 ++++++++++++-- + 1 file changed, 12 insertions(+), 2 deletions(-) + +--- a/arch/x86/kernel/cpu/bugs.c ++++ b/arch/x86/kernel/cpu/bugs.c +@@ -428,6 +428,13 @@ static void __init mmio_select_mitigatio + if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) && + boot_cpu_has(X86_FEATURE_RTM))) + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); ++ ++ /* ++ * X86_FEATURE_CLEAR_CPU_BUF could be enabled by other VERW based ++ * mitigations, disable KVM-only mitigation in that case. ++ */ ++ if (boot_cpu_has(X86_FEATURE_CLEAR_CPU_BUF)) ++ static_branch_disable(&mmio_stale_data_clear); + else + static_branch_enable(&mmio_stale_data_clear); + +@@ -504,8 +511,11 @@ static void __init md_clear_update_mitig + taa_mitigation = TAA_MITIGATION_VERW; + taa_select_mitigation(); + } +- if (mmio_mitigation == MMIO_MITIGATION_OFF && +- boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) { ++ /* ++ * MMIO_MITIGATION_OFF is not checked here so that mmio_stale_data_clear ++ * gets updated correctly as per X86_FEATURE_CLEAR_CPU_BUF state. ++ */ ++ if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) { + mmio_mitigation = MMIO_MITIGATION_VERW; + mmio_select_mitigation(); + } diff --git a/queue-5.15/x86-rfds-mitigate-register-file-data-sampling-rfds.patch b/queue-5.15/x86-rfds-mitigate-register-file-data-sampling-rfds.patch new file mode 100644 index 00000000000..2a4da33b490 --- /dev/null +++ b/queue-5.15/x86-rfds-mitigate-register-file-data-sampling-rfds.patch @@ -0,0 +1,381 @@ +From stable+bounces-27532-greg=kroah.com@vger.kernel.org Tue Mar 12 22:11:38 2024 +From: Pawan Gupta +Date: Tue, 12 Mar 2024 14:11:30 -0700 +Subject: x86/rfds: Mitigate Register File Data Sampling (RFDS) +To: stable@vger.kernel.org +Cc: Dave Hansen , Thomas Gleixner , Josh Poimboeuf +Message-ID: <20240312-delay-verw-backport-5-15-y-v2-10-e0f71d17ed1b@linux.intel.com> +Content-Disposition: inline + +From: Pawan Gupta + +commit 8076fcde016c9c0e0660543e67bff86cb48a7c9c upstream. + +RFDS is a CPU vulnerability that may allow userspace to infer kernel +stale data previously used in floating point registers, vector registers +and integer registers. RFDS only affects certain Intel Atom processors. + +Intel released a microcode update that uses VERW instruction to clear +the affected CPU buffers. Unlike MDS, none of the affected cores support +SMT. + +Add RFDS bug infrastructure and enable the VERW based mitigation by +default, that clears the affected buffers just before exiting to +userspace. Also add sysfs reporting and cmdline parameter +"reg_file_data_sampling" to control the mitigation. + +For details see: +Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst + + [ pawan: - Resolved conflicts in sysfs reporting. + - s/ATOM_GRACEMONT/ALDERLAKE_N/ATOM_GRACEMONT is called + ALDERLAKE_N in 6.6. ] + +Signed-off-by: Pawan Gupta +Signed-off-by: Dave Hansen +Reviewed-by: Thomas Gleixner +Acked-by: Josh Poimboeuf +Signed-off-by: Greg Kroah-Hartman +--- + Documentation/ABI/testing/sysfs-devices-system-cpu | 1 + Documentation/admin-guide/kernel-parameters.txt | 21 +++++ + arch/x86/Kconfig | 11 ++ + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 8 ++ + arch/x86/kernel/cpu/bugs.c | 78 ++++++++++++++++++++- + arch/x86/kernel/cpu/common.c | 38 +++++++++- + drivers/base/cpu.c | 8 ++ + include/linux/cpu.h | 2 + 9 files changed, 162 insertions(+), 6 deletions(-) + +--- a/Documentation/ABI/testing/sysfs-devices-system-cpu ++++ b/Documentation/ABI/testing/sysfs-devices-system-cpu +@@ -517,6 +517,7 @@ What: /sys/devices/system/cpu/vulnerabi + /sys/devices/system/cpu/vulnerabilities/mds + /sys/devices/system/cpu/vulnerabilities/meltdown + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data ++ /sys/devices/system/cpu/vulnerabilities/reg_file_data_sampling + /sys/devices/system/cpu/vulnerabilities/retbleed + /sys/devices/system/cpu/vulnerabilities/spec_store_bypass + /sys/devices/system/cpu/vulnerabilities/spectre_v1 +--- a/Documentation/admin-guide/kernel-parameters.txt ++++ b/Documentation/admin-guide/kernel-parameters.txt +@@ -1037,6 +1037,26 @@ + The filter can be disabled or changed to another + driver later using sysfs. + ++ reg_file_data_sampling= ++ [X86] Controls mitigation for Register File Data ++ Sampling (RFDS) vulnerability. RFDS is a CPU ++ vulnerability which may allow userspace to infer ++ kernel data values previously stored in floating point ++ registers, vector registers, or integer registers. ++ RFDS only affects Intel Atom processors. ++ ++ on: Turns ON the mitigation. ++ off: Turns OFF the mitigation. ++ ++ This parameter overrides the compile time default set ++ by CONFIG_MITIGATION_RFDS. Mitigation cannot be ++ disabled when other VERW based mitigations (like MDS) ++ are enabled. In order to disable RFDS mitigation all ++ VERW based mitigations need to be disabled. ++ ++ For details see: ++ Documentation/admin-guide/hw-vuln/reg-file-data-sampling.rst ++ + driver_async_probe= [KNL] + List of driver names to be probed asynchronously. + Format: ,... +@@ -3070,6 +3090,7 @@ + nopti [X86,PPC] + nospectre_v1 [X86,PPC] + nospectre_v2 [X86,PPC,S390,ARM64] ++ reg_file_data_sampling=off [X86] + retbleed=off [X86] + spec_store_bypass_disable=off [X86,PPC] + spectre_v2_user=off [X86] +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -2492,6 +2492,17 @@ config GDS_FORCE_MITIGATION + + If in doubt, say N. + ++config MITIGATION_RFDS ++ bool "RFDS Mitigation" ++ depends on CPU_SUP_INTEL ++ default y ++ help ++ Enable mitigation for Register File Data Sampling (RFDS) by default. ++ RFDS is a hardware vulnerability which affects Intel Atom CPUs. It ++ allows unprivileged speculative access to stale data previously ++ stored in floating point, vector and integer registers. ++ See also ++ + endif + + config ARCH_HAS_ADD_PAGES +--- a/arch/x86/include/asm/cpufeatures.h ++++ b/arch/x86/include/asm/cpufeatures.h +@@ -467,4 +467,5 @@ + /* BUG word 2 */ + #define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */ + #define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */ ++#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* CPU is vulnerable to Register File Data Sampling */ + #endif /* _ASM_X86_CPUFEATURES_H */ +--- a/arch/x86/include/asm/msr-index.h ++++ b/arch/x86/include/asm/msr-index.h +@@ -168,6 +168,14 @@ + * CPU is not vulnerable to Gather + * Data Sampling (GDS). + */ ++#define ARCH_CAP_RFDS_NO BIT(27) /* ++ * Not susceptible to Register ++ * File Data Sampling. ++ */ ++#define ARCH_CAP_RFDS_CLEAR BIT(28) /* ++ * VERW clears CPU Register ++ * File. ++ */ + + #define MSR_IA32_FLUSH_CMD 0x0000010b + #define L1D_FLUSH BIT(0) /* +--- a/arch/x86/kernel/cpu/bugs.c ++++ b/arch/x86/kernel/cpu/bugs.c +@@ -487,6 +487,57 @@ static int __init mmio_stale_data_parse_ + early_param("mmio_stale_data", mmio_stale_data_parse_cmdline); + + #undef pr_fmt ++#define pr_fmt(fmt) "Register File Data Sampling: " fmt ++ ++enum rfds_mitigations { ++ RFDS_MITIGATION_OFF, ++ RFDS_MITIGATION_VERW, ++ RFDS_MITIGATION_UCODE_NEEDED, ++}; ++ ++/* Default mitigation for Register File Data Sampling */ ++static enum rfds_mitigations rfds_mitigation __ro_after_init = ++ IS_ENABLED(CONFIG_MITIGATION_RFDS) ? RFDS_MITIGATION_VERW : RFDS_MITIGATION_OFF; ++ ++static const char * const rfds_strings[] = { ++ [RFDS_MITIGATION_OFF] = "Vulnerable", ++ [RFDS_MITIGATION_VERW] = "Mitigation: Clear Register File", ++ [RFDS_MITIGATION_UCODE_NEEDED] = "Vulnerable: No microcode", ++}; ++ ++static void __init rfds_select_mitigation(void) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_RFDS) || cpu_mitigations_off()) { ++ rfds_mitigation = RFDS_MITIGATION_OFF; ++ return; ++ } ++ if (rfds_mitigation == RFDS_MITIGATION_OFF) ++ return; ++ ++ if (x86_read_arch_cap_msr() & ARCH_CAP_RFDS_CLEAR) ++ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF); ++ else ++ rfds_mitigation = RFDS_MITIGATION_UCODE_NEEDED; ++} ++ ++static __init int rfds_parse_cmdline(char *str) ++{ ++ if (!str) ++ return -EINVAL; ++ ++ if (!boot_cpu_has_bug(X86_BUG_RFDS)) ++ return 0; ++ ++ if (!strcmp(str, "off")) ++ rfds_mitigation = RFDS_MITIGATION_OFF; ++ else if (!strcmp(str, "on")) ++ rfds_mitigation = RFDS_MITIGATION_VERW; ++ ++ return 0; ++} ++early_param("reg_file_data_sampling", rfds_parse_cmdline); ++ ++#undef pr_fmt + #define pr_fmt(fmt) "" fmt + + static void __init md_clear_update_mitigation(void) +@@ -519,6 +570,11 @@ static void __init md_clear_update_mitig + mmio_mitigation = MMIO_MITIGATION_VERW; + mmio_select_mitigation(); + } ++ if (rfds_mitigation == RFDS_MITIGATION_OFF && ++ boot_cpu_has_bug(X86_BUG_RFDS)) { ++ rfds_mitigation = RFDS_MITIGATION_VERW; ++ rfds_select_mitigation(); ++ } + out: + if (boot_cpu_has_bug(X86_BUG_MDS)) + pr_info("MDS: %s\n", mds_strings[mds_mitigation]); +@@ -528,6 +584,8 @@ out: + pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]); + else if (boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN)) + pr_info("MMIO Stale Data: Unknown: No mitigations\n"); ++ if (boot_cpu_has_bug(X86_BUG_RFDS)) ++ pr_info("Register File Data Sampling: %s\n", rfds_strings[rfds_mitigation]); + } + + static void __init md_clear_select_mitigation(void) +@@ -535,11 +593,12 @@ static void __init md_clear_select_mitig + mds_select_mitigation(); + taa_select_mitigation(); + mmio_select_mitigation(); ++ rfds_select_mitigation(); + + /* +- * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update +- * and print their mitigation after MDS, TAA and MMIO Stale Data +- * mitigation selection is done. ++ * As these mitigations are inter-related and rely on VERW instruction ++ * to clear the microarchitural buffers, update and print their status ++ * after mitigation selection is done for each of these vulnerabilities. + */ + md_clear_update_mitigation(); + } +@@ -2600,6 +2659,11 @@ static ssize_t mmio_stale_data_show_stat + sched_smt_active() ? "vulnerable" : "disabled"); + } + ++static ssize_t rfds_show_state(char *buf) ++{ ++ return sysfs_emit(buf, "%s\n", rfds_strings[rfds_mitigation]); ++} ++ + static char *stibp_state(void) + { + if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) +@@ -2760,6 +2824,9 @@ static ssize_t cpu_show_common(struct de + case X86_BUG_SRSO: + return srso_show_state(buf); + ++ case X86_BUG_RFDS: ++ return rfds_show_state(buf); ++ + default: + break; + } +@@ -2834,4 +2901,9 @@ ssize_t cpu_show_spec_rstack_overflow(st + { + return cpu_show_common(dev, attr, buf, X86_BUG_SRSO); + } ++ ++ssize_t cpu_show_reg_file_data_sampling(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ return cpu_show_common(dev, attr, buf, X86_BUG_RFDS); ++} + #endif +--- a/arch/x86/kernel/cpu/common.c ++++ b/arch/x86/kernel/cpu/common.c +@@ -1138,6 +1138,8 @@ static const __initconst struct x86_cpu_ + #define SRSO BIT(5) + /* CPU is affected by GDS */ + #define GDS BIT(6) ++/* CPU is affected by Register File Data Sampling */ ++#define RFDS BIT(7) + + static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = { + VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS), +@@ -1165,9 +1167,18 @@ static const struct x86_cpu_id cpu_vuln_ + VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS), + VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED), + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS), +- VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS), +- VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO), +- VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS), ++ VULNBL_INTEL_STEPPINGS(ALDERLAKE, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(ALDERLAKE_L, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(RAPTORLAKE, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(RAPTORLAKE_P, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(RAPTORLAKE_S, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(ALDERLAKE_N, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO | RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT_D, X86_STEPPING_ANY, RFDS), ++ VULNBL_INTEL_STEPPINGS(ATOM_GOLDMONT_PLUS, X86_STEPPING_ANY, RFDS), + + VULNBL_AMD(0x15, RETBLEED), + VULNBL_AMD(0x16, RETBLEED), +@@ -1201,6 +1212,24 @@ static bool arch_cap_mmio_immune(u64 ia3 + ia32_cap & ARCH_CAP_SBDR_SSDP_NO); + } + ++static bool __init vulnerable_to_rfds(u64 ia32_cap) ++{ ++ /* The "immunity" bit trumps everything else: */ ++ if (ia32_cap & ARCH_CAP_RFDS_NO) ++ return false; ++ ++ /* ++ * VMMs set ARCH_CAP_RFDS_CLEAR for processors not in the blacklist to ++ * indicate that mitigation is needed because guest is running on a ++ * vulnerable hardware or may migrate to such hardware: ++ */ ++ if (ia32_cap & ARCH_CAP_RFDS_CLEAR) ++ return true; ++ ++ /* Only consult the blacklist when there is no enumeration: */ ++ return cpu_matches(cpu_vuln_blacklist, RFDS); ++} ++ + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) + { + u64 ia32_cap = x86_read_arch_cap_msr(); +@@ -1312,6 +1341,9 @@ static void __init cpu_set_bug_bits(stru + setup_force_cpu_bug(X86_BUG_SRSO); + } + ++ if (vulnerable_to_rfds(ia32_cap)) ++ setup_force_cpu_bug(X86_BUG_RFDS); ++ + if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN)) + return; + +--- a/drivers/base/cpu.c ++++ b/drivers/base/cpu.c +@@ -589,6 +589,12 @@ ssize_t __weak cpu_show_spec_rstack_over + return sysfs_emit(buf, "Not affected\n"); + } + ++ssize_t __weak cpu_show_reg_file_data_sampling(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ return sysfs_emit(buf, "Not affected\n"); ++} ++ + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); + static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); + static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); +@@ -602,6 +608,7 @@ static DEVICE_ATTR(mmio_stale_data, 0444 + static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL); + static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL); + static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL); ++static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL); + + static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, +@@ -617,6 +624,7 @@ static struct attribute *cpu_root_vulner + &dev_attr_retbleed.attr, + &dev_attr_gather_data_sampling.attr, + &dev_attr_spec_rstack_overflow.attr, ++ &dev_attr_reg_file_data_sampling.attr, + NULL + }; + +--- a/include/linux/cpu.h ++++ b/include/linux/cpu.h +@@ -74,6 +74,8 @@ extern ssize_t cpu_show_spec_rstack_over + struct device_attribute *attr, char *buf); + extern ssize_t cpu_show_gds(struct device *dev, + struct device_attribute *attr, char *buf); ++extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev, ++ struct device_attribute *attr, char *buf); + + extern __printf(4, 5) + struct device *cpu_device_create(struct device *parent, void *drvdata,