From: Greg Kroah-Hartman Date: Mon, 7 Mar 2022 07:46:34 +0000 (+0100) Subject: 5.16-stable patches X-Git-Tag: v4.9.305~20 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=4217e301bb6111007670f3a668ed35d205b5189e;p=thirdparty%2Fkernel%2Fstable-queue.git 5.16-stable patches added patches: proc-fix-documentation-and-description-of-pagemap.patch revert-xfrm-xfrm_state_mtu-should-return-at-least-1280-for-ipv6.patch s390-ftrace-fix-arch_ftrace_get_regs-implementation.patch s390-ftrace-fix-ftrace_caller-ftrace_regs_caller-generation.patch x86-kvmclock-fix-hyper-v-isolated-vm-s-boot-issue-when-vcpus-64.patch --- diff --git a/queue-5.16/proc-fix-documentation-and-description-of-pagemap.patch b/queue-5.16/proc-fix-documentation-and-description-of-pagemap.patch new file mode 100644 index 00000000000..93000688f05 --- /dev/null +++ b/queue-5.16/proc-fix-documentation-and-description-of-pagemap.patch @@ -0,0 +1,60 @@ +From dd21bfa425c098b95ca86845f8e7d1ec1ddf6e4a Mon Sep 17 00:00:00 2001 +From: Yun Zhou +Date: Fri, 4 Mar 2022 20:29:07 -0800 +Subject: proc: fix documentation and description of pagemap + +From: Yun Zhou + +commit dd21bfa425c098b95ca86845f8e7d1ec1ddf6e4a upstream. + +Since bit 57 was exported for uffd-wp write-protected (commit +fb8e37f35a2f: "mm/pagemap: export uffd-wp protection information"), +fixing it can reduce some unnecessary confusion. + +Link: https://lkml.kernel.org/r/20220301044538.3042713-1-yun.zhou@windriver.com +Fixes: fb8e37f35a2fe1 ("mm/pagemap: export uffd-wp protection information") +Signed-off-by: Yun Zhou +Reviewed-by: Peter Xu +Cc: Jonathan Corbet +Cc: Tiberiu A Georgescu +Cc: Florian Schmidt +Cc: Ivan Teterevkov +Cc: SeongJae Park +Cc: Yang Shi +Cc: David Hildenbrand +Cc: Axel Rasmussen +Cc: Miaohe Lin +Cc: Andrea Arcangeli +Cc: Colin Cross +Cc: Alistair Popple +Signed-off-by: Andrew Morton +Signed-off-by: Linus Torvalds +Signed-off-by: Greg Kroah-Hartman +--- + Documentation/admin-guide/mm/pagemap.rst | 2 +- + fs/proc/task_mmu.c | 3 ++- + 2 files changed, 3 insertions(+), 2 deletions(-) + +--- a/Documentation/admin-guide/mm/pagemap.rst ++++ b/Documentation/admin-guide/mm/pagemap.rst +@@ -23,7 +23,7 @@ There are four components to pagemap: + * Bit 56 page exclusively mapped (since 4.2) + * Bit 57 pte is uffd-wp write-protected (since 5.13) (see + :ref:`Documentation/admin-guide/mm/userfaultfd.rst `) +- * Bits 57-60 zero ++ * Bits 58-60 zero + * Bit 61 page is file-page or shared-anon (since 3.5) + * Bit 62 page swapped + * Bit 63 page present +--- a/fs/proc/task_mmu.c ++++ b/fs/proc/task_mmu.c +@@ -1586,7 +1586,8 @@ static const struct mm_walk_ops pagemap_ + * Bits 5-54 swap offset if swapped + * Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst) + * Bit 56 page exclusively mapped +- * Bits 57-60 zero ++ * Bit 57 pte is uffd-wp write-protected ++ * Bits 58-60 zero + * Bit 61 page is file-page or shared-anon + * Bit 62 page swapped + * Bit 63 page present diff --git a/queue-5.16/revert-xfrm-xfrm_state_mtu-should-return-at-least-1280-for-ipv6.patch b/queue-5.16/revert-xfrm-xfrm_state_mtu-should-return-at-least-1280-for-ipv6.patch new file mode 100644 index 00000000000..3245332e28f --- /dev/null +++ b/queue-5.16/revert-xfrm-xfrm_state_mtu-should-return-at-least-1280-for-ipv6.patch @@ -0,0 +1,107 @@ +From a6d95c5a628a09be129f25d5663a7e9db8261f51 Mon Sep 17 00:00:00 2001 +From: Jiri Bohac +Date: Wed, 26 Jan 2022 16:00:18 +0100 +Subject: Revert "xfrm: xfrm_state_mtu should return at least 1280 for ipv6" + +From: Jiri Bohac + +commit a6d95c5a628a09be129f25d5663a7e9db8261f51 upstream. + +This reverts commit b515d2637276a3810d6595e10ab02c13bfd0b63a. + +Commit b515d2637276a3810d6595e10ab02c13bfd0b63a ("xfrm: xfrm_state_mtu +should return at least 1280 for ipv6") in v5.14 breaks the TCP MSS +calculation in ipsec transport mode, resulting complete stalls of TCP +connections. This happens when the (P)MTU is 1280 or slighly larger. + +The desired formula for the MSS is: +MSS = (MTU - ESP_overhead) - IP header - TCP header + +However, the above commit clamps the (MTU - ESP_overhead) to a +minimum of 1280, turning the formula into +MSS = max(MTU - ESP overhead, 1280) - IP header - TCP header + +With the (P)MTU near 1280, the calculated MSS is too large and the +resulting TCP packets never make it to the destination because they +are over the actual PMTU. + +The above commit also causes suboptimal double fragmentation in +xfrm tunnel mode, as described in +https://lore.kernel.org/netdev/20210429202529.codhwpc7w6kbudug@dwarf.suse.cz/ + +The original problem the above commit was trying to fix is now fixed +by commit 6596a0229541270fb8d38d989f91b78838e5e9da ("xfrm: fix MTU +regression"). + +Signed-off-by: Jiri Bohac +Signed-off-by: Steffen Klassert +Signed-off-by: Greg Kroah-Hartman +--- + include/net/xfrm.h | 1 - + net/ipv4/esp4.c | 2 +- + net/ipv6/esp6.c | 2 +- + net/xfrm/xfrm_state.c | 14 ++------------ + 4 files changed, 4 insertions(+), 15 deletions(-) + +--- a/include/net/xfrm.h ++++ b/include/net/xfrm.h +@@ -1567,7 +1567,6 @@ void xfrm_sad_getinfo(struct net *net, s + void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si); + u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq); + int xfrm_init_replay(struct xfrm_state *x); +-u32 __xfrm_state_mtu(struct xfrm_state *x, int mtu); + u32 xfrm_state_mtu(struct xfrm_state *x, int mtu); + int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload); + int xfrm_init_state(struct xfrm_state *x); +--- a/net/ipv4/esp4.c ++++ b/net/ipv4/esp4.c +@@ -671,7 +671,7 @@ static int esp_output(struct xfrm_state + struct xfrm_dst *dst = (struct xfrm_dst *)skb_dst(skb); + u32 padto; + +- padto = min(x->tfcpad, __xfrm_state_mtu(x, dst->child_mtu_cached)); ++ padto = min(x->tfcpad, xfrm_state_mtu(x, dst->child_mtu_cached)); + if (skb->len < padto) + esp.tfclen = padto - skb->len; + } +--- a/net/ipv6/esp6.c ++++ b/net/ipv6/esp6.c +@@ -708,7 +708,7 @@ static int esp6_output(struct xfrm_state + struct xfrm_dst *dst = (struct xfrm_dst *)skb_dst(skb); + u32 padto; + +- padto = min(x->tfcpad, __xfrm_state_mtu(x, dst->child_mtu_cached)); ++ padto = min(x->tfcpad, xfrm_state_mtu(x, dst->child_mtu_cached)); + if (skb->len < padto) + esp.tfclen = padto - skb->len; + } +--- a/net/xfrm/xfrm_state.c ++++ b/net/xfrm/xfrm_state.c +@@ -2571,7 +2571,7 @@ void xfrm_state_delete_tunnel(struct xfr + } + EXPORT_SYMBOL(xfrm_state_delete_tunnel); + +-u32 __xfrm_state_mtu(struct xfrm_state *x, int mtu) ++u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) + { + const struct xfrm_type *type = READ_ONCE(x->type); + struct crypto_aead *aead; +@@ -2602,17 +2602,7 @@ u32 __xfrm_state_mtu(struct xfrm_state * + return ((mtu - x->props.header_len - crypto_aead_authsize(aead) - + net_adj) & ~(blksize - 1)) + net_adj - 2; + } +-EXPORT_SYMBOL_GPL(__xfrm_state_mtu); +- +-u32 xfrm_state_mtu(struct xfrm_state *x, int mtu) +-{ +- mtu = __xfrm_state_mtu(x, mtu); +- +- if (x->props.family == AF_INET6 && mtu < IPV6_MIN_MTU) +- return IPV6_MIN_MTU; +- +- return mtu; +-} ++EXPORT_SYMBOL_GPL(xfrm_state_mtu); + + int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload) + { diff --git a/queue-5.16/s390-ftrace-fix-arch_ftrace_get_regs-implementation.patch b/queue-5.16/s390-ftrace-fix-arch_ftrace_get_regs-implementation.patch new file mode 100644 index 00000000000..4de996ff44d --- /dev/null +++ b/queue-5.16/s390-ftrace-fix-arch_ftrace_get_regs-implementation.patch @@ -0,0 +1,113 @@ +From 1389f17937a03fe4ec71b094e1aa6530a901963e Mon Sep 17 00:00:00 2001 +From: Heiko Carstens +Date: Tue, 22 Feb 2022 14:53:47 +0100 +Subject: s390/ftrace: fix arch_ftrace_get_regs implementation + +From: Heiko Carstens + +commit 1389f17937a03fe4ec71b094e1aa6530a901963e upstream. + +arch_ftrace_get_regs is supposed to return a struct pt_regs pointer +only if the pt_regs structure contains all register contents, which +means it must have been populated when created via ftrace_regs_caller. + +If it was populated via ftrace_caller the contents are not complete +(the psw mask part is missing), and therefore a NULL pointer needs be +returned. + +The current code incorrectly always returns a struct pt_regs pointer. + +Fix this by adding another pt_regs flag which indicates if the +contents are complete, and fix arch_ftrace_get_regs accordingly. + +Fixes: 894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations") +Reported-by: Christophe Leroy +Reported-by: Naveen N. Rao +Reviewed-by: Sven Schnelle +Acked-by: Ilya Leoshkevich +Signed-off-by: Heiko Carstens +Signed-off-by: Vasily Gorbik +Signed-off-by: Greg Kroah-Hartman +--- + arch/s390/include/asm/ftrace.h | 10 ++++++---- + arch/s390/include/asm/ptrace.h | 2 ++ + arch/s390/kernel/ftrace.c | 2 +- + arch/s390/kernel/mcount.S | 9 +++++++++ + 4 files changed, 18 insertions(+), 5 deletions(-) + +--- a/arch/s390/include/asm/ftrace.h ++++ b/arch/s390/include/asm/ftrace.h +@@ -47,15 +47,17 @@ struct ftrace_regs { + + static __always_inline struct pt_regs *arch_ftrace_get_regs(struct ftrace_regs *fregs) + { +- return &fregs->regs; ++ struct pt_regs *regs = &fregs->regs; ++ ++ if (test_pt_regs_flag(regs, PIF_FTRACE_FULL_REGS)) ++ return regs; ++ return NULL; + } + + static __always_inline void ftrace_instruction_pointer_set(struct ftrace_regs *fregs, + unsigned long ip) + { +- struct pt_regs *regs = arch_ftrace_get_regs(fregs); +- +- regs->psw.addr = ip; ++ fregs->regs.psw.addr = ip; + } + + /* +--- a/arch/s390/include/asm/ptrace.h ++++ b/arch/s390/include/asm/ptrace.h +@@ -15,11 +15,13 @@ + #define PIF_EXECVE_PGSTE_RESTART 1 /* restart execve for PGSTE binaries */ + #define PIF_SYSCALL_RET_SET 2 /* return value was set via ptrace */ + #define PIF_GUEST_FAULT 3 /* indicates program check in sie64a */ ++#define PIF_FTRACE_FULL_REGS 4 /* all register contents valid (ftrace) */ + + #define _PIF_SYSCALL BIT(PIF_SYSCALL) + #define _PIF_EXECVE_PGSTE_RESTART BIT(PIF_EXECVE_PGSTE_RESTART) + #define _PIF_SYSCALL_RET_SET BIT(PIF_SYSCALL_RET_SET) + #define _PIF_GUEST_FAULT BIT(PIF_GUEST_FAULT) ++#define _PIF_FTRACE_FULL_REGS BIT(PIF_FTRACE_FULL_REGS) + + #ifndef __ASSEMBLY__ + +--- a/arch/s390/kernel/ftrace.c ++++ b/arch/s390/kernel/ftrace.c +@@ -291,7 +291,7 @@ void kprobe_ftrace_handler(unsigned long + + regs = ftrace_get_regs(fregs); + p = get_kprobe((kprobe_opcode_t *)ip); +- if (unlikely(!p) || kprobe_disabled(p)) ++ if (!regs || unlikely(!p) || kprobe_disabled(p)) + goto out; + + if (kprobe_running()) { +--- a/arch/s390/kernel/mcount.S ++++ b/arch/s390/kernel/mcount.S +@@ -27,6 +27,7 @@ ENDPROC(ftrace_stub) + #define STACK_PTREGS_GPRS (STACK_PTREGS + __PT_GPRS) + #define STACK_PTREGS_PSW (STACK_PTREGS + __PT_PSW) + #define STACK_PTREGS_ORIG_GPR2 (STACK_PTREGS + __PT_ORIG_GPR2) ++#define STACK_PTREGS_FLAGS (STACK_PTREGS + __PT_FLAGS) + #ifdef __PACK_STACK + /* allocate just enough for r14, r15 and backchain */ + #define TRACED_FUNC_FRAME_SIZE 24 +@@ -57,6 +58,14 @@ ENDPROC(ftrace_stub) + .if \allregs == 1 + stg %r14,(STACK_PTREGS_PSW)(%r15) + stosm (STACK_PTREGS_PSW)(%r15),0 ++#ifdef CONFIG_HAVE_MARCH_Z10_FEATURES ++ mvghi STACK_PTREGS_FLAGS(%r15),_PIF_FTRACE_FULL_REGS ++#else ++ lghi %r14,_PIF_FTRACE_FULL_REGS ++ stg %r14,STACK_PTREGS_FLAGS(%r15) ++#endif ++ .else ++ xc STACK_PTREGS_FLAGS(8,%r15),STACK_PTREGS_FLAGS(%r15) + .endif + + lg %r14,(__SF_GPRS+8*8)(%r1) # restore original return address diff --git a/queue-5.16/s390-ftrace-fix-ftrace_caller-ftrace_regs_caller-generation.patch b/queue-5.16/s390-ftrace-fix-ftrace_caller-ftrace_regs_caller-generation.patch new file mode 100644 index 00000000000..c5d0fe3fa5b --- /dev/null +++ b/queue-5.16/s390-ftrace-fix-ftrace_caller-ftrace_regs_caller-generation.patch @@ -0,0 +1,86 @@ +From 9fa881f7e3c74ce6626d166bca9397e5d925937f Mon Sep 17 00:00:00 2001 +From: Heiko Carstens +Date: Wed, 23 Feb 2022 13:02:59 +0100 +Subject: s390/ftrace: fix ftrace_caller/ftrace_regs_caller generation + +From: Heiko Carstens + +commit 9fa881f7e3c74ce6626d166bca9397e5d925937f upstream. + +ftrace_caller was used for both ftrace_caller and ftrace_regs_caller, +which means that the target address of the hotpatch trampoline was +never updated. + +With commit 894979689d3a ("s390/ftrace: provide separate +ftrace_caller/ftrace_regs_caller implementations") a separate +ftrace_regs_caller entry point was implemeted, however it was +forgotten to implement the necessary changes for ftrace_modify_call +and ftrace_make_call, where the branch target has to be modified +accordingly. + +Therefore add the missing code now. + +Fixes: 894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations") +Reviewed-by: Sven Schnelle +Acked-by: Ilya Leoshkevich +Signed-off-by: Heiko Carstens +Signed-off-by: Vasily Gorbik +Signed-off-by: Greg Kroah-Hartman +--- + arch/s390/kernel/ftrace.c | 35 +++++++++++++++++++++++++++++++++++ + 1 file changed, 35 insertions(+) + +--- a/arch/s390/kernel/ftrace.c ++++ b/arch/s390/kernel/ftrace.c +@@ -159,9 +159,38 @@ int ftrace_init_nop(struct module *mod, + return 0; + } + ++static struct ftrace_hotpatch_trampoline *ftrace_get_trampoline(struct dyn_ftrace *rec) ++{ ++ struct ftrace_hotpatch_trampoline *trampoline; ++ struct ftrace_insn insn; ++ s64 disp; ++ u16 opc; ++ ++ if (copy_from_kernel_nofault(&insn, (void *)rec->ip, sizeof(insn))) ++ return ERR_PTR(-EFAULT); ++ disp = (s64)insn.disp * 2; ++ trampoline = (void *)(rec->ip + disp); ++ if (get_kernel_nofault(opc, &trampoline->brasl_opc)) ++ return ERR_PTR(-EFAULT); ++ if (opc != 0xc015) ++ return ERR_PTR(-EINVAL); ++ return trampoline; ++} ++ + int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, + unsigned long addr) + { ++ struct ftrace_hotpatch_trampoline *trampoline; ++ u64 old; ++ ++ trampoline = ftrace_get_trampoline(rec); ++ if (IS_ERR(trampoline)) ++ return PTR_ERR(trampoline); ++ if (get_kernel_nofault(old, &trampoline->interceptor)) ++ return -EFAULT; ++ if (old != old_addr) ++ return -EINVAL; ++ s390_kernel_write(&trampoline->interceptor, &addr, sizeof(addr)); + return 0; + } + +@@ -188,6 +217,12 @@ static void brcl_enable(void *brcl) + + int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) + { ++ struct ftrace_hotpatch_trampoline *trampoline; ++ ++ trampoline = ftrace_get_trampoline(rec); ++ if (IS_ERR(trampoline)) ++ return PTR_ERR(trampoline); ++ s390_kernel_write(&trampoline->interceptor, &addr, sizeof(addr)); + brcl_enable((void *)rec->ip); + return 0; + } diff --git a/queue-5.16/series b/queue-5.16/series index 2e1c5cd39a8..d929147d0ef 100644 --- a/queue-5.16/series +++ b/queue-5.16/series @@ -178,3 +178,8 @@ btrfs-qgroup-fix-deadlock-between-rescan-worker-and-remove-qgroup.patch btrfs-add-missing-run-of-delayed-items-after-unlink-during-log-replay.patch btrfs-fallback-to-blocking-mode-when-doing-async-dio-over-multiple-extents.patch btrfs-do-not-start-relocation-until-in-progress-drops-are-done.patch +revert-xfrm-xfrm_state_mtu-should-return-at-least-1280-for-ipv6.patch +proc-fix-documentation-and-description-of-pagemap.patch +x86-kvmclock-fix-hyper-v-isolated-vm-s-boot-issue-when-vcpus-64.patch +s390-ftrace-fix-arch_ftrace_get_regs-implementation.patch +s390-ftrace-fix-ftrace_caller-ftrace_regs_caller-generation.patch diff --git a/queue-5.16/x86-kvmclock-fix-hyper-v-isolated-vm-s-boot-issue-when-vcpus-64.patch b/queue-5.16/x86-kvmclock-fix-hyper-v-isolated-vm-s-boot-issue-when-vcpus-64.patch new file mode 100644 index 00000000000..e9933fd77f6 --- /dev/null +++ b/queue-5.16/x86-kvmclock-fix-hyper-v-isolated-vm-s-boot-issue-when-vcpus-64.patch @@ -0,0 +1,57 @@ +From 92e68cc558774de01024c18e8b35cdce4731c910 Mon Sep 17 00:00:00 2001 +From: Dexuan Cui +Date: Fri, 25 Feb 2022 00:46:00 -0800 +Subject: x86/kvmclock: Fix Hyper-V Isolated VM's boot issue when vCPUs > 64 + +From: Dexuan Cui + +commit 92e68cc558774de01024c18e8b35cdce4731c910 upstream. + +When Linux runs as an Isolated VM on Hyper-V, it supports AMD SEV-SNP +but it's partially enlightened, i.e. cc_platform_has( +CC_ATTR_GUEST_MEM_ENCRYPT) is true but sev_active() is false. + +Commit 4d96f9109109 per se is good, but with it now +kvm_setup_vsyscall_timeinfo() -> kvmclock_init_mem() calls +set_memory_decrypted(), and later gets stuck when trying to zere out +the pages pointed by 'hvclock_mem', if Linux runs as an Isolated VM on +Hyper-V. The cause is that here now the Linux VM should no longer access +the original guest physical addrss (GPA); instead the VM should do +memremap() and access the original GPA + ms_hyperv.shared_gpa_boundary: +see the example code in drivers/hv/connection.c: vmbus_connect() or +drivers/hv/ring_buffer.c: hv_ringbuffer_init(). If the VM tries to +access the original GPA, it keepts getting injected a fault by Hyper-V +and gets stuck there. + +Here the issue happens only when the VM has >=65 vCPUs, because the +global static array hv_clock_boot[] can hold 64 "struct +pvclock_vsyscall_time_info" (the sizeof of the struct is 64 bytes), so +kvmclock_init_mem() only allocates memory in the case of vCPUs > 64. + +Since the 'hvclock_mem' pages are only useful when the kvm clock is +supported by the underlying hypervisor, fix the issue by returning +early when Linux VM runs on Hyper-V, which doesn't support kvm clock. + +Fixes: 4d96f9109109 ("x86/sev: Replace occurrences of sev_active() with cc_platform_has()") +Tested-by: Andrea Parri (Microsoft) +Signed-off-by: Andrea Parri (Microsoft) +Signed-off-by: Dexuan Cui +Message-Id: <20220225084600.17817-1-decui@microsoft.com> +Signed-off-by: Paolo Bonzini +Signed-off-by: Greg Kroah-Hartman +--- + arch/x86/kernel/kvmclock.c | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/arch/x86/kernel/kvmclock.c ++++ b/arch/x86/kernel/kvmclock.c +@@ -239,6 +239,9 @@ static void __init kvmclock_init_mem(voi + + static int __init kvm_setup_vsyscall_timeinfo(void) + { ++ if (!kvm_para_available()) ++ return 0; ++ + kvmclock_init_mem(); + + #ifdef CONFIG_X86_64