From: Sasha Levin Date: Fri, 12 Jun 2026 14:46:49 +0000 (-0400) Subject: Fixes for all trees X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=91bdac0cad95adbecb576652632b0e235b1e4d36;p=thirdparty%2Fkernel%2Fstable-queue.git Fixes for all trees Signed-off-by: Sasha Levin --- diff --git a/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch new file mode 100644 index 0000000000..267853e811 --- /dev/null +++ b/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch @@ -0,0 +1,62 @@ +From 7ef2f162cbc7a568944cc1c5967edcd65c534154 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:49:02 +0100 +Subject: arm64: tlb: Allow XZR argument to TLBI ops + +From: Mark Rutland + +commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream. + +The TLBI instruction accepts XZR as a register argument, and for TLBI +operations with a register argument, there is no functional difference +between using XZR or another GPR which contains zeroes. Operations +without a register argument are encoded as if XZR were used. + +Allow the __TLBI_1() macro to use XZR when a register argument is all +zeroes. + +Today this only results in a trivial code saving in +__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In +subsequent patches this pattern will be used more generally. + +There should be no functional change as a result of this patch. + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v5.10.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 36f02892e1df80..b17d8b049d258b 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -37,12 +37,12 @@ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ +- "tlbi " #op ", %0\n" \ ++ "tlbi " #op ", %x0\n" \ + ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %0", \ ++ "dsb ish\n tlbi " #op ", %x0", \ + ARM64_WORKAROUND_REPEAT_TLBI, \ + CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ +- : : "r" (arg)) ++ : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) + +-- +2.53.0 + diff --git a/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch new file mode 100644 index 0000000000..97d0a7bec2 --- /dev/null +++ b/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch @@ -0,0 +1,380 @@ +From 41aa334fdf902b9b3c0e6aca88531b533165862d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:49:03 +0100 +Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI + +From: Mark Rutland + +commit a8f78680ee6bf795086384e8aea159a52814f827 upstream. + +The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several +errata where broadcast TLBI;DSB sequences don't provide all the +architecturally required synchronization. The workaround performs more +work than necessary, and can have significant overhead. This patch +optimizes the workaround, as explained below. + +The workaround was originally added for Qualcomm Falkor erratum 1009 in +commit: + + d9ff80f83ecb ("arm64: Work around Falkor erratum 1009") + +As noted in the message for that commit, the workaround is applied even +in cases where it is not strictly necessary. + +The workaround was later reused without changes for: + +* Arm Cortex-A76 erratum #1286807 + SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/ + +* Arm Cortex-A55 erratum #2441007 + SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/ + +* Arm Cortex-A510 erratum #2441009 + SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/ + +The important details to note are as follows: + +1. All relevant errata only affect the ordering and/or completion of + memory accesses which have been translated by an invalidated TLB + entry. The actual invalidation of TLB entries is unaffected. + +2. The existing workaround is applied to both broadcast and local TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for broadcast invalidation. + +3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI + sequence, whereas for all relevant errata it is only necessary to + execute a single additional TLBI;DSB sequence after any number of + TLBIs are completed by a DSB. + + For example, for a sequence of batched TLBIs: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + + ... the existing workaround will expand this to: + + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + DSB ISH + + ... whereas it is sufficient to have: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + TLBI [, ] // additional + DSB ISH // additional + + Using a single additional TBLI and DSB at the end of the sequence can + have significantly lower overhead as each DSB which completes a TLBI + must synchronize with other PEs in the system, with potential + performance effects both locally and system-wide. + +4. The existing workaround repeats each specific TLBI operation, whereas + for all relevant errata it is sufficient for the additional TLBI to + use *any* operation which will be broadcast, regardless of which + translation regime or stage of translation the operation applies to. + + For example, for a single TLBI: + + TLBI ALLE2IS + DSB ISH + + ... the existing workaround will expand this to: + + TLBI ALLE2IS + DSB ISH + TLBI ALLE2IS // additional + DSB ISH // additional + + ... whereas it is sufficient to have: + + TLBI ALLE2IS + DSB ISH + TLBI VALE1IS, XZR // additional + DSB ISH // additional + + As the additional TLBI doesn't have to match a specific earlier TLBI, + the additional TLBI can be implemented in separate code, with no + memory of the earlier TLBIs. The additional TLBI can also use a + cheaper TLBI operation. + +5. The existing workaround is applied to both Stage-1 and Stage-2 TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for Stage-1 invalidation. + + Architecturally, TLBI operations which invalidate only Stage-2 + information (e.g. IPAS2E1IS) are not required to invalidate TLB + entries which combine information from Stage-1 and Stage-2 + translation table entries, and consequently may not complete memory + accesses translated by those combined entries. In these cases, + completion of memory accesses is only guaranteed after subsequent + invalidation of Stage-1 information (e.g. VMALLE1IS). + +Taking the above points into account, this patch reworks the workaround +logic to reduce overhead: + +* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are + added and used in place of any dsb(ish) which is used to complete + broadcast Stage-1 TLB maintenance. When the + ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will + execute an additional TLBI;DSB sequence. + + For consistency, it might make sense to add __tlbi_sync_*() helpers + for local and stage 2 maintenance. For now I've left those with + open-coded dsb() to keep the diff small. + +* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This + is no longer needed as the necessary synchronization will happen in + __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp(). + +* The additional TLBI operation is chosen to have minimal impact: + + - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at + EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused + entry for the reserved ASID in the kernel's own translation regime, + and have no adverse affect. + + - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used + in hyp code, where it will target an unused entry in the hyp code's + TTBR0 mapping, and should have no adverse effect. + +* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a + TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no + need for arch_tlbbatch_should_defer() to consider + ARM64_WORKAROUND_REPEAT_TLBI. + +When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this +patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes +the resulting Image 64KiB smaller: + +| [mark@lakrids:~/src/linux]% size vmlinux-* +| text data bss dec hex filename +| 21179831 19660919 708216 41548966 279fca6 vmlinux-after +| 21181075 19660903 708216 41550194 27a0172 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l vmlinux-* +| -rwxr-xr-x 1 mark mark 157771472 Feb 4 12:05 vmlinux-after +| -rwxr-xr-x 1 mark mark 157815432 Feb 4 12:05 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l Image-* +| -rw-r--r-- 1 mark mark 41007616 Feb 4 12:05 Image-after +| -rw-r--r-- 1 mark mark 41073152 Feb 4 12:05 Image-before + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v5.10.y; use inline ALTERNATIVE() sequence] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 51 ++++++++++++++++++++++--------- + arch/arm64/kernel/sys_compat.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 6 ++-- + arch/arm64/kvm/hyp/vhe/tlb.c | 6 ++-- + 4 files changed, 44 insertions(+), 21 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index b17d8b049d258b..0fd1bb180561c2 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -30,18 +30,10 @@ + */ + #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op "\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op, \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op ", %x0\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %x0", \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) +@@ -158,6 +150,37 @@ static inline unsigned long get_trans_granule(void) + #define __TLBI_RANGE_NUM(pages, scale) \ + ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1) + ++#define __repeat_tlbi_sync(op, arg) \ ++do { \ ++ asm volatile( \ ++ ALTERNATIVE("nop\n nop", \ ++ "tlbi " #op ", %x0\n dsb ish", \ ++ ARM64_WORKAROUND_REPEAT_TLBI, \ ++ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ ++ : \ ++ : "rZ" (arg)); \ ++} while (0) ++ ++/* ++ * Complete broadcast TLB maintenance issued by the host which invalidates ++ * stage 1 information in the host's own translation regime. ++ */ ++static inline void __tlbi_sync_s1ish(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale1is, 0); ++} ++ ++/* ++ * Complete broadcast TLB maintenance issued by hyp code which invalidates ++ * stage 1 translation information in any translation regime. ++ */ ++static inline void __tlbi_sync_s1ish_hyp(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale2is, 0); ++} ++ + /* + * TLB Invalidation + * ================ +@@ -239,7 +262,7 @@ static inline void flush_tlb_all(void) + { + dsb(ishst); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -251,7 +274,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) + asid = __TLBI_VADDR(0, ASID(mm)); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, +@@ -269,7 +292,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, + unsigned long uaddr) + { + flush_tlb_page_nosync(vma, uaddr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + /* +@@ -357,7 +380,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, + } + scale++; + } +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_range(struct vm_area_struct *vma, +@@ -386,7 +409,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end + dsb(ishst); + for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) + __tlbi(vaale1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -400,7 +423,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) + + dsb(ishst); + __tlbi(vaae1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + #endif +diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c +index 51274bab25653f..a42266f495d463 100644 +--- a/arch/arm64/kernel/sys_compat.c ++++ b/arch/arm64/kernel/sys_compat.c +@@ -38,7 +38,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end) + * We pick the reserved-ASID to minimise the impact. + */ + __tlbi(aside1is, __TLBI_VADDR(0, 0)); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + ret = __flush_cache_user_range(start, start + chunk); +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 435d0a54ab9a25..deeb4bc943d89c 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -79,7 +79,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -95,7 +95,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -120,5 +120,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 67047feb306876..ac695f43f651fc 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +-- +2.53.0 + diff --git a/queue-5.10/io_uring-prevent-opcode-speculation.patch b/queue-5.10/io_uring-prevent-opcode-speculation.patch new file mode 100644 index 0000000000..96694264f2 --- /dev/null +++ b/queue-5.10/io_uring-prevent-opcode-speculation.patch @@ -0,0 +1,42 @@ +From e244745e99fab815a6ef9b83c59fd7fc5fd84200 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Wed, 10 Jun 2026 20:22:03 +0300 +Subject: io_uring: prevent opcode speculation + +From: Pavel Begunkov + +commit 1e988c3fe1264708f4f92109203ac5b1d65de50b upstream. + +sqe->opcode is used for different tables, make sure we santitise it +against speculations. + +Cc: stable@vger.kernel.org +Fixes: d3656344fea03 ("io_uring: add lookup table for various opcode needs") +Signed-off-by: Pavel Begunkov +Reviewed-by: Li Zetao +Link: https://lore.kernel.org/r/7eddbf31c8ca0a3947f8ed98271acc2b4349c016.1739568408.git.asml.silence@gmail.com +Signed-off-by: Jens Axboe +[ Alexey: Sanitize req->opcode directly because io_init_req() in + linux-5.10.y has no local opcode variable and subsequent lookups use it. ] +Signed-off-by: Alexey Panov +Signed-off-by: Sasha Levin +--- + io_uring/io_uring.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c +index 2ca09e2dbd3d4a..51262d48a4a11b 100644 +--- a/io_uring/io_uring.c ++++ b/io_uring/io_uring.c +@@ -7193,6 +7193,8 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, + return -EINVAL; + if (unlikely(req->opcode >= IORING_OP_LAST)) + return -EINVAL; ++ req->opcode = array_index_nospec(req->opcode, IORING_OP_LAST); ++ + if (!io_check_restriction(ctx, req, sqe_flags)) + return -EACCES; + +-- +2.53.0 + diff --git a/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch new file mode 100644 index 0000000000..f9aa36c047 --- /dev/null +++ b/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch @@ -0,0 +1,120 @@ +From 5e62b5e382ab852f334a898c6f9e873731c68409 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:49:01 +0100 +Subject: KVM: arm64: Remove VPIPT I-cache handling + +From: Marc Zyngier + +commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream. + +We have some special handling for VPIPT I-cache in critical parts +of the cache and TLB maintenance. Remove it. + +Reviewed-by: Zenghui Yu +Reviewed-by: Anshuman Khandual +Signed-off-by: Marc Zyngier +Acked-by: Mark Rutland +Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org +Signed-off-by: Will Deacon +[Mark: Backport to v5.10.y. VPIPT HW was never built; this is all dead code] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/kvm_mmu.h | 4 ++-- + arch/arm64/kvm/hyp/nvhe/tlb.c | 35 -------------------------------- + arch/arm64/kvm/hyp/vhe/tlb.c | 13 ------------ + 3 files changed, 2 insertions(+), 50 deletions(-) + +diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h +index 47dafd6ab3a30a..c700bf9241fce3 100644 +--- a/arch/arm64/include/asm/kvm_mmu.h ++++ b/arch/arm64/include/asm/kvm_mmu.h +@@ -162,8 +162,8 @@ static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn, + if (icache_is_aliasing()) { + /* any kind of VIPT cache */ + __flush_icache_all(); +- } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) { +- /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */ ++ } else { ++ /* PIPT */ + void *va = page_address(pfn_to_page(pfn)); + + invalidate_icache_range((unsigned long)va, +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 229b06748c2084..435d0a54ab9a25 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -82,28 +82,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + dsb(ish); + isb(); + +- /* +- * If the host is running at EL1 and we have a VPIPT I-cache, +- * then we must perform I-cache maintenance at EL2 in order for +- * it to have an effect on the guest. Since the guest cannot hit +- * I-cache lines allocated with a different VMID, we don't need +- * to worry about junk out of guest reset (we nuke the I-cache on +- * VMID rollover), but we do need to be careful when remapping +- * executable pages for the same guest. This can happen when KSM +- * takes a CoW fault on an executable page, copies the page into +- * a page that was previously mapped in the guest and then needs +- * to invalidate the guest view of the I-cache for that page +- * from EL1. To solve this, we invalidate the entire I-cache when +- * unmapping a page from a guest if we have a VPIPT I-cache but +- * the host is running at EL1. As above, we could do better if +- * we had the VA. +- * +- * The moral of this story is: if you have a VPIPT I-cache, then +- * you should be running with VHE enabled. +- */ +- if (icache_is_vpipt()) +- __flush_icache_all(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -142,18 +120,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 66f17349f0c369..67047feb306876 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +-- +2.53.0 + diff --git a/queue-5.10/series b/queue-5.10/series index d186b74b33..35ec2ba6e0 100644 --- a/queue-5.10/series +++ b/queue-5.10/series @@ -156,3 +156,9 @@ usbnet-fix-using-smp_processor_id-in-preemptible-cod.patch nfsd-don-t-ignore-the-return-code-of-svc_proc_regist.patch wifi-mac80211-check-tdls-flag-in-ieee80211_tdls_oper.patch spi-meson-spicc-fix-double-put-in-remove-path.patch +io_uring-prevent-opcode-speculation.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch +tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch +kvm-arm64-remove-vpipt-i-cache-handling.patch +arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch +arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch diff --git a/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..536ff9ef87 --- /dev/null +++ b/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From acba7a9546166e019db32d71367ca8b388eb2fd4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index 18f19fc66c64fa..6f5c996d3ed234 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1145,6 +1145,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1154,6 +1155,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 + diff --git a/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch b/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch new file mode 100644 index 0000000000..c49c525897 --- /dev/null +++ b/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch @@ -0,0 +1,51 @@ +From de01d2486f3a93a389084ad4dec108860cbb7dad Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:28:53 -0700 +Subject: tun: free page on build_skb failure in tun_xdp_one() + +From: Weiming Shi + +[ Upstream commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8 ] + +When build_skb() fails in tun_xdp_one(), the function sets ret to +-ENOMEM and jumps to the out label, which returns without freeing the +page that vhost_net_build_xdp() allocated for the frame. As with the +short-frame rejection path, tun_sendmsg() discards the per-buffer error +and still returns total_len, so vhost_tx_batch() takes the success path +and never frees the page. Each build_skb() failure in a batch leaks one +page-frag chunk. + +Free the page before taking the error path, matching the put_page() the +other error exits of tun_xdp_one() already perform. + +Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8) +[Harshit: Backport to 5.15.y/5.10.y, use err instead of ret, no change +needed] +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tun.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/net/tun.c b/drivers/net/tun.c +index 930086d79f97c8..d960b261dbe4f6 100644 +--- a/drivers/net/tun.c ++++ b/drivers/net/tun.c +@@ -2518,6 +2518,7 @@ static int tun_xdp_one(struct tun_struct *tun, + build: + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto out; + } +-- +2.53.0 + diff --git a/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch new file mode 100644 index 0000000000..9545de440d --- /dev/null +++ b/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch @@ -0,0 +1,62 @@ +From 37b0e9bc17faee182310f527351a872c78eeb9b4 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:47:05 +0100 +Subject: arm64: tlb: Allow XZR argument to TLBI ops + +From: Mark Rutland + +commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream. + +The TLBI instruction accepts XZR as a register argument, and for TLBI +operations with a register argument, there is no functional difference +between using XZR or another GPR which contains zeroes. Operations +without a register argument are encoded as if XZR were used. + +Allow the __TLBI_1() macro to use XZR when a register argument is all +zeroes. + +Today this only results in a trivial code saving in +__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In +subsequent patches this pattern will be used more generally. + +There should be no functional change as a result of this patch. + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v5.15.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 412a3b9a3c25dc..2626a45849c241 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -37,12 +37,12 @@ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ +- "tlbi " #op ", %0\n" \ ++ "tlbi " #op ", %x0\n" \ + ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %0", \ ++ "dsb ish\n tlbi " #op ", %x0", \ + ARM64_WORKAROUND_REPEAT_TLBI, \ + CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ +- : : "r" (arg)) ++ : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) + +-- +2.53.0 + diff --git a/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch new file mode 100644 index 0000000000..c55ae07507 --- /dev/null +++ b/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch @@ -0,0 +1,380 @@ +From efb74232b2fda57351e04185bf8126663f40329d Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:47:06 +0100 +Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI + +From: Mark Rutland + +commit a8f78680ee6bf795086384e8aea159a52814f827 upstream. + +The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several +errata where broadcast TLBI;DSB sequences don't provide all the +architecturally required synchronization. The workaround performs more +work than necessary, and can have significant overhead. This patch +optimizes the workaround, as explained below. + +The workaround was originally added for Qualcomm Falkor erratum 1009 in +commit: + + d9ff80f83ecb ("arm64: Work around Falkor erratum 1009") + +As noted in the message for that commit, the workaround is applied even +in cases where it is not strictly necessary. + +The workaround was later reused without changes for: + +* Arm Cortex-A76 erratum #1286807 + SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/ + +* Arm Cortex-A55 erratum #2441007 + SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/ + +* Arm Cortex-A510 erratum #2441009 + SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/ + +The important details to note are as follows: + +1. All relevant errata only affect the ordering and/or completion of + memory accesses which have been translated by an invalidated TLB + entry. The actual invalidation of TLB entries is unaffected. + +2. The existing workaround is applied to both broadcast and local TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for broadcast invalidation. + +3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI + sequence, whereas for all relevant errata it is only necessary to + execute a single additional TLBI;DSB sequence after any number of + TLBIs are completed by a DSB. + + For example, for a sequence of batched TLBIs: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + + ... the existing workaround will expand this to: + + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + DSB ISH + + ... whereas it is sufficient to have: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + TLBI [, ] // additional + DSB ISH // additional + + Using a single additional TBLI and DSB at the end of the sequence can + have significantly lower overhead as each DSB which completes a TLBI + must synchronize with other PEs in the system, with potential + performance effects both locally and system-wide. + +4. The existing workaround repeats each specific TLBI operation, whereas + for all relevant errata it is sufficient for the additional TLBI to + use *any* operation which will be broadcast, regardless of which + translation regime or stage of translation the operation applies to. + + For example, for a single TLBI: + + TLBI ALLE2IS + DSB ISH + + ... the existing workaround will expand this to: + + TLBI ALLE2IS + DSB ISH + TLBI ALLE2IS // additional + DSB ISH // additional + + ... whereas it is sufficient to have: + + TLBI ALLE2IS + DSB ISH + TLBI VALE1IS, XZR // additional + DSB ISH // additional + + As the additional TLBI doesn't have to match a specific earlier TLBI, + the additional TLBI can be implemented in separate code, with no + memory of the earlier TLBIs. The additional TLBI can also use a + cheaper TLBI operation. + +5. The existing workaround is applied to both Stage-1 and Stage-2 TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for Stage-1 invalidation. + + Architecturally, TLBI operations which invalidate only Stage-2 + information (e.g. IPAS2E1IS) are not required to invalidate TLB + entries which combine information from Stage-1 and Stage-2 + translation table entries, and consequently may not complete memory + accesses translated by those combined entries. In these cases, + completion of memory accesses is only guaranteed after subsequent + invalidation of Stage-1 information (e.g. VMALLE1IS). + +Taking the above points into account, this patch reworks the workaround +logic to reduce overhead: + +* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are + added and used in place of any dsb(ish) which is used to complete + broadcast Stage-1 TLB maintenance. When the + ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will + execute an additional TLBI;DSB sequence. + + For consistency, it might make sense to add __tlbi_sync_*() helpers + for local and stage 2 maintenance. For now I've left those with + open-coded dsb() to keep the diff small. + +* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This + is no longer needed as the necessary synchronization will happen in + __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp(). + +* The additional TLBI operation is chosen to have minimal impact: + + - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at + EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused + entry for the reserved ASID in the kernel's own translation regime, + and have no adverse affect. + + - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used + in hyp code, where it will target an unused entry in the hyp code's + TTBR0 mapping, and should have no adverse effect. + +* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a + TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no + need for arch_tlbbatch_should_defer() to consider + ARM64_WORKAROUND_REPEAT_TLBI. + +When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this +patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes +the resulting Image 64KiB smaller: + +| [mark@lakrids:~/src/linux]% size vmlinux-* +| text data bss dec hex filename +| 21179831 19660919 708216 41548966 279fca6 vmlinux-after +| 21181075 19660903 708216 41550194 27a0172 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l vmlinux-* +| -rwxr-xr-x 1 mark mark 157771472 Feb 4 12:05 vmlinux-after +| -rwxr-xr-x 1 mark mark 157815432 Feb 4 12:05 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l Image-* +| -rw-r--r-- 1 mark mark 41007616 Feb 4 12:05 Image-after +| -rw-r--r-- 1 mark mark 41073152 Feb 4 12:05 Image-before + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v5.15.y; use inline ALTERNATIVE() sequence] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 51 ++++++++++++++++++++++--------- + arch/arm64/kernel/sys_compat.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 6 ++-- + arch/arm64/kvm/hyp/vhe/tlb.c | 6 ++-- + 4 files changed, 44 insertions(+), 21 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 2626a45849c241..cc6e47172f8fa7 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -30,18 +30,10 @@ + */ + #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op "\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op, \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op ", %x0\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %x0", \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) +@@ -158,6 +150,37 @@ static inline unsigned long get_trans_granule(void) + #define __TLBI_RANGE_NUM(pages, scale) \ + ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1) + ++#define __repeat_tlbi_sync(op, arg) \ ++do { \ ++ asm volatile( \ ++ ALTERNATIVE("nop\n nop", \ ++ "tlbi " #op ", %x0\n dsb ish", \ ++ ARM64_WORKAROUND_REPEAT_TLBI, \ ++ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ ++ : \ ++ : "rZ" (arg)); \ ++} while (0) ++ ++/* ++ * Complete broadcast TLB maintenance issued by the host which invalidates ++ * stage 1 information in the host's own translation regime. ++ */ ++static inline void __tlbi_sync_s1ish(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale1is, 0); ++} ++ ++/* ++ * Complete broadcast TLB maintenance issued by hyp code which invalidates ++ * stage 1 translation information in any translation regime. ++ */ ++static inline void __tlbi_sync_s1ish_hyp(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale2is, 0); ++} ++ + /* + * TLB Invalidation + * ================ +@@ -239,7 +262,7 @@ static inline void flush_tlb_all(void) + { + dsb(ishst); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -251,7 +274,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) + asid = __TLBI_VADDR(0, ASID(mm)); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, +@@ -269,7 +292,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, + unsigned long uaddr) + { + flush_tlb_page_nosync(vma, uaddr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + /* +@@ -357,7 +380,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, + } + scale++; + } +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_range(struct vm_area_struct *vma, +@@ -386,7 +409,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end + dsb(ishst); + for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) + __tlbi(vaale1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -400,7 +423,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) + + dsb(ishst); + __tlbi(vaae1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + #endif +diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c +index b88a52f7188fcc..416195f376816b 100644 +--- a/arch/arm64/kernel/sys_compat.c ++++ b/arch/arm64/kernel/sys_compat.c +@@ -38,7 +38,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end) + * We pick the reserved-ASID to minimise the impact. + */ + __tlbi(aside1is, __TLBI_VADDR(0, 0)); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + ret = caches_clean_inval_user_pou(start, start + chunk); +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 291789df24e3ee..76973e3b48a076 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -81,7 +81,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -97,7 +97,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -122,5 +122,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index fc3fcd29ccc306..59aa22b48e9538 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +-- +2.53.0 + diff --git a/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch new file mode 100644 index 0000000000..f4710e1231 --- /dev/null +++ b/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch @@ -0,0 +1,120 @@ +From 1e70ed7fdcf4977c02fdf405beabfdea2fb7e110 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:47:04 +0100 +Subject: KVM: arm64: Remove VPIPT I-cache handling + +From: Marc Zyngier + +commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream. + +We have some special handling for VPIPT I-cache in critical parts +of the cache and TLB maintenance. Remove it. + +Reviewed-by: Zenghui Yu +Reviewed-by: Anshuman Khandual +Signed-off-by: Marc Zyngier +Acked-by: Mark Rutland +Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org +Signed-off-by: Will Deacon +[Mark: Backport to v5.15.y. VPIPT HW was never built; this is all dead code] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/kvm_mmu.h | 4 ++-- + arch/arm64/kvm/hyp/nvhe/tlb.c | 35 -------------------------------- + arch/arm64/kvm/hyp/vhe/tlb.c | 13 ------------ + 3 files changed, 2 insertions(+), 50 deletions(-) + +diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h +index 02d37888774383..1733eb87c29fbb 100644 +--- a/arch/arm64/include/asm/kvm_mmu.h ++++ b/arch/arm64/include/asm/kvm_mmu.h +@@ -207,8 +207,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size) + if (icache_is_aliasing()) { + /* any kind of VIPT cache */ + icache_inval_all_pou(); +- } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) { +- /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */ ++ } else { ++ /* PIPT */ + icache_inval_pou((unsigned long)va, (unsigned long)va + size); + } + } +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index d296d617f58963..291789df24e3ee 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -84,28 +84,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + dsb(ish); + isb(); + +- /* +- * If the host is running at EL1 and we have a VPIPT I-cache, +- * then we must perform I-cache maintenance at EL2 in order for +- * it to have an effect on the guest. Since the guest cannot hit +- * I-cache lines allocated with a different VMID, we don't need +- * to worry about junk out of guest reset (we nuke the I-cache on +- * VMID rollover), but we do need to be careful when remapping +- * executable pages for the same guest. This can happen when KSM +- * takes a CoW fault on an executable page, copies the page into +- * a page that was previously mapped in the guest and then needs +- * to invalidate the guest view of the I-cache for that page +- * from EL1. To solve this, we invalidate the entire I-cache when +- * unmapping a page from a guest if we have a VPIPT I-cache but +- * the host is running at EL1. As above, we could do better if +- * we had the VA. +- * +- * The moral of this story is: if you have a VPIPT I-cache, then +- * you should be running with VHE enabled. +- */ +- if (icache_is_vpipt()) +- icache_inval_all_pou(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -144,18 +122,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 24cef9b87f9e9c..fc3fcd29ccc306 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +-- +2.53.0 + diff --git a/queue-5.15/series b/queue-5.15/series index c438df797b..d4012ed00d 100644 --- a/queue-5.15/series +++ b/queue-5.15/series @@ -174,3 +174,8 @@ time-fix-off-by-one-in-settimeofday-usec-validation.patch ext4-validate-p_idx-bounds-in-ext4_ext_correct_index.patch fs-ntfs3-return-error-for-inconsistent-extended-attr.patch nfsd-don-t-ignore-the-return-code-of-svc_proc_regist.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch +tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch +kvm-arm64-remove-vpipt-i-cache-handling.patch +arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch +arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch diff --git a/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..37aa092834 --- /dev/null +++ b/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From d8b319de2dcbcae9b9da8f41f4f38c1e7c3435e5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index a08adca412b41a..3a91972485cc36 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1143,6 +1143,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1152,6 +1153,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 + diff --git a/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch b/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch new file mode 100644 index 0000000000..96adc3a236 --- /dev/null +++ b/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch @@ -0,0 +1,51 @@ +From 66f41f99118179e2e9f3ffb6c5c27e84bee8319e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:28:53 -0700 +Subject: tun: free page on build_skb failure in tun_xdp_one() + +From: Weiming Shi + +[ Upstream commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8 ] + +When build_skb() fails in tun_xdp_one(), the function sets ret to +-ENOMEM and jumps to the out label, which returns without freeing the +page that vhost_net_build_xdp() allocated for the frame. As with the +short-frame rejection path, tun_sendmsg() discards the per-buffer error +and still returns total_len, so vhost_tx_batch() takes the success path +and never frees the page. Each build_skb() failure in a batch leaks one +page-frag chunk. + +Free the page before taking the error path, matching the put_page() the +other error exits of tun_xdp_one() already perform. + +Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8) +[Harshit: Backport to 5.15.y/5.10.y, use err instead of ret, no change +needed] +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tun.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/net/tun.c b/drivers/net/tun.c +index 803cb4722dbf4a..aad0760c8d92b7 100644 +--- a/drivers/net/tun.c ++++ b/drivers/net/tun.c +@@ -2468,6 +2468,7 @@ static int tun_xdp_one(struct tun_struct *tun, + build: + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto out; + } +-- +2.53.0 + diff --git a/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch new file mode 100644 index 0000000000..d5731a2337 --- /dev/null +++ b/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch @@ -0,0 +1,62 @@ +From 466eb0f0987d894af36ef6ed03fe1f73b3448bea Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:44:50 +0100 +Subject: arm64: tlb: Allow XZR argument to TLBI ops + +From: Mark Rutland + +commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream. + +The TLBI instruction accepts XZR as a register argument, and for TLBI +operations with a register argument, there is no functional difference +between using XZR or another GPR which contains zeroes. Operations +without a register argument are encoded as if XZR were used. + +Allow the __TLBI_1() macro to use XZR when a register argument is all +zeroes. + +Today this only results in a trivial code saving in +__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In +subsequent patches this pattern will be used more generally. + +There should be no functional change as a result of this patch. + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.1.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 412a3b9a3c25dc..2626a45849c241 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -37,12 +37,12 @@ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ +- "tlbi " #op ", %0\n" \ ++ "tlbi " #op ", %x0\n" \ + ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %0", \ ++ "dsb ish\n tlbi " #op ", %x0", \ + ARM64_WORKAROUND_REPEAT_TLBI, \ + CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ +- : : "r" (arg)) ++ : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) + +-- +2.53.0 + diff --git a/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch new file mode 100644 index 0000000000..e1cd052aff --- /dev/null +++ b/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch @@ -0,0 +1,391 @@ +From abeb0cd888fa1d94cd041af9dc8cdf78986cd652 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:44:51 +0100 +Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI + +From: Mark Rutland + +commit a8f78680ee6bf795086384e8aea159a52814f827 upstream. + +The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several +errata where broadcast TLBI;DSB sequences don't provide all the +architecturally required synchronization. The workaround performs more +work than necessary, and can have significant overhead. This patch +optimizes the workaround, as explained below. + +The workaround was originally added for Qualcomm Falkor erratum 1009 in +commit: + + d9ff80f83ecb ("arm64: Work around Falkor erratum 1009") + +As noted in the message for that commit, the workaround is applied even +in cases where it is not strictly necessary. + +The workaround was later reused without changes for: + +* Arm Cortex-A76 erratum #1286807 + SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/ + +* Arm Cortex-A55 erratum #2441007 + SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/ + +* Arm Cortex-A510 erratum #2441009 + SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/ + +The important details to note are as follows: + +1. All relevant errata only affect the ordering and/or completion of + memory accesses which have been translated by an invalidated TLB + entry. The actual invalidation of TLB entries is unaffected. + +2. The existing workaround is applied to both broadcast and local TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for broadcast invalidation. + +3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI + sequence, whereas for all relevant errata it is only necessary to + execute a single additional TLBI;DSB sequence after any number of + TLBIs are completed by a DSB. + + For example, for a sequence of batched TLBIs: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + + ... the existing workaround will expand this to: + + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + DSB ISH + + ... whereas it is sufficient to have: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + TLBI [, ] // additional + DSB ISH // additional + + Using a single additional TBLI and DSB at the end of the sequence can + have significantly lower overhead as each DSB which completes a TLBI + must synchronize with other PEs in the system, with potential + performance effects both locally and system-wide. + +4. The existing workaround repeats each specific TLBI operation, whereas + for all relevant errata it is sufficient for the additional TLBI to + use *any* operation which will be broadcast, regardless of which + translation regime or stage of translation the operation applies to. + + For example, for a single TLBI: + + TLBI ALLE2IS + DSB ISH + + ... the existing workaround will expand this to: + + TLBI ALLE2IS + DSB ISH + TLBI ALLE2IS // additional + DSB ISH // additional + + ... whereas it is sufficient to have: + + TLBI ALLE2IS + DSB ISH + TLBI VALE1IS, XZR // additional + DSB ISH // additional + + As the additional TLBI doesn't have to match a specific earlier TLBI, + the additional TLBI can be implemented in separate code, with no + memory of the earlier TLBIs. The additional TLBI can also use a + cheaper TLBI operation. + +5. The existing workaround is applied to both Stage-1 and Stage-2 TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for Stage-1 invalidation. + + Architecturally, TLBI operations which invalidate only Stage-2 + information (e.g. IPAS2E1IS) are not required to invalidate TLB + entries which combine information from Stage-1 and Stage-2 + translation table entries, and consequently may not complete memory + accesses translated by those combined entries. In these cases, + completion of memory accesses is only guaranteed after subsequent + invalidation of Stage-1 information (e.g. VMALLE1IS). + +Taking the above points into account, this patch reworks the workaround +logic to reduce overhead: + +* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are + added and used in place of any dsb(ish) which is used to complete + broadcast Stage-1 TLB maintenance. When the + ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will + execute an additional TLBI;DSB sequence. + + For consistency, it might make sense to add __tlbi_sync_*() helpers + for local and stage 2 maintenance. For now I've left those with + open-coded dsb() to keep the diff small. + +* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This + is no longer needed as the necessary synchronization will happen in + __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp(). + +* The additional TLBI operation is chosen to have minimal impact: + + - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at + EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused + entry for the reserved ASID in the kernel's own translation regime, + and have no adverse affect. + + - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used + in hyp code, where it will target an unused entry in the hyp code's + TTBR0 mapping, and should have no adverse effect. + +* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a + TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no + need for arch_tlbbatch_should_defer() to consider + ARM64_WORKAROUND_REPEAT_TLBI. + +When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this +patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes +the resulting Image 64KiB smaller: + +| [mark@lakrids:~/src/linux]% size vmlinux-* +| text data bss dec hex filename +| 21179831 19660919 708216 41548966 279fca6 vmlinux-after +| 21181075 19660903 708216 41550194 27a0172 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l vmlinux-* +| -rwxr-xr-x 1 mark mark 157771472 Feb 4 12:05 vmlinux-after +| -rwxr-xr-x 1 mark mark 157815432 Feb 4 12:05 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l Image-* +| -rw-r--r-- 1 mark mark 41007616 Feb 4 12:05 Image-after +| -rw-r--r-- 1 mark mark 41073152 Feb 4 12:05 Image-before + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.1.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 48 ++++++++++++++++++++++--------- + arch/arm64/kernel/sys_compat.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 6 ++-- + arch/arm64/kvm/hyp/pgtable.c | 2 +- + arch/arm64/kvm/hyp/vhe/tlb.c | 6 ++-- + 5 files changed, 42 insertions(+), 22 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 2626a45849c241..289c3948d5b08a 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -30,18 +30,10 @@ + */ + #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op "\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op, \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op ", %x0\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %x0", \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) +@@ -158,6 +150,34 @@ static inline unsigned long get_trans_granule(void) + #define __TLBI_RANGE_NUM(pages, scale) \ + ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1) + ++#define __repeat_tlbi_sync(op, arg...) \ ++do { \ ++ if (!alternative_has_feature_unlikely(ARM64_WORKAROUND_REPEAT_TLBI)) \ ++ break; \ ++ __tlbi(op, ##arg); \ ++ dsb(ish); \ ++} while (0) ++ ++/* ++ * Complete broadcast TLB maintenance issued by the host which invalidates ++ * stage 1 information in the host's own translation regime. ++ */ ++static inline void __tlbi_sync_s1ish(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale1is, 0); ++} ++ ++/* ++ * Complete broadcast TLB maintenance issued by hyp code which invalidates ++ * stage 1 translation information in any translation regime. ++ */ ++static inline void __tlbi_sync_s1ish_hyp(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale2is, 0); ++} ++ + /* + * TLB Invalidation + * ================ +@@ -239,7 +259,7 @@ static inline void flush_tlb_all(void) + { + dsb(ishst); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -251,7 +271,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) + asid = __TLBI_VADDR(0, ASID(mm)); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, +@@ -269,7 +289,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, + unsigned long uaddr) + { + flush_tlb_page_nosync(vma, uaddr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + /* +@@ -357,7 +377,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, + } + scale++; + } +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_range(struct vm_area_struct *vma, +@@ -386,7 +406,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end + dsb(ishst); + for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) + __tlbi(vaale1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -400,7 +420,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) + + dsb(ishst); + __tlbi(vaae1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + #endif +diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c +index df14336c3a29cf..2bc2ac91d79e39 100644 +--- a/arch/arm64/kernel/sys_compat.c ++++ b/arch/arm64/kernel/sys_compat.c +@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end) + * We pick the reserved-ASID to minimise the impact. + */ + __tlbi(aside1is, __TLBI_VADDR(0, 0)); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + ret = caches_clean_inval_user_pou(start, start + chunk); +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 291789df24e3ee..76973e3b48a076 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -81,7 +81,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -97,7 +97,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -122,5 +122,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c +index f0167dc7438f8a..d2838de92b4796 100644 +--- a/arch/arm64/kvm/hyp/pgtable.c ++++ b/arch/arm64/kvm/hyp/pgtable.c +@@ -486,7 +486,7 @@ static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + data->unmapped += granule; + } + +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + mm_ops->put_page(ptep); + +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index fc3fcd29ccc306..59aa22b48e9538 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +-- +2.53.0 + diff --git a/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch new file mode 100644 index 0000000000..a10dc5872d --- /dev/null +++ b/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch @@ -0,0 +1,120 @@ +From a3414e8d7a4b00a5e9cd3714a8a0064999776995 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:44:49 +0100 +Subject: KVM: arm64: Remove VPIPT I-cache handling + +From: Marc Zyngier + +commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream. + +We have some special handling for VPIPT I-cache in critical parts +of the cache and TLB maintenance. Remove it. + +Reviewed-by: Zenghui Yu +Reviewed-by: Anshuman Khandual +Signed-off-by: Marc Zyngier +Acked-by: Mark Rutland +Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org +Signed-off-by: Will Deacon +[Mark: Backport to v6.1.y. VPIPT HW was never built; this is all dead code] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/kvm_mmu.h | 4 ++-- + arch/arm64/kvm/hyp/nvhe/tlb.c | 35 -------------------------------- + arch/arm64/kvm/hyp/vhe/tlb.c | 13 ------------ + 3 files changed, 2 insertions(+), 50 deletions(-) + +diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h +index 7784081088e78f..1495fcddd98e58 100644 +--- a/arch/arm64/include/asm/kvm_mmu.h ++++ b/arch/arm64/include/asm/kvm_mmu.h +@@ -214,8 +214,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size) + if (icache_is_aliasing()) { + /* any kind of VIPT cache */ + icache_inval_all_pou(); +- } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) { +- /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */ ++ } else { ++ /* PIPT */ + icache_inval_pou((unsigned long)va, (unsigned long)va + size); + } + } +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index d296d617f58963..291789df24e3ee 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -84,28 +84,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + dsb(ish); + isb(); + +- /* +- * If the host is running at EL1 and we have a VPIPT I-cache, +- * then we must perform I-cache maintenance at EL2 in order for +- * it to have an effect on the guest. Since the guest cannot hit +- * I-cache lines allocated with a different VMID, we don't need +- * to worry about junk out of guest reset (we nuke the I-cache on +- * VMID rollover), but we do need to be careful when remapping +- * executable pages for the same guest. This can happen when KSM +- * takes a CoW fault on an executable page, copies the page into +- * a page that was previously mapped in the guest and then needs +- * to invalidate the guest view of the I-cache for that page +- * from EL1. To solve this, we invalidate the entire I-cache when +- * unmapping a page from a guest if we have a VPIPT I-cache but +- * the host is running at EL1. As above, we could do better if +- * we had the VA. +- * +- * The moral of this story is: if you have a VPIPT I-cache, then +- * you should be running with VHE enabled. +- */ +- if (icache_is_vpipt()) +- icache_inval_all_pou(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -144,18 +122,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 24cef9b87f9e9c..fc3fcd29ccc306 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +-- +2.53.0 + diff --git a/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch b/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch new file mode 100644 index 0000000000..958c31acb4 --- /dev/null +++ b/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch @@ -0,0 +1,97 @@ +From f026c847954efa7c68cb8aac0df03c27476eef44 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 26 May 2026 11:12:39 +0700 +Subject: net: skbuff: fix missing zerocopy reference in pskb_carve helpers + +From: Minh Nguyen + +commit 98d0912e9f841e5529a5b89a972805f34cb1c69d upstream. + +pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy +the old skb_shared_info header into a new buffer via memcpy(), which +includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs. +Neither function calls net_zcopy_get() for the new shinfo, creating an +unaccounted holder: every skb_shared_info with destructor_arg set will +call skb_zcopy_clear() once when freed, but the corresponding +net_zcopy_get() was never called for the new copy. Repeated calls +drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while +TX skbs still hold live destructor_arg pointers. + +KASAN reports use-after-free on a freed ubuf_info_msgzc: + + BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810 + Read of size 8 at addr ffff88801574d3e8 by task poc/220 + + Call Trace: + skb_release_data+0x77b/0x810 + kfree_skb_list_reason+0x13e/0x610 + skb_release_data+0x4cd/0x810 + sk_skb_reason_drop+0xf3/0x340 + skb_queue_purge_reason+0x282/0x440 + rds_tcp_inc_free+0x1e/0x30 + rds_recvmsg+0x354/0x1780 + __sys_recvmsg+0xdf/0x180 + + Allocated by task 219: + msg_zerocopy_realloc+0x157/0x7b0 + tcp_sendmsg_locked+0x2892/0x3ba0 + + Freed by task 219: + ip_recv_error+0x74a/0xb10 + tcp_recvmsg+0x475/0x530 + +The skb consuming the late access still referenced the same uarg via +shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without +a refcount bump. This has been verified to be reliably exploitable: a +working proof-of-concept achieves full root privilege escalation from +an unprivileged local user on a default kernel configuration. + +The fix follows the pattern of pskb_expand_head() which has the same +memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get() +is placed after skb_orphan_frags() succeeds, so the orphan error path +needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is +placed after all failure points and just before skb_release_data(), so +no error path needs cleanup at all -- matching pskb_expand_head() more +closely and avoiding the need for a balancing net_zcopy_put(). + +Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function") +Cc: stable@vger.kernel.org +Assisted-by: Claude:claude-sonnet-4-6 +Signed-off-by: Minh Nguyen +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com +Signed-off-by: Paolo Abeni +[Salvatore Bonaccorso: Backport for context changes, as 6.1.y has not +511a3eda2f8d ("net: dropreason: propagate drop_reason to +skb_release_data()")]. +Signed-off-by: Salvatore Bonaccorso +Signed-off-by: Sasha Levin +--- + net/core/skbuff.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/net/core/skbuff.c b/net/core/skbuff.c +index 41b2aaed7a14aa..f1f5b2b25f8522 100644 +--- a/net/core/skbuff.c ++++ b/net/core/skbuff.c +@@ -6247,6 +6247,8 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off, + kfree(data); + return -ENOMEM; + } ++ if (skb_zcopy(skb)) ++ net_zcopy_get(skb_zcopy(skb)); + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) + skb_frag_ref(skb, i); + if (skb_has_frag_list(skb)) +@@ -6396,6 +6398,8 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off, + kfree(data); + return -ENOMEM; + } ++ if (skb_zcopy(skb)) ++ net_zcopy_get(skb_zcopy(skb)); + skb_release_data(skb); + + skb->head = data; +-- +2.53.0 + diff --git a/queue-6.1/series b/queue-6.1/series index 59f06c3bb8..07b90fdede 100644 --- a/queue-6.1/series +++ b/queue-6.1/series @@ -230,3 +230,8 @@ alsa-pcm-fix-wait-queue-list-corruption-in-snd_pcm_d.patch fs-ntfs3-return-error-for-inconsistent-extended-attr.patch usb-gadget-f_ncm-fix-net_device-lifecycle-with-devic.patch usb-gadget-u_ether-fix-null-pointer-deref-in-eth_get.patch +net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch +kvm-arm64-remove-vpipt-i-cache-handling.patch +arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch +arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch diff --git a/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..c005443666 --- /dev/null +++ b/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From 33edbfe76b534b501b60c6a6a99f16685ee6228e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index f8e7b163810de6..15ab71f5288ac3 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1157,6 +1157,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1166,6 +1167,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 + diff --git a/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch new file mode 100644 index 0000000000..63e8c2cac8 --- /dev/null +++ b/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch @@ -0,0 +1,62 @@ +From 5cc6a38c30753a8f05a6be51fa4944b02fb59424 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:40:23 +0100 +Subject: arm64: tlb: Allow XZR argument to TLBI ops + +From: Mark Rutland + +commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream. + +The TLBI instruction accepts XZR as a register argument, and for TLBI +operations with a register argument, there is no functional difference +between using XZR or another GPR which contains zeroes. Operations +without a register argument are encoded as if XZR were used. + +Allow the __TLBI_1() macro to use XZR when a register argument is all +zeroes. + +Today this only results in a trivial code saving in +__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In +subsequent patches this pattern will be used more generally. + +There should be no functional change as a result of this patch. + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.12.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 5f12cdc2b9671a..dd802d58b39436 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -38,12 +38,12 @@ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ +- "tlbi " #op ", %0\n" \ ++ "tlbi " #op ", %x0\n" \ + ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %0", \ ++ "dsb ish\n tlbi " #op ", %x0", \ + ARM64_WORKAROUND_REPEAT_TLBI, \ + CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ +- : : "r" (arg)) ++ : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) + +-- +2.53.0 + diff --git a/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch new file mode 100644 index 0000000000..f928aa2fec --- /dev/null +++ b/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch @@ -0,0 +1,456 @@ +From aa4de9abf6780db4b481bab6b76912fe2118e061 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:40:24 +0100 +Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI + +From: Mark Rutland + +commit a8f78680ee6bf795086384e8aea159a52814f827 upstream. + +The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several +errata where broadcast TLBI;DSB sequences don't provide all the +architecturally required synchronization. The workaround performs more +work than necessary, and can have significant overhead. This patch +optimizes the workaround, as explained below. + +The workaround was originally added for Qualcomm Falkor erratum 1009 in +commit: + + d9ff80f83ecb ("arm64: Work around Falkor erratum 1009") + +As noted in the message for that commit, the workaround is applied even +in cases where it is not strictly necessary. + +The workaround was later reused without changes for: + +* Arm Cortex-A76 erratum #1286807 + SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/ + +* Arm Cortex-A55 erratum #2441007 + SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/ + +* Arm Cortex-A510 erratum #2441009 + SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/ + +The important details to note are as follows: + +1. All relevant errata only affect the ordering and/or completion of + memory accesses which have been translated by an invalidated TLB + entry. The actual invalidation of TLB entries is unaffected. + +2. The existing workaround is applied to both broadcast and local TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for broadcast invalidation. + +3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI + sequence, whereas for all relevant errata it is only necessary to + execute a single additional TLBI;DSB sequence after any number of + TLBIs are completed by a DSB. + + For example, for a sequence of batched TLBIs: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + + ... the existing workaround will expand this to: + + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + DSB ISH + + ... whereas it is sufficient to have: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + TLBI [, ] // additional + DSB ISH // additional + + Using a single additional TBLI and DSB at the end of the sequence can + have significantly lower overhead as each DSB which completes a TLBI + must synchronize with other PEs in the system, with potential + performance effects both locally and system-wide. + +4. The existing workaround repeats each specific TLBI operation, whereas + for all relevant errata it is sufficient for the additional TLBI to + use *any* operation which will be broadcast, regardless of which + translation regime or stage of translation the operation applies to. + + For example, for a single TLBI: + + TLBI ALLE2IS + DSB ISH + + ... the existing workaround will expand this to: + + TLBI ALLE2IS + DSB ISH + TLBI ALLE2IS // additional + DSB ISH // additional + + ... whereas it is sufficient to have: + + TLBI ALLE2IS + DSB ISH + TLBI VALE1IS, XZR // additional + DSB ISH // additional + + As the additional TLBI doesn't have to match a specific earlier TLBI, + the additional TLBI can be implemented in separate code, with no + memory of the earlier TLBIs. The additional TLBI can also use a + cheaper TLBI operation. + +5. The existing workaround is applied to both Stage-1 and Stage-2 TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for Stage-1 invalidation. + + Architecturally, TLBI operations which invalidate only Stage-2 + information (e.g. IPAS2E1IS) are not required to invalidate TLB + entries which combine information from Stage-1 and Stage-2 + translation table entries, and consequently may not complete memory + accesses translated by those combined entries. In these cases, + completion of memory accesses is only guaranteed after subsequent + invalidation of Stage-1 information (e.g. VMALLE1IS). + +Taking the above points into account, this patch reworks the workaround +logic to reduce overhead: + +* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are + added and used in place of any dsb(ish) which is used to complete + broadcast Stage-1 TLB maintenance. When the + ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will + execute an additional TLBI;DSB sequence. + + For consistency, it might make sense to add __tlbi_sync_*() helpers + for local and stage 2 maintenance. For now I've left those with + open-coded dsb() to keep the diff small. + +* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This + is no longer needed as the necessary synchronization will happen in + __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp(). + +* The additional TLBI operation is chosen to have minimal impact: + + - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at + EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused + entry for the reserved ASID in the kernel's own translation regime, + and have no adverse affect. + + - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used + in hyp code, where it will target an unused entry in the hyp code's + TTBR0 mapping, and should have no adverse effect. + +* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a + TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no + need for arch_tlbbatch_should_defer() to consider + ARM64_WORKAROUND_REPEAT_TLBI. + +When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this +patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes +the resulting Image 64KiB smaller: + +| [mark@lakrids:~/src/linux]% size vmlinux-* +| text data bss dec hex filename +| 21179831 19660919 708216 41548966 279fca6 vmlinux-after +| 21181075 19660903 708216 41550194 27a0172 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l vmlinux-* +| -rwxr-xr-x 1 mark mark 157771472 Feb 4 12:05 vmlinux-after +| -rwxr-xr-x 1 mark mark 157815432 Feb 4 12:05 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l Image-* +| -rw-r--r-- 1 mark mark 41007616 Feb 4 12:05 Image-after +| -rw-r--r-- 1 mark mark 41073152 Feb 4 12:05 Image-before + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.12.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 59 ++++++++++++++++++------------- + arch/arm64/kernel/sys_compat.c | 2 +- + arch/arm64/kvm/hyp/nvhe/mm.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 8 ++--- + arch/arm64/kvm/hyp/pgtable.c | 2 +- + arch/arm64/kvm/hyp/vhe/tlb.c | 10 +++--- + 6 files changed, 47 insertions(+), 36 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index dd802d58b39436..2c59b71b99e8ad 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -31,18 +31,10 @@ + */ + #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op "\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op, \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op ", %x0\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %x0", \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) +@@ -181,6 +173,34 @@ static inline unsigned long get_trans_granule(void) + (__pages >> (5 * (scale) + 1)) - 1; \ + }) + ++#define __repeat_tlbi_sync(op, arg...) \ ++do { \ ++ if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI)) \ ++ break; \ ++ __tlbi(op, ##arg); \ ++ dsb(ish); \ ++} while (0) ++ ++/* ++ * Complete broadcast TLB maintenance issued by the host which invalidates ++ * stage 1 information in the host's own translation regime. ++ */ ++static inline void __tlbi_sync_s1ish(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale1is, 0); ++} ++ ++/* ++ * Complete broadcast TLB maintenance issued by hyp code which invalidates ++ * stage 1 translation information in any translation regime. ++ */ ++static inline void __tlbi_sync_s1ish_hyp(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale2is, 0); ++} ++ + /* + * TLB Invalidation + * ================ +@@ -266,7 +286,7 @@ static inline void flush_tlb_all(void) + { + dsb(ishst); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -278,7 +298,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) + asid = __TLBI_VADDR(0, ASID(mm)); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); +- dsb(ish); ++ __tlbi_sync_s1ish(); + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); + } + +@@ -305,20 +325,11 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, + unsigned long uaddr) + { + flush_tlb_page_nosync(vma, uaddr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) + { +- /* +- * TLB flush deferral is not required on systems which are affected by +- * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation +- * will have two consecutive TLBI instructions with a dsb(ish) in between +- * defeating the purpose (i.e save overall 'dsb ish' cost). +- */ +- if (alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI)) +- return false; +- + return true; + } + +@@ -352,7 +363,7 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) + */ + static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) + { +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + /* +@@ -478,7 +489,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, + { + __flush_tlb_range_nosync(vma, start, end, stride, + last_level, tlb_level); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline void flush_tlb_range(struct vm_area_struct *vma, +@@ -508,7 +519,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end + dsb(ishst); + for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) + __tlbi(vaale1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -522,7 +533,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) + + dsb(ishst); + __tlbi(vaae1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + #endif +diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c +index 4a609e9b65de03..b9d4998c97efac 100644 +--- a/arch/arm64/kernel/sys_compat.c ++++ b/arch/arm64/kernel/sys_compat.c +@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end) + * We pick the reserved-ASID to minimise the impact. + */ + __tlbi(aside1is, __TLBI_VADDR(0, 0)); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + ret = caches_clean_inval_user_pou(start, start + chunk); +diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c +index 8850b591d77518..cd58fbebd07393 100644 +--- a/arch/arm64/kvm/hyp/nvhe/mm.c ++++ b/arch/arm64/kvm/hyp/nvhe/mm.c +@@ -261,7 +261,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) + */ + dsb(ishst); + __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), KVM_PGTABLE_LAST_LEVEL); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + } + +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 48da9ca9763f6e..3dc1ce0d27fe66 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -169,7 +169,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -226,7 +226,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, + + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -240,7 +240,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + enter_vmid_context(mmu, &cxt, false); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -266,5 +266,5 @@ void __kvm_flush_vm_context(void) + /* Same remark as in enter_vmid_context() */ + dsb(ish); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c +index b11bcebac908a7..deabc21caae370 100644 +--- a/arch/arm64/kvm/hyp/pgtable.c ++++ b/arch/arm64/kvm/hyp/pgtable.c +@@ -497,7 +497,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx, + *unmapped += granule; + } + +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + mm_ops->put_page(ctx->ptep); + +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 3d50a1bd2bdbcb..0f2aea1b42888a 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -115,7 +115,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -176,7 +176,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, + + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -192,7 +192,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + enter_vmid_context(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + exit_vmid_context(&cxt); +@@ -217,7 +217,7 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } + + /* +@@ -358,7 +358,7 @@ int __kvm_tlbi_s1e2(struct kvm_s2_mmu *mmu, u64 va, u64 sys_encoding) + default: + ret = -EINVAL; + } +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + if (mmu) +-- +2.53.0 + diff --git a/queue-6.12/series b/queue-6.12/series index e029c24f50..fe63823fa8 100644 --- a/queue-6.12/series +++ b/queue-6.12/series @@ -65,3 +65,6 @@ ima-kexec-skip-ima-segment-validation-after-kexec-so.patch ima-kexec-move-ima-log-copy-from-kexec-load-to-execu.patch spi-cadence-quadspi-fix-unclocked-access-on-unbind.patch tools-rv-fix-cleanup-after-failed-trace-setup.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch +arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch +arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch diff --git a/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..01b44995d0 --- /dev/null +++ b/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From c612ff567c6fa151b7fe76ebd8028151950eb3c7 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index 5ca6ecf0ce5fbc..c460b1f39136a5 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1177,6 +1177,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1186,6 +1187,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 + diff --git a/queue-6.18/series b/queue-6.18/series index 03d57d5344..6178051f58 100644 --- a/queue-6.18/series +++ b/queue-6.18/series @@ -81,3 +81,4 @@ tools-rv-fix-substring-match-when-listing-container-.patch tools-rv-fix-cleanup-after-failed-trace-setup.patch verification-rvgen-fix-options-shared-among-commands.patch verification-rvgen-fix-ltl2k-writing-true-as-a-liter.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch diff --git a/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..0233a52e33 --- /dev/null +++ b/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From c9e0360a32be156dffec09ff965815a0e7f597fe Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index 6fd3b14273b374..b51ce7af1b20f9 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1052,6 +1052,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1061,6 +1062,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 + diff --git a/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch new file mode 100644 index 0000000000..2ad95cb6e2 --- /dev/null +++ b/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch @@ -0,0 +1,62 @@ +From 335a2178bfb71e2b34a326aa5f8ddeb989671b26 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:42:47 +0100 +Subject: arm64: tlb: Allow XZR argument to TLBI ops + +From: Mark Rutland + +commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream. + +The TLBI instruction accepts XZR as a register argument, and for TLBI +operations with a register argument, there is no functional difference +between using XZR or another GPR which contains zeroes. Operations +without a register argument are encoded as if XZR were used. + +Allow the __TLBI_1() macro to use XZR when a register argument is all +zeroes. + +Today this only results in a trivial code saving in +__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In +subsequent patches this pattern will be used more generally. + +There should be no functional change as a result of this patch. + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.6.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index 6eeb56b6fac13e..c8d8b9622369f0 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -38,12 +38,12 @@ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ +- "tlbi " #op ", %0\n" \ ++ "tlbi " #op ", %x0\n" \ + ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %0", \ ++ "dsb ish\n tlbi " #op ", %x0", \ + ARM64_WORKAROUND_REPEAT_TLBI, \ + CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ +- : : "r" (arg)) ++ : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) + +-- +2.53.0 + diff --git a/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch new file mode 100644 index 0000000000..503b97868e --- /dev/null +++ b/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch @@ -0,0 +1,446 @@ +From c4e2a8ccfa34ae3c554f5b665298833726d95c3e Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:42:48 +0100 +Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI + +From: Mark Rutland + +commit a8f78680ee6bf795086384e8aea159a52814f827 upstream. + +The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several +errata where broadcast TLBI;DSB sequences don't provide all the +architecturally required synchronization. The workaround performs more +work than necessary, and can have significant overhead. This patch +optimizes the workaround, as explained below. + +The workaround was originally added for Qualcomm Falkor erratum 1009 in +commit: + + d9ff80f83ecb ("arm64: Work around Falkor erratum 1009") + +As noted in the message for that commit, the workaround is applied even +in cases where it is not strictly necessary. + +The workaround was later reused without changes for: + +* Arm Cortex-A76 erratum #1286807 + SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/ + +* Arm Cortex-A55 erratum #2441007 + SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/ + +* Arm Cortex-A510 erratum #2441009 + SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/ + +The important details to note are as follows: + +1. All relevant errata only affect the ordering and/or completion of + memory accesses which have been translated by an invalidated TLB + entry. The actual invalidation of TLB entries is unaffected. + +2. The existing workaround is applied to both broadcast and local TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for broadcast invalidation. + +3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI + sequence, whereas for all relevant errata it is only necessary to + execute a single additional TLBI;DSB sequence after any number of + TLBIs are completed by a DSB. + + For example, for a sequence of batched TLBIs: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + + ... the existing workaround will expand this to: + + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + TLBI [, ] + DSB ISH // additional + TLBI [, ] // additional + DSB ISH + + ... whereas it is sufficient to have: + + TLBI [, ] + TLBI [, ] + TLBI [, ] + DSB ISH + TLBI [, ] // additional + DSB ISH // additional + + Using a single additional TBLI and DSB at the end of the sequence can + have significantly lower overhead as each DSB which completes a TLBI + must synchronize with other PEs in the system, with potential + performance effects both locally and system-wide. + +4. The existing workaround repeats each specific TLBI operation, whereas + for all relevant errata it is sufficient for the additional TLBI to + use *any* operation which will be broadcast, regardless of which + translation regime or stage of translation the operation applies to. + + For example, for a single TLBI: + + TLBI ALLE2IS + DSB ISH + + ... the existing workaround will expand this to: + + TLBI ALLE2IS + DSB ISH + TLBI ALLE2IS // additional + DSB ISH // additional + + ... whereas it is sufficient to have: + + TLBI ALLE2IS + DSB ISH + TLBI VALE1IS, XZR // additional + DSB ISH // additional + + As the additional TLBI doesn't have to match a specific earlier TLBI, + the additional TLBI can be implemented in separate code, with no + memory of the earlier TLBIs. The additional TLBI can also use a + cheaper TLBI operation. + +5. The existing workaround is applied to both Stage-1 and Stage-2 TLB + invalidation, whereas for all relevant errata it is only necessary to + apply a workaround for Stage-1 invalidation. + + Architecturally, TLBI operations which invalidate only Stage-2 + information (e.g. IPAS2E1IS) are not required to invalidate TLB + entries which combine information from Stage-1 and Stage-2 + translation table entries, and consequently may not complete memory + accesses translated by those combined entries. In these cases, + completion of memory accesses is only guaranteed after subsequent + invalidation of Stage-1 information (e.g. VMALLE1IS). + +Taking the above points into account, this patch reworks the workaround +logic to reduce overhead: + +* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are + added and used in place of any dsb(ish) which is used to complete + broadcast Stage-1 TLB maintenance. When the + ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will + execute an additional TLBI;DSB sequence. + + For consistency, it might make sense to add __tlbi_sync_*() helpers + for local and stage 2 maintenance. For now I've left those with + open-coded dsb() to keep the diff small. + +* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This + is no longer needed as the necessary synchronization will happen in + __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp(). + +* The additional TLBI operation is chosen to have minimal impact: + + - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at + EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused + entry for the reserved ASID in the kernel's own translation regime, + and have no adverse affect. + + - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used + in hyp code, where it will target an unused entry in the hyp code's + TTBR0 mapping, and should have no adverse effect. + +* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a + TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no + need for arch_tlbbatch_should_defer() to consider + ARM64_WORKAROUND_REPEAT_TLBI. + +When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this +patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes +the resulting Image 64KiB smaller: + +| [mark@lakrids:~/src/linux]% size vmlinux-* +| text data bss dec hex filename +| 21179831 19660919 708216 41548966 279fca6 vmlinux-after +| 21181075 19660903 708216 41550194 27a0172 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l vmlinux-* +| -rwxr-xr-x 1 mark mark 157771472 Feb 4 12:05 vmlinux-after +| -rwxr-xr-x 1 mark mark 157815432 Feb 4 12:05 vmlinux-before +| [mark@lakrids:~/src/linux]% ls -l Image-* +| -rw-r--r-- 1 mark mark 41007616 Feb 4 12:05 Image-after +| -rw-r--r-- 1 mark mark 41073152 Feb 4 12:05 Image-before + +Signed-off-by: Mark Rutland +Cc: Catalin Marinas +Cc: Marc Zyngier +Cc: Oliver Upton +Cc: Ryan Roberts +Cc: Will Deacon +Signed-off-by: Will Deacon +Signed-off-by: Catalin Marinas +Signed-off-by: Greg Kroah-Hartman +[Mark: Backport to v6.6.y] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/tlbflush.h | 60 ++++++++++++++++++------------- + arch/arm64/kernel/sys_compat.c | 2 +- + arch/arm64/kvm/hyp/nvhe/mm.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 8 ++--- + arch/arm64/kvm/hyp/pgtable.c | 2 +- + arch/arm64/kvm/hyp/vhe/tlb.c | 8 ++--- + 6 files changed, 46 insertions(+), 36 deletions(-) + +diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h +index c8d8b9622369f0..d96342d455a68a 100644 +--- a/arch/arm64/include/asm/tlbflush.h ++++ b/arch/arm64/include/asm/tlbflush.h +@@ -31,18 +31,10 @@ + */ + #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op "\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op, \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : ) + + #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE \ + "tlbi " #op ", %x0\n" \ +- ALTERNATIVE("nop\n nop", \ +- "dsb ish\n tlbi " #op ", %x0", \ +- ARM64_WORKAROUND_REPEAT_TLBI, \ +- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \ + : : "rZ" (arg)) + + #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg) +@@ -165,6 +157,34 @@ static inline unsigned long get_trans_granule(void) + (__pages >> (5 * (scale) + 1)) - 1; \ + }) + ++#define __repeat_tlbi_sync(op, arg...) \ ++do { \ ++ if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI)) \ ++ break; \ ++ __tlbi(op, ##arg); \ ++ dsb(ish); \ ++} while (0) ++ ++/* ++ * Complete broadcast TLB maintenance issued by the host which invalidates ++ * stage 1 information in the host's own translation regime. ++ */ ++static inline void __tlbi_sync_s1ish(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale1is, 0); ++} ++ ++/* ++ * Complete broadcast TLB maintenance issued by hyp code which invalidates ++ * stage 1 translation information in any translation regime. ++ */ ++static inline void __tlbi_sync_s1ish_hyp(void) ++{ ++ dsb(ish); ++ __repeat_tlbi_sync(vale2is, 0); ++} ++ + /* + * TLB Invalidation + * ================ +@@ -246,7 +266,7 @@ static inline void flush_tlb_all(void) + { + dsb(ishst); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -258,7 +278,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm) + asid = __TLBI_VADDR(0, ASID(mm)); + __tlbi(aside1is, asid); + __tlbi_user(aside1is, asid); +- dsb(ish); ++ __tlbi_sync_s1ish(); + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); + } + +@@ -285,21 +305,11 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, + unsigned long uaddr) + { + flush_tlb_page_nosync(vma, uaddr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) + { +-#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI +- /* +- * TLB flush deferral is not required on systems which are affected by +- * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation +- * will have two consecutive TLBI instructions with a dsb(ish) in between +- * defeating the purpose (i.e save overall 'dsb ish' cost). +- */ +- if (unlikely(cpus_have_const_cap(ARM64_WORKAROUND_REPEAT_TLBI))) +- return false; +-#endif + return true; + } + +@@ -333,7 +343,7 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) + */ + static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) + { +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + /* +@@ -437,7 +447,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, + else + __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true); + +- dsb(ish); ++ __tlbi_sync_s1ish(); + mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); + } + +@@ -467,7 +477,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end + dsb(ishst); + for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) + __tlbi(vaale1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + +@@ -481,7 +491,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr) + + dsb(ishst); + __tlbi(vaae1is, addr); +- dsb(ish); ++ __tlbi_sync_s1ish(); + isb(); + } + #endif +diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c +index df14336c3a29cf..2bc2ac91d79e39 100644 +--- a/arch/arm64/kernel/sys_compat.c ++++ b/arch/arm64/kernel/sys_compat.c +@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end) + * We pick the reserved-ASID to minimise the impact. + */ + __tlbi(aside1is, __TLBI_VADDR(0, 0)); +- dsb(ish); ++ __tlbi_sync_s1ish(); + } + + ret = caches_clean_inval_user_pou(start, start + chunk); +diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c +index 65a7a186d7b217..4bb9da3381eaf4 100644 +--- a/arch/arm64/kvm/hyp/nvhe/mm.c ++++ b/arch/arm64/kvm/hyp/nvhe/mm.c +@@ -261,7 +261,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) + */ + dsb(ishst); + __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1)); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + } + +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index a60fb13e21924f..f03d4f7dbf443d 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -102,7 +102,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -158,7 +158,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, + + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -172,7 +172,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt, false); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -198,5 +198,5 @@ void __kvm_flush_vm_context(void) + /* Same remark as in __tlb_switch_to_guest() */ + dsb(ish); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c +index ca0bf0b92ca09e..4ec07236a68d21 100644 +--- a/arch/arm64/kvm/hyp/pgtable.c ++++ b/arch/arm64/kvm/hyp/pgtable.c +@@ -534,7 +534,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx, + *unmapped += granule; + } + +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + mm_ops->put_page(ctx->ptep); + +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 23325e9f3cc388..af3a02b6b48b32 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + */ + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -165,7 +165,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, + + dsb(ish); + __tlbi(vmalle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -181,7 +181,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) + __tlb_switch_to_guest(mmu, &cxt); + + __tlbi(vmalls12e1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + isb(); + + __tlb_switch_to_host(&cxt); +@@ -206,5 +206,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- dsb(ish); ++ __tlbi_sync_s1ish_hyp(); + } +-- +2.53.0 + diff --git a/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch new file mode 100644 index 0000000000..b7c8806e0e --- /dev/null +++ b/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch @@ -0,0 +1,175 @@ +From 26cde0f9cc45bcc7262ab2df5170d297a75638bb Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 14:42:46 +0100 +Subject: KVM: arm64: Remove VPIPT I-cache handling + +From: Marc Zyngier + +commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream. + +We have some special handling for VPIPT I-cache in critical parts +of the cache and TLB maintenance. Remove it. + +Reviewed-by: Zenghui Yu +Reviewed-by: Anshuman Khandual +Signed-off-by: Marc Zyngier +Acked-by: Mark Rutland +Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org +Signed-off-by: Will Deacon +[Mark: Backport to v6.6.y. VPIPT HW was never built; this is all dead code] +Signed-off-by: Mark Rutland +Signed-off-by: Sasha Levin +--- + arch/arm64/include/asm/kvm_mmu.h | 5 ++- + arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +- + arch/arm64/kvm/hyp/nvhe/tlb.c | 61 -------------------------------- + arch/arm64/kvm/hyp/vhe/tlb.c | 13 ------- + 4 files changed, 3 insertions(+), 78 deletions(-) + +diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h +index 96a80e8f62263e..888c5f90201073 100644 +--- a/arch/arm64/include/asm/kvm_mmu.h ++++ b/arch/arm64/include/asm/kvm_mmu.h +@@ -229,9 +229,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size) + if (icache_is_aliasing()) { + /* any kind of VIPT cache */ + icache_inval_all_pou(); +- } else if (read_sysreg(CurrentEL) != CurrentEL_EL1 || +- !icache_is_vpipt()) { +- /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */ ++ } else { ++ /* PIPT */ + icache_inval_pou((unsigned long)va, (unsigned long)va + size); + } + } +diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c +index 03acc8343c5d1b..fd3e0b2891c604 100644 +--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c ++++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c +@@ -12,7 +12,7 @@ + #include + #include + +-/* Used by icache_is_vpipt(). */ ++/* Used by icache_is_aliasing(). */ + unsigned long __icache_flags; + + /* Used by kvm_get_vttbr(). */ +diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c +index 1b265713d6bede..a60fb13e21924f 100644 +--- a/arch/arm64/kvm/hyp/nvhe/tlb.c ++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c +@@ -105,28 +105,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, + dsb(ish); + isb(); + +- /* +- * If the host is running at EL1 and we have a VPIPT I-cache, +- * then we must perform I-cache maintenance at EL2 in order for +- * it to have an effect on the guest. Since the guest cannot hit +- * I-cache lines allocated with a different VMID, we don't need +- * to worry about junk out of guest reset (we nuke the I-cache on +- * VMID rollover), but we do need to be careful when remapping +- * executable pages for the same guest. This can happen when KSM +- * takes a CoW fault on an executable page, copies the page into +- * a page that was previously mapped in the guest and then needs +- * to invalidate the guest view of the I-cache for that page +- * from EL1. To solve this, we invalidate the entire I-cache when +- * unmapping a page from a guest if we have a VPIPT I-cache but +- * the host is running at EL1. As above, we could do better if +- * we had the VA. +- * +- * The moral of this story is: if you have a VPIPT I-cache, then +- * you should be running with VHE enabled. +- */ +- if (icache_is_vpipt()) +- icache_inval_all_pou(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -157,28 +135,6 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + dsb(nsh); + isb(); + +- /* +- * If the host is running at EL1 and we have a VPIPT I-cache, +- * then we must perform I-cache maintenance at EL2 in order for +- * it to have an effect on the guest. Since the guest cannot hit +- * I-cache lines allocated with a different VMID, we don't need +- * to worry about junk out of guest reset (we nuke the I-cache on +- * VMID rollover), but we do need to be careful when remapping +- * executable pages for the same guest. This can happen when KSM +- * takes a CoW fault on an executable page, copies the page into +- * a page that was previously mapped in the guest and then needs +- * to invalidate the guest view of the I-cache for that page +- * from EL1. To solve this, we invalidate the entire I-cache when +- * unmapping a page from a guest if we have a VPIPT I-cache but +- * the host is running at EL1. As above, we could do better if +- * we had the VA. +- * +- * The moral of this story is: if you have a VPIPT I-cache, then +- * you should be running with VHE enabled. +- */ +- if (icache_is_vpipt()) +- icache_inval_all_pou(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -205,10 +161,6 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, + dsb(ish); + isb(); + +- /* See the comment in __kvm_tlb_flush_vmid_ipa() */ +- if (icache_is_vpipt()) +- icache_inval_all_pou(); +- + __tlb_switch_to_host(&cxt); + } + +@@ -246,18 +198,5 @@ void __kvm_flush_vm_context(void) + /* Same remark as in __tlb_switch_to_guest() */ + dsb(ish); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c +index 46bd43f61d76f5..23325e9f3cc388 100644 +--- a/arch/arm64/kvm/hyp/vhe/tlb.c ++++ b/arch/arm64/kvm/hyp/vhe/tlb.c +@@ -206,18 +206,5 @@ void __kvm_flush_vm_context(void) + { + dsb(ishst); + __tlbi(alle1is); +- +- /* +- * VIPT and PIPT caches are not affected by VMID, so no maintenance +- * is necessary across a VMID rollover. +- * +- * VPIPT caches constrain lookup and maintenance to the active VMID, +- * so we need to invalidate lines with a stale VMID to avoid an ABA +- * race after multiple rollovers. +- * +- */ +- if (icache_is_vpipt()) +- asm volatile("ic ialluis"); +- + dsb(ish); + } +-- +2.53.0 + diff --git a/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch b/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch new file mode 100644 index 0000000000..fc070d1ded --- /dev/null +++ b/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch @@ -0,0 +1,95 @@ +From c0c6df7edac424f2708b5e1749238e0fd31aae7b Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Tue, 26 May 2026 11:12:39 +0700 +Subject: net: skbuff: fix missing zerocopy reference in pskb_carve helpers + +From: Minh Nguyen + +commit 98d0912e9f841e5529a5b89a972805f34cb1c69d upstream. + +pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy +the old skb_shared_info header into a new buffer via memcpy(), which +includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs. +Neither function calls net_zcopy_get() for the new shinfo, creating an +unaccounted holder: every skb_shared_info with destructor_arg set will +call skb_zcopy_clear() once when freed, but the corresponding +net_zcopy_get() was never called for the new copy. Repeated calls +drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while +TX skbs still hold live destructor_arg pointers. + +KASAN reports use-after-free on a freed ubuf_info_msgzc: + + BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810 + Read of size 8 at addr ffff88801574d3e8 by task poc/220 + + Call Trace: + skb_release_data+0x77b/0x810 + kfree_skb_list_reason+0x13e/0x610 + skb_release_data+0x4cd/0x810 + sk_skb_reason_drop+0xf3/0x340 + skb_queue_purge_reason+0x282/0x440 + rds_tcp_inc_free+0x1e/0x30 + rds_recvmsg+0x354/0x1780 + __sys_recvmsg+0xdf/0x180 + + Allocated by task 219: + msg_zerocopy_realloc+0x157/0x7b0 + tcp_sendmsg_locked+0x2892/0x3ba0 + + Freed by task 219: + ip_recv_error+0x74a/0xb10 + tcp_recvmsg+0x475/0x530 + +The skb consuming the late access still referenced the same uarg via +shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without +a refcount bump. This has been verified to be reliably exploitable: a +working proof-of-concept achieves full root privilege escalation from +an unprivileged local user on a default kernel configuration. + +The fix follows the pattern of pskb_expand_head() which has the same +memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get() +is placed after skb_orphan_frags() succeeds, so the orphan error path +needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is +placed after all failure points and just before skb_release_data(), so +no error path needs cleanup at all -- matching pskb_expand_head() more +closely and avoiding the need for a balancing net_zcopy_put(). + +Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function") +Cc: stable@vger.kernel.org +Assisted-by: Claude:claude-sonnet-4-6 +Signed-off-by: Minh Nguyen +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com +Signed-off-by: Paolo Abeni +[Salvatore Bonaccorso: Adjust for context changes in v6.6.y] +Signed-off-by: Salvatore Bonaccorso +Signed-off-by: Sasha Levin +--- + net/core/skbuff.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/net/core/skbuff.c b/net/core/skbuff.c +index 2282b6ad4be21a..5f45a52cc8ca66 100644 +--- a/net/core/skbuff.c ++++ b/net/core/skbuff.c +@@ -6412,6 +6412,8 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off, + skb_kfree_head(data, size); + return -ENOMEM; + } ++ if (skb_zcopy(skb)) ++ net_zcopy_get(skb_zcopy(skb)); + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) + skb_frag_ref(skb, i); + if (skb_has_frag_list(skb)) +@@ -6561,6 +6563,8 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off, + skb_kfree_head(data, size); + return -ENOMEM; + } ++ if (skb_zcopy(skb)) ++ net_zcopy_get(skb_zcopy(skb)); + skb_release_data(skb, SKB_CONSUMED, false); + + skb->head = data; +-- +2.53.0 + diff --git a/queue-6.6/series b/queue-6.6/series index 9a3242aa37..ca3de67c27 100644 --- a/queue-6.6/series +++ b/queue-6.6/series @@ -250,3 +250,8 @@ alsa-pcm-fix-wait-queue-list-corruption-in-snd_pcm_d.patch usb-gadget-f_ncm-fix-net_device-lifecycle-with-devic.patch usb-gadget-u_ether-fix-null-pointer-deref-in-eth_get.patch tools-rv-fix-cleanup-after-failed-trace-setup.patch +net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch +tap-free-page-on-error-paths-in-tap_get_user_xdp.patch +kvm-arm64-remove-vpipt-i-cache-handling.patch +arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch +arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch diff --git a/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch new file mode 100644 index 0000000000..2bd793a556 --- /dev/null +++ b/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch @@ -0,0 +1,57 @@ +From c39e7defc92471c24a90c889fdd0690bb0a88ad5 Mon Sep 17 00:00:00 2001 +From: Sasha Levin +Date: Thu, 11 Jun 2026 06:21:06 -0700 +Subject: tap: free page on error paths in tap_get_user_xdp() + +From: Weiming Shi + +[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ] + +tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL, +and returns -ENOMEM when build_skb() fails. Both paths jump to the err +label without freeing the page that vhost_net_build_xdp() allocated for +the frame. tap_sendmsg() discards the per-buffer return value and always +returns 0, so vhost_tx_batch() takes the success path and never frees +the page; each rejected frame in a batch leaks one page-frag chunk. + +Free the page on both error paths, before the skb is built. This is the +tap counterpart of the same leak in tun_xdp_one(). + +Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()") +Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame") +Reported-by: Xiang Mei +Signed-off-by: Weiming Shi +Reviewed-by: Dongli Zhang +Reviewed-by: Willem de Bruijn +Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com +Signed-off-by: Jakub Kicinski +(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2) +Signed-off-by: Harshit Mogalapalli +Signed-off-by: Sasha Levin +--- + drivers/net/tap.c | 2 ++ + 1 file changed, 2 insertions(+) + +diff --git a/drivers/net/tap.c b/drivers/net/tap.c +index 2c4f9d19827f38..5f01c875e49ee4 100644 +--- a/drivers/net/tap.c ++++ b/drivers/net/tap.c +@@ -1178,6 +1178,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + int err, depth; + + if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) { ++ put_page(virt_to_head_page(xdp->data)); + err = -EINVAL; + goto err; + } +@@ -1187,6 +1188,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp) + + skb = build_skb(xdp->data_hard_start, buflen); + if (!skb) { ++ put_page(virt_to_head_page(xdp->data)); + err = -ENOMEM; + goto err; + } +-- +2.53.0 +