Fixes for all trees

author Sasha Levin <sashal@kernel.org>

Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)

committer Sasha Levin <sashal@kernel.org>

Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)
author Sasha Levin <sashal@kernel.org>
Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)
committer Sasha Levin <sashal@kernel.org>
Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)
diff --git a/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch

new file mode 100644 (file)

index 0000000..267853e
--- /dev/null
+++ b/queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
@@ -0,0 +1,62 @@
+From 7ef2f162cbc7a568944cc1c5967edcd65c534154 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:49:02 +0100
+Subject: arm64: tlb: Allow XZR argument to TLBI ops
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream.
+
+The TLBI instruction accepts XZR as a register argument, and for TLBI
+operations with a register argument, there is no functional difference
+between using XZR or another GPR which contains zeroes. Operations
+without a register argument are encoded as if XZR were used.
+
+Allow the __TLBI_1() macro to use XZR when a register argument is all
+zeroes.
+
+Today this only results in a trivial code saving in
+__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In
+subsequent patches this pattern will be used more generally.
+
+There should be no functional change as a result of this patch.
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v5.10.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 36f02892e1df80..b17d8b049d258b 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -37,12 +37,12 @@
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+-                             "tlbi " #op ", %0\n"                            \
++                             "tlbi " #op ", %x0\n"                           \
+                  ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %0",     \
++                             "dsb ish\n               tlbi " #op ", %x0",    \
+                              ARM64_WORKAROUND_REPEAT_TLBI,                   \
+                              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+-                          : : "r" (arg))
++                          : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+ 
+-- 
+2.53.0
+
diff --git a/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch

new file mode 100644 (file)

index 0000000..97d0a7b
--- /dev/null
+++ b/queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
@@ -0,0 +1,380 @@
+From 41aa334fdf902b9b3c0e6aca88531b533165862d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:49:03 +0100
+Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit a8f78680ee6bf795086384e8aea159a52814f827 upstream.
+
+The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
+errata where broadcast TLBI;DSB sequences don't provide all the
+architecturally required synchronization. The workaround performs more
+work than necessary, and can have significant overhead. This patch
+optimizes the workaround, as explained below.
+
+The workaround was originally added for Qualcomm Falkor erratum 1009 in
+commit:
+
+  d9ff80f83ecb ("arm64: Work around Falkor erratum 1009")
+
+As noted in the message for that commit, the workaround is applied even
+in cases where it is not strictly necessary.
+
+The workaround was later reused without changes for:
+
+* Arm Cortex-A76 erratum #1286807
+  SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/
+
+* Arm Cortex-A55 erratum #2441007
+  SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/
+
+* Arm Cortex-A510 erratum #2441009
+  SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/
+
+The important details to note are as follows:
+
+1. All relevant errata only affect the ordering and/or completion of
+   memory accesses which have been translated by an invalidated TLB
+   entry. The actual invalidation of TLB entries is unaffected.
+
+2. The existing workaround is applied to both broadcast and local TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for broadcast invalidation.
+
+3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
+   sequence, whereas for all relevant errata it is only necessary to
+   execute a single additional TLBI;DSB sequence after any number of
+   TLBIs are completed by a DSB.
+
+   For example, for a sequence of batched TLBIs:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI <op1>[, <arg1>]
+       DSB ISH                  // additional
+       TLBI <op1>[, <arg1>]     // additional
+       TLBI <op2>[, <arg2>]
+       DSB ISH                  // additional
+       TLBI <op2>[, <arg2>]     // additional
+       TLBI <op3>[, <arg3>]
+       DSB ISH                  // additional
+       TLBI <op3>[, <arg3>]     // additional
+       DSB ISH
+
+   ... whereas it is sufficient to have:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+       TLBI <opX>[, <argX>]     // additional
+       DSB ISH                  // additional
+
+   Using a single additional TBLI and DSB at the end of the sequence can
+   have significantly lower overhead as each DSB which completes a TLBI
+   must synchronize with other PEs in the system, with potential
+   performance effects both locally and system-wide.
+
+4. The existing workaround repeats each specific TLBI operation, whereas
+   for all relevant errata it is sufficient for the additional TLBI to
+   use *any* operation which will be broadcast, regardless of which
+   translation regime or stage of translation the operation applies to.
+
+   For example, for a single TLBI:
+
+       TLBI ALLE2IS
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI ALLE2IS             // additional
+       DSB ISH                  // additional
+
+   ... whereas it is sufficient to have:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI VALE1IS, XZR        // additional
+       DSB ISH                  // additional
+
+   As the additional TLBI doesn't have to match a specific earlier TLBI,
+   the additional TLBI can be implemented in separate code, with no
+   memory of the earlier TLBIs. The additional TLBI can also use a
+   cheaper TLBI operation.
+
+5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for Stage-1 invalidation.
+
+   Architecturally, TLBI operations which invalidate only Stage-2
+   information (e.g. IPAS2E1IS) are not required to invalidate TLB
+   entries which combine information from Stage-1 and Stage-2
+   translation table entries, and consequently may not complete memory
+   accesses translated by those combined entries. In these cases,
+   completion of memory accesses is only guaranteed after subsequent
+   invalidation of Stage-1 information (e.g. VMALLE1IS).
+
+Taking the above points into account, this patch reworks the workaround
+logic to reduce overhead:
+
+* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are
+  added and used in place of any dsb(ish) which is used to complete
+  broadcast Stage-1 TLB maintenance. When the
+  ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will
+  execute an additional TLBI;DSB sequence.
+
+  For consistency, it might make sense to add __tlbi_sync_*() helpers
+  for local and stage 2 maintenance. For now I've left those with
+  open-coded dsb() to keep the diff small.
+
+* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This
+  is no longer needed as the necessary synchronization will happen in
+  __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp().
+
+* The additional TLBI operation is chosen to have minimal impact:
+
+  - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at
+    EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused
+    entry for the reserved ASID in the kernel's own translation regime,
+    and have no adverse affect.
+
+  - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used
+    in hyp code, where it will target an unused entry in the hyp code's
+    TTBR0 mapping, and should have no adverse effect.
+
+* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a
+  TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no
+  need for arch_tlbbatch_should_defer() to consider
+  ARM64_WORKAROUND_REPEAT_TLBI.
+
+When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this
+patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes
+the resulting Image 64KiB smaller:
+
+| [mark@lakrids:~/src/linux]% size vmlinux-*
+|    text    data     bss     dec     hex filename
+| 21179831        19660919         708216 41548966        279fca6 vmlinux-after
+| 21181075        19660903         708216 41550194        27a0172 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l vmlinux-*
+| -rwxr-xr-x 1 mark mark 157771472 Feb  4 12:05 vmlinux-after
+| -rwxr-xr-x 1 mark mark 157815432 Feb  4 12:05 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l Image-*
+| -rw-r--r-- 1 mark mark 41007616 Feb  4 12:05 Image-after
+| -rw-r--r-- 1 mark mark 41073152 Feb  4 12:05 Image-before
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v5.10.y; use inline ALTERNATIVE() sequence]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 51 ++++++++++++++++++++++---------
+ arch/arm64/kernel/sys_compat.c    |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c     |  6 ++--
+ arch/arm64/kvm/hyp/vhe/tlb.c      |  6 ++--
+ 4 files changed, 44 insertions(+), 21 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index b17d8b049d258b..0fd1bb180561c2 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -30,18 +30,10 @@
+  */
+ #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op "\n"                                \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op,            \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op ", %x0\n"                           \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %x0",    \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+@@ -158,6 +150,37 @@ static inline unsigned long get_trans_granule(void)
+ #define __TLBI_RANGE_NUM(pages, scale)        \
+       ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
+ 
++#define __repeat_tlbi_sync(op, arg)                                           \
++do {                                                                          \
++      asm volatile(                                                           \
++      ALTERNATIVE("nop\n                      nop",                           \
++                  "tlbi " #op ", %x0\n        dsb ish",                       \
++                  ARM64_WORKAROUND_REPEAT_TLBI,                               \
++                  CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)                        \
++      :                                                                       \
++      : "rZ" (arg));                                                          \
++} while (0)
++
++/*
++ * Complete broadcast TLB maintenance issued by the host which invalidates
++ * stage 1 information in the host's own translation regime.
++ */
++static inline void __tlbi_sync_s1ish(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale1is, 0);
++}
++
++/*
++ * Complete broadcast TLB maintenance issued by hyp code which invalidates
++ * stage 1 translation information in any translation regime.
++ */
++static inline void __tlbi_sync_s1ish_hyp(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale2is, 0);
++}
++
+ /*
+  *    TLB Invalidation
+  *    ================
+@@ -239,7 +262,7 @@ static inline void flush_tlb_all(void)
+ {
+       dsb(ishst);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -251,7 +274,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
+       asid = __TLBI_VADDR(0, ASID(mm));
+       __tlbi(aside1is, asid);
+       __tlbi_user(aside1is, asid);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
+@@ -269,7 +292,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
+                                 unsigned long uaddr)
+ {
+       flush_tlb_page_nosync(vma, uaddr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ /*
+@@ -357,7 +380,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
+               }
+               scale++;
+       }
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_range(struct vm_area_struct *vma,
+@@ -386,7 +409,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
+       dsb(ishst);
+       for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
+               __tlbi(vaale1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -400,7 +423,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
+ 
+       dsb(ishst);
+       __tlbi(vaae1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ #endif
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index 51274bab25653f..a42266f495d463 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -38,7 +38,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
+                        * We pick the reserved-ASID to minimise the impact.
+                        */
+                       __tlbi(aside1is, __TLBI_VADDR(0, 0));
+-                      dsb(ish);
++                      __tlbi_sync_s1ish();
+               }
+ 
+               ret = __flush_cache_user_range(start, start + chunk);
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 435d0a54ab9a25..deeb4bc943d89c 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -79,7 +79,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -95,7 +95,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -120,5 +120,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 67047feb306876..ac695f43f651fc 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+-- 
+2.53.0
+
diff --git a/queue-5.10/io_uring-prevent-opcode-speculation.patch b/queue-5.10/io_uring-prevent-opcode-speculation.patch

new file mode 100644 (file)

index 0000000..9669426
--- /dev/null
+++ b/queue-5.10/io_uring-prevent-opcode-speculation.patch
@@ -0,0 +1,42 @@
+From e244745e99fab815a6ef9b83c59fd7fc5fd84200 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Wed, 10 Jun 2026 20:22:03 +0300
+Subject: io_uring: prevent opcode speculation
+
+From: Pavel Begunkov <asml.silence@gmail.com>
+
+commit 1e988c3fe1264708f4f92109203ac5b1d65de50b upstream.
+
+sqe->opcode is used for different tables, make sure we santitise it
+against speculations.
+
+Cc: stable@vger.kernel.org
+Fixes: d3656344fea03 ("io_uring: add lookup table for various opcode needs")
+Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
+Reviewed-by: Li Zetao <lizetao1@huawei.com>
+Link: https://lore.kernel.org/r/7eddbf31c8ca0a3947f8ed98271acc2b4349c016.1739568408.git.asml.silence@gmail.com
+Signed-off-by: Jens Axboe <axboe@kernel.dk>
+[ Alexey: Sanitize req->opcode directly because io_init_req() in
+  linux-5.10.y has no local opcode variable and subsequent lookups use it. ]
+Signed-off-by: Alexey Panov <apanov@astralinux.ru>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ io_uring/io_uring.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index 2ca09e2dbd3d4a..51262d48a4a11b 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -7193,6 +7193,8 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
+               return -EINVAL;
+       if (unlikely(req->opcode >= IORING_OP_LAST))
+               return -EINVAL;
++      req->opcode = array_index_nospec(req->opcode, IORING_OP_LAST);
++
+       if (!io_check_restriction(ctx, req, sqe_flags))
+               return -EACCES;
+ 
+-- 
+2.53.0
+
diff --git a/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch

new file mode 100644 (file)

index 0000000..f9aa36c
--- /dev/null
+++ b/queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch
@@ -0,0 +1,120 @@
+From 5e62b5e382ab852f334a898c6f9e873731c68409 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:49:01 +0100
+Subject: KVM: arm64: Remove VPIPT I-cache handling
+
+From: Marc Zyngier <maz@kernel.org>
+
+commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream.
+
+We have some special handling for VPIPT I-cache in critical parts
+of the cache and TLB maintenance. Remove it.
+
+Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
+Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
+Signed-off-by: Marc Zyngier <maz@kernel.org>
+Acked-by: Mark Rutland <mark.rutland@arm.com>
+Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org
+Signed-off-by: Will Deacon <will@kernel.org>
+[Mark: Backport to v5.10.y. VPIPT HW was never built; this is all dead code]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/kvm_mmu.h |  4 ++--
+ arch/arm64/kvm/hyp/nvhe/tlb.c    | 35 --------------------------------
+ arch/arm64/kvm/hyp/vhe/tlb.c     | 13 ------------
+ 3 files changed, 2 insertions(+), 50 deletions(-)
+
+diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
+index 47dafd6ab3a30a..c700bf9241fce3 100644
+--- a/arch/arm64/include/asm/kvm_mmu.h
++++ b/arch/arm64/include/asm/kvm_mmu.h
+@@ -162,8 +162,8 @@ static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
+       if (icache_is_aliasing()) {
+               /* any kind of VIPT cache */
+               __flush_icache_all();
+-      } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
+-              /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
++      } else {
++              /* PIPT */
+               void *va = page_address(pfn_to_page(pfn));
+ 
+               invalidate_icache_range((unsigned long)va,
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 229b06748c2084..435d0a54ab9a25 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -82,28 +82,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+       dsb(ish);
+       isb();
+ 
+-      /*
+-       * If the host is running at EL1 and we have a VPIPT I-cache,
+-       * then we must perform I-cache maintenance at EL2 in order for
+-       * it to have an effect on the guest. Since the guest cannot hit
+-       * I-cache lines allocated with a different VMID, we don't need
+-       * to worry about junk out of guest reset (we nuke the I-cache on
+-       * VMID rollover), but we do need to be careful when remapping
+-       * executable pages for the same guest. This can happen when KSM
+-       * takes a CoW fault on an executable page, copies the page into
+-       * a page that was previously mapped in the guest and then needs
+-       * to invalidate the guest view of the I-cache for that page
+-       * from EL1. To solve this, we invalidate the entire I-cache when
+-       * unmapping a page from a guest if we have a VPIPT I-cache but
+-       * the host is running at EL1. As above, we could do better if
+-       * we had the VA.
+-       *
+-       * The moral of this story is: if you have a VPIPT I-cache, then
+-       * you should be running with VHE enabled.
+-       */
+-      if (icache_is_vpipt())
+-              __flush_icache_all();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -142,18 +120,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 66f17349f0c369..67047feb306876 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+-- 
+2.53.0
+
diff --git a/queue-5.10/series b/queue-5.10/series

index d186b74b338a5229317f3a4eedcd8a9ade11cc16..35ec2ba6e0bc1aeab0718bae87ba8e4f5aedb760 100644 (file)
--- a/queue-5.10/series
+++ b/queue-5.10/series
@@ -156,3 +156,9 @@ usbnet-fix-using-smp_processor_id-in-preemptible-cod.patch
  nfsd-don-t-ignore-the-return-code-of-svc_proc_regist.patch
  wifi-mac80211-check-tdls-flag-in-ieee80211_tdls_oper.patch
  spi-meson-spicc-fix-double-put-in-remove-path.patch
+io_uring-prevent-opcode-speculation.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
+tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch
+kvm-arm64-remove-vpipt-i-cache-handling.patch
+arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
+arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
diff --git a/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..536ff9e
--- /dev/null
+++ b/queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From acba7a9546166e019db32d71367ca8b388eb2fd4 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index 18f19fc66c64fa..6f5c996d3ed234 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1145,6 +1145,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1154,6 +1155,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
diff --git a/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch b/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch

new file mode 100644 (file)

index 0000000..c49c525
--- /dev/null
+++ b/queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch
@@ -0,0 +1,51 @@
+From de01d2486f3a93a389084ad4dec108860cbb7dad Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:28:53 -0700
+Subject: tun: free page on build_skb failure in tun_xdp_one()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8 ]
+
+When build_skb() fails in tun_xdp_one(), the function sets ret to
+-ENOMEM and jumps to the out label, which returns without freeing the
+page that vhost_net_build_xdp() allocated for the frame. As with the
+short-frame rejection path, tun_sendmsg() discards the per-buffer error
+and still returns total_len, so vhost_tx_batch() takes the success path
+and never frees the page. Each build_skb() failure in a batch leaks one
+page-frag chunk.
+
+Free the page before taking the error path, matching the put_page() the
+other error exits of tun_xdp_one() already perform.
+
+Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8)
+[Harshit: Backport to 5.15.y/5.10.y, use err instead of ret, no change
+needed]
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tun.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/drivers/net/tun.c b/drivers/net/tun.c
+index 930086d79f97c8..d960b261dbe4f6 100644
+--- a/drivers/net/tun.c
++++ b/drivers/net/tun.c
+@@ -2518,6 +2518,7 @@ static int tun_xdp_one(struct tun_struct *tun,
+ build:
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto out;
+       }
+-- 
+2.53.0
+
diff --git a/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch

new file mode 100644 (file)

index 0000000..9545de4
--- /dev/null
+++ b/queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
@@ -0,0 +1,62 @@
+From 37b0e9bc17faee182310f527351a872c78eeb9b4 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:47:05 +0100
+Subject: arm64: tlb: Allow XZR argument to TLBI ops
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream.
+
+The TLBI instruction accepts XZR as a register argument, and for TLBI
+operations with a register argument, there is no functional difference
+between using XZR or another GPR which contains zeroes. Operations
+without a register argument are encoded as if XZR were used.
+
+Allow the __TLBI_1() macro to use XZR when a register argument is all
+zeroes.
+
+Today this only results in a trivial code saving in
+__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In
+subsequent patches this pattern will be used more generally.
+
+There should be no functional change as a result of this patch.
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v5.15.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 412a3b9a3c25dc..2626a45849c241 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -37,12 +37,12 @@
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+-                             "tlbi " #op ", %0\n"                            \
++                             "tlbi " #op ", %x0\n"                           \
+                  ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %0",     \
++                             "dsb ish\n               tlbi " #op ", %x0",    \
+                              ARM64_WORKAROUND_REPEAT_TLBI,                   \
+                              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+-                          : : "r" (arg))
++                          : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+ 
+-- 
+2.53.0
+
diff --git a/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch

new file mode 100644 (file)

index 0000000..c55ae07
--- /dev/null
+++ b/queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
@@ -0,0 +1,380 @@
+From efb74232b2fda57351e04185bf8126663f40329d Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:47:06 +0100
+Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit a8f78680ee6bf795086384e8aea159a52814f827 upstream.
+
+The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
+errata where broadcast TLBI;DSB sequences don't provide all the
+architecturally required synchronization. The workaround performs more
+work than necessary, and can have significant overhead. This patch
+optimizes the workaround, as explained below.
+
+The workaround was originally added for Qualcomm Falkor erratum 1009 in
+commit:
+
+  d9ff80f83ecb ("arm64: Work around Falkor erratum 1009")
+
+As noted in the message for that commit, the workaround is applied even
+in cases where it is not strictly necessary.
+
+The workaround was later reused without changes for:
+
+* Arm Cortex-A76 erratum #1286807
+  SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/
+
+* Arm Cortex-A55 erratum #2441007
+  SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/
+
+* Arm Cortex-A510 erratum #2441009
+  SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/
+
+The important details to note are as follows:
+
+1. All relevant errata only affect the ordering and/or completion of
+   memory accesses which have been translated by an invalidated TLB
+   entry. The actual invalidation of TLB entries is unaffected.
+
+2. The existing workaround is applied to both broadcast and local TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for broadcast invalidation.
+
+3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
+   sequence, whereas for all relevant errata it is only necessary to
+   execute a single additional TLBI;DSB sequence after any number of
+   TLBIs are completed by a DSB.
+
+   For example, for a sequence of batched TLBIs:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI <op1>[, <arg1>]
+       DSB ISH                  // additional
+       TLBI <op1>[, <arg1>]     // additional
+       TLBI <op2>[, <arg2>]
+       DSB ISH                  // additional
+       TLBI <op2>[, <arg2>]     // additional
+       TLBI <op3>[, <arg3>]
+       DSB ISH                  // additional
+       TLBI <op3>[, <arg3>]     // additional
+       DSB ISH
+
+   ... whereas it is sufficient to have:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+       TLBI <opX>[, <argX>]     // additional
+       DSB ISH                  // additional
+
+   Using a single additional TBLI and DSB at the end of the sequence can
+   have significantly lower overhead as each DSB which completes a TLBI
+   must synchronize with other PEs in the system, with potential
+   performance effects both locally and system-wide.
+
+4. The existing workaround repeats each specific TLBI operation, whereas
+   for all relevant errata it is sufficient for the additional TLBI to
+   use *any* operation which will be broadcast, regardless of which
+   translation regime or stage of translation the operation applies to.
+
+   For example, for a single TLBI:
+
+       TLBI ALLE2IS
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI ALLE2IS             // additional
+       DSB ISH                  // additional
+
+   ... whereas it is sufficient to have:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI VALE1IS, XZR        // additional
+       DSB ISH                  // additional
+
+   As the additional TLBI doesn't have to match a specific earlier TLBI,
+   the additional TLBI can be implemented in separate code, with no
+   memory of the earlier TLBIs. The additional TLBI can also use a
+   cheaper TLBI operation.
+
+5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for Stage-1 invalidation.
+
+   Architecturally, TLBI operations which invalidate only Stage-2
+   information (e.g. IPAS2E1IS) are not required to invalidate TLB
+   entries which combine information from Stage-1 and Stage-2
+   translation table entries, and consequently may not complete memory
+   accesses translated by those combined entries. In these cases,
+   completion of memory accesses is only guaranteed after subsequent
+   invalidation of Stage-1 information (e.g. VMALLE1IS).
+
+Taking the above points into account, this patch reworks the workaround
+logic to reduce overhead:
+
+* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are
+  added and used in place of any dsb(ish) which is used to complete
+  broadcast Stage-1 TLB maintenance. When the
+  ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will
+  execute an additional TLBI;DSB sequence.
+
+  For consistency, it might make sense to add __tlbi_sync_*() helpers
+  for local and stage 2 maintenance. For now I've left those with
+  open-coded dsb() to keep the diff small.
+
+* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This
+  is no longer needed as the necessary synchronization will happen in
+  __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp().
+
+* The additional TLBI operation is chosen to have minimal impact:
+
+  - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at
+    EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused
+    entry for the reserved ASID in the kernel's own translation regime,
+    and have no adverse affect.
+
+  - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used
+    in hyp code, where it will target an unused entry in the hyp code's
+    TTBR0 mapping, and should have no adverse effect.
+
+* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a
+  TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no
+  need for arch_tlbbatch_should_defer() to consider
+  ARM64_WORKAROUND_REPEAT_TLBI.
+
+When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this
+patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes
+the resulting Image 64KiB smaller:
+
+| [mark@lakrids:~/src/linux]% size vmlinux-*
+|    text    data     bss     dec     hex filename
+| 21179831        19660919         708216 41548966        279fca6 vmlinux-after
+| 21181075        19660903         708216 41550194        27a0172 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l vmlinux-*
+| -rwxr-xr-x 1 mark mark 157771472 Feb  4 12:05 vmlinux-after
+| -rwxr-xr-x 1 mark mark 157815432 Feb  4 12:05 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l Image-*
+| -rw-r--r-- 1 mark mark 41007616 Feb  4 12:05 Image-after
+| -rw-r--r-- 1 mark mark 41073152 Feb  4 12:05 Image-before
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v5.15.y; use inline ALTERNATIVE() sequence]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 51 ++++++++++++++++++++++---------
+ arch/arm64/kernel/sys_compat.c    |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c     |  6 ++--
+ arch/arm64/kvm/hyp/vhe/tlb.c      |  6 ++--
+ 4 files changed, 44 insertions(+), 21 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 2626a45849c241..cc6e47172f8fa7 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -30,18 +30,10 @@
+  */
+ #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op "\n"                                \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op,            \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op ", %x0\n"                           \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %x0",    \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+@@ -158,6 +150,37 @@ static inline unsigned long get_trans_granule(void)
+ #define __TLBI_RANGE_NUM(pages, scale)        \
+       ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
+ 
++#define __repeat_tlbi_sync(op, arg)                                           \
++do {                                                                          \
++      asm volatile(                                                           \
++      ALTERNATIVE("nop\n                      nop",                           \
++                  "tlbi " #op ", %x0\n        dsb ish",                       \
++                  ARM64_WORKAROUND_REPEAT_TLBI,                               \
++                  CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)                        \
++      :                                                                       \
++      : "rZ" (arg));                                                          \
++} while (0)
++
++/*
++ * Complete broadcast TLB maintenance issued by the host which invalidates
++ * stage 1 information in the host's own translation regime.
++ */
++static inline void __tlbi_sync_s1ish(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale1is, 0);
++}
++
++/*
++ * Complete broadcast TLB maintenance issued by hyp code which invalidates
++ * stage 1 translation information in any translation regime.
++ */
++static inline void __tlbi_sync_s1ish_hyp(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale2is, 0);
++}
++
+ /*
+  *    TLB Invalidation
+  *    ================
+@@ -239,7 +262,7 @@ static inline void flush_tlb_all(void)
+ {
+       dsb(ishst);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -251,7 +274,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
+       asid = __TLBI_VADDR(0, ASID(mm));
+       __tlbi(aside1is, asid);
+       __tlbi_user(aside1is, asid);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
+@@ -269,7 +292,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
+                                 unsigned long uaddr)
+ {
+       flush_tlb_page_nosync(vma, uaddr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ /*
+@@ -357,7 +380,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
+               }
+               scale++;
+       }
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_range(struct vm_area_struct *vma,
+@@ -386,7 +409,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
+       dsb(ishst);
+       for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
+               __tlbi(vaale1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -400,7 +423,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
+ 
+       dsb(ishst);
+       __tlbi(vaae1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ #endif
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index b88a52f7188fcc..416195f376816b 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -38,7 +38,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
+                        * We pick the reserved-ASID to minimise the impact.
+                        */
+                       __tlbi(aside1is, __TLBI_VADDR(0, 0));
+-                      dsb(ish);
++                      __tlbi_sync_s1ish();
+               }
+ 
+               ret = caches_clean_inval_user_pou(start, start + chunk);
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 291789df24e3ee..76973e3b48a076 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -81,7 +81,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -97,7 +97,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -122,5 +122,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index fc3fcd29ccc306..59aa22b48e9538 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+-- 
+2.53.0
+
diff --git a/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch

new file mode 100644 (file)

index 0000000..f4710e1
--- /dev/null
+++ b/queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch
@@ -0,0 +1,120 @@
+From 1e70ed7fdcf4977c02fdf405beabfdea2fb7e110 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:47:04 +0100
+Subject: KVM: arm64: Remove VPIPT I-cache handling
+
+From: Marc Zyngier <maz@kernel.org>
+
+commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream.
+
+We have some special handling for VPIPT I-cache in critical parts
+of the cache and TLB maintenance. Remove it.
+
+Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
+Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
+Signed-off-by: Marc Zyngier <maz@kernel.org>
+Acked-by: Mark Rutland <mark.rutland@arm.com>
+Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org
+Signed-off-by: Will Deacon <will@kernel.org>
+[Mark: Backport to v5.15.y. VPIPT HW was never built; this is all dead code]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/kvm_mmu.h |  4 ++--
+ arch/arm64/kvm/hyp/nvhe/tlb.c    | 35 --------------------------------
+ arch/arm64/kvm/hyp/vhe/tlb.c     | 13 ------------
+ 3 files changed, 2 insertions(+), 50 deletions(-)
+
+diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
+index 02d37888774383..1733eb87c29fbb 100644
+--- a/arch/arm64/include/asm/kvm_mmu.h
++++ b/arch/arm64/include/asm/kvm_mmu.h
+@@ -207,8 +207,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size)
+       if (icache_is_aliasing()) {
+               /* any kind of VIPT cache */
+               icache_inval_all_pou();
+-      } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
+-              /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
++      } else {
++              /* PIPT */
+               icache_inval_pou((unsigned long)va, (unsigned long)va + size);
+       }
+ }
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index d296d617f58963..291789df24e3ee 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -84,28 +84,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+       dsb(ish);
+       isb();
+ 
+-      /*
+-       * If the host is running at EL1 and we have a VPIPT I-cache,
+-       * then we must perform I-cache maintenance at EL2 in order for
+-       * it to have an effect on the guest. Since the guest cannot hit
+-       * I-cache lines allocated with a different VMID, we don't need
+-       * to worry about junk out of guest reset (we nuke the I-cache on
+-       * VMID rollover), but we do need to be careful when remapping
+-       * executable pages for the same guest. This can happen when KSM
+-       * takes a CoW fault on an executable page, copies the page into
+-       * a page that was previously mapped in the guest and then needs
+-       * to invalidate the guest view of the I-cache for that page
+-       * from EL1. To solve this, we invalidate the entire I-cache when
+-       * unmapping a page from a guest if we have a VPIPT I-cache but
+-       * the host is running at EL1. As above, we could do better if
+-       * we had the VA.
+-       *
+-       * The moral of this story is: if you have a VPIPT I-cache, then
+-       * you should be running with VHE enabled.
+-       */
+-      if (icache_is_vpipt())
+-              icache_inval_all_pou();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -144,18 +122,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 24cef9b87f9e9c..fc3fcd29ccc306 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+-- 
+2.53.0
+
diff --git a/queue-5.15/series b/queue-5.15/series

index c438df797be62a42c137248ed4705ba7d00797a3..d4012ed00db189ec9d21a568e76a8c4a9192ec82 100644 (file)
--- a/queue-5.15/series
+++ b/queue-5.15/series
@@ -174,3 +174,8 @@ time-fix-off-by-one-in-settimeofday-usec-validation.patch
  ext4-validate-p_idx-bounds-in-ext4_ext_correct_index.patch
  fs-ntfs3-return-error-for-inconsistent-extended-attr.patch
  nfsd-don-t-ignore-the-return-code-of-svc_proc_regist.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
+tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch
+kvm-arm64-remove-vpipt-i-cache-handling.patch
+arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
+arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
diff --git a/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..37aa092
--- /dev/null
+++ b/queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From d8b319de2dcbcae9b9da8f41f4f38c1e7c3435e5 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index a08adca412b41a..3a91972485cc36 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1143,6 +1143,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1152,6 +1153,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
diff --git a/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch b/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch

new file mode 100644 (file)

index 0000000..96adc3a
--- /dev/null
+++ b/queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch
@@ -0,0 +1,51 @@
+From 66f41f99118179e2e9f3ffb6c5c27e84bee8319e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:28:53 -0700
+Subject: tun: free page on build_skb failure in tun_xdp_one()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8 ]
+
+When build_skb() fails in tun_xdp_one(), the function sets ret to
+-ENOMEM and jumps to the out label, which returns without freeing the
+page that vhost_net_build_xdp() allocated for the frame. As with the
+short-frame rejection path, tun_sendmsg() discards the per-buffer error
+and still returns total_len, so vhost_tx_batch() takes the success path
+and never frees the page. Each build_skb() failure in a batch leaks one
+page-frag chunk.
+
+Free the page before taking the error path, matching the put_page() the
+other error exits of tun_xdp_one() already perform.
+
+Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163312.1479805-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit aa8963fdce667a42fb7f0bdd2909fadcab02f9a8)
+[Harshit: Backport to 5.15.y/5.10.y, use err instead of ret, no change
+needed]
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tun.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/drivers/net/tun.c b/drivers/net/tun.c
+index 803cb4722dbf4a..aad0760c8d92b7 100644
+--- a/drivers/net/tun.c
++++ b/drivers/net/tun.c
+@@ -2468,6 +2468,7 @@ static int tun_xdp_one(struct tun_struct *tun,
+ build:
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto out;
+       }
+-- 
+2.53.0
+
diff --git a/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch

new file mode 100644 (file)

index 0000000..d5731a2
--- /dev/null
+++ b/queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
@@ -0,0 +1,62 @@
+From 466eb0f0987d894af36ef6ed03fe1f73b3448bea Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:44:50 +0100
+Subject: arm64: tlb: Allow XZR argument to TLBI ops
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream.
+
+The TLBI instruction accepts XZR as a register argument, and for TLBI
+operations with a register argument, there is no functional difference
+between using XZR or another GPR which contains zeroes. Operations
+without a register argument are encoded as if XZR were used.
+
+Allow the __TLBI_1() macro to use XZR when a register argument is all
+zeroes.
+
+Today this only results in a trivial code saving in
+__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In
+subsequent patches this pattern will be used more generally.
+
+There should be no functional change as a result of this patch.
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.1.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 412a3b9a3c25dc..2626a45849c241 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -37,12 +37,12 @@
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+-                             "tlbi " #op ", %0\n"                            \
++                             "tlbi " #op ", %x0\n"                           \
+                  ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %0",     \
++                             "dsb ish\n               tlbi " #op ", %x0",    \
+                              ARM64_WORKAROUND_REPEAT_TLBI,                   \
+                              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+-                          : : "r" (arg))
++                          : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+ 
+-- 
+2.53.0
+
diff --git a/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch

new file mode 100644 (file)

index 0000000..e1cd052
--- /dev/null
+++ b/queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
@@ -0,0 +1,391 @@
+From abeb0cd888fa1d94cd041af9dc8cdf78986cd652 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:44:51 +0100
+Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit a8f78680ee6bf795086384e8aea159a52814f827 upstream.
+
+The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
+errata where broadcast TLBI;DSB sequences don't provide all the
+architecturally required synchronization. The workaround performs more
+work than necessary, and can have significant overhead. This patch
+optimizes the workaround, as explained below.
+
+The workaround was originally added for Qualcomm Falkor erratum 1009 in
+commit:
+
+  d9ff80f83ecb ("arm64: Work around Falkor erratum 1009")
+
+As noted in the message for that commit, the workaround is applied even
+in cases where it is not strictly necessary.
+
+The workaround was later reused without changes for:
+
+* Arm Cortex-A76 erratum #1286807
+  SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/
+
+* Arm Cortex-A55 erratum #2441007
+  SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/
+
+* Arm Cortex-A510 erratum #2441009
+  SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/
+
+The important details to note are as follows:
+
+1. All relevant errata only affect the ordering and/or completion of
+   memory accesses which have been translated by an invalidated TLB
+   entry. The actual invalidation of TLB entries is unaffected.
+
+2. The existing workaround is applied to both broadcast and local TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for broadcast invalidation.
+
+3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
+   sequence, whereas for all relevant errata it is only necessary to
+   execute a single additional TLBI;DSB sequence after any number of
+   TLBIs are completed by a DSB.
+
+   For example, for a sequence of batched TLBIs:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI <op1>[, <arg1>]
+       DSB ISH                  // additional
+       TLBI <op1>[, <arg1>]     // additional
+       TLBI <op2>[, <arg2>]
+       DSB ISH                  // additional
+       TLBI <op2>[, <arg2>]     // additional
+       TLBI <op3>[, <arg3>]
+       DSB ISH                  // additional
+       TLBI <op3>[, <arg3>]     // additional
+       DSB ISH
+
+   ... whereas it is sufficient to have:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+       TLBI <opX>[, <argX>]     // additional
+       DSB ISH                  // additional
+
+   Using a single additional TBLI and DSB at the end of the sequence can
+   have significantly lower overhead as each DSB which completes a TLBI
+   must synchronize with other PEs in the system, with potential
+   performance effects both locally and system-wide.
+
+4. The existing workaround repeats each specific TLBI operation, whereas
+   for all relevant errata it is sufficient for the additional TLBI to
+   use *any* operation which will be broadcast, regardless of which
+   translation regime or stage of translation the operation applies to.
+
+   For example, for a single TLBI:
+
+       TLBI ALLE2IS
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI ALLE2IS             // additional
+       DSB ISH                  // additional
+
+   ... whereas it is sufficient to have:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI VALE1IS, XZR        // additional
+       DSB ISH                  // additional
+
+   As the additional TLBI doesn't have to match a specific earlier TLBI,
+   the additional TLBI can be implemented in separate code, with no
+   memory of the earlier TLBIs. The additional TLBI can also use a
+   cheaper TLBI operation.
+
+5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for Stage-1 invalidation.
+
+   Architecturally, TLBI operations which invalidate only Stage-2
+   information (e.g. IPAS2E1IS) are not required to invalidate TLB
+   entries which combine information from Stage-1 and Stage-2
+   translation table entries, and consequently may not complete memory
+   accesses translated by those combined entries. In these cases,
+   completion of memory accesses is only guaranteed after subsequent
+   invalidation of Stage-1 information (e.g. VMALLE1IS).
+
+Taking the above points into account, this patch reworks the workaround
+logic to reduce overhead:
+
+* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are
+  added and used in place of any dsb(ish) which is used to complete
+  broadcast Stage-1 TLB maintenance. When the
+  ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will
+  execute an additional TLBI;DSB sequence.
+
+  For consistency, it might make sense to add __tlbi_sync_*() helpers
+  for local and stage 2 maintenance. For now I've left those with
+  open-coded dsb() to keep the diff small.
+
+* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This
+  is no longer needed as the necessary synchronization will happen in
+  __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp().
+
+* The additional TLBI operation is chosen to have minimal impact:
+
+  - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at
+    EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused
+    entry for the reserved ASID in the kernel's own translation regime,
+    and have no adverse affect.
+
+  - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used
+    in hyp code, where it will target an unused entry in the hyp code's
+    TTBR0 mapping, and should have no adverse effect.
+
+* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a
+  TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no
+  need for arch_tlbbatch_should_defer() to consider
+  ARM64_WORKAROUND_REPEAT_TLBI.
+
+When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this
+patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes
+the resulting Image 64KiB smaller:
+
+| [mark@lakrids:~/src/linux]% size vmlinux-*
+|    text    data     bss     dec     hex filename
+| 21179831        19660919         708216 41548966        279fca6 vmlinux-after
+| 21181075        19660903         708216 41550194        27a0172 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l vmlinux-*
+| -rwxr-xr-x 1 mark mark 157771472 Feb  4 12:05 vmlinux-after
+| -rwxr-xr-x 1 mark mark 157815432 Feb  4 12:05 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l Image-*
+| -rw-r--r-- 1 mark mark 41007616 Feb  4 12:05 Image-after
+| -rw-r--r-- 1 mark mark 41073152 Feb  4 12:05 Image-before
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.1.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 48 ++++++++++++++++++++++---------
+ arch/arm64/kernel/sys_compat.c    |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c     |  6 ++--
+ arch/arm64/kvm/hyp/pgtable.c      |  2 +-
+ arch/arm64/kvm/hyp/vhe/tlb.c      |  6 ++--
+ 5 files changed, 42 insertions(+), 22 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 2626a45849c241..289c3948d5b08a 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -30,18 +30,10 @@
+  */
+ #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op "\n"                                \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op,            \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op ", %x0\n"                           \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %x0",    \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+@@ -158,6 +150,34 @@ static inline unsigned long get_trans_granule(void)
+ #define __TLBI_RANGE_NUM(pages, scale)        \
+       ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
+ 
++#define __repeat_tlbi_sync(op, arg...)                                                \
++do {                                                                          \
++      if (!alternative_has_feature_unlikely(ARM64_WORKAROUND_REPEAT_TLBI))    \
++              break;                                                          \
++      __tlbi(op, ##arg);                                                      \
++      dsb(ish);                                                               \
++} while (0)
++
++/*
++ * Complete broadcast TLB maintenance issued by the host which invalidates
++ * stage 1 information in the host's own translation regime.
++ */
++static inline void __tlbi_sync_s1ish(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale1is, 0);
++}
++
++/*
++ * Complete broadcast TLB maintenance issued by hyp code which invalidates
++ * stage 1 translation information in any translation regime.
++ */
++static inline void __tlbi_sync_s1ish_hyp(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale2is, 0);
++}
++
+ /*
+  *    TLB Invalidation
+  *    ================
+@@ -239,7 +259,7 @@ static inline void flush_tlb_all(void)
+ {
+       dsb(ishst);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -251,7 +271,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
+       asid = __TLBI_VADDR(0, ASID(mm));
+       __tlbi(aside1is, asid);
+       __tlbi_user(aside1is, asid);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
+@@ -269,7 +289,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
+                                 unsigned long uaddr)
+ {
+       flush_tlb_page_nosync(vma, uaddr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ /*
+@@ -357,7 +377,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
+               }
+               scale++;
+       }
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_range(struct vm_area_struct *vma,
+@@ -386,7 +406,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
+       dsb(ishst);
+       for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
+               __tlbi(vaale1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -400,7 +420,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
+ 
+       dsb(ishst);
+       __tlbi(vaae1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ #endif
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index df14336c3a29cf..2bc2ac91d79e39 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
+                        * We pick the reserved-ASID to minimise the impact.
+                        */
+                       __tlbi(aside1is, __TLBI_VADDR(0, 0));
+-                      dsb(ish);
++                      __tlbi_sync_s1ish();
+               }
+ 
+               ret = caches_clean_inval_user_pou(start, start + chunk);
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 291789df24e3ee..76973e3b48a076 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -81,7 +81,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -97,7 +97,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -122,5 +122,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
+index f0167dc7438f8a..d2838de92b4796 100644
+--- a/arch/arm64/kvm/hyp/pgtable.c
++++ b/arch/arm64/kvm/hyp/pgtable.c
+@@ -486,7 +486,7 @@ static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
+               data->unmapped += granule;
+       }
+ 
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+       mm_ops->put_page(ptep);
+ 
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index fc3fcd29ccc306..59aa22b48e9538 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -121,7 +121,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -146,5 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+-- 
+2.53.0
+
diff --git a/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch

new file mode 100644 (file)

index 0000000..a10dc58
--- /dev/null
+++ b/queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch
@@ -0,0 +1,120 @@
+From a3414e8d7a4b00a5e9cd3714a8a0064999776995 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:44:49 +0100
+Subject: KVM: arm64: Remove VPIPT I-cache handling
+
+From: Marc Zyngier <maz@kernel.org>
+
+commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream.
+
+We have some special handling for VPIPT I-cache in critical parts
+of the cache and TLB maintenance. Remove it.
+
+Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
+Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
+Signed-off-by: Marc Zyngier <maz@kernel.org>
+Acked-by: Mark Rutland <mark.rutland@arm.com>
+Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org
+Signed-off-by: Will Deacon <will@kernel.org>
+[Mark: Backport to v6.1.y. VPIPT HW was never built; this is all dead code]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/kvm_mmu.h |  4 ++--
+ arch/arm64/kvm/hyp/nvhe/tlb.c    | 35 --------------------------------
+ arch/arm64/kvm/hyp/vhe/tlb.c     | 13 ------------
+ 3 files changed, 2 insertions(+), 50 deletions(-)
+
+diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
+index 7784081088e78f..1495fcddd98e58 100644
+--- a/arch/arm64/include/asm/kvm_mmu.h
++++ b/arch/arm64/include/asm/kvm_mmu.h
+@@ -214,8 +214,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size)
+       if (icache_is_aliasing()) {
+               /* any kind of VIPT cache */
+               icache_inval_all_pou();
+-      } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
+-              /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
++      } else {
++              /* PIPT */
+               icache_inval_pou((unsigned long)va, (unsigned long)va + size);
+       }
+ }
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index d296d617f58963..291789df24e3ee 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -84,28 +84,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+       dsb(ish);
+       isb();
+ 
+-      /*
+-       * If the host is running at EL1 and we have a VPIPT I-cache,
+-       * then we must perform I-cache maintenance at EL2 in order for
+-       * it to have an effect on the guest. Since the guest cannot hit
+-       * I-cache lines allocated with a different VMID, we don't need
+-       * to worry about junk out of guest reset (we nuke the I-cache on
+-       * VMID rollover), but we do need to be careful when remapping
+-       * executable pages for the same guest. This can happen when KSM
+-       * takes a CoW fault on an executable page, copies the page into
+-       * a page that was previously mapped in the guest and then needs
+-       * to invalidate the guest view of the I-cache for that page
+-       * from EL1. To solve this, we invalidate the entire I-cache when
+-       * unmapping a page from a guest if we have a VPIPT I-cache but
+-       * the host is running at EL1. As above, we could do better if
+-       * we had the VA.
+-       *
+-       * The moral of this story is: if you have a VPIPT I-cache, then
+-       * you should be running with VHE enabled.
+-       */
+-      if (icache_is_vpipt())
+-              icache_inval_all_pou();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -144,18 +122,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 24cef9b87f9e9c..fc3fcd29ccc306 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -146,18 +146,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+-- 
+2.53.0
+
diff --git a/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch b/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch

new file mode 100644 (file)

index 0000000..958c31a
--- /dev/null
+++ b/queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch
@@ -0,0 +1,97 @@
+From f026c847954efa7c68cb8aac0df03c27476eef44 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 26 May 2026 11:12:39 +0700
+Subject: net: skbuff: fix missing zerocopy reference in pskb_carve helpers
+
+From: Minh Nguyen <minhnguyen.080505@gmail.com>
+
+commit 98d0912e9f841e5529a5b89a972805f34cb1c69d upstream.
+
+pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
+the old skb_shared_info header into a new buffer via memcpy(), which
+includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
+Neither function calls net_zcopy_get() for the new shinfo, creating an
+unaccounted holder: every skb_shared_info with destructor_arg set will
+call skb_zcopy_clear() once when freed, but the corresponding
+net_zcopy_get() was never called for the new copy. Repeated calls
+drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
+TX skbs still hold live destructor_arg pointers.
+
+KASAN reports use-after-free on a freed ubuf_info_msgzc:
+
+  BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810
+  Read of size 8 at addr ffff88801574d3e8 by task poc/220
+
+  Call Trace:
+   skb_release_data+0x77b/0x810
+   kfree_skb_list_reason+0x13e/0x610
+   skb_release_data+0x4cd/0x810
+   sk_skb_reason_drop+0xf3/0x340
+   skb_queue_purge_reason+0x282/0x440
+   rds_tcp_inc_free+0x1e/0x30
+   rds_recvmsg+0x354/0x1780
+   __sys_recvmsg+0xdf/0x180
+
+  Allocated by task 219:
+   msg_zerocopy_realloc+0x157/0x7b0
+   tcp_sendmsg_locked+0x2892/0x3ba0
+
+  Freed by task 219:
+   ip_recv_error+0x74a/0xb10
+   tcp_recvmsg+0x475/0x530
+
+The skb consuming the late access still referenced the same uarg via
+shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without
+a refcount bump. This has been verified to be reliably exploitable: a
+working proof-of-concept achieves full root privilege escalation from
+an unprivileged local user on a default kernel configuration.
+
+The fix follows the pattern of pskb_expand_head() which has the same
+memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get()
+is placed after skb_orphan_frags() succeeds, so the orphan error path
+needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is
+placed after all failure points and just before skb_release_data(), so
+no error path needs cleanup at all -- matching pskb_expand_head() more
+closely and avoiding the need for a balancing net_zcopy_put().
+
+Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function")
+Cc: stable@vger.kernel.org
+Assisted-by: Claude:claude-sonnet-4-6
+Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+[Salvatore Bonaccorso: Backport for context changes, as 6.1.y has not
+511a3eda2f8d ("net: dropreason: propagate drop_reason to
+skb_release_data()")].
+Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/core/skbuff.c | 4 ++++
+ 1 file changed, 4 insertions(+)
+
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index 41b2aaed7a14aa..f1f5b2b25f8522 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -6247,6 +6247,8 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
+                       kfree(data);
+                       return -ENOMEM;
+               }
++              if (skb_zcopy(skb))
++                      net_zcopy_get(skb_zcopy(skb));
+               for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
+                       skb_frag_ref(skb, i);
+               if (skb_has_frag_list(skb))
+@@ -6396,6 +6398,8 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off,
+               kfree(data);
+               return -ENOMEM;
+       }
++      if (skb_zcopy(skb))
++              net_zcopy_get(skb_zcopy(skb));
+       skb_release_data(skb);
+ 
+       skb->head = data;
+-- 
+2.53.0
+
diff --git a/queue-6.1/series b/queue-6.1/series

index 59f06c3bb80c735a612eefc5f9f6c7c2fe6daf12..07b90fdede94bbe4facdd857b52c18aff20933c4 100644 (file)
--- a/queue-6.1/series
+++ b/queue-6.1/series
@@ -230,3 +230,8 @@ alsa-pcm-fix-wait-queue-list-corruption-in-snd_pcm_d.patch
  fs-ntfs3-return-error-for-inconsistent-extended-attr.patch
  usb-gadget-f_ncm-fix-net_device-lifecycle-with-devic.patch
  usb-gadget-u_ether-fix-null-pointer-deref-in-eth_get.patch
+net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
+kvm-arm64-remove-vpipt-i-cache-handling.patch
+arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
+arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
diff --git a/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..c005443
--- /dev/null
+++ b/queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From 33edbfe76b534b501b60c6a6a99f16685ee6228e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index f8e7b163810de6..15ab71f5288ac3 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1157,6 +1157,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1166,6 +1167,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
diff --git a/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch

new file mode 100644 (file)

index 0000000..63e8c2c
--- /dev/null
+++ b/queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
@@ -0,0 +1,62 @@
+From 5cc6a38c30753a8f05a6be51fa4944b02fb59424 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:40:23 +0100
+Subject: arm64: tlb: Allow XZR argument to TLBI ops
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream.
+
+The TLBI instruction accepts XZR as a register argument, and for TLBI
+operations with a register argument, there is no functional difference
+between using XZR or another GPR which contains zeroes. Operations
+without a register argument are encoded as if XZR were used.
+
+Allow the __TLBI_1() macro to use XZR when a register argument is all
+zeroes.
+
+Today this only results in a trivial code saving in
+__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In
+subsequent patches this pattern will be used more generally.
+
+There should be no functional change as a result of this patch.
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.12.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 5f12cdc2b9671a..dd802d58b39436 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -38,12 +38,12 @@
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+-                             "tlbi " #op ", %0\n"                            \
++                             "tlbi " #op ", %x0\n"                           \
+                  ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %0",     \
++                             "dsb ish\n               tlbi " #op ", %x0",    \
+                              ARM64_WORKAROUND_REPEAT_TLBI,                   \
+                              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+-                          : : "r" (arg))
++                          : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+ 
+-- 
+2.53.0
+
diff --git a/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch

new file mode 100644 (file)

index 0000000..f928aa2
--- /dev/null
+++ b/queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
@@ -0,0 +1,456 @@
+From aa4de9abf6780db4b481bab6b76912fe2118e061 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:40:24 +0100
+Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit a8f78680ee6bf795086384e8aea159a52814f827 upstream.
+
+The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
+errata where broadcast TLBI;DSB sequences don't provide all the
+architecturally required synchronization. The workaround performs more
+work than necessary, and can have significant overhead. This patch
+optimizes the workaround, as explained below.
+
+The workaround was originally added for Qualcomm Falkor erratum 1009 in
+commit:
+
+  d9ff80f83ecb ("arm64: Work around Falkor erratum 1009")
+
+As noted in the message for that commit, the workaround is applied even
+in cases where it is not strictly necessary.
+
+The workaround was later reused without changes for:
+
+* Arm Cortex-A76 erratum #1286807
+  SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/
+
+* Arm Cortex-A55 erratum #2441007
+  SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/
+
+* Arm Cortex-A510 erratum #2441009
+  SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/
+
+The important details to note are as follows:
+
+1. All relevant errata only affect the ordering and/or completion of
+   memory accesses which have been translated by an invalidated TLB
+   entry. The actual invalidation of TLB entries is unaffected.
+
+2. The existing workaround is applied to both broadcast and local TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for broadcast invalidation.
+
+3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
+   sequence, whereas for all relevant errata it is only necessary to
+   execute a single additional TLBI;DSB sequence after any number of
+   TLBIs are completed by a DSB.
+
+   For example, for a sequence of batched TLBIs:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI <op1>[, <arg1>]
+       DSB ISH                  // additional
+       TLBI <op1>[, <arg1>]     // additional
+       TLBI <op2>[, <arg2>]
+       DSB ISH                  // additional
+       TLBI <op2>[, <arg2>]     // additional
+       TLBI <op3>[, <arg3>]
+       DSB ISH                  // additional
+       TLBI <op3>[, <arg3>]     // additional
+       DSB ISH
+
+   ... whereas it is sufficient to have:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+       TLBI <opX>[, <argX>]     // additional
+       DSB ISH                  // additional
+
+   Using a single additional TBLI and DSB at the end of the sequence can
+   have significantly lower overhead as each DSB which completes a TLBI
+   must synchronize with other PEs in the system, with potential
+   performance effects both locally and system-wide.
+
+4. The existing workaround repeats each specific TLBI operation, whereas
+   for all relevant errata it is sufficient for the additional TLBI to
+   use *any* operation which will be broadcast, regardless of which
+   translation regime or stage of translation the operation applies to.
+
+   For example, for a single TLBI:
+
+       TLBI ALLE2IS
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI ALLE2IS             // additional
+       DSB ISH                  // additional
+
+   ... whereas it is sufficient to have:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI VALE1IS, XZR        // additional
+       DSB ISH                  // additional
+
+   As the additional TLBI doesn't have to match a specific earlier TLBI,
+   the additional TLBI can be implemented in separate code, with no
+   memory of the earlier TLBIs. The additional TLBI can also use a
+   cheaper TLBI operation.
+
+5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for Stage-1 invalidation.
+
+   Architecturally, TLBI operations which invalidate only Stage-2
+   information (e.g. IPAS2E1IS) are not required to invalidate TLB
+   entries which combine information from Stage-1 and Stage-2
+   translation table entries, and consequently may not complete memory
+   accesses translated by those combined entries. In these cases,
+   completion of memory accesses is only guaranteed after subsequent
+   invalidation of Stage-1 information (e.g. VMALLE1IS).
+
+Taking the above points into account, this patch reworks the workaround
+logic to reduce overhead:
+
+* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are
+  added and used in place of any dsb(ish) which is used to complete
+  broadcast Stage-1 TLB maintenance. When the
+  ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will
+  execute an additional TLBI;DSB sequence.
+
+  For consistency, it might make sense to add __tlbi_sync_*() helpers
+  for local and stage 2 maintenance. For now I've left those with
+  open-coded dsb() to keep the diff small.
+
+* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This
+  is no longer needed as the necessary synchronization will happen in
+  __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp().
+
+* The additional TLBI operation is chosen to have minimal impact:
+
+  - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at
+    EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused
+    entry for the reserved ASID in the kernel's own translation regime,
+    and have no adverse affect.
+
+  - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used
+    in hyp code, where it will target an unused entry in the hyp code's
+    TTBR0 mapping, and should have no adverse effect.
+
+* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a
+  TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no
+  need for arch_tlbbatch_should_defer() to consider
+  ARM64_WORKAROUND_REPEAT_TLBI.
+
+When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this
+patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes
+the resulting Image 64KiB smaller:
+
+| [mark@lakrids:~/src/linux]% size vmlinux-*
+|    text    data     bss     dec     hex filename
+| 21179831        19660919         708216 41548966        279fca6 vmlinux-after
+| 21181075        19660903         708216 41550194        27a0172 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l vmlinux-*
+| -rwxr-xr-x 1 mark mark 157771472 Feb  4 12:05 vmlinux-after
+| -rwxr-xr-x 1 mark mark 157815432 Feb  4 12:05 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l Image-*
+| -rw-r--r-- 1 mark mark 41007616 Feb  4 12:05 Image-after
+| -rw-r--r-- 1 mark mark 41073152 Feb  4 12:05 Image-before
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.12.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 59 ++++++++++++++++++-------------
+ arch/arm64/kernel/sys_compat.c    |  2 +-
+ arch/arm64/kvm/hyp/nvhe/mm.c      |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c     |  8 ++---
+ arch/arm64/kvm/hyp/pgtable.c      |  2 +-
+ arch/arm64/kvm/hyp/vhe/tlb.c      | 10 +++---
+ 6 files changed, 47 insertions(+), 36 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index dd802d58b39436..2c59b71b99e8ad 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -31,18 +31,10 @@
+  */
+ #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op "\n"                                \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op,            \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op ", %x0\n"                           \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %x0",    \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+@@ -181,6 +173,34 @@ static inline unsigned long get_trans_granule(void)
+               (__pages >> (5 * (scale) + 1)) - 1;                     \
+       })
+ 
++#define __repeat_tlbi_sync(op, arg...)                                                \
++do {                                                                          \
++      if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI))        \
++              break;                                                          \
++      __tlbi(op, ##arg);                                                      \
++      dsb(ish);                                                               \
++} while (0)
++
++/*
++ * Complete broadcast TLB maintenance issued by the host which invalidates
++ * stage 1 information in the host's own translation regime.
++ */
++static inline void __tlbi_sync_s1ish(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale1is, 0);
++}
++
++/*
++ * Complete broadcast TLB maintenance issued by hyp code which invalidates
++ * stage 1 translation information in any translation regime.
++ */
++static inline void __tlbi_sync_s1ish_hyp(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale2is, 0);
++}
++
+ /*
+  *    TLB Invalidation
+  *    ================
+@@ -266,7 +286,7 @@ static inline void flush_tlb_all(void)
+ {
+       dsb(ishst);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -278,7 +298,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
+       asid = __TLBI_VADDR(0, ASID(mm));
+       __tlbi(aside1is, asid);
+       __tlbi_user(aside1is, asid);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
+ }
+ 
+@@ -305,20 +325,11 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
+                                 unsigned long uaddr)
+ {
+       flush_tlb_page_nosync(vma, uaddr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
+ {
+-      /*
+-       * TLB flush deferral is not required on systems which are affected by
+-       * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation
+-       * will have two consecutive TLBI instructions with a dsb(ish) in between
+-       * defeating the purpose (i.e save overall 'dsb ish' cost).
+-       */
+-      if (alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI))
+-              return false;
+-
+       return true;
+ }
+ 
+@@ -352,7 +363,7 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
+  */
+ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
+ {
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ /*
+@@ -478,7 +489,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
+ {
+       __flush_tlb_range_nosync(vma, start, end, stride,
+                                last_level, tlb_level);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline void flush_tlb_range(struct vm_area_struct *vma,
+@@ -508,7 +519,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
+       dsb(ishst);
+       for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
+               __tlbi(vaale1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -522,7 +533,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
+ 
+       dsb(ishst);
+       __tlbi(vaae1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ #endif
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index 4a609e9b65de03..b9d4998c97efac 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
+                        * We pick the reserved-ASID to minimise the impact.
+                        */
+                       __tlbi(aside1is, __TLBI_VADDR(0, 0));
+-                      dsb(ish);
++                      __tlbi_sync_s1ish();
+               }
+ 
+               ret = caches_clean_inval_user_pou(start, start + chunk);
+diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
+index 8850b591d77518..cd58fbebd07393 100644
+--- a/arch/arm64/kvm/hyp/nvhe/mm.c
++++ b/arch/arm64/kvm/hyp/nvhe/mm.c
+@@ -261,7 +261,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot)
+        */
+       dsb(ishst);
+       __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), KVM_PGTABLE_LAST_LEVEL);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ }
+ 
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 48da9ca9763f6e..3dc1ce0d27fe66 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -169,7 +169,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -226,7 +226,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
+ 
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -240,7 +240,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       enter_vmid_context(mmu, &cxt, false);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -266,5 +266,5 @@ void __kvm_flush_vm_context(void)
+       /* Same remark as in enter_vmid_context() */
+       dsb(ish);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
+index b11bcebac908a7..deabc21caae370 100644
+--- a/arch/arm64/kvm/hyp/pgtable.c
++++ b/arch/arm64/kvm/hyp/pgtable.c
+@@ -497,7 +497,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
+               *unmapped += granule;
+       }
+ 
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+       mm_ops->put_page(ctx->ptep);
+ 
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 3d50a1bd2bdbcb..0f2aea1b42888a 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -115,7 +115,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -176,7 +176,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
+ 
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -192,7 +192,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       enter_vmid_context(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       exit_vmid_context(&cxt);
+@@ -217,7 +217,7 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+ 
+ /*
+@@ -358,7 +358,7 @@ int __kvm_tlbi_s1e2(struct kvm_s2_mmu *mmu, u64 va, u64 sys_encoding)
+       default:
+               ret = -EINVAL;
+       }
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       if (mmu)
+-- 
+2.53.0
+
diff --git a/queue-6.12/series b/queue-6.12/series

index e029c24f509a6e033a9c05711cdbceda350e6063..fe63823fa878ff9ab78302320d906ff35d722e3a 100644 (file)
--- a/queue-6.12/series
+++ b/queue-6.12/series
@@ -65,3 +65,6 @@ ima-kexec-skip-ima-segment-validation-after-kexec-so.patch
  ima-kexec-move-ima-log-copy-from-kexec-load-to-execu.patch
  spi-cadence-quadspi-fix-unclocked-access-on-unbind.patch
  tools-rv-fix-cleanup-after-failed-trace-setup.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
+arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
+arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
diff --git a/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..01b4499
--- /dev/null
+++ b/queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From c612ff567c6fa151b7fe76ebd8028151950eb3c7 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index 5ca6ecf0ce5fbc..c460b1f39136a5 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1177,6 +1177,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1186,6 +1187,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
diff --git a/queue-6.18/series b/queue-6.18/series

index 03d57d534411c7e2c7f9282c5460bd87e3bf02fd..6178051f58204c762c507c317756d33cfe9a2690 100644 (file)
--- a/queue-6.18/series
+++ b/queue-6.18/series
@@ -81,3 +81,4 @@ tools-rv-fix-substring-match-when-listing-container-.patch
  tools-rv-fix-cleanup-after-failed-trace-setup.patch
  verification-rvgen-fix-options-shared-among-commands.patch
  verification-rvgen-fix-ltl2k-writing-true-as-a-liter.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
diff --git a/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..0233a52
--- /dev/null
+++ b/queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From c9e0360a32be156dffec09ff965815a0e7f597fe Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index 6fd3b14273b374..b51ce7af1b20f9 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1052,6 +1052,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1061,6 +1062,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
diff --git a/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch b/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch

new file mode 100644 (file)

index 0000000..2ad95cb
--- /dev/null
+++ b/queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
@@ -0,0 +1,62 @@
+From 335a2178bfb71e2b34a326aa5f8ddeb989671b26 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:42:47 +0100
+Subject: arm64: tlb: Allow XZR argument to TLBI ops
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit bfd9c931d19aa59fb8371d557774fa169b15db9a upstream.
+
+The TLBI instruction accepts XZR as a register argument, and for TLBI
+operations with a register argument, there is no functional difference
+between using XZR or another GPR which contains zeroes. Operations
+without a register argument are encoded as if XZR were used.
+
+Allow the __TLBI_1() macro to use XZR when a register argument is all
+zeroes.
+
+Today this only results in a trivial code saving in
+__do_compat_cache_op()'s workaround for Neoverse-N1 erratum #1542419. In
+subsequent patches this pattern will be used more generally.
+
+There should be no functional change as a result of this patch.
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.6.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index 6eeb56b6fac13e..c8d8b9622369f0 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -38,12 +38,12 @@
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+-                             "tlbi " #op ", %0\n"                            \
++                             "tlbi " #op ", %x0\n"                           \
+                  ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %0",     \
++                             "dsb ish\n               tlbi " #op ", %x0",    \
+                              ARM64_WORKAROUND_REPEAT_TLBI,                   \
+                              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+-                          : : "r" (arg))
++                          : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+ 
+-- 
+2.53.0
+
diff --git a/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch b/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch

new file mode 100644 (file)

index 0000000..503b978
--- /dev/null
+++ b/queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
@@ -0,0 +1,446 @@
+From c4e2a8ccfa34ae3c554f5b665298833726d95c3e Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:42:48 +0100
+Subject: arm64: tlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
+
+From: Mark Rutland <mark.rutland@arm.com>
+
+commit a8f78680ee6bf795086384e8aea159a52814f827 upstream.
+
+The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
+errata where broadcast TLBI;DSB sequences don't provide all the
+architecturally required synchronization. The workaround performs more
+work than necessary, and can have significant overhead. This patch
+optimizes the workaround, as explained below.
+
+The workaround was originally added for Qualcomm Falkor erratum 1009 in
+commit:
+
+  d9ff80f83ecb ("arm64: Work around Falkor erratum 1009")
+
+As noted in the message for that commit, the workaround is applied even
+in cases where it is not strictly necessary.
+
+The workaround was later reused without changes for:
+
+* Arm Cortex-A76 erratum #1286807
+  SDEN v33: https://developer.arm.com/documentation/SDEN-885749/33-0/
+
+* Arm Cortex-A55 erratum #2441007
+  SDEN v16: https://developer.arm.com/documentation/SDEN-859338/1600/
+
+* Arm Cortex-A510 erratum #2441009
+  SDEN v19: https://developer.arm.com/documentation/SDEN-1873351/1900/
+
+The important details to note are as follows:
+
+1. All relevant errata only affect the ordering and/or completion of
+   memory accesses which have been translated by an invalidated TLB
+   entry. The actual invalidation of TLB entries is unaffected.
+
+2. The existing workaround is applied to both broadcast and local TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for broadcast invalidation.
+
+3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
+   sequence, whereas for all relevant errata it is only necessary to
+   execute a single additional TLBI;DSB sequence after any number of
+   TLBIs are completed by a DSB.
+
+   For example, for a sequence of batched TLBIs:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI <op1>[, <arg1>]
+       DSB ISH                  // additional
+       TLBI <op1>[, <arg1>]     // additional
+       TLBI <op2>[, <arg2>]
+       DSB ISH                  // additional
+       TLBI <op2>[, <arg2>]     // additional
+       TLBI <op3>[, <arg3>]
+       DSB ISH                  // additional
+       TLBI <op3>[, <arg3>]     // additional
+       DSB ISH
+
+   ... whereas it is sufficient to have:
+
+       TLBI <op1>[, <arg1>]
+       TLBI <op2>[, <arg2>]
+       TLBI <op3>[, <arg3>]
+       DSB ISH
+       TLBI <opX>[, <argX>]     // additional
+       DSB ISH                  // additional
+
+   Using a single additional TBLI and DSB at the end of the sequence can
+   have significantly lower overhead as each DSB which completes a TLBI
+   must synchronize with other PEs in the system, with potential
+   performance effects both locally and system-wide.
+
+4. The existing workaround repeats each specific TLBI operation, whereas
+   for all relevant errata it is sufficient for the additional TLBI to
+   use *any* operation which will be broadcast, regardless of which
+   translation regime or stage of translation the operation applies to.
+
+   For example, for a single TLBI:
+
+       TLBI ALLE2IS
+       DSB ISH
+
+   ... the existing workaround will expand this to:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI ALLE2IS             // additional
+       DSB ISH                  // additional
+
+   ... whereas it is sufficient to have:
+
+       TLBI ALLE2IS
+       DSB ISH
+       TLBI VALE1IS, XZR        // additional
+       DSB ISH                  // additional
+
+   As the additional TLBI doesn't have to match a specific earlier TLBI,
+   the additional TLBI can be implemented in separate code, with no
+   memory of the earlier TLBIs. The additional TLBI can also use a
+   cheaper TLBI operation.
+
+5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
+   invalidation, whereas for all relevant errata it is only necessary to
+   apply a workaround for Stage-1 invalidation.
+
+   Architecturally, TLBI operations which invalidate only Stage-2
+   information (e.g. IPAS2E1IS) are not required to invalidate TLB
+   entries which combine information from Stage-1 and Stage-2
+   translation table entries, and consequently may not complete memory
+   accesses translated by those combined entries. In these cases,
+   completion of memory accesses is only guaranteed after subsequent
+   invalidation of Stage-1 information (e.g. VMALLE1IS).
+
+Taking the above points into account, this patch reworks the workaround
+logic to reduce overhead:
+
+* New __tlbi_sync_s1ish() and __tlbi_sync_s1ish_hyp() functions are
+  added and used in place of any dsb(ish) which is used to complete
+  broadcast Stage-1 TLB maintenance. When the
+  ARM64_WORKAROUND_REPEAT_TLBI workaround is enabled, these helpers will
+  execute an additional TLBI;DSB sequence.
+
+  For consistency, it might make sense to add __tlbi_sync_*() helpers
+  for local and stage 2 maintenance. For now I've left those with
+  open-coded dsb() to keep the diff small.
+
+* The duplication of TLBIs in __TLBI_0() and __TLBI_1() is removed. This
+  is no longer needed as the necessary synchronization will happen in
+  __tlbi_sync_s1ish() or __tlbi_sync_s1ish_hyp().
+
+* The additional TLBI operation is chosen to have minimal impact:
+
+  - __tlbi_sync_s1ish() uses "TLBI VALE1IS, XZR". This is only used at
+    EL1 or at EL2 with {E2H,TGE}=={1,1}, where it will target an unused
+    entry for the reserved ASID in the kernel's own translation regime,
+    and have no adverse affect.
+
+  - __tlbi_sync_s1ish_hyp() uses "TLBI VALE2IS, XZR". This is only used
+    in hyp code, where it will target an unused entry in the hyp code's
+    TTBR0 mapping, and should have no adverse effect.
+
+* As __TLBI_0() and __TLBI_1() no longer replace each TLBI with a
+  TLBI;DSB;TLBI sequence, batching TLBIs is worthwhile, and there's no
+  need for arch_tlbbatch_should_defer() to consider
+  ARM64_WORKAROUND_REPEAT_TLBI.
+
+When building defconfig with GCC 15.1.0, compared to v6.19-rc1, this
+patch saves ~1KiB of text, makes the vmlinux ~42KiB smaller, and makes
+the resulting Image 64KiB smaller:
+
+| [mark@lakrids:~/src/linux]% size vmlinux-*
+|    text    data     bss     dec     hex filename
+| 21179831        19660919         708216 41548966        279fca6 vmlinux-after
+| 21181075        19660903         708216 41550194        27a0172 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l vmlinux-*
+| -rwxr-xr-x 1 mark mark 157771472 Feb  4 12:05 vmlinux-after
+| -rwxr-xr-x 1 mark mark 157815432 Feb  4 12:05 vmlinux-before
+| [mark@lakrids:~/src/linux]% ls -l Image-*
+| -rw-r--r-- 1 mark mark 41007616 Feb  4 12:05 Image-after
+| -rw-r--r-- 1 mark mark 41073152 Feb  4 12:05 Image-before
+
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Cc: Catalin Marinas <catalin.marinas@arm.com>
+Cc: Marc Zyngier <maz@kernel.org>
+Cc: Oliver Upton <oupton@kernel.org>
+Cc: Ryan Roberts <ryan.roberts@arm.com>
+Cc: Will Deacon <will@kernel.org>
+Signed-off-by: Will Deacon <will@kernel.org>
+Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+[Mark: Backport to v6.6.y]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/tlbflush.h | 60 ++++++++++++++++++-------------
+ arch/arm64/kernel/sys_compat.c    |  2 +-
+ arch/arm64/kvm/hyp/nvhe/mm.c      |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c     |  8 ++---
+ arch/arm64/kvm/hyp/pgtable.c      |  2 +-
+ arch/arm64/kvm/hyp/vhe/tlb.c      |  8 ++---
+ 6 files changed, 46 insertions(+), 36 deletions(-)
+
+diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
+index c8d8b9622369f0..d96342d455a68a 100644
+--- a/arch/arm64/include/asm/tlbflush.h
++++ b/arch/arm64/include/asm/tlbflush.h
+@@ -31,18 +31,10 @@
+  */
+ #define __TLBI_0(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op "\n"                                \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op,            \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : )
+ 
+ #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE                            \
+                              "tlbi " #op ", %x0\n"                           \
+-                 ALTERNATIVE("nop\n                   nop",                  \
+-                             "dsb ish\n               tlbi " #op ", %x0",    \
+-                             ARM64_WORKAROUND_REPEAT_TLBI,                   \
+-                             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)            \
+                           : : "rZ" (arg))
+ 
+ #define __TLBI_N(op, arg, n, ...) __TLBI_##n(op, arg)
+@@ -165,6 +157,34 @@ static inline unsigned long get_trans_granule(void)
+               (__pages >> (5 * (scale) + 1)) - 1;                     \
+       })
+ 
++#define __repeat_tlbi_sync(op, arg...)                                                \
++do {                                                                          \
++      if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_REPEAT_TLBI))        \
++              break;                                                          \
++      __tlbi(op, ##arg);                                                      \
++      dsb(ish);                                                               \
++} while (0)
++
++/*
++ * Complete broadcast TLB maintenance issued by the host which invalidates
++ * stage 1 information in the host's own translation regime.
++ */
++static inline void __tlbi_sync_s1ish(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale1is, 0);
++}
++
++/*
++ * Complete broadcast TLB maintenance issued by hyp code which invalidates
++ * stage 1 translation information in any translation regime.
++ */
++static inline void __tlbi_sync_s1ish_hyp(void)
++{
++      dsb(ish);
++      __repeat_tlbi_sync(vale2is, 0);
++}
++
+ /*
+  *    TLB Invalidation
+  *    ================
+@@ -246,7 +266,7 @@ static inline void flush_tlb_all(void)
+ {
+       dsb(ishst);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -258,7 +278,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
+       asid = __TLBI_VADDR(0, ASID(mm));
+       __tlbi(aside1is, asid);
+       __tlbi_user(aside1is, asid);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
+ }
+ 
+@@ -285,21 +305,11 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
+                                 unsigned long uaddr)
+ {
+       flush_tlb_page_nosync(vma, uaddr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
+ {
+-#ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI
+-      /*
+-       * TLB flush deferral is not required on systems which are affected by
+-       * ARM64_WORKAROUND_REPEAT_TLBI, as __tlbi()/__tlbi_user() implementation
+-       * will have two consecutive TLBI instructions with a dsb(ish) in between
+-       * defeating the purpose (i.e save overall 'dsb ish' cost).
+-       */
+-      if (unlikely(cpus_have_const_cap(ARM64_WORKAROUND_REPEAT_TLBI)))
+-              return false;
+-#endif
+       return true;
+ }
+ 
+@@ -333,7 +343,7 @@ static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
+  */
+ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
+ {
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+ }
+ 
+ /*
+@@ -437,7 +447,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
+       else
+               __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true);
+ 
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
+ }
+ 
+@@ -467,7 +477,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
+       dsb(ishst);
+       for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12))
+               __tlbi(vaale1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ 
+@@ -481,7 +491,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
+ 
+       dsb(ishst);
+       __tlbi(vaae1is, addr);
+-      dsb(ish);
++      __tlbi_sync_s1ish();
+       isb();
+ }
+ #endif
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index df14336c3a29cf..2bc2ac91d79e39 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -37,7 +37,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
+                        * We pick the reserved-ASID to minimise the impact.
+                        */
+                       __tlbi(aside1is, __TLBI_VADDR(0, 0));
+-                      dsb(ish);
++                      __tlbi_sync_s1ish();
+               }
+ 
+               ret = caches_clean_inval_user_pou(start, start + chunk);
+diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
+index 65a7a186d7b217..4bb9da3381eaf4 100644
+--- a/arch/arm64/kvm/hyp/nvhe/mm.c
++++ b/arch/arm64/kvm/hyp/nvhe/mm.c
+@@ -261,7 +261,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot)
+        */
+       dsb(ishst);
+       __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1));
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ }
+ 
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index a60fb13e21924f..f03d4f7dbf443d 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -102,7 +102,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -158,7 +158,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
+ 
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -172,7 +172,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt, false);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -198,5 +198,5 @@ void __kvm_flush_vm_context(void)
+       /* Same remark as in __tlb_switch_to_guest() */
+       dsb(ish);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
+index ca0bf0b92ca09e..4ec07236a68d21 100644
+--- a/arch/arm64/kvm/hyp/pgtable.c
++++ b/arch/arm64/kvm/hyp/pgtable.c
+@@ -534,7 +534,7 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
+               *unmapped += granule;
+       }
+ 
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+       mm_ops->put_page(ctx->ptep);
+ 
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 23325e9f3cc388..af3a02b6b48b32 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -105,7 +105,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+        */
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -165,7 +165,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
+ 
+       dsb(ish);
+       __tlbi(vmalle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -181,7 +181,7 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
+       __tlb_switch_to_guest(mmu, &cxt);
+ 
+       __tlbi(vmalls12e1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+       isb();
+ 
+       __tlb_switch_to_host(&cxt);
+@@ -206,5 +206,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-      dsb(ish);
++      __tlbi_sync_s1ish_hyp();
+ }
+-- 
+2.53.0
+
diff --git a/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch b/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch

new file mode 100644 (file)

index 0000000..b7c8806
--- /dev/null
+++ b/queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch
@@ -0,0 +1,175 @@
+From 26cde0f9cc45bcc7262ab2df5170d297a75638bb Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 14:42:46 +0100
+Subject: KVM: arm64: Remove VPIPT I-cache handling
+
+From: Marc Zyngier <maz@kernel.org>
+
+commit ced242ba9d7cb3571f6e0f165f643cb832d52148 upstream.
+
+We have some special handling for VPIPT I-cache in critical parts
+of the cache and TLB maintenance. Remove it.
+
+Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
+Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
+Signed-off-by: Marc Zyngier <maz@kernel.org>
+Acked-by: Mark Rutland <mark.rutland@arm.com>
+Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org
+Signed-off-by: Will Deacon <will@kernel.org>
+[Mark: Backport to v6.6.y. VPIPT HW was never built; this is all dead code]
+Signed-off-by: Mark Rutland <mark.rutland@arm.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ arch/arm64/include/asm/kvm_mmu.h |  5 ++-
+ arch/arm64/kvm/hyp/nvhe/pkvm.c   |  2 +-
+ arch/arm64/kvm/hyp/nvhe/tlb.c    | 61 --------------------------------
+ arch/arm64/kvm/hyp/vhe/tlb.c     | 13 -------
+ 4 files changed, 3 insertions(+), 78 deletions(-)
+
+diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
+index 96a80e8f62263e..888c5f90201073 100644
+--- a/arch/arm64/include/asm/kvm_mmu.h
++++ b/arch/arm64/include/asm/kvm_mmu.h
+@@ -229,9 +229,8 @@ static inline void __invalidate_icache_guest_page(void *va, size_t size)
+       if (icache_is_aliasing()) {
+               /* any kind of VIPT cache */
+               icache_inval_all_pou();
+-      } else if (read_sysreg(CurrentEL) != CurrentEL_EL1 ||
+-                 !icache_is_vpipt()) {
+-              /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
++      } else {
++              /* PIPT */
+               icache_inval_pou((unsigned long)va, (unsigned long)va + size);
+       }
+ }
+diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
+index 03acc8343c5d1b..fd3e0b2891c604 100644
+--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
++++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
+@@ -12,7 +12,7 @@
+ #include <nvhe/pkvm.h>
+ #include <nvhe/trap_handler.h>
+ 
+-/* Used by icache_is_vpipt(). */
++/* Used by icache_is_aliasing(). */
+ unsigned long __icache_flags;
+ 
+ /* Used by kvm_get_vttbr(). */
+diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
+index 1b265713d6bede..a60fb13e21924f 100644
+--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
++++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
+@@ -105,28 +105,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
+       dsb(ish);
+       isb();
+ 
+-      /*
+-       * If the host is running at EL1 and we have a VPIPT I-cache,
+-       * then we must perform I-cache maintenance at EL2 in order for
+-       * it to have an effect on the guest. Since the guest cannot hit
+-       * I-cache lines allocated with a different VMID, we don't need
+-       * to worry about junk out of guest reset (we nuke the I-cache on
+-       * VMID rollover), but we do need to be careful when remapping
+-       * executable pages for the same guest. This can happen when KSM
+-       * takes a CoW fault on an executable page, copies the page into
+-       * a page that was previously mapped in the guest and then needs
+-       * to invalidate the guest view of the I-cache for that page
+-       * from EL1. To solve this, we invalidate the entire I-cache when
+-       * unmapping a page from a guest if we have a VPIPT I-cache but
+-       * the host is running at EL1. As above, we could do better if
+-       * we had the VA.
+-       *
+-       * The moral of this story is: if you have a VPIPT I-cache, then
+-       * you should be running with VHE enabled.
+-       */
+-      if (icache_is_vpipt())
+-              icache_inval_all_pou();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -157,28 +135,6 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
+       dsb(nsh);
+       isb();
+ 
+-      /*
+-       * If the host is running at EL1 and we have a VPIPT I-cache,
+-       * then we must perform I-cache maintenance at EL2 in order for
+-       * it to have an effect on the guest. Since the guest cannot hit
+-       * I-cache lines allocated with a different VMID, we don't need
+-       * to worry about junk out of guest reset (we nuke the I-cache on
+-       * VMID rollover), but we do need to be careful when remapping
+-       * executable pages for the same guest. This can happen when KSM
+-       * takes a CoW fault on an executable page, copies the page into
+-       * a page that was previously mapped in the guest and then needs
+-       * to invalidate the guest view of the I-cache for that page
+-       * from EL1. To solve this, we invalidate the entire I-cache when
+-       * unmapping a page from a guest if we have a VPIPT I-cache but
+-       * the host is running at EL1. As above, we could do better if
+-       * we had the VA.
+-       *
+-       * The moral of this story is: if you have a VPIPT I-cache, then
+-       * you should be running with VHE enabled.
+-       */
+-      if (icache_is_vpipt())
+-              icache_inval_all_pou();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -205,10 +161,6 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
+       dsb(ish);
+       isb();
+ 
+-      /* See the comment in __kvm_tlb_flush_vmid_ipa() */
+-      if (icache_is_vpipt())
+-              icache_inval_all_pou();
+-
+       __tlb_switch_to_host(&cxt);
+ }
+ 
+@@ -246,18 +198,5 @@ void __kvm_flush_vm_context(void)
+       /* Same remark as in __tlb_switch_to_guest() */
+       dsb(ish);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
+index 46bd43f61d76f5..23325e9f3cc388 100644
+--- a/arch/arm64/kvm/hyp/vhe/tlb.c
++++ b/arch/arm64/kvm/hyp/vhe/tlb.c
+@@ -206,18 +206,5 @@ void __kvm_flush_vm_context(void)
+ {
+       dsb(ishst);
+       __tlbi(alle1is);
+-
+-      /*
+-       * VIPT and PIPT caches are not affected by VMID, so no maintenance
+-       * is necessary across a VMID rollover.
+-       *
+-       * VPIPT caches constrain lookup and maintenance to the active VMID,
+-       * so we need to invalidate lines with a stale VMID to avoid an ABA
+-       * race after multiple rollovers.
+-       *
+-       */
+-      if (icache_is_vpipt())
+-              asm volatile("ic ialluis");
+-
+       dsb(ish);
+ }
+-- 
+2.53.0
+
diff --git a/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch b/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch

new file mode 100644 (file)

index 0000000..fc070d1
--- /dev/null
+++ b/queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch
@@ -0,0 +1,95 @@
+From c0c6df7edac424f2708b5e1749238e0fd31aae7b Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Tue, 26 May 2026 11:12:39 +0700
+Subject: net: skbuff: fix missing zerocopy reference in pskb_carve helpers
+
+From: Minh Nguyen <minhnguyen.080505@gmail.com>
+
+commit 98d0912e9f841e5529a5b89a972805f34cb1c69d upstream.
+
+pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
+the old skb_shared_info header into a new buffer via memcpy(), which
+includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
+Neither function calls net_zcopy_get() for the new shinfo, creating an
+unaccounted holder: every skb_shared_info with destructor_arg set will
+call skb_zcopy_clear() once when freed, but the corresponding
+net_zcopy_get() was never called for the new copy. Repeated calls
+drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while
+TX skbs still hold live destructor_arg pointers.
+
+KASAN reports use-after-free on a freed ubuf_info_msgzc:
+
+  BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810
+  Read of size 8 at addr ffff88801574d3e8 by task poc/220
+
+  Call Trace:
+   skb_release_data+0x77b/0x810
+   kfree_skb_list_reason+0x13e/0x610
+   skb_release_data+0x4cd/0x810
+   sk_skb_reason_drop+0xf3/0x340
+   skb_queue_purge_reason+0x282/0x440
+   rds_tcp_inc_free+0x1e/0x30
+   rds_recvmsg+0x354/0x1780
+   __sys_recvmsg+0xdf/0x180
+
+  Allocated by task 219:
+   msg_zerocopy_realloc+0x157/0x7b0
+   tcp_sendmsg_locked+0x2892/0x3ba0
+
+  Freed by task 219:
+   ip_recv_error+0x74a/0xb10
+   tcp_recvmsg+0x475/0x530
+
+The skb consuming the late access still referenced the same uarg via
+shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without
+a refcount bump. This has been verified to be reliably exploitable: a
+working proof-of-concept achieves full root privilege escalation from
+an unprivileged local user on a default kernel configuration.
+
+The fix follows the pattern of pskb_expand_head() which has the same
+memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get()
+is placed after skb_orphan_frags() succeeds, so the orphan error path
+needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is
+placed after all failure points and just before skb_release_data(), so
+no error path needs cleanup at all -- matching pskb_expand_head() more
+closely and avoiding the need for a balancing net_zcopy_put().
+
+Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function")
+Cc: stable@vger.kernel.org
+Assisted-by: Claude:claude-sonnet-4-6
+Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com
+Signed-off-by: Paolo Abeni <pabeni@redhat.com>
+[Salvatore Bonaccorso: Adjust for context changes in v6.6.y]
+Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ net/core/skbuff.c | 4 ++++
+ 1 file changed, 4 insertions(+)
+
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index 2282b6ad4be21a..5f45a52cc8ca66 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -6412,6 +6412,8 @@ static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
+                       skb_kfree_head(data, size);
+                       return -ENOMEM;
+               }
++              if (skb_zcopy(skb))
++                      net_zcopy_get(skb_zcopy(skb));
+               for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
+                       skb_frag_ref(skb, i);
+               if (skb_has_frag_list(skb))
+@@ -6561,6 +6563,8 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off,
+               skb_kfree_head(data, size);
+               return -ENOMEM;
+       }
++      if (skb_zcopy(skb))
++              net_zcopy_get(skb_zcopy(skb));
+       skb_release_data(skb, SKB_CONSUMED, false);
+ 
+       skb->head = data;
+-- 
+2.53.0
+
diff --git a/queue-6.6/series b/queue-6.6/series

index 9a3242aa372f1d25d3f09f2bc20c30bf0a0e91d5..ca3de67c27c70ccb0b1259e4260e6684abdc21b7 100644 (file)
--- a/queue-6.6/series
+++ b/queue-6.6/series
@@ -250,3 +250,8 @@ alsa-pcm-fix-wait-queue-list-corruption-in-snd_pcm_d.patch
  usb-gadget-f_ncm-fix-net_device-lifecycle-with-devic.patch
  usb-gadget-u_ether-fix-null-pointer-deref-in-eth_get.patch
  tools-rv-fix-cleanup-after-failed-trace-setup.patch
+net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch
+tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
+kvm-arm64-remove-vpipt-i-cache-handling.patch
+arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch
+arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch
diff --git a/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch b/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch

new file mode 100644 (file)

index 0000000..2bd793a
--- /dev/null
+++ b/queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch
@@ -0,0 +1,57 @@
+From c39e7defc92471c24a90c889fdd0690bb0a88ad5 Mon Sep 17 00:00:00 2001
+From: Sasha Levin <sashal@kernel.org>
+Date: Thu, 11 Jun 2026 06:21:06 -0700
+Subject: tap: free page on error paths in tap_get_user_xdp()
+
+From: Weiming Shi <bestswngs@gmail.com>
+
+[ Upstream commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2 ]
+
+tap_get_user_xdp() rejects a frame shorter than ETH_HLEN with -EINVAL,
+and returns -ENOMEM when build_skb() fails. Both paths jump to the err
+label without freeing the page that vhost_net_build_xdp() allocated for
+the frame. tap_sendmsg() discards the per-buffer return value and always
+returns 0, so vhost_tx_batch() takes the success path and never frees
+the page; each rejected frame in a batch leaks one page-frag chunk.
+
+Free the page on both error paths, before the skb is built. This is the
+tap counterpart of the same leak in tun_xdp_one().
+
+Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
+Fixes: ed7f2afdd0e0 ("tap: add missing verification for short frame")
+Reported-by: Xiang Mei <xmei5@asu.edu>
+Signed-off-by: Weiming Shi <bestswngs@gmail.com>
+Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
+Reviewed-by: Willem de Bruijn <willemb@google.com>
+Link: https://patch.msgid.link/20260521163230.1478627-2-bestswngs@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+(cherry picked from commit 3bcf7aec6a9d16438f2cec29f5d7c8d5b8edf9b2)
+Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+---
+ drivers/net/tap.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index 2c4f9d19827f38..5f01c875e49ee4 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -1178,6 +1178,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+       int err, depth;
+ 
+       if (unlikely(xdp->data_end - xdp->data < ETH_HLEN)) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -EINVAL;
+               goto err;
+       }
+@@ -1187,6 +1188,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
+ 
+       skb = build_skb(xdp->data_hard_start, buflen);
+       if (!skb) {
++              put_page(virt_to_head_page(xdp->data));
+               err = -ENOMEM;
+               goto err;
+       }
+-- 
+2.53.0
+
author	Sasha Levin <sashal@kernel.org>
	Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)
committer	Sasha Levin <sashal@kernel.org>
	Fri, 12 Jun 2026 14:46:49 +0000 (10:46 -0400)
queue-5.10/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/io_uring-prevent-opcode-speculation.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/kvm-arm64-remove-vpipt-i-cache-handling.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/series		patch \| blob \| blame \| history
queue-5.10/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob
queue-5.10/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch	[new file with mode: 0644]	patch \| blob
queue-5.15/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch	[new file with mode: 0644]	patch \| blob
queue-5.15/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch	[new file with mode: 0644]	patch \| blob
queue-5.15/kvm-arm64-remove-vpipt-i-cache-handling.patch	[new file with mode: 0644]	patch \| blob
queue-5.15/series		patch \| blob \| blame \| history
queue-5.15/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob
queue-5.15/tun-free-page-on-build_skb-failure-in-tun_xdp_one.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/kvm-arm64-remove-vpipt-i-cache-handling.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch	[new file with mode: 0644]	patch \| blob
queue-6.1/series		patch \| blob \| blame \| history
queue-6.1/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/series		patch \| blob \| blame \| history
queue-6.12/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob
queue-6.18/series		patch \| blob \| blame \| history
queue-6.18/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob
queue-6.6/arm64-tlb-allow-xzr-argument-to-tlbi-ops.patch	[new file with mode: 0644]	patch \| blob
queue-6.6/arm64-tlb-optimize-arm64_workaround_repeat_tlbi.patch	[new file with mode: 0644]	patch \| blob
queue-6.6/kvm-arm64-remove-vpipt-i-cache-handling.patch	[new file with mode: 0644]	patch \| blob
queue-6.6/net-skbuff-fix-missing-zerocopy-reference-in-pskb_ca.patch	[new file with mode: 0644]	patch \| blob
queue-6.6/series		patch \| blob \| blame \| history
queue-6.6/tap-free-page-on-error-paths-in-tap_get_user_xdp.patch	[new file with mode: 0644]	patch \| blob