Marc Zyngier [Thu, 5 Feb 2026 09:17:30 +0000 (09:17 +0000)]
Merge branch kvm-arm64/gicv5-prologue into kvmarm-master/next
* kvm-arm64/gicv5-prologue:
: .
: Prologue to GICv5 support, courtesy of Sascha Bischoff.
:
: This is preliminary work that sets the scene for the full-blow
: support.
: .
irqchip/gic-v5: Check if impl is virt capable
KVM: arm64: gic: Set vgic_model before initing private IRQs
arm64/sysreg: Drop ICH_HFGRTR_EL2.ICC_HAPR_EL1 and make RES1
KVM: arm64: gic-v3: Switch vGIC-v3 to use generated ICH_VMCR_EL2
Marc Zyngier [Thu, 5 Feb 2026 09:17:04 +0000 (09:17 +0000)]
Merge branch kvm-arm64/gicv3-tdir-fixes into kvmarm-master/next
* kvm-arm64/gicv3-tdir-fixes:
: .
: Address two trapping-related issues when running legacy (i.e. GICv3)
: guests on GICv5 hosts, courtesy of Sascha Bischoff.
: .
KVM: arm64: Correct test for ICH_HCR_EL2_TDIR cap for GICv5 hosts
KVM: arm64: gic: Enable GICv3 CPUIF trapping on GICv5 hosts if required
Marc Zyngier [Thu, 5 Feb 2026 09:16:42 +0000 (09:16 +0000)]
Merge branch kvm-arm64/fwb-for-all into kvmarm-master/next
* kvm-arm64/fwb-for-all:
: .
: Allow pKVM's host stage-2 mappings to use the Force Write Back version
: of the memory attributes by using the "pass-through' encoding.
:
: This avoids having two separate encodings for S2 on a given platform.
: .
KVM: arm64: Simplify PAGE_S2_MEMATTR
KVM: arm64: Kill KVM_PGTABLE_S2_NOFWB
KVM: arm64: Switch pKVM host S2 over to KVM_PGTABLE_S2_AS_S1
KVM: arm64: Add KVM_PGTABLE_S2_AS_S1 flag
arm64: Add MT_S2{,_FWB}_AS_S1 encodings
Marc Zyngier [Thu, 5 Feb 2026 09:16:31 +0000 (09:16 +0000)]
Merge branch kvm-arm64/pkvm-no-mte into kvmarm-master/next
* kvm-arm64/pkvm-no-mte:
: .
: pKVM updates preventing the host from using MTE-related system
: sysrem registers when the feature is disabled from the kernel
: command-line (arm64.nomte), courtesy of Fuad Taba.
:
: From the cover letter:
:
: "If MTE is supported by the hardware (and is enabled at EL3), it remains
: available to lower exception levels by default. Disabling it in the host
: kernel (e.g., via 'arm64.nomte') only stops the kernel from advertising
: the feature; it does not physically disable MTE in the hardware.
:
: The ability to disable MTE in the host kernel is used by some systems,
: such as Android, so that the physical memory otherwise used as tag
: storage can be used for other things (i.e. treated just like the rest of
: memory). In this scenario, a malicious host could still access tags in
: pages donated to a guest using MTE instructions (e.g., STG and LDG),
: bypassing the kernel's configuration."
: .
KVM: arm64: Use kvm_has_mte() in pKVM trap initialization
KVM: arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
KVM: arm64: Trap MTE access and discovery when MTE is disabled
KVM: arm64: Remove dead code resetting HCR_EL2 for pKVM
Sascha Bischoff [Wed, 28 Jan 2026 18:07:33 +0000 (18:07 +0000)]
irqchip/gic-v5: Check if impl is virt capable
Now that there is support for creating a GICv5-based guest with KVM,
check that the hardware itself supports virtualisation, skipping the
setting of struct gic_kvm_info if not.
Note: If native GICv5 virt is not supported, then nor is
FEAT_GCIE_LEGACY, so we are able to skip altogether.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260128175919.3828384-33-sascha.bischoff@arm.com
[maz: cosmetic changes] Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Wed, 28 Jan 2026 18:00:52 +0000 (18:00 +0000)]
KVM: arm64: gic: Set vgic_model before initing private IRQs
Different GIC types require the private IRQs to be initialised
differently. GICv5 is the culprit as it supports both a different
number of private IRQs, and all of these are PPIs (there are no
SGIs). Moreover, as GICv5 uses the top bits of the interrupt ID to
encode the type, the intid also needs to computed differently.
Up until now, the GIC model has been set after initialising the
private IRQs for a VCPU. Move this earlier to ensure that the GIC
model is available when configuring the private IRQs. While we're at
it, also move the setting of the in_kernel flag and implementation
revision to keep them grouped together as before.
Sascha Bischoff [Wed, 28 Jan 2026 18:00:06 +0000 (18:00 +0000)]
arm64/sysreg: Drop ICH_HFGRTR_EL2.ICC_HAPR_EL1 and make RES1
The GICv5 architecture is dropping the ICC_HAPR_EL1 and ICV_HAPR_EL1
system registers. These registers were never added to the sysregs, but
the traps for them were.
Drop the trap bit from the ICH_HFGRTR_EL2 and make it Res1 as per the
upcoming GICv5 spec change. Additionally, update the EL2 setup code to
not attempt to set that bit.
Sascha Bischoff [Wed, 28 Jan 2026 17:59:50 +0000 (17:59 +0000)]
KVM: arm64: gic-v3: Switch vGIC-v3 to use generated ICH_VMCR_EL2
The VGIC-v3 code relied on hand-written definitions for the
ICH_VMCR_EL2 register. This register, and the associated fields, is
now generated as part of the sysreg framework. Move to using the
generated definitions instead of the hand-written ones.
There are no functional changes as part of this change.
Sascha Bischoff [Mon, 8 Dec 2025 15:28:23 +0000 (15:28 +0000)]
KVM: arm64: Correct test for ICH_HCR_EL2_TDIR cap for GICv5 hosts
The original order of checks in the ICH_HCR_EL2_TDIR test returned
with false early in the case where the native GICv3 CPUIF was not
present. The result was that on GICv5 hosts with legacy support -
which do not have the GICv3 CPUIF - the test always returned false.
Reshuffle the checks such that support for GICv5 legacy is checked
prior to checking for the native GICv3 CPUIF.
Sascha Bischoff [Mon, 8 Dec 2025 15:28:22 +0000 (15:28 +0000)]
KVM: arm64: gic: Enable GICv3 CPUIF trapping on GICv5 hosts if required
Factor out the enable (and printing of) the GICv3 CPUIF traps from the
main GICv3 probe into a separate function. Call said function from the
GICv5 probe for legacy support, ensuring that any required GICv3 CPUIF
traps on GICv5 hosts will be correctly handled, rather than injecting
an undef into the guest.
Marc Zyngier [Fri, 23 Jan 2026 19:16:36 +0000 (19:16 +0000)]
KVM: arm64: Kill KVM_PGTABLE_S2_NOFWB
Nobody is using this flag anymore, so remove it. This allows
some cleanup by removing stage2_has_fwb(), which is can be replaced
by a direct check on the capability.
Marc Zyngier [Fri, 23 Jan 2026 19:16:33 +0000 (19:16 +0000)]
arm64: Add MT_S2{,_FWB}_AS_S1 encodings
pKVM usage of S2 translation on the host is purely for isolation
purposes, not translation. To that effect, the memory attributes
being used must be that of S1.
With FWB=0, this is easily achieved by using the Normal Cacheable
type (which is the weakest possible memory type) at S2, and let S1
pick something stronger as required.
With FWB=1, the attributes are combined in a different way, and we
cannot arbitrarily use Normal Cacheable. We can, however, use a
memattr encoding that indicates that the final attributes are that
of Stage-1.
Add these encoding and a few pointers to the relevant parts of the
specification. It might come handy some day.
Fuad Tabba [Thu, 22 Jan 2026 11:22:18 +0000 (11:22 +0000)]
KVM: arm64: Use kvm_has_mte() in pKVM trap initialization
When initializing HCR traps in protected mode, use kvm_has_mte() to
check for MTE support rather than kvm_has_feat(kvm, ID_AA64PFR1_EL1,
MTE, IMP).
kvm_has_mte() provides a more comprehensive check:
- kvm_has_feat() only checks if MTE is in the guest's ID register view
(i.e., what we advertise to the guest)
- kvm_has_mte() checks both system_supports_mte() AND whether
KVM_ARCH_FLAG_MTE_ENABLED is set for this VM instance
Fuad Tabba [Thu, 22 Jan 2026 11:22:17 +0000 (11:22 +0000)]
KVM: arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
When MTE hardware is present but disabled via software (`arm64.nomte` or
`CONFIG_ARM64_MTE=n`), the kernel clears `HCR_EL2.ATA` and sets
`HCR_EL2.TID5`, to prevent the use of MTE instructions.
Additionally, accesses to certain MTE system registers trap to EL2 with
exception class ESR_ELx_EC_SYS64. To emulate hardware without MTE (where
such accesses would cause an Undefined Instruction exception), inject
UNDEF into the host.
Fuad Tabba [Thu, 22 Jan 2026 11:22:16 +0000 (11:22 +0000)]
KVM: arm64: Trap MTE access and discovery when MTE is disabled
If MTE is not supported by the hardware, or is disabled in the kernel
configuration (`CONFIG_ARM64_MTE=n`) or command line (`arm64.nomte`),
the kernel stops advertising MTE to userspace and avoids using MTE
instructions. However, this is a software-level disable only.
When MTE hardware is present and enabled by EL3 firmware, leaving
`HCR_EL2.ATA` set allows the host to execute MTE instructions (STG, LDG,
etc.) and access allocation tags in physical memory.
Prevent this by clearing `HCR_EL2.ATA` when MTE is disabled. Remove it
from the `HCR_HOST_NVHE_FLAGS` default, and conditionally set it in
`cpu_prepare_hyp_mode()` only when `system_supports_mte()` returns true.
This causes MTE instructions to trap to EL2 when `HCR_EL2.ATA` is
cleared.
Additionally, set `HCR_EL2.TID5` when MTE is disabled. This traps reads
of `GMID_EL1` (Multiple tag transfer ID register) to EL2, preventing the
discovery of MTE parameters (such as tag block size) when the feature is
suppressed.
Early boot code in `head.S` temporarily keeps `HCR_ATA` set to avoid
special-casing initialization paths. This is safe because this code
executes before untrusted code runs and will clear `HCR_ATA` if MTE is
disabled.
Fuad Tabba [Thu, 22 Jan 2026 11:22:15 +0000 (11:22 +0000)]
KVM: arm64: Remove dead code resetting HCR_EL2 for pKVM
The pKVM lifecycle does not support tearing down the hypervisor and
returning to the hyp stub once initialized. The transition to protected
mode is one-way.
Consequently, the code path in hyp-init.S responsible for resetting
EL2 registers (triggered by kexec or hibernation) is unreachable in
protected mode.
Remove the dead code handling HCR_EL2 reset for
ARM64_KVM_PROTECTED_MODE.
Marc Zyngier [Fri, 23 Jan 2026 10:04:47 +0000 (10:04 +0000)]
Merge branch kvm-arm64/pkvm-features-6.20 into kvmarm-master/next
* kvm-arm64/pkvm-features-6.20:
: .
: pKVM guest feature trapping fixes, courtesy of Fuad Tabba.
: .
KVM: arm64: Prevent host from managing timer offsets for protected VMs
KVM: arm64: Check whether a VM IOCTL is allowed in pKVM
KVM: arm64: Track KVM IOCTLs and their associated KVM caps
KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM
KVM: arm64: Include VM type when checking VM capabilities in pKVM
KVM: arm64: Introduce helper to calculate fault IPA offset
KVM: arm64: Fix MTE flag initialization for protected VMs
KVM: arm64: Fix Trace Buffer trap polarity for protected VMs
KVM: arm64: Fix Trace Buffer trapping for protected VMs
Marc Zyngier [Fri, 23 Jan 2026 10:04:35 +0000 (10:04 +0000)]
Merge branch kvm-arm64/feat_idst into kvmarm-master/next
* kvm-arm64/feat_idst:
: .
: Add support for FEAT_IDST, allowing ID registers that are not implemented
: to be reported as a normal trap rather than as an UNDEF exception.
: .
KVM: arm64: selftests: Add a test for FEAT_IDST
KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome
KVM: arm64: pkvm: Add a generic synchronous exception injection primitive
KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE
KVM: arm64: Handle CSSIDR2_EL1 and SMIDR_EL1 in a generic way
KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers
KVM: arm64: Add a generic synchronous exception injection primitive
KVM: arm64: Add trap routing for GMID_EL1
arm64: Repaint ID_AA64MMFR2_EL1.IDS description
Marc Zyngier [Fri, 23 Jan 2026 10:03:03 +0000 (10:03 +0000)]
Merge branch kvm-arm64/vtcr into kvmarm-master/next
* kvm-arm64/vtcr:
: .
: VTCR_EL2 conversion to the configuration-driven RESx framework,
: fixing a couple of UXN/PXN/XN bugs in the process.
: .
KVM: arm64: nv: Return correct RES0 bits for FGT registers
KVM: arm64: Always populate FGT masks at boot time
KVM: arm64: Honor UX/PX attributes for EL2 S1 mappings
KVM: arm64: Convert VTCR_EL2 to config-driven sanitisation
KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co
arm64: Convert VTCR_EL2 to sysreg infratructure
arm64: Convert ID_AA64MMFR0_EL1.TGRAN{4,16,64}_2 to UnsignedEnum
KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers
KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit
KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF
KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps()
KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate()
KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp()
KVM: arm64: Inject UNDEF for a register trap without accessor
KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load
KVM: arm64: Fix EL2 S1 XN handling for hVHE setups
KVM: arm64: gic: Check for vGICv3 when clearing TWI
Marc Zyngier [Fri, 23 Jan 2026 09:51:57 +0000 (09:51 +0000)]
Merge branch arm64/for-next/cpufeature into kvmarm-master/next
Merge arm64/for-next/cpufeature in to resolve conflicts resulting from
the removal of CONFIG_PAN.
* arm64/for-next/cpufeature:
arm64: Add support for FEAT_{LS64, LS64_V}
KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
arm64: Unconditionally enable PAN support
arm64: Unconditionally enable LSE support
arm64: Add support for TSV110 Spectre-BHB mitigation
Yicong Yang [Mon, 19 Jan 2026 02:29:27 +0000 (10:29 +0800)]
arm64: Add support for FEAT_{LS64, LS64_V}
Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
These features are identified by ID_AA64ISAR1_EL1.LS64 and the
use of such instructions in userspace (EL0) can be trapped.
As st64bv (FEAT_LS64_V) and st64bv0 (FEAT_LS64_ACCDATA) can not be tell
apart, FEAT_LS64 and FEAT_LS64_ACCDATA which will be supported in later
patch will be exported to userspace, FEAT_LS64_V will be enabled only
in kernel.
In order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP3
and cpuinfo
ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
special memory (device memory) so requires support by the CPU, system
and target memory location (device that support these instructions).
The HWCAP3_LS64, implies the support of CPU and system (since no
identification method from system, so SoC vendors should advertise
support in the CPU if system also support them).
Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
DABT will be generated, so users (probably userspace driver developer)
should make sure the target memory (device) also have the support.
For st64bv 0xffffffffffffffff will be returned as status result for
unsupported memory so user should check it.
Document the restrictions along with HWCAP3_LS64.
Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Yicong Yang [Mon, 19 Jan 2026 02:29:26 +0000 (10:29 +0800)]
KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Yicong Yang [Mon, 19 Jan 2026 02:29:25 +0000 (10:29 +0800)]
arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by
HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage
at EL0/1.
This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.
Acked-by: Will Deacon <will@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Yicong Yang [Mon, 19 Jan 2026 02:29:24 +0000 (10:29 +0800)]
KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
If FEAT_LS64WB not supported, FEAT_LS64* instructions only support
to access Device/Uncacheable memory, otherwise a data abort for
unsupported Exclusive or atomic access (0x35, UAoEF) is generated
per spec. It's implementation defined whether the target exception
level is routed and is possible to implemented as route to EL2 on a
VHE VM according to DDI0487L.b Section C3.2.6 Single-copy atomic
64-byte load/store.
If it's implemented as generate the DABT to the final enabled stage
(stage-2), inject the UAoEF back to the guest after checking the
memslot is valid.
Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Marc Zyngier [Mon, 19 Jan 2026 02:29:23 +0000 (10:29 +0800)]
KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
Add a bit of documentation for KVM_EXIT_ARM_LDST64B so that userspace
knows what to expect.
Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Marc Zyngier [Mon, 19 Jan 2026 02:29:22 +0000 (10:29 +0800)]
KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.
However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.
Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.
A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.
Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org>
Marc Zyngier [Wed, 7 Jan 2026 18:06:59 +0000 (18:06 +0000)]
arm64: Unconditionally enable LSE support
LSE atomics have been in the architecture since ARMv8.1 (released in
2014), and are hopefully supported by all modern toolchains.
Drop the optional nature of LSE support in the kernel, and always
compile the support in, as this really is very little code. LL/SC
still is the default, and the switch to LSE is done dynamically.
Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
KVM: arm64: nv: Return correct RES0 bits for FGT registers
We had extended the sysreg masking infrastructure to more general
registers, instead of restricting it to VNCR-backed registers, since
commit a0162020095e ("KVM: arm64: Extend masking facility to arbitrary
registers"). Fix kvm_get_sysreg_res0() to reflect this fact.
Note that we're sure that we only deal with FGT registers in
kvm_get_sysreg_res0(), the
if (sr < __VNCR_START__)
is actually a never false, which should probably be removed later.
Fixes: 69c19e047dfe ("KVM: arm64: Add TCR2_EL2 to the sysreg arrays") Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev> Link: https://patch.msgid.link/20260121101631.41037-1-zenghui.yu@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
Marc Zyngier [Thu, 22 Jan 2026 08:51:53 +0000 (08:51 +0000)]
KVM: arm64: Always populate FGT masks at boot time
We currently only populate the FGT masks if the underlying HW does
support FEAT_FGT. However, with the addition of the RES1 support for
system registers, this results in a lot of noise at boot time, as
reported by Nathan.
That's because even if FGT isn't supported, we still check for the
attribution of the bits to particular features, and not keeping the
masks up-to-date leads to (fairly harmess) warnings.
Given that we want these checks to be enforced even if the HW doesn't
support FGT, enable the generation of FGT masks unconditionally (this
is rather cheap anyway). Only the storage of the FGT configuration is
avoided, which will save a tiny bit of memory on these machines.
Fuad Tabba [Thu, 11 Dec 2025 10:47:09 +0000 (10:47 +0000)]
KVM: arm64: Prevent host from managing timer offsets for protected VMs
For protected VMs, the guest's timer offset state should not be
controlled by the host and must always run with a virtual counter offset
of 0. The existing timer logic allowed the host to set and manage the
timer counter offsets for protected VMs in certain cases.
Disable all host-side management of timer offsets for protected VMs by
adding checks in the relevant code paths.
Fuad Tabba [Thu, 11 Dec 2025 10:47:08 +0000 (10:47 +0000)]
KVM: arm64: Check whether a VM IOCTL is allowed in pKVM
Certain VM IOCTLs are tied to specific VM features. Since pKVM does not
support all features, restrict which IOCTLs are allowed depending on
whether the associated feature is supported.
Use the existing VM capability check as the source of truth to whether
an IOCTL is allowed for a particular VM by mapping the IOCTLs with their
associated capabilities.
Fuad Tabba [Thu, 11 Dec 2025 10:47:07 +0000 (10:47 +0000)]
KVM: arm64: Track KVM IOCTLs and their associated KVM caps
Track KVM IOCTLs (VM IOCTLs for now), and the associated KVM capability
that enables that IOCTL. Add a function that performs the lookup.
This will be used by CoCo VM Hypervisors (e.g., pKVM) to determine
whether a particular KVM IOCTL is allowed for its VMs.
Suggested-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com>
[maz: don't expose KVM_CAP_BASIC to userspace, and rely on NR_VCPUS
as a proxy for this] Link: https://patch.msgid.link/20251211104710.151771-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Fuad Tabba [Thu, 11 Dec 2025 10:47:06 +0000 (10:47 +0000)]
KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM
Supporting MTE in pKVM introduces significant complexity to the
hypervisor at EL2, even for non-protected VMs, since it would require
EL2 to handle tag management.
For now, do not allow KVM_CAP_ARM_MTE for any VM type in protected mode.
Fuad Tabba [Thu, 11 Dec 2025 10:47:05 +0000 (10:47 +0000)]
KVM: arm64: Include VM type when checking VM capabilities in pKVM
Certain features and capabilities are restricted in protected mode. Most
of these features are restricted only for protected VMs, but some
are restricted for ALL VMs in protected mode.
Extend the pKVM capability check to pass the VM (kvm), and use that when
determining supported features.
Fuad Tabba [Thu, 11 Dec 2025 10:47:03 +0000 (10:47 +0000)]
KVM: arm64: Fix MTE flag initialization for protected VMs
The function pkvm_init_features_from_host() initializes guest
features, propagating them from the host. The logic to propagate
KVM_ARCH_FLAG_MTE_ENABLED (Memory Tagging Extension)
has a couple of issues.
First, the check was in the common path, before the divergence for
protected and non-protected VMs. For non-protected VMs, this was
unnecessary, as 'kvm->arch.flags' is completely overwritten by
host_arch_flags immediately after, which already contains the MTE flag.
For protected VMs, this was setting the flag even if the feature is not
allowed.
Second, the check was reading 'host_kvm->arch.flags' instead of using
the local 'host_arch_flags', which is read once from the host flags.
Fix these by moving the MTE flag check inside the protected-VM-only
path, checking if the feature is allowed, and changing it to use the
correct host_arch_flags local variable. This ensures non-protected VMs
get the flag via the bulk copy, and protected VMs get it via an explicit
check.
Fixes: b7f345fbc32a ("KVM: arm64: Fix FEAT_MTE in pKVM") Reviewed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20251211104710.151771-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Fuad Tabba [Thu, 11 Dec 2025 10:47:02 +0000 (10:47 +0000)]
KVM: arm64: Fix Trace Buffer trap polarity for protected VMs
The E2TB bits in MDCR_EL2 control trapping of Trace Buffer system
register accesses. These accesses are trapped to EL2 when the bits are
clear.
The trap initialization logic for protected VMs in pvm_init_traps_mdcr()
had the polarity inverted. When a guest did not support the Trace Buffer
feature, the code was setting E2TB. This incorrectly disabled the trap,
potentially allowing a protected guest to access registers for a feature
it was not given.
Fix this by inverting the operation.
Fixes: f50758260bff ("KVM: arm64: Group setting traps for protected VMs by control register") Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20251211104710.151771-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Fuad Tabba [Thu, 11 Dec 2025 10:47:01 +0000 (10:47 +0000)]
KVM: arm64: Fix Trace Buffer trapping for protected VMs
For protected VMs in pKVM, the hypervisor should trap accesses to trace
buffer system registers if Trace Buffer isn't supported by the VM.
However, the current code only traps if Trace Buffer External Mode isn't
supported.
Fix this by checking for FEAT_TRBE (Trace Buffer) rather than
FEAT_TRBE_EXT.
Fixes: 9d5261269098 ("KVM: arm64: Trap external trace for protected VMs") Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20251211104710.151771-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Fuad Tabba [Fri, 9 Jan 2026 08:22:18 +0000 (08:22 +0000)]
KVM: selftests: Fix typos and stale comments in kvm_util
Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`.
- Correct the argument description for `vcpu_args_set` in `kvm_util.h`,
which incorrectly listed `vm` instead of `vcpu`.
- Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" ->
"executing").
- Correct the return value description for `vm_vaddr_unused_gap` in
`kvm_util.c` to match the implementation, which returns an address "at
or above" `vaddr_min`, not "at or below".
Fuad Tabba [Fri, 9 Jan 2026 08:22:17 +0000 (08:22 +0000)]
KVM: selftests: Move page_align() to shared header
To avoid code duplication, move page_align() to the shared `kvm_util.h`
header file. Rename it to vm_page_align(), to make it clear that the
alignment is done with respect to the guest's base page size.
Fuad Tabba [Fri, 9 Jan 2026 08:22:16 +0000 (08:22 +0000)]
KVM: riscv: selftests: Fix incorrect rounding in page_align()
The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.
Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.
Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V 64-bit") Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260109082218.3236580-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
Fuad Tabba [Fri, 9 Jan 2026 08:22:15 +0000 (08:22 +0000)]
KVM: arm64: selftests: Fix incorrect rounding in page_align()
The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.
Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.
Fixes: 7a6629ef746d ("kvm: selftests: add virt mem support for aarch64") Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260109082218.3236580-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM selftests map all guest code and data into the lower virtual address
range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...)
managed by TTBR1_EL1 is unused and uninitialized.
If a guest accesses the upper range, the MMU attempts a translation
table walk using uninitialized registers, leading to unpredictable
behavior.
Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1,
ensuring that any access to the upper range generates an immediate
Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to
ensure that tagged pointers in the upper range also deterministically
trigger a Translation Fault via EPD1.
Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in
`processor.h` to support this configuration. These are based on their
definitions in `arch/arm64/include/asm/pgtable-hwdef.h`.
Marc Zyngier [Thu, 8 Jan 2026 17:32:32 +0000 (17:32 +0000)]
KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome
With FEAT_IDST, unimplemented system registers in the feature ID space
must be reported using EC=0x18 at the closest handling EL, rather than
with an UNDEF.
Most of these system registers are always implemented thanks to their
dependency on FEAT_AA64, except for a set of (currently) three registers:
GMID_EL1 (depending on MTE2), CCSIDR2_EL1 (depending on FEAT_CCIDX),
and SMIDR_EL1 (depending on SME).
For these three registers, report their trap as EC=0x18 if they
end-up trapping into KVM and that FEAT_IDST is implemented in the guest.
Otherwise, just make them UNDEF.
Marc Zyngier [Thu, 8 Jan 2026 17:32:30 +0000 (17:32 +0000)]
KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE
If our host has MTE, but the guest doesn't, make sure we set HCR_EL2.TID5
to force GMID_EL1 being trapped. Such trap will be handled by the
FEAT_IDST handling.
Marc Zyngier [Thu, 8 Jan 2026 17:32:28 +0000 (17:32 +0000)]
KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers
Add a bit of infrastrtcture to triage_sysreg_trap() to handle the
case of registers falling into the Feature ID space that do not
have a local handler.
For these, we can directly apply the FEAT_IDST semantics and inject
an EC=0x18 exception. Otherwise, an UNDEF will do.
Marc Zyngier [Thu, 8 Jan 2026 17:32:26 +0000 (17:32 +0000)]
KVM: arm64: Add trap routing for GMID_EL1
HCR_EL2.TID5 is currently ignored by the trap routing infrastructure.
Wire it in the routing table so that GMID_EL1, the sole register
trapped by this bit, is correctly handled in the NV case.
Marc Zyngier [Thu, 8 Jan 2026 17:32:25 +0000 (17:32 +0000)]
arm64: Repaint ID_AA64MMFR2_EL1.IDS description
ID_AA64MMFR2_EL1.IDS, as described in the sysreg file, is pretty horrible
as it diesctly give the ESR value. Repaint it using the usual NI/IMP
identifiers to describe the absence/presence of FEAT_IDST.
Also add the new EL3 routing feature, even if we really don't care about it.
Marc Zyngier [Wed, 10 Dec 2025 17:30:24 +0000 (17:30 +0000)]
KVM: arm64: Honor UX/PX attributes for EL2 S1 mappings
Now that we potentially have two bits to deal with when setting
execution permissions, make sure we correctly handle them when both
when building the page tables and when reading back from them.
Marc Zyngier [Wed, 10 Dec 2025 17:30:23 +0000 (17:30 +0000)]
KVM: arm64: Convert VTCR_EL2 to config-driven sanitisation
Describe all the VTCR_EL2 fields and their respective configurations,
making sure that we correctly ignore the bits that are not defined
for a given guest configuration.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20251210173024.561160-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Wed, 10 Dec 2025 17:30:22 +0000 (17:30 +0000)]
KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co
None of the registers we manage in the feature dependency infrastructure
so far has any RES1 bit. This is about to change, as VTCR_EL2 has
its bit 31 being RES1.
In order to not fail the consistency checks by not describing a bit,
add RES1 bits to the set of immutable bits. This requires some extra
surgery for the FGT handling, as we now need to track RES1 bits there
as well.
There are no RES1 FGT bits *yet*. Watch this space.
Marc Zyngier [Wed, 10 Dec 2025 17:30:21 +0000 (17:30 +0000)]
arm64: Convert VTCR_EL2 to sysreg infratructure
Our definition of VTCR_EL2 is both partial (tons of fields are
missing) and totally inconsistent (some constants are shifted,
some are not). They are also expressed in terms of TCR, which is
rather inconvenient.
Replace the ad-hoc definitions with the the generated version.
This results in a bunch of additional changes to make the code
with the unshifted nature of generated enumerations.
The register data was extracted from the BSD licenced AARCHMRS
(AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0).
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20251210173024.561160-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Wed, 10 Dec 2025 17:30:20 +0000 (17:30 +0000)]
arm64: Convert ID_AA64MMFR0_EL1.TGRAN{4,16,64}_2 to UnsignedEnum
ID_AA64MMFR0_EL1.TGRAN{4,16,64}_2 are currently represented as unordered
enumerations. However, the architecture treats them as Unsigned,
as hinted to by the MRS data:
Will Deacon [Mon, 5 Jan 2026 15:49:09 +0000 (15:49 +0000)]
KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers
Commit ddcadb297ce5 ("KVM: arm64: Ignore EAGAIN for walks outside of a
fault") introduced a new walker flag ('KVM_PGTABLE_WALK_HANDLE_FAULT')
to KVM's page-table code. When set, the walk logic maintains its
previous behaviour of terminating a walk as soon as the visitor callback
returns an error. However, when the flag is clear, the walk will
continue if the visitor returns -EAGAIN and the error is then suppressed
and returned as zero to the caller.
Clearing the flag is beneficial when write-protecting a range of IPAs
with kvm_pgtable_stage2_wrprotect() but is not useful in any other
cases, either because we are operating on a single page (e.g.
kvm_pgtable_stage2_mkyoung() or kvm_phys_addr_ioremap()) or because the
early termination is desirable (e.g. when mapping pages from a fault in
user_mem_abort()).
Subsequently, commit e912efed485a ("KVM: arm64: Introduce the EL1 pKVM
MMU") hooked up pKVM's hypercall interface to the MMU code at EL1 but
failed to propagate any of the walker flags. As a result, page-table
walks at EL2 fail to set KVM_PGTABLE_WALK_HANDLE_FAULT even when the
early termination semantics are desirable on the fault handling path.
Rather than complicate the pKVM hypercall interface, invert the flag so
that the whole thing can be simplified and only pass the new flag
('KVM_PGTABLE_WALK_IGNORE_EAGAIN') from the wrprotect code.
Cc: Fuad Tabba <tabba@google.com> Cc: Quentin Perret <qperret@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oupton@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Quentin Perret <qperret@google.com> Link: https://msgid.link/20260105154939.11041-2-will@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
Marc Zyngier [Wed, 7 Jan 2026 12:46:00 +0000 (12:46 +0000)]
KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit
We set PSTATE.PAN to 1 on exiting from a guest if PAN support has
been compiled in and that it exists on the HW. However, this is not
necessarily correct.
In a nVHE configuration, there is no notion of PAN at EL2, so setting
PSTATE.PAN to anything is pointless.
Furthermore, not setting PAN to 0 when CONFIG_ARM64_PAN isn't set
means we run with the *guest's* PSTATE.PAN (which might be set to 1),
and we will explode on the next userspace access. Yes, the architecture
is delightful in that particular corner.
Fix the whole thing by always setting PAN to something when running
VHE (which implies PAN support), and only ignore it when running nVHE.
Oliver Upton [Thu, 8 Jan 2026 20:42:30 +0000 (12:42 -0800)]
KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF
Naturally, updating the Access Flag in a stage-1 descriptor requires
write permission at stage-2, although this isn't actually enforced in
KVM's software PTW.
Generate a stage-2 permission fault if the stage-1 walk attempts to
update the descriptor and its corresponding stage-2 translation lacks
write permission.
Fixes: bff8aa213dee ("KVM: arm64: Implement HW access flag management in stage-1 SW PTW") Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20260108204230.677172-1-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
Alexandru Elisei [Tue, 16 Dec 2025 10:30:52 +0000 (10:30 +0000)]
KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp()
__pvkm_host_share_hyp() and __pkvm_host_unshare_hyp() both have one
parameter, the pfn, not two. Even though correctness isn't impacted because
the SMCCC handlers pass the first argument and ignore the second one, let's
call the functions with the proper number of arguments.
Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Fuad Tabba <tabba@google.com> Link: https://msgid.link/20251216103053.47224-4-alexandru.elisei@arm.com Signed-off-by: Oliver Upton <oupton@kernel.org>
Alexandru Elisei [Tue, 16 Dec 2025 10:30:51 +0000 (10:30 +0000)]
KVM: arm64: Inject UNDEF for a register trap without accessor
Configuring a register trap without specifying an accessor function is
abviously a bug. Instead of calling die() when that happens, let's be a
bit more helpful and print the register encoding. Also inject an
undefined instruction exception in the guest, similar to other unhandled
register accesses.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://msgid.link/20251216103053.47224-3-alexandru.elisei@arm.com Signed-off-by: Oliver Upton <oupton@kernel.org>
Alexandru Elisei [Tue, 16 Dec 2025 10:30:50 +0000 (10:30 +0000)]
KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load
Commit fb10ddf35c1c ("KVM: arm64: Compute per-vCPU FGTs at vcpu_load()")
introduced per-VCPU FGT traps. For an unprotected pKVM VCPU, the untrusted
host FGT configuration is copied in pkvm_vcpu_init_traps(), which is called
from __pkvm_init_vcpu(). __pkvm_init_vcpu() is called once per VCPU (when
the VCPU is first run) which means that the uninitialized, zero, values for
the FGT registers end up being used for the entire lifetime of the VCPU.
This causes both unwanted traps (for the inverse polarity trap bits) and
the guest being allowed to access registers it shouldn't.
Fix it by copying the FGT traps for unprotected pKVM VCPUs when the
untrusted host loads the VCPU.
Fixes: fb10ddf35c1c ("KVM: arm64: Compute per-vCPU FGTs at vcpu_load()") Acked-by: Will Deacon <will@kernel.org> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251216103053.47224-2-alexandru.elisei@arm.com Signed-off-by: Oliver Upton <oupton@kernel.org>
Marc Zyngier [Wed, 10 Dec 2025 17:30:19 +0000 (17:30 +0000)]
KVM: arm64: Fix EL2 S1 XN handling for hVHE setups
The current XN implementation is tied to the EL2 translation regime,
and fall flat on its face with the EL2&0 one that is used for hVHE,
as the permission bit for privileged execution is a different one.
Fixes: 6537565fd9b7f ("KVM: arm64: Adjust EL2 stage-1 leaf AP bits when ARM64_KVM_HVHE is set") Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://msgid.link/20251210173024.561160-2-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
Sascha Bischoff [Tue, 6 Jan 2026 16:52:10 +0000 (16:52 +0000)]
KVM: arm64: gic: Check for vGICv3 when clearing TWI
Explicitly check for the vgic being v3 when disabling TWI. Failure to
check this can result in using the wrong view of the vgic CPU IF union
causing undesirable/unexpected behaviour.
Jinqian Yang [Sat, 27 Dec 2025 09:24:48 +0000 (17:24 +0800)]
arm64: Add support for TSV110 Spectre-BHB mitigation
The TSV110 processor is vulnerable to the Spectre-BHB (Branch History
Buffer) attack, which can be exploited to leak information through
branch prediction side channels. This commit adds the MIDR of TSV110
to the list for software mitigation.
Signed-off-by: Jinqian Yang <yangjinqian1@huawei.com> Reviewed-by: Zenghui Yu <zenghui.yu@linux.dev> Signed-off-by: Will Deacon <will@kernel.org>
Linus Torvalds [Sun, 4 Jan 2026 15:21:18 +0000 (07:21 -0800)]
Merge tag 'core_urgent_for_v6.19_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core entry fix from Borislav Petkov:
- Make sure clang inlines trivial local_irq_* helpers
* tag 'core_urgent_for_v6.19_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry: Always inline local_irq_{enable,disable}_exit_to_user()
Linus Torvalds [Fri, 2 Jan 2026 22:24:09 +0000 (14:24 -0800)]
Merge tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tool fixes and from Namhyung Kim:
- skip building BPF skeletons if libopenssl is missing
- a couple of test updates
- handle error cases of filename__read_build_id()
- support NVIDIA Olympus for ARM SPE profiling
- update tool headers to sync with the kernel
* tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools build: Fix the common set of features test wrt libopenssl
tools headers: Sync syscall table with kernel sources
tools headers: Sync linux/socket.h with kernel sources
tools headers: Sync linux/gfp_types.h with kernel sources
tools headers: Sync arm64 headers with kernel sources
tools headers: Sync x86 headers with kernel sources
tools headers: Sync UAPI sound/asound.h with kernel sources
tools headers: Sync UAPI linux/mount.h with kernel sources
tools headers: Sync UAPI linux/fs.h with kernel sources
tools headers: Sync UAPI linux/fcntl.h with kernel sources
tools headers: Sync UAPI KVM headers with kernel sources
tools headers: Sync UAPI drm/drm.h with kernel sources
perf arm-spe: Add NVIDIA Olympus to neoverse list
tools headers arm64: Add NVIDIA Olympus part
perf tests top: Make the test exclusive
perf tests kvm: Avoid leaving perf.data.guest file around
perf symbol: Fix ENOENT case for filename__read_build_id
perf tools: Disable BPF skeleton if no libopenssl found
tools/build: Add a feature test for libopenssl
Linus Torvalds [Fri, 2 Jan 2026 20:28:24 +0000 (12:28 -0800)]
Merge tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library fix from Eric Biggers:
"Fix the kunit_run_irq_test() function (which I recently added for the
CRC and crypto tests) to be less timing-dependent.
This fixes flakiness in the polyval kunit test suite"
* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
kunit: Enforce task execution in {soft,hard}irq contexts
Linus Torvalds [Fri, 2 Jan 2026 20:25:47 +0000 (12:25 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
- Fix several syzkaller found bugs:
- Poor parsing of the RDMA_NL_LS_OP_IP_RESOLVE netlink
- GID entry refcount leaking when CM destruction races with
multicast establishment
- Missing refcount put in ib_del_sub_device_and_put()
- Fixup recently introduced uABI padding for 32 bit consistency
- Avoid user triggered math overflow in MANA and AFA
- Reading invalid netdev data during an event
- kdoc fixes
- Fix never-working gid copying in ib_get_gids_from_rdma_hdr
- Typo in bnxt when validating the BAR
- bnxt mis-parsed IB_SEND_IP_CSUM so it didn't work always
- bnxt out of bounds access in bnxt related to the counters on new
devices
- Allocate the bnxt PDE table with the right sizing
- Use dma_free_coherent() correctly in bnxt
- Allow rxe to be unloadable when CONFIG_PROVE_LOCKING by adjusting the
tracking of the global sockets it uses
- Missing unlocking on error path in rxe
- Compute the right number of pages in a MR in rtrs
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/bnxt_re: fix dma_free_coherent() pointer
RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation
IB/rxe: Fix missing umem_odp->umem_mutex unlock on error path
RDMA/bnxt_re: Fix to use correct page size for PDE table
RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
RDMA/bnxt_re: Fix IB_SEND_IP_CSUM handling in post_send
RDMA/core: always drop device refcount in ib_del_sub_device_and_put()
RDMA/rxe: let rxe_reclassify_recv_socket() call sk_owner_put()
RDMA/bnxt_re: Fix incorrect BAR check in bnxt_qplib_map_creq_db()
RDMA/core: Fix logic error in ib_get_gids_from_rdma_hdr()
RDMA/efa: Remove possible negative shift
RTRS/rtrs: clean up rtrs headers kernel-doc
RDMA/irdma: avoid invalid read in irdma_net_event
RDMA/mana_ib: check cqe length for kernel CQs
RDMA/irdma: Fix irdma_alloc_ucontext_resp padding
RDMA/ucma: Fix rdma_ucm_query_ib_service_resp struct padding
RDMA/cm: Fix leaking the multicast GID table reference
RDMA/core: Check for the presence of LS_NLA_TYPE_DGID correctly
Linus Torvalds [Fri, 2 Jan 2026 20:21:34 +0000 (12:21 -0800)]
Merge tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
- Fix for build failures in tests that use an empty FIXTURE() seen in
Android's build environment, which uses -D_FORTIFY_SOURCE=3, a build
failure occurs in tests that use an empty FIXTURE()
- Fix func_traceonoff_triggers.tc sometimes failures on Kunpeng-920
board resulting from including transient trace file name in checksum
compare
- Fix to remove available_events requirement from toplevel-enable for
instance as it isn't a valid requirement for this test
* tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest/harness: Use helper to avoid zero-size memset warning
selftests/ftrace: Test toplevel-enable for instance
selftests/ftrace: traceonoff_triggers: strip off names
Linus Torvalds [Fri, 2 Jan 2026 20:15:59 +0000 (12:15 -0800)]
Merge tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Scan partition tables asynchronously for ublk, similarly to how nvme
does it. This avoids potential deadlocks, which is why nvme does it
that way too. Includes a set of selftests as well.
- MD pull request via Yu:
- Fix null-pointer dereference in raid5 sysfs group_thread_cnt
store (Tuo Li)
- Fix possible mempool corruption during raid1 raid_disks update
via sysfs (FengWei Shih)
- Fix logical_block_size configuration being overwritten during
super_1_validate() (Li Nan)
- Fix forward incompatibility with configurable logical block size:
arrays assembled on new kernels could not be assembled on older
kernels (v6.18 and before) due to non-zero reserved pad rejection
(Li Nan)
- Fix static checker warning about iterator not incremented (Li Nan)
- Skip CPU offlining notifications on unmapped hardware queues
- bfq-iosched block stats fix
- Fix outdated comment in bfq-iosched
* tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
block, bfq: update outdated comment
blk-mq: skip CPU offline notify on unmapped hctx
selftests/ublk: fix Makefile to rebuild on header changes
selftests/ublk: add test for async partition scan
ublk: scan partition in async way
block,bfq: fix aux stat accumulation destination
md: Fix forward incompatibility from configurable logical block size
md: Fix logical_block_size configuration being overwritten
md: suspend array while updating raid_disks via sysfs
md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
md: Fix static checker warning in analyze_sbs
Linus Torvalds [Fri, 2 Jan 2026 20:07:55 +0000 (12:07 -0800)]
Merge tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Removed dead argument length for io_uring_validate_mmap_request()
- Use GFP_NOWAIT for overflow CQEs on legacy ring setups rather than
GFP_ATOMIC, which makes it play nicer with memcg limits
- Fix a potential circular locking issue with tctx node removal and
exec based cancelations
* tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request()
io_uring/tctx: add separate lock for list of tctx's in ctx
io_uring: use GFP_NOWAIT for overflow CQEs on legacy rings
Linus Torvalds [Fri, 2 Jan 2026 20:04:51 +0000 (12:04 -0800)]
Merge tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Fix the AMD microcode Entrysign signature checking code to include
more models"
* tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix Halo
Linus Torvalds [Fri, 2 Jan 2026 19:33:33 +0000 (11:33 -0800)]
Merge tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Complete CPUCFG registers definition, set correct protection_map[] for
VM_NONE/VM_SHARED, fix some bugs in the orc stack unwinder, ftrace and
BPF JIT"
* tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
samples/ftrace: Adjust LoongArch register restore order in direct calls
LoongArch: BPF: Enhance the bpf_arch_text_poke() function
LoongArch: BPF: Enable trampoline-based tracing for module functions
LoongArch: BPF: Adjust the jump offset of tail calls
LoongArch: BPF: Save return address register ra to t0 before trampoline
LoongArch: BPF: Zero-extend bpf_tail_call() index
LoongArch: BPF: Sign extend kfunc call arguments
LoongArch: Refactor register restoration in ftrace_common_return
LoongArch: Enable exception fixup for specific ADE subcode
LoongArch: Remove unnecessary checks for ORC unwinder
LoongArch: Remove is_entry_func() and kernel_entry_end
LoongArch: Use UNWIND_HINT_END_OF_STACK for entry points
LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED
LoongArch: Complete CPUCFG registers definition
Linus Torvalds [Fri, 2 Jan 2026 17:53:45 +0000 (09:53 -0800)]
Merge tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Happy New Year, jetlagged fixes from me, still pretty quiet, xe is
most of this, with i915/nouveau/imagination fixes and some shmem
cleanups.
shmem:
- docs and MODULE_LICENSE fix
xe:
- Ensure svm device memory is idle before migration completes
- Fix a SVM debug printout
- Use READ_ONCE() / WRITE_ONCE() for g2h_fence
i915:
- Fix eb_lookup_vmas() failure path
nouveau:
- fix prepare_fb warnings
imagination:
- prevent export of protected objects"
* tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel:
drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer
drm/xe/guc: READ/WRITE_ONCE g2h_fence->done
drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use
drm/xe/svm: Fix a debug printout
drm/gem-shmem: Fix the MODULE_LICENSE() string
drm/gem-shmem: Fix typos in documentation
drm/nouveau/dispnv50: Don't call drm_atomic_get_crtc_state() in prepare_fb
drm/imagination: Disallow exporting of PM/FW protected objects
Linus Torvalds [Fri, 2 Jan 2026 17:24:43 +0000 (09:24 -0800)]
Merge tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix memory leak
- Fix two refcount leaks
- Fix error path in create_smb2_pipe
* tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd:
smb/server: fix refcount leak in smb2_open()
smb/server: fix refcount leak in parse_durable_handle_context()
smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe()
ksmbd: Fix memory leak in get_file_all_info()
Linus Torvalds [Fri, 2 Jan 2026 17:14:13 +0000 (09:14 -0800)]
Merge tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix array out of bounds error in copy_file_range
- Add tracepoint to help debug ioctl failures
* tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range
smb3 client: add missing tracepoint for unsupported ioctls
Julia Lawall [Wed, 31 Dec 2025 17:22:07 +0000 (18:22 +0100)]
block, bfq: update outdated comment
The function bfq_bfqq_may_idle() was renamed as bfq_better_to_idle()
in commit 277a4a9b56cd ("block, bfq: give a better name to
bfq_bfqq_may_idle"). Update the comment accordingly.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 31 Dec 2025 15:12:46 +0000 (08:12 -0700)]
io_uring/tctx: add separate lock for list of tctx's in ctx
ctx->tcxt_list holds the tasks using this ring, and it's currently
protected by the normal ctx->uring_lock. However, this can cause a
circular locking issue, as reported by syzbot, where cancelations off
exec end up needing to remove an entry from this list:
======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G L
------------------------------------------------------
syz.0.9999/12287 is trying to acquire lock: ffff88805851c0a8 (&ctx->uring_lock){+.+.}-{4:4}, at: io_uring_del_tctx_node+0xf0/0x2c0 io_uring/tctx.c:179
but task is already holding lock: ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: prepare_bprm_creds fs/exec.c:1360 [inline] ffff88802db5a2e0 (&sig->cred_guard_mutex){+.+.}-{4:4}, at: bprm_execve+0xb9/0x1400 fs/exec.c:1733
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
Add a separate lock just for the tctx_list, tctx_lock. This can nest
under ->uring_lock, where necessary, and be used separately for list
manipulation. For the cancelation off exec side, this removes the
need to grab ->uring_lock, hence fixing the circular locking
dependency.
Dave Airlie [Thu, 1 Jan 2026 06:51:30 +0000 (16:51 +1000)]
Merge tag 'drm-misc-fixes-2025-12-29' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc4:
- Documentation fixes and MODULE_LICENSE fix for shmem helper.
- Fix warnings in nouveau prepare_fb().
- Prevent export of protected objects in imagination driver.
Shuah Khan [Wed, 31 Dec 2025 23:46:26 +0000 (16:46 -0700)]
wifi: mt76: Remove blank line after mt792x firmware version dmesg
An extra blank line gets printed after printing firmware version
because the build date is null terminated. Remove the "\n" from
dev_info() calls to print firmware version and build date to fix
the problem.
Wake Liu [Wed, 24 Dec 2025 08:41:20 +0000 (16:41 +0800)]
kselftest/harness: Use helper to avoid zero-size memset warning
When building kselftests with a toolchain that enables source
fortification (e.g., Android's build environment, which uses
-D_FORTIFY_SOURCE=3), a build failure occurs in tests that use an
empty FIXTURE().
The root cause is that an empty fixture struct results in
`sizeof(self_private)` evaluating to 0. The compiler's fortification
checks then detect the `memset()` call with a compile-time constant size
of 0, issuing a `-Wuser-defined-warnings` which is promoted to an error
by `-Werror`.
An initial attempt to guard the call with `if (sizeof(self_private) > 0)`
was insufficient. The compiler's static analysis is aggressive enough
to flag the `memset(..., 0)` pattern before evaluating the conditional,
thus still triggering the error.
To resolve this robustly, this change introduces a `static inline`
helper function, `__kselftest_memset_safe()`. This function wraps the
size check and the `memset()` call. By replacing the direct `memset()`
in the `__TEST_F_IMPL` macro with a call to this helper, we create an
abstraction boundary. This prevents the compiler's static analyzer from
"seeing" the problematic pattern at the macro expansion site, resolving
the build failure.
Build Context:
Compiler: Android (14488419, +pgo, +bolt, +lto, +mlgo, based on r584948) clang version 22.0.0 (https://android.googlesource.com/toolchain/llvm-project 2d65e4108033380e6fe8e08b1f1826cd2bfb0c99)
Relevant Options: -O2 -Wall -Werror -D_FORTIFY_SOURCE=3 -target i686-linux-android10000
Zheng Yejian [Tue, 9 May 2023 20:36:59 +0000 (04:36 +0800)]
selftests/ftrace: Test toplevel-enable for instance
'available_events' is actually not required by
'test.d/event/toplevel-enable.tc' and its Existence has been tested in
'test.d/00basic/basic4.tc'.
So the require of 'available_events' can be dropped and then we can add
'instance' flag to test 'test.d/event/toplevel-enable.tc' for instance.
Test result show as below:
# ./ftracetest test.d/event/toplevel-enable.tc
=== Ftrace unit tests ===
[1] event tracing - enable/disable with top level files [PASS]
[2] (instance) event tracing - enable/disable with top level files [PASS]
# of passed: 2
# of failed: 0
# of unresolved: 0
# of untested: 0
# of unsupported: 0
# of xfailed: 0
# of undefined(test bug): 0