Marc Zyngier [Wed, 24 Sep 2025 18:35:50 +0000 (19:35 +0100)]
Merge branch kvm-arm64/selftests-6.18 into kvmarm-master/next
* kvm-arm64/selftests-6.18:
: .
: KVM/arm64 selftest updates for 6.18:
:
: - Large update to run EL1 selftests at EL2 when possible
: (20250917212044.294760-1-oliver.upton@linux.dev)
:
: - Work around lack of ID_AA64MMFR4_EL1 trapping on CPUs
: without FEAT_FGT
: (20250923173006.467455-1-oliver.upton@linux.dev)
:
: - Additional fixes and cleanups
: (20250920-kvm-arm64-id-aa64isar3-el1-v1-0-1764c1c1c96d@kernel.org)
: .
KVM: arm64: selftests: Cover ID_AA64ISAR3_EL1 in set_id_regs
KVM: arm64: selftests: Remove a duplicate register listing in set_id_regs
KVM: arm64: selftests: Cope with arch silliness in EL2 selftest
KVM: arm64: selftests: Add basic test for running in VHE EL2
KVM: arm64: selftests: Enable EL2 by default
KVM: arm64: selftests: Initialize HCR_EL2
KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU counters
KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2
KVM: arm64: selftests: Select SMCCC conduit based on current EL
KVM: arm64: selftests: Provide helper for getting default vCPU target
KVM: arm64: selftests: Alias EL1 registers to EL2 counterparts
KVM: arm64: selftests: Create a VGICv3 for 'default' VMs
KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creation
KVM: arm64: selftests: Add helper to check for VGICv3 support
KVM: arm64: selftests: Initialize VGICv3 only once
KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library code
Mark Brown [Sat, 20 Sep 2025 19:51:59 +0000 (20:51 +0100)]
KVM: arm64: selftests: Remove a duplicate register listing in set_id_regs
Currently we list the main set of registers with bits we test three
times, once in the test_regs array which is used at runtime, once in the
guest code and once in a list of ARRAY_SIZE() operations we use to tell
kselftest how many tests we plan to execute. This is needlessly fiddly,
when adding new registers as the test_cnt calculation is formatted with
two registers per line. Instead count the number of bitfields in the
register arrays at runtime.
The existing code subtracts ARRAY_SIZE(test_regs) from the number of
tests to account for the terminating FTR_REG_END entries in the per
register arrays, the new code accounts for this when enumerating.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Tue, 23 Sep 2025 17:30:06 +0000 (10:30 -0700)]
KVM: arm64: selftests: Cope with arch silliness in EL2 selftest
Implementations without FEAT_FGT aren't required to trap the entire ID
register space when HCR_EL2.TID3 is set. This is a terrible idea, as the
hypervisor may need to advertise the absence of a feature to the VM
using a negative value in a signed field, FEAT_E2H0 being a great
example of this.
Cope with uncooperative implementations in the EL2 selftest by accepting
a zero value when FEAT_FGT is absent and otherwise only tolerating the
expected nonzero value.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:42 +0000 (14:20 -0700)]
KVM: arm64: selftests: Enable EL2 by default
Take advantage of VHE to implicitly promote KVM selftests to run at EL2
with only slight modification. Update the smccc_filter test to account
for this now that the EL2-ness of a VM is visible to tests.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:40 +0000 (14:20 -0700)]
KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU counters
Configuring the number of implemented counters via PMCR_EL0.N was a bad
idea in retrospect as it interacts poorly with nested. Migrate the
selftest to use the vCPU attribute instead of the KVM_SET_ONE_REG
mechanism.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:39 +0000 (14:20 -0700)]
KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2
Arch timer registers are redirected to their hypervisor counterparts
when running in VHE EL2. This is great, except for the fact that the
hypervisor timers use different PPIs. Use the correct INTIDs when that
is the case.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:37 +0000 (14:20 -0700)]
KVM: arm64: selftests: Provide helper for getting default vCPU target
The default vCPU target in KVM selftests is pretty boring in that it
doesn't enable any vCPU features. Expose a helper for getting the
default target to prepare for cramming in more features. Call
KVM_ARM_PREFERRED_TARGET directly from get-reg-list as it needs
fine-grained control over feature flags.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:36 +0000 (14:20 -0700)]
KVM: arm64: selftests: Alias EL1 registers to EL2 counterparts
FEAT_VHE has the somewhat nice property of implicitly redirecting EL1
register aliases to their corresponding EL2 representations when E2H=1.
Unfortunately, there's no such abstraction for userspace and EL2
registers are always accessed by their canonical encoding.
Introduce a helper that applies EL2 redirections to sysregs and use
aggressive inlining to catch misuse at compile time. Go a little past
the architectural definition for ease of use for test authors (e.g. the
stack pointer).
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:35 +0000 (14:20 -0700)]
KVM: arm64: selftests: Create a VGICv3 for 'default' VMs
Start creating a VGICv3 by default unless explicitly opted-out by the
test. While having an interrupt controller is nice, the real benefit
here is clearing a hurdle for EL2 VMs which mandate the presence of a
VGIC.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:34 +0000 (14:20 -0700)]
KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creation
vgic_v3_setup() has a good bit of sanity checking internally to ensure
that vCPUs have actually been created and match the dimensioning of the
vgic itself. Spin off an unsanitised setup and initialization helper so
vgic initialization can be wired in around a 'default' VM's vCPU
creation.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 21:20:31 +0000 (14:20 -0700)]
KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library code
In order to compel the default usage of EL2 in selftests, move
kvm_arch_vm_post_create() to library code and expose an opt-in for using
MTE by default.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Sat, 20 Sep 2025 11:26:29 +0000 (12:26 +0100)]
Merge branch kvm-arm64/misc-6.18 into kvmarm-master/next
* kvm-arm64/misc-6.18:
: .
: .
: Misc improvements and bug fixes:
:
: - Fix XN handling in the S2 page table dumper
: (20250809135356.1003520-1-r09922117@csie.ntu.edu.tw)
:
: - Fix sanitity checks for huge mapping with pKVM running np guests
: (20250815162655.121108-1-ben.horgan@arm.com)
:
: - Fix use of TRBE when KVM is disabled, and Linux running under
: a lesser hypervisor (20250902-etm_crash-v2-1-aa9713a7306b@oss.qualcomm.com)
:
: - Fix out of date MTE-related comments (20250915155234.196288-1-alexandru.elisei@arm.com)
:
: - Fix PSCI BE support when running a NV guest (20250916161103.1040727-1-maz@kernel.org)
:
: - Fix page reference leak when refusing to map a page due to mismatched attributes
: (20250917130737.2139403-1-tabba@google.com)
:
: - Add trap handling for PMSDSFR_EL1
: (20250901-james-perf-feat_spe_eft-v8-7-2e2738f24559@linaro.org)
:
: - Add advertisement from FEAT_LSFE (Large System Float Extension)
: (20250918-arm64-lsfe-v4-1-0abc712101c7@kernel.org)
: .
KVM: arm64: Expose FEAT_LSFE to guests
KVM: arm64: Add trap configs for PMSDSFR_EL1
KVM: arm64: Fix page leak in user_mem_abort()
KVM: arm64: Fix kvm_vcpu_{set,is}_be() to deal with EL2 state
KVM: arm64: Update stale comment for sanitise_mte_tags()
KVM: arm64: Return early from trace helpers when KVM isn't available
KVM: arm64: Fix debug checking for np-guests using huge mappings
KVM: arm64: ptdump: Don't test PTE_VALID alongside other attributes
Marc Zyngier [Sat, 20 Sep 2025 11:26:24 +0000 (12:26 +0100)]
Merge branch kvm-arm64/nv-misc-6.18 into kvmarm-master/next
* kvm-arm64/nv-misc-6.18:
: .
: Various NV-related fixes:
:
: - Relax KVM's SError injection to consider that HCR_EL2.AMO's
: effective value is 1 when HCR_EL2.{E2H,TGE)=={1,0}.
: (20250918164632.410404-1-oliver.upton@linux.dev)
:
: - Allow userspace to disable some S2 base granule sizes
: (20250918165505.415017-1-oliver.upton@linux.dev)
: .
KVM: arm64: nv: Allow userspace to de-feature stage-2 TGRANs
KVM: arm64: nv: Treat AMO as 1 when at EL2 and {E2H,TGE} = {1, 0}
Marc Zyngier [Sat, 20 Sep 2025 11:26:18 +0000 (12:26 +0100)]
Merge branch kvm-arm64/el2-feature-control into kvmarm-master/next
* kvm-arm64/el2-feature-control: (23 commits)
: .
: General rework of EL2 features that can be disabled to satisfy
: the requirement of migration between heterogeneous hosts:
:
: - Handle effective RES0 behaviour of undefined registers, making sure
: that disabling a feature affects full registeres, and not just
: individual control bits. (20250918151402.1665315-1-maz@kernel.org)
:
: - Allow ID_AA64MMFR1_EL1.{TWED,HCX} to be disabled from userspace.
: (20250911114621.3724469-1-yangjinqian1@huawei.com)
:
: - Turn the NV feature management into a deny-list, and expose
: missing features to EL2 guests.
: (20250912212258.407350-1-oliver.upton@linux.dev)
: .
KVM: arm64: nv: Expose up to FEAT_Debugv8p8 to NV-enabled VMs
KVM: arm64: nv: Advertise FEAT_TIDCP1 to NV-enabled VMs
KVM: arm64: nv: Advertise FEAT_SpecSEI to NV-enabled VMs
KVM: arm64: nv: Expose FEAT_TWED to NV-enabled VMs
KVM: arm64: nv: Exclude guest's TWED configuration when TWE isn't set
KVM: arm64: nv: Expose FEAT_AFP to NV-enabled VMs
KVM: arm64: nv: Expose FEAT_ECBHB to NV-enabled VMs
KVM: arm64: nv: Expose FEAT_RASv1p1 via RAS_frac
KVM: arm64: nv: Expose FEAT_DF2 to NV-enabled VMs
KVM: arm64: nv: Don't erroneously claim FEAT_DoubleLock for NV VMs
KVM: arm64: nv: Convert masks to denylists in limit_nv_id_reg()
KVM: arm64: selftests: Test writes to ID_AA64MMFR1_EL1.{HCX, TWED}
KVM: arm64: Make ID_AA64MMFR1_EL1.{HCX, TWED} writable from userspace
KVM: arm64: Convert MDCR_EL2 RES0 handling to compute_reg_res0_bits()
KVM: arm64: Convert SCTLR_EL1 RES0 handling to compute_reg_res0_bits()
KVM: arm64: Enforce absence of FEAT_TCR2 on TCR2_EL2
KVM: arm64: Enforce absence of FEAT_SCTLR2 on SCTLR2_EL{1,2}
KVM: arm64: Convert HCR_EL2 RES0 handling to compute_reg_res0_bits()
KVM: arm64: Enforce absence of FEAT_HCX on HCRX_EL2
KVM: arm64: Enforce absence of FEAT_FGT2 on FGT2 registers
...
Marc Zyngier [Sat, 20 Sep 2025 11:26:11 +0000 (12:26 +0100)]
Merge branch kvm-arm64/nv-debug into kvmarm-master/next
* kvm-arm64/nv-debug:
: .
: Fix handling of MDSCR_EL1 in NV context, which is unfortunately
: mishandled by the architecture. Patches courtesy of Oliver Upton
: (20250917203125.283116-2-oliver.upton@linux.dev)
: .
KVM: arm64: nv: Apply guest's MDCR traps in nested context
KVM: arm64: nv: Trap debug registers when in hyp context
Marc Zyngier [Sat, 20 Sep 2025 11:26:05 +0000 (12:26 +0100)]
Merge branch kvm-arm64/gic-v5-nv into kvmarm-master/next
* kvm-arm64/gic-v5-nv:
: .
: Add NV support to GICv5 in GICv3 emulation mode, ensuring that the v3
: guest support is identical to that of a pure v3 platform.
:
: Patches courtesy of Sascha Bischoff (20250828105925.3865158-1-sascha.bischoff@arm.com)
: .
irqchip/gic-v5: Drop has_gcie_v3_compat from gic_kvm_info
KVM: arm64: Use ARM64_HAS_GICV5_LEGACY for GICv5 probing
arm64: cpucaps: Add GICv5 Legacy vCPU interface (GCIE_LEGACY) capability
KVM: arm64: Enable nested for GICv5 host with FEAT_GCIE_LEGACY
KVM: arm64: Don't access ICC_SRE_EL2 if GICv3 doesn't support v2 compatibility
Marc Zyngier [Sat, 20 Sep 2025 11:25:57 +0000 (12:25 +0100)]
Merge branch kvm-arm64/52bit-at into kvmarm-master/next
* kvm-arm64/52bit-at:
: .
: Upgrade the S1 page table walker to support 52bit PA, and use it to
: report the fault level when taking a S2 fault on S1PTW, which is required
: by the architecture (20250915114451.660351-1-maz@kernel.org).
: .
KVM: arm64: selftest: Expand external_aborts test to look for TTW levels
KVM: arm64: Populate level on S1PTW SEA injection
KVM: arm64: Add S1 IPA to page table level walker
KVM: arm64: Add filtering hook to S1 page table walk
KVM: arm64: Don't switch MMU on translation from non-NV context
KVM: arm64: Allow EL1 control registers to be accessed from the CPU state
KVM: arm64: Allow use of S1 PTW for non-NV vcpus
KVM: arm64: Report faults from S1 walk setup at the expected start level
KVM: arm64: Expand valid block mappings to FEAT_LPA/LPA2 support
KVM: arm64: Populate PAR_EL1 with 52bit addresses
KVM: arm64: Compute shareability for LPA2
KVM: arm64: Pass the walk_info structure to compute_par_s1()
KVM: arm64: Decouple output address from the PT descriptor
KVM: arm64: Compute 52bit TTBR address and alignment
KVM: arm64: Account for 52bit when computing maximum OA
KVM: arm64: Add helper computing the state of 52bit PA support
Marc Zyngier [Mon, 25 Aug 2025 12:13:56 +0000 (13:13 +0100)]
KVM: arm64: Populate level on S1PTW SEA injection
Our fault injection mechanism is mildly primitive, and doesn't
really implement the architecture when it comes to reporting
the level of a failing S1 PTW (we blindly report a SEA outside
of a PTW).
Now that we can walk the S1 page tables and look for a particular
IPA in the descriptors, it is pretty easy to improve the SEA
injection code.
Note that we only do it for AArch64 guests, and that 32bit guests
are left to their own device (oddly enough, I don't fancy writing
a 32bit PTW...).
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 25 Aug 2025 10:31:33 +0000 (11:31 +0100)]
KVM: arm64: Add S1 IPA to page table level walker
Use the filtering hook infrastructure to implement a new walker
that, for a given VA and an IPA, returns the level of the first
occurence of this IPA in the walk from that VA.
This will be used to improve our SEA syndrome reporting.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 25 Aug 2025 14:20:06 +0000 (15:20 +0100)]
KVM: arm64: Allow EL1 control registers to be accessed from the CPU state
As we are about to plug the SW PTW into the EL1-only code, we can
no longer assume that the EL1 state is not resident on the CPU,
as we don't necessarily get there from EL2 traps.
Turn the __vcpu_sys_reg() access on the EL1 state into calls to
the vcpu_read_sys_reg() helper, which is guaranteed to do the
right thing.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Tue, 29 Jul 2025 11:06:14 +0000 (12:06 +0100)]
KVM: arm64: Allow use of S1 PTW for non-NV vcpus
As we are about to use the S1 PTW in non-NV contexts, we must make
sure that we don't evaluate the EL2 state when dealing with the EL1&0
translation regime.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 28 Jul 2025 16:20:29 +0000 (17:20 +0100)]
KVM: arm64: Report faults from S1 walk setup at the expected start level
Translation faults from TTBR must be reported on the start level,
and not level-0. Enforcing this requires moving quite a lot of
code around so that the start level can be computed early enough
that it is usable.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Sun, 27 Jul 2025 18:37:01 +0000 (19:37 +0100)]
KVM: arm64: Compute shareability for LPA2
LPA2 gets the memory access shareability from TCR_ELx instead of
getting it form the descriptors. Store it in the walk info struct
so that it is passed around and evaluated as required.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 25 Aug 2025 13:48:32 +0000 (14:48 +0100)]
KVM: arm64: Pass the walk_info structure to compute_par_s1()
Instead of just passing the translation regime, pass the full
walk_info structure to compute_par_s1(). This will help further
chamges that will require it.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Sat, 26 Jul 2025 10:38:09 +0000 (11:38 +0100)]
KVM: arm64: Add helper computing the state of 52bit PA support
Track whether the guest is using 52bit PAs, either LPA or LPA2.
This further simplifies the handling of LVA for 4k and 16k pages,
as LPA2 implies LVA in this case.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Mark Brown [Thu, 18 Sep 2025 19:42:06 +0000 (20:42 +0100)]
KVM: arm64: Expose FEAT_LSFE to guests
FEAT_LSFE (Large System Float Extension), providing atomic floating point
memory operations, is optional from v9.5. This feature adds no new
architectural state, expose the relevant ID register field to guests so
they can discover it.
Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
James Clark [Mon, 1 Sep 2025 12:40:36 +0000 (13:40 +0100)]
KVM: arm64: Add trap configs for PMSDSFR_EL1
SPE data source filtering (SPE_FEAT_FDS) adds a new register
PMSDSFR_EL1, add the trap configs for it. PMSNEVFR_EL1 was also missing
its VNCR offset so add it along with PMSDSFR_EL1.
Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:58 +0000 (14:22 -0700)]
KVM: arm64: nv: Expose up to FEAT_Debugv8p8 to NV-enabled VMs
The changes to the debug architecture up to v8.8 are concerned with
external debug, which of course has no direct impact on VMs. Raise the
feature limit and document what's preventing us from raising it further.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:57 +0000 (14:22 -0700)]
KVM: arm64: nv: Advertise FEAT_TIDCP1 to NV-enabled VMs
While KVM does not expose IMPDEF features to VMs, FEAT_TIDCP1 is an
architecturally-defined EL1 trap of a particular sysreg encoding range.
Furthermore, KVM already advertises this feature to non-NV VMs.
As there is no interaction with EL2 traps, expose the feature.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:56 +0000 (14:22 -0700)]
KVM: arm64: nv: Advertise FEAT_SpecSEI to NV-enabled VMs
FEAT_SpecSEI is an informational feature describing whether speculative
loads may generate SErrors. Since there are already cases where KVM
reinjects an SError into the VM it is already possible this may happen
due to a speculative load within the VM.
Stop hiding the feature from NV-enabled VMs.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:54 +0000 (14:22 -0700)]
KVM: arm64: nv: Exclude guest's TWED configuration when TWE isn't set
Ignore the guest hypervisor's configured TWE delay if it hasn't actually
requested WFE traps. Otherwise, OR'ing these fields into the effective
HCR when the guest sets TWE is safe as KVM doesn't use FEAT_TWED and
leaves the fields initialized to 0.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:52 +0000 (14:22 -0700)]
KVM: arm64: nv: Expose FEAT_ECBHB to NV-enabled VMs
The exact wording of the restrictions on branch prediction due to
FEAT_ECBHB in DDI0487L.b is as follows:
When FEAT_ECBHB is implemented, the branch history information created
in a context before an exception to a higher Exception level using
AArch64 cannot be used by code before that exception to exploitatively
control the execution of any indirect branches in code in a different
context after the exception.
While vEL2 and EL1 are multiplexed at EL1, they exist in different
hardware-described contexts as KVM uses different stage-2 MMUs to
represent the corresponding translation regimes. Additionally, exception
entries into vEL2 always imply a hardware exception entry into literal EL2
for the emulated regime change.
Given all of this, and the fact that FEAT_ECBHB places no limitation on
the EL of the protected context after the exception, we can claim
FEAT_ECBHB on supporting hardware.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:51 +0000 (14:22 -0700)]
KVM: arm64: nv: Expose FEAT_RASv1p1 via RAS_frac
KVM already supports FEAT_RASv1p1 for NV-enabled VMs but only when
advertised through the canonical field. Stop masking the silly frac
field to expose the feature on systems without FEAT_DF.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:50 +0000 (14:22 -0700)]
KVM: arm64: nv: Expose FEAT_DF2 to NV-enabled VMs
The supporting infrastructure in KVM's abort injection code was merged a
while ago, but the author (me!) forgot to relax the NV limitation when
FEAT_DF2 got exposed to non-NV VMs. Fix it.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Fri, 12 Sep 2025 21:22:49 +0000 (14:22 -0700)]
KVM: arm64: nv: Don't erroneously claim FEAT_DoubleLock for NV VMs
ID_AA64DFR0_EL1.DoubleLock is one of those annoying signed feature
fields where a non-negative value implies that a feature is implemented
and a negative value implies that it is not. While the intention of
masking this field was likely to hide the feature, KVM actually
advertises it, even on unsupporting hardware.
Remove FEAT_DoubleLock from the mask, making the NI value visible to the
VM. Take care to accept the old, incorrect values for this field as
we've lied to userspace.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 18 Sep 2025 15:13:58 +0000 (16:13 +0100)]
KVM: arm64: Convert HCR_EL2 RES0 handling to compute_reg_res0_bits()
While HCR_EL2 is unlikely to ever be RES0 (at least when NV is on),
but consistency doesn't hurt, and it can be described in the same
way as the other registers.
Convert it over to the new RES0-computing infrastructure.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 18 Sep 2025 15:13:55 +0000 (16:13 +0100)]
KVM: arm64: Enforce absence of FEAT_FGT on FGT registers
As we want to enforce FGT registers behaving as RES0 when FEAT_FGT
is not exposed to the guest, We move a bumch of things that are
so far passed as parameter into a structure that points to the
bit description.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Thu, 18 Sep 2025 15:13:54 +0000 (16:13 +0100)]
KVM: arm64: Add reg_feat_map_desc to describe full register dependency
struct reg_bits_to_feat_map is great to describe bit-to-feature
dependency, but not so much to describe register-to-feature
dependency. Yet both need to exist.
Add a new reg_feat_map_desc structure to describe this.
Extra complexity is added by the need to source the RES0 bits from
the runtime-computed FGT masks, for which we need an extra flag
and extra complexity. Oh well.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Thu, 18 Sep 2025 16:55:05 +0000 (09:55 -0700)]
KVM: arm64: nv: Allow userspace to de-feature stage-2 TGRANs
KVM advertises the stage-2 TGRAN fields as writable to userspace but
prevents any modification for NV-enabled VMs. Update the special-cased
sanitization to permit de-featuring a particular TGRAN without allowing
the legacy value which refers to the stage-1 field for support.
Reported-by: Itaru Kitayama <itaru.kitayama@linux.dev> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Thu, 18 Sep 2025 16:46:31 +0000 (09:46 -0700)]
KVM: arm64: nv: Treat AMO as 1 when at EL2 and {E2H,TGE} = {1, 0}
SErrors are not deliverable at EL2 when the effective value of
HCR_EL2.{TGE,AMO} = {0, 0}. This is bothersome to deal with in nested
as we need to use auxiliary pending state to track the pending vSError
since HCR_EL2.VSE has no mechanism for honoring the guest HCR. On top of
that, we have no way of making that auxiliary pending state visible in
ISR_EL1.
A defect against the architecture now allows an implementation to treat
HCR_EL2.AMO as 1 when HCR_EL2.{E2H,TGE} = {1, 0}. Let's do exactly that,
meaning SErrors are always deliverable at EL2 for the typical E2H=RES1
VM.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 20:31:25 +0000 (13:31 -0700)]
KVM: arm64: nv: Apply guest's MDCR traps in nested context
KVM needs to ensure the guest hypervisor's traps take effect when the
vCPU is in a nested context. While supporting infrastructure is in place
for most of the EL2 trap registers, MDCR_EL2 is not.
Fold the guest's trap configuration into the effective MDCR_EL2. Apply
it directly to the in-memory representation as it gets recomputed on
every vcpu_load() anyway.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Oliver Upton [Wed, 17 Sep 2025 20:31:24 +0000 (13:31 -0700)]
KVM: arm64: nv: Trap debug registers when in hyp context
In case you haven't realized it yet, the architecture is _slightly_
broken in the context of nested virt. Here we have another example of
FEAT_NV2 redirecting a sysreg (MDSCR_EL1) to memory that actually
affects execution at vEL2.
Fortunately, MDCR_EL2.TDA provides the necessary traps to hide this
mess at the expense of unnecessarily trapping the breakpoint/watchpoint
registers. Yes, FEAT_FGT gives us a precise trap but let's just opt for
obvious correctness to start.
Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 28 Aug 2025 10:59:43 +0000 (10:59 +0000)]
irqchip/gic-v5: Drop has_gcie_v3_compat from gic_kvm_info
The presence of FEAT_GCIE_LEGACY is now handled as a CPU
feature. Therefore, drop the check and flag from the GIC driver and
gic_kvm_info as it is no longer required or used by KVM.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 28 Aug 2025 10:59:42 +0000 (10:59 +0000)]
KVM: arm64: Use ARM64_HAS_GICV5_LEGACY for GICv5 probing
The previous implementation of the probing function had the flaw that
it wouldn't catch mismatched CPU features. Specifically, GICv5 legacy
support (support for GICv3 VMs on a GICv5 host) was being enabled as
long as the initial boot CPU had support for the feature. This allowed
the support to become enabled on mismatched configurations.
Move to using cpus_have_final_cap(ARM64_HAS_GICV5_LEGACY) instead,
which only returns true when all booted CPUs support
FEAT_GCIE_LEGACY. A byproduct of this is that it ensures that late
onlining of CPUs is blocked on feature mismatch.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Implement the GCIE_LEGACY capability as a system feature to be able to
check for support from KVM. The type is explicitly
ARM64_CPUCAP_EARLY_LOCAL_CPU_FEATURE, which means that the capability
is enabled early if all boot CPUs support it. Additionally, if this
capability is enabled during boot, it prevents late onlining of CPUs
that lack it, thereby avoiding potential mismatched configurations
which would break KVM.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Sascha Bischoff [Thu, 28 Aug 2025 10:59:42 +0000 (10:59 +0000)]
KVM: arm64: Enable nested for GICv5 host with FEAT_GCIE_LEGACY
Extend the NV check to pass for a GICv5 host that has
FEAT_GCIE_LEGACY. The has_gcie_v3_compat flag is only set on GICv5
hosts (that explicitly support FEAT_GCIE_LEGACY), and hence the
explicit check for a VGIC_V5 is omitted.
As of this change, vGICv3-based VMs can run with nested on a
compatible GICv5 host.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Wed, 17 Sep 2025 09:11:28 +0000 (10:11 +0100)]
KVM: arm64: Don't access ICC_SRE_EL2 if GICv3 doesn't support v2 compatibility
We currently access ICC_SRE_EL2 at each load/put on VHE, and on each
entry/exit on nVHE. Both are quite onerous on NV, as this register
always traps.
We do this to make sure the EL1 guest doesn't flip between v2 and v3
behind our back. But all modern implementations have dropped v2,
and this is just overhead.
At the same time, the GICv5 spec has been fixed to allow access to
ICC_SRE_EL2 in legacy mode. Use this opportunity to replace the
GICv5 checks for v2 compat checks, with an ad-hoc static key.
Co-developed-by: Sascha Bischoff <sascha.bischoff@arm.com> Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
The user_mem_abort() function acquires a page reference via
__kvm_faultin_pfn() early in its execution. However, the subsequent
checks for mismatched attributes between stage 1 and stage 2 mappings
would return an error code directly, bypassing the corresponding page
release.
Fix this by storing the error and releasing the unused page before
returning the error.
Marc Zyngier [Wed, 17 Sep 2025 16:30:44 +0000 (17:30 +0100)]
Merge branch kvm-arm64/dump-instr into kvmarm-master/next
* kvm-arm64/dump-instr:
: .
: Dump the isntruction stream on panic, just like the rest of the kernel
: already does.
:
: Patches courtesy of Mostafa Saleh (20250909133631.3844423-1-smostafa@google.com)
: .
KVM: arm64: Map hyp text as RO and dump instr on panic
KVM: arm64: Dump instruction on hyp panic
Marc Zyngier [Tue, 16 Sep 2025 16:11:03 +0000 (17:11 +0100)]
KVM: arm64: Fix kvm_vcpu_{set,is}_be() to deal with EL2 state
Nobody really cares about BE, but KVM currently only deals with
SCTLR_EL1 when evaluating or setting the endianness in PSCI,
meaning that we evaluate whatever the L2 state has been at some point.
Teach these primitives about SCTLR_EL2, and forget about BE...
Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
Alexandru Elisei [Mon, 15 Sep 2025 15:52:34 +0000 (16:52 +0100)]
KVM: arm64: Update stale comment for sanitise_mte_tags()
Commit c911f0d46879 ("KVM: arm64: permit all VM_MTE_ALLOWED mappings
with MTE enabled") allowed VM_SHARED VMAs in a VM with MTE enabled, so
remove the comment to the contrary.
Commit d77e59a8fccd ("arm64: mte: Lock a page for MTE tag initialisation")
removed the race that can lead to tags being zeroed more than once when
multiple threads attempt initialisation at the same time, so remove the
comment about mmap_lock too. Note that sanitise_mte_tags() was never called
with the mmap_lock held from user_mem_abort() and the race was prevented by
kvm->mmu_lock.
However, the function still requires to have the kvm->mmu_lock held to
ensure that the memory remains mapped in the userspace process while the
tags are zeroed. Document this in a comment.
CC: Peter Collingbourne <pcc@google.com> CC: Catalin Marinas <catalin.marinas@arm.com> CC: Steven Price <steven.price@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Yingchao Deng [Tue, 2 Sep 2025 03:48:25 +0000 (11:48 +0800)]
KVM: arm64: Return early from trace helpers when KVM isn't available
When Linux is booted at EL1, host_data_ptr() resolves to the nVHE
hypervisor's copy of host data. When hyp mode isn't available for
KVM the nVHE percpu bases remain uninitialized. Consequently, any usage
of host_data_ptr() will result in a NULL dereference which has been
observed in KVM's trace filtering helpers.
Add an early return to the trace filtering helpers if KVM isn't
initialized, avoiding the NULL dereference. Take this opportunity
to move the TRBE-skipping checks to a common helper.
Fixes: 054b88391bbe2 ("KVM: arm64: Support trace filtering for guests") Signed-off-by: Yingchao Deng <yingchao.deng@oss.qualcomm.com> Reviewed-by: James Clark <james.clark@linaro.org>
[maz: repainted the helpers to be readable, and the commit message
with Oliver's suggestion] Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: Avoid synchronize_srcu() in kvm_io_bus_register_dev()
Device MMIO registration may happen quite frequently during VM boot,
and the SRCU synchronization each time has a measurable effect
on VM startup time. In our experiments it can account for around 25%
of a VM's startup time.
Replace the synchronization with a deferred free of the old kvm_io_bus
structure.
Tested-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: Implement barriers before accessing kvm->buses[] on SRCU read paths
This ensures that, if a VCPU has "observed" that an IO registration has
occurred, the instruction currently being trapped or emulated will also
observe the IO registration.
At the same time, enforce that kvm_get_bus() is used only on the
update side, ensuring that a long-term reference cannot be obtained by
an SRCU reader.
Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
In preparation to remove synchronize_srcu() from MMIO registration,
remove the distributor's dependency on this implicit barrier by
direct acquire-release synchronization on the flag write and its
lock-free check.
Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
It is now used only within kvm_vgic_map_resources(). vgic_dist::ready
is already written directly by this function, so it is clearer to
bypass the macro for reads as well.
Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 15 Sep 2025 09:49:04 +0000 (10:49 +0100)]
Merge branch kvm-arm64/pkvm_vm_handle into kvmarm-master/next
* kvm-arm64/pkvm_vm_handle:
: pKVM VM handle allocation fixes, courtesy of Fuad Tabba.
:
: From the cover letter (20250909072437.4110547-1-tabba@google.com):
:
: "In pKVM, this handle is allocated when the VM is initialized at the
: hypervisor, which is on the first vCPU run. However, the host starts
: initializing the VM and setting up its data structures earlier. MMU
: notifiers for the VMs are also registered before VM initialization at
: the hypervisor, and rely on the handle to identify the VM.
:
: Therefore, there is a potential gap between when the VM is (partially)
: setup at the host, but still without a valid pKVM handle to identify it
: when communicating with the hypervisor."
KVM: arm64: Reserve pKVM handle during pkvm_init_host_vm()
KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization
KVM: arm64: Consolidate pKVM hypervisor VM initialization logic
KVM: arm64: Separate allocation and insertion of pKVM VM table entries
KVM: arm64: Decouple hyp VM creation state from its handle
KVM: arm64: Clarify comments to distinguish pKVM mode from protected VMs
KVM: arm64: Rename 'host_kvm' to 'kvm' in pKVM host code
KVM: arm64: Rename pkvm.enabled to pkvm.is_protected
KVM: arm64: Add build-time check for duplicate DECLARE_REG use
KVM: arm64: Reserve pKVM handle during pkvm_init_host_vm()
When a pKVM guest is active, TLB invalidations triggered by host MMU
notifiers require a valid hypervisor handle. Currently, this handle is
only allocated when the first vCPU is run.
However, the guest's memory is associated with the host MMU much
earlier, during kvm_arch_init_vm(). This creates a window where an MMU
invalidation could occur after the kvm_pgtable pointer checked by the
notifiers is set but before the pKVM handle has been created.
Fix this by reserving the pKVM handle when the host VM is first set up.
Move the call to the __pkvm_reserve_vm hypercall from the first-vCPU-run
path into pkvm_init_host_vm(), which is called during initial VM setup.
This ensures the handle is available before any subsystem can trigger an
MMU notification for the VM.
The VM destruction path is updated to call __pkvm_unreserve_vm for cases
where a VM was reserved but never fully created at the hypervisor,
ensuring the handle is properly released.
This fix leverages the two-stage reservation/initialization hypercall
interface introduced in preceding patches.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization
The existing __pkvm_init_vm hypercall performs both the reservation of a
VM table entry and the initialization of the hypervisor VM state in a
single operation. This design prevents the host from obtaining a VM
handle from the hypervisor until all preparation for the creation and
the initialization of the VM is done, which is on the first vCPU run
operation.
To support more flexible VM lifecycle management, the host needs the
ability to reserve a handle early, before the first vCPU run.
Refactor the hypercall interface to enable this, splitting the single
hypercall into a two-stage process:
- __pkvm_reserve_vm: A new hypercall that allocates a slot in the
hypervisor's vm_table, marks it as reserved, and returns a unique
handle to the host.
- __pkvm_unreserve_vm: A corresponding cleanup hypercall to safely
release the reservation if the host fails to proceed with full
initialization.
- __pkvm_init_vm: The existing hypercall is modified to no longer
allocate a slot. It now expects a pre-reserved handle and commits the
donated VM memory to that slot.
For now, the host-side code in __pkvm_create_hyp_vm calls the new
reserve and init hypercalls back-to-back to maintain existing behavior.
This paves the way for subsequent patches to separate the reservation
and initialization steps in the VM's lifecycle.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Consolidate pKVM hypervisor VM initialization logic
The insert_vm_table_entry() function was performing tasks beyond its
primary responsibility. In addition to inserting a VM pointer into the
vm_table, it was also initializing several fields within 'struct
pkvm_hyp_vm', such as the VMID and stage-2 MMU pointers. This mixing of
concerns made the code harder to follow.
As another preparatory step towards allowing a VM table entry to be
reserved before the VM is fully created, this logic must be cleaned up.
By separating table insertion from state initialization, we can control
the timing of the initialization step more precisely in subsequent
patches.
Refactor the code to consolidate all initialization logic into
init_pkvm_hyp_vm():
- Move the initialization of the handle, VMID, and MMU fields from
insert_vm_table_entry() to init_pkvm_hyp_vm().
- Simplify insert_vm_table_entry() to perform only one action: placing
the provided pkvm_hyp_vm pointer into the vm_table.
- Update the calling sequence in __pkvm_init_vm() to first allocate an
entry in the VM table, initialize the VM, and then insert the VM into
the VM table. This is all protected by the vm_table_lock for now.
Subsequent patches will adjust the sequence and not hold the
vm_table_lock while initializing the VM at the hypervisor
(init_pkvm_hyp_vm()).
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Separate allocation and insertion of pKVM VM table entries
The current insert_vm_table_entry() function performs two actions at
once: it finds a free slot in the pKVM VM table and populates it with
the pkvm_hyp_vm pointer.
Refactor this function as a preparatory step for future work that will
require reserving a VM slot and its corresponding handle earlier in the
VM lifecycle, before the pkvm_hyp_vm structure is initialized and ready
to be inserted.
Split the function into a two-phase process:
- A new allocate_vm_table_entry() function finds an empty slot, marks it
as reserved with a RESERVED_ENTRY placeholder, and returns a handle
derived from the slot's index.
- The insert_vm_table_entry() function is repurposed to take the handle,
validate that the corresponding slot is in the reserved state, and
then populate it with the pkvm_hyp_vm pointer.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Decouple hyp VM creation state from its handle
Currently, the presence of a pKVM handle (pkvm.handle != 0) is used to
determine if the corresponding hypervisor (EL2) VM has been created and
initialized. This couples the handle's lifecycle with the VM's creation
state.
This coupling will become problematic with upcoming changes that will
allocate the pKVM handle earlier in the VM's life, before the VM is
instantiated at the hypervisor.
To prepare for this and make the state tracking explicit, decouple the
two concepts. Introduce a new boolean flag, 'pkvm.is_created', to track
whether the hypervisor-side VM has been created and initialized.
A new helper, pkvm_hyp_vm_is_created(), is added to check this flag. All
call sites that previously checked for the handle's existence are
converted to use the new, explicit check. The 'is_created' flag is set
to true upon successful creation in the hypervisor (EL2) and cleared
upon destruction.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Clarify comments to distinguish pKVM mode from protected VMs
The hypervisor code for protected KVM contains comments that are
imprecise and at times flat-out wrong. They often refer to a "protected
VM" in contexts where the code or data structure applies to _any_ VM
managed by the hypervisor when pKVM is enabled.
For instance, the 'vm_table' holds handles for all VMs known to the
hypervisor, not exclusively for those that are configured as protected.
This inaccurate terminology can make the code scope harder to understand
for future (and current) developers.
Clarify the comments throughout the pKVM hypervisor code to make a clear
distinction between the pKVM feature itself (i.e., "protected mode") and
the VMs that are specifically configured to be protected. This involves
replacing ambiguous uses of "protected VM" with more accurate phrasing.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Rename 'host_kvm' to 'kvm' in pKVM host code
In hypervisor (EL2) code, it is important to distinguish between the
host's 'struct kvm' and a protected VM's 'struct kvm'. Using 'host_kvm'
as variable name in that context makes this distinction clear.
However, in the host kernel code (EL1), there is no such ambiguity. The
code is only ever concerned with the host's own 'struct kvm' instance.
The 'host_' prefix is therefore redundant and adds unnecessary
verbosity.
Simplify the code by renaming the 'host_kvm' parameter to 'kvm' in all
functions within host-side kernel code (EL1). This improves readability
and makes the naming consistent with other host-side kernel code.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Rename pkvm.enabled to pkvm.is_protected
The 'pkvm.enabled' field in struct kvm_protected_vm is confusingly
named. Its purpose is to indicate whether a VM is a _protected_ VM under
pKVM, and not whether the VM itself is enabled or running.
For a non-protected VM, the VM can be fully active and running, yet this
field would be false. This ambiguity can lead to incorrect assumptions
about the VM's operational state and makes the code harder to reason
about.
Rename the field to 'is_protected' to make it unambiguous that the flag
tracks the protected status of the VM.
No functional change intended.
Reviewed-by: Kunwu Chan <kunwu.chan@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Kunwu Chan <chentao@kylinos.cn> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM: arm64: Add build-time check for duplicate DECLARE_REG use
The DECLARE_REG() macro provides a convenient way to create a local
variable initialized from a cpu context in the hyp trap handlers.
However, a common error is to use the macro multiple times in the same
scope with the same register index, but for different logical purposes.
This results in valid C code that compiles without error, but introduces
subtle bugs where a developer expects two different variables to hold
values from two different registers, when in fact they are both sourced
from the same one.
To prevent this entire class of bugs, modify the DECLARE_REG() macro
to declare a dummy variable whose name is derived from the register
index. If the macro is used again with the same index in the same
scope, the compiler will fail with a "redeclaration of variable"
error, turning a subtle runtime bug into an obvious build-time failure.
Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
Ben Horgan [Fri, 15 Aug 2025 16:26:55 +0000 (17:26 +0100)]
KVM: arm64: Fix debug checking for np-guests using huge mappings
When running with transparent huge pages and CONFIG_NVHE_EL2_DEBUG then
the debug checking in assert_host_shared_guest() fails on the launch of an
np-guest. This WARN_ON() causes a panic and generates the stack below.
In __pkvm_host_relax_perms_guest() the debug checking assumes the mapping
is a single page but it may be a block map. Update the checking so that
the size is not checked and just assumes the correct size.
While we're here make the same fix in __pkvm_host_mkyoung_guest().
Signed-off-by: Ben Horgan <ben.horgan@arm.com> Fixes: f28f1d02f4eaa ("KVM: arm64: Add a range to __pkvm_host_unshare_guest()") Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Quentin Perret <qperret@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: stable@vger.kernel.org Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Marc Zyngier [Mon, 15 Sep 2025 09:27:28 +0000 (10:27 +0100)]
Merge branch kvm-arm64/ffa-1.2 into kvmarm-master/next
* kvm-arm64/ffa-1.2:
: .
: FFA 1.2 support for pKVM, courtesy of Per Larsen.
:
: From the cover letter at [1]:
:
: "The FF-A 1.2 specification introduces a new SEND_DIRECT2 ABI which
: allows registers x4-x17 to be used for the message payload. This patch
: set prevents the host from using a lower FF-A version than what has
: already been negotiated with the hypervisor. This is necessary because
: the hypervisor does not have the necessary compatibility paths to
: translate from the hypervisor FF-A version to a previous version."
:
: [1] https://lore.kernel.org/r/20250820-virtio-msg-ffa-v11-0-497ef43550a3@google.com
: .
KVM: arm64: Bump the supported version of FF-A to 1.2
KVM: arm64: Mask response to FFA_FEATURE call
KVM: arm64: Mark optional FF-A 1.2 interfaces as unsupported
KVM: arm64: Mark FFA_NOTIFICATION_* calls as unsupported
KVM: arm64: Use SMCCC 1.2 for FF-A initialization and in host handler
KVM: arm64: Correct return value on host version downgrade attempt
Wei-Lin Chang [Sat, 9 Aug 2025 13:53:56 +0000 (21:53 +0800)]
KVM: arm64: ptdump: Don't test PTE_VALID alongside other attributes
The attribute masks and test values in the ptdump code are meant for
individual attributes, however for stage-2 ptdump we included PTE_VALID
while testing for R, W, X, and AF. This led to some confusion and the
flipped output for the executable attribute.
Remove PTE_VALID from all attribute masks and values so that each test
matches only the relevant bits.
Additionally, the executable attribute printing is updated to align with
stage-1 ptdump, printing "NX" for non-executable regions and "x " for
executable ones.
Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Sebastian Ene <sebastianene@google.com> Signed-off-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Per Larsen [Wed, 20 Aug 2025 01:10:10 +0000 (01:10 +0000)]
KVM: arm64: Bump the supported version of FF-A to 1.2
FF-A version 1.2 introduces the DIRECT_REQ2 ABI. Bump the FF-A version
preferred by the hypervisor to enable implementation of the 1.2-only
FFA_MSG_SEND_DIRECT_REQ2 and FFA_MSG_SEND_RESP2 messaging interfaces.
Co-developed-by: Ayrton Munoz <ayrton@google.com> Signed-off-by: Ayrton Munoz <ayrton@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Per Larsen <perlarsen@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
Per Larsen [Wed, 20 Aug 2025 01:10:09 +0000 (01:10 +0000)]
KVM: arm64: Mask response to FFA_FEATURE call
The minimum size and alignment boundary for FFA_RXTX_MAP is returned in
bit[1:0]. Mask off any other bits in w2 when reading the minimum buffer
size in hyp_ffa_post_init.
Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Per Larsen <perlarsen@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>