git.ipfire.org Git - thirdparty/kernel/linux.git/log

KVM: arm64: Deduplicate ASID retrieval code

We currently have three versions of the ASID retrieval code, one
in the S1 walker, and two in the VNCR handling (although the last
two are limited to the EL2&0 translation regime).

Make this code common, and take this opportunity to also simplify
the code a bit while switching over to the TTBRx_EL1_ASID macro.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260225104718.14209-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

irqchip/gic-v5: Fix inversion of IRS_IDR0.virt flag

It appears that a !! became ! during a cleanup, resulting in inverted
logic when detecting if a host GICv5 implementation is capable of
virtualization.

Re-add the missing !, fixing the behaviour.

Fixes: 3227c3a89d65f ("irqchip/gic-v5: Check if impl is virt capable")
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260225083130.3378490-1-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Revert accidental drop of kvm_uninit_stage2_mmu() for non-NV VMs

Commit 0c4762e26879 ("KVM: arm64: nv: Avoid NV stage-2 code when NV is
not supported") added an early return to several functions in
arch/arm64/kvm/nested.c to prevent a UBSAN shift-out-of-bounds error
when accessing the pgt union for non-nested VMs.

However, this early return was inadvertently applied to
kvm_arch_flush_shadow_all() as well, causing it to skip the call to
kvm_uninit_stage2_mmu(kvm) for all non-nested VMs.

For pKVM, skipping this teardown means the host never unshares the
guest's memory with the EL2 hypervisor. When the host kernel later
recycles these leaked pages for a new VM, it attempts to re-share them.
The hypervisor correctly rejects this with -EPERM, triggering a host
WARN_ON and hanging the guest.

Fix this by dropping the early return from kvm_arch_flush_shadow_all().
The for-loop guarding the nested MMU cleanup already bounds itself when
nested_mmus_size == 0, allowing execution to proceed to
kvm_uninit_stage2_mmu() as intended.

Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/60916cb6-f460-4751-b910-f63c58700ad0@sirena.org.uk/
Fixes: 0c4762e26879 ("KVM: arm64: nv: Avoid NV stage-2 code when NV is not supported")
Signed-off-by: Fuad Tabba <tabba@google.com>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20260222083352.89503-1-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix protected mode handling of pages larger than 4kB

Since 3669ddd8fa8b5 ("KVM: arm64: Add a range to pkvm_mappings"),
pKVM tracks the memory that has been mapped into a guest in a
side data structure. Crucially, it uses it to find out whether
a page has already been mapped, and therefore refuses to map it
twice. So far, so good.

However, this very patch completely breaks non-4kB page support,
with guests being unable to boot. The most obvious symptom is that
we take the same fault repeatedly, and not making forward progress.
A quick investigation shows that this is because of the above
rejection code.

As it turns out, there are multiple issues at play:

- while the HPFAR_EL2 register gives you the faulting IPA minus
  the bottom 12 bits, it will still give you the extra bits that
  are part of the page offset for anything larger than 4kB,
  even for a level-3 mapping

- pkvm_pgtable_stage2_map() assumes that the address passed as
  a parameter is aligned to the size of the intended mapping

- the faulting address is only aligned for a non-page mapping

When the planets are suitably aligned (pun intended), the guest
faults on a page by accessing it past the bottom 4kB, and extra bits
get set in the HPFAR_EL2 register. If this results in a page mapping
(which is likely with large granule sizes), nothing aligns it further
down, and pkvm_mapping_iter_first() finds an intersection that
doesn't really exist. We assume this is a spurious fault and return
-EAGAIN. And again...

This doesn't hit outside of the protected code, as the page table
code always aligns the IPA down to a page boundary, hiding the issue
for everyone else.

Fix it by always forcing the alignment on vma_pagesize, irrespective
of the value of vma_pagesize.

Fixes: 3669ddd8fa8b5 ("KVM: arm64: Add a range to pkvm_mappings")
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://https://patch.msgid.link/20260222141000.3084258-1-maz@kernel.org
Cc: stable@vger.kernel.org

KVM: arm64: vgic: Handle const qualifier from gic_kvm_info allocation type

In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)

The assigned type is "struct gic_kvm_info", but the returned type,
while matching, is const qualified. To get them exactly matching, just
use the dereferenced pointer for the sizeof().

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20260206223022.it.052-kees@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()

The `sve_state` pointer in `hyp_vcpu->vcpu.arch` is initialized as a
hypervisor virtual address during vCPU initialization in
`pkvm_vcpu_init_sve()`.

`unpin_host_sve_state()` calls `kern_hyp_va()` on this address. Since
`kern_hyp_va()` is idempotent, it's not a bug. However, it is
unnecessary and potentially confusing. Remove the redundant conversion.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix ID register initialization for non-protected pKVM guests

In protected mode, the hypervisor maintains a separate instance of
the `kvm` structure for each VM. For non-protected VMs, this structure is
initialized from the host's `kvm` state.

Currently, `pkvm_init_features_from_host()` copies the
`KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the
underlying `id_regs` data being initialized. This results in the
hypervisor seeing the flag as set while the ID registers remain zeroed.

Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for
non-protected VMs. This breaks logic that relies on feature detection,
such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain
system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not
saved/restored during the world switch, which could lead to state
corruption.

Fix this by explicitly copying the ID registers from the host `kvm` to
the hypervisor `kvm` for non-protected VMs during initialization, since
we trust the host with its non-protected guests' features. Also ensure
`KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in
`pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly
initialize them and set the flag once done.

Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Optimise away S1POE handling when not supported by host

Although ID register sanitisation prevents guests from seeing the
feature, adding this check to the helper allows the compiler to entirely
eliminate S1POE-specific code paths (such as context switching POR_EL1)
when the host kernel is compiled without support (CONFIG_ARM64_POE is
disabled).

This aligns with the pattern used for other optional features like SVE
(kvm_has_sve()) and FPMR (kvm_has_fpmr()), ensuring no POE logic if the
host lacks support, regardless of the guest configuration state.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Hide S1POE from guests when not supported by the host

When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
guests whenever the hardware supports it, ignoring the host kernel
configuration.

If a guest detects this feature and attempts to use it, the host will
fail to context-switch POR_EL1, potentially leading to state corruption.

Fix this by masking ID_AA64MMFR3_EL1.S1POE in the sanitised system
registers, preventing KVM from advertising the feature when the host
does not support it (i.e. system_supports_poe() is false).

Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260213143815.1732675-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/misc-6.20 into kvmarm-master/next

* kvm-arm64/misc-6.20:
  : .
  : Misc KVM/arm64 changes for 6.20
  :
  : - Trivial FPSIMD cleanups
  :
  : - Calculate hyp VA size only once, avoiding potential mapping issues when
  :   VA bits is smaller than expected
  :
  : - Silence sparse warning for the HYP stack base
  :
  : - Fix error checking when handling FFA_VERSION
  :
  : - Add missing trap configuration for DBGWCR15_EL1
  :
  : - Don't try to deal with nested S2 when NV isn't enabled for a guest
  :
  : - Various spelling fixes
  : .
  KVM: arm64: nv: Avoid NV stage-2 code when NV is not supported
  KVM: arm64: Fix various comments
  KVM: arm64: nv: Add trap config for DBGWCR<15>_EL1
  KVM: arm64: Fix error checking for FFA_VERSION
  KVM: arm64: Fix missing <asm/stackpage/nvhe.h> include
  KVM: arm64: Calculate hyp VA size only once
  KVM: arm64: Remove ISB after writing FPEXC32_EL2
  KVM: arm64: Shuffle KVM_HOST_DATA_FLAG_* indices
  KVM: arm64: Fix comment in fpsimd_lazy_switch_to_host()

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/resx into kvmarm-master/next

* kvm-arm64/resx:
  : .
  : Add infrastructure to deal with the full gamut of RESx bits
  : for NV. As a result, it is now possible to have the expected
  : semantics for some bits such as SCTLR_EL2.SPAN.
  : .
  KVM: arm64: Add debugfs file dumping computed RESx values
  KVM: arm64: Add sanitisation to SCTLR_EL2
  KVM: arm64: Remove all traces of HCR_EL2.MIOCNCE
  KVM: arm64: Remove all traces of FEAT_TME
  KVM: arm64: Simplify handling of full register invalid constraint
  KVM: arm64: Get rid of FIXED_VALUE altogether
  KVM: arm64: Simplify handling of HCR_EL2.E2H RESx
  KVM: arm64: Move RESx into individual register descriptors
  KVM: arm64: Add RES1_WHEN_E2Hx constraints as configuration flags
  KVM: arm64: Add REQUIRES_E2H1 constraint as configuration flags
  KVM: arm64: Simplify FIXED_VALUE handling
  KVM: arm64: Convert HCR_EL2.RW to AS_RES1
  KVM: arm64: Correctly handle SCTLR_EL1 RES1 bits for unsupported features
  KVM: arm64: Allow RES1 bits to be inferred from configuration
  KVM: arm64: Inherit RESx bits from FGT register descriptors
  KVM: arm64: Extend unified RESx handling to runtime sanitisation
  KVM: arm64: Introduce data structure tracking both RES0 and RES1 bits
  KVM: arm64: Introduce standalone FGU computing primitive
  KVM: arm64: Remove duplicate configuration for SCTLR_EL1.{EE,E0E}
  arm64: Convert SCTLR_EL2 to sysreg infrastructure

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/debugfs-fixes into kvmarm-master/next

* kvm-arm64/debugfs-fixes:
  : .
  : Cleanup of the debugfs iterator, which are way more complicated
  : than they ought to be, courtesy of Fuad Tabba. From the cover letter:
  :
  : "This series refactors the debugfs implementations for `idregs` and
  :  `vgic-state` to use standard `seq_file` iterator patterns.
  :
  :  The existing implementations relied on storing iterator state within
  :  global VM structures (`kvm_arch` and `vgic_dist`). This approach
  :  prevented concurrent reads of the debugfs files (returning -EBUSY) and
  :  created improper dependencies between transient file operations and
  :  long-lived VM state."
  : .
  KVM: arm64: Use standard seq_file iterator for vgic-debug debugfs
  KVM: arm64: Reimplement vgic-debug XArray iteration
  KVM: arm64: Use standard seq_file iterator for idregs debugfs

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/gicv5-prologue into kvmarm-master/next

* kvm-arm64/gicv5-prologue:
  : .
  : Prologue to GICv5 support, courtesy of Sascha Bischoff.
  :
  : This is preliminary work that sets the scene for the full-blow
  : support.
  : .
  irqchip/gic-v5: Check if impl is virt capable
  KVM: arm64: gic: Set vgic_model before initing private IRQs
  arm64/sysreg: Drop ICH_HFGRTR_EL2.ICC_HAPR_EL1 and make RES1
  KVM: arm64: gic-v3: Switch vGIC-v3 to use generated ICH_VMCR_EL2

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/gicv3-tdir-fixes into kvmarm-master/next

* kvm-arm64/gicv3-tdir-fixes:
  : .
  : Address two trapping-related issues when running legacy (i.e. GICv3)
  : guests on GICv5 hosts, courtesy of Sascha Bischoff.
  : .
  KVM: arm64: Correct test for ICH_HCR_EL2_TDIR cap for GICv5 hosts
  KVM: arm64: gic: Enable GICv3 CPUIF trapping on GICv5 hosts if required

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/fwb-for-all into kvmarm-master/next

* kvm-arm64/fwb-for-all:
  : .
  : Allow pKVM's host stage-2 mappings to use the Force Write Back version
  : of the memory attributes by using the "pass-through' encoding.
  :
  : This avoids having two separate encodings for S2 on a given platform.
  : .
  KVM: arm64: Simplify PAGE_S2_MEMATTR
  KVM: arm64: Kill KVM_PGTABLE_S2_NOFWB
  KVM: arm64: Switch pKVM host S2 over to KVM_PGTABLE_S2_AS_S1
  KVM: arm64: Add KVM_PGTABLE_S2_AS_S1 flag
  arm64: Add MT_S2{,_FWB}_AS_S1 encodings

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/pkvm-no-mte into kvmarm-master/next

* kvm-arm64/pkvm-no-mte:
  : .
  : pKVM updates preventing the host from using MTE-related system
  : sysrem registers when the feature is disabled from the kernel
  : command-line (arm64.nomte), courtesy of Fuad Taba.
  :
  : From the cover letter:
  :
  : "If MTE is supported by the hardware (and is enabled at EL3), it remains
  : available to lower exception levels by default. Disabling it in the host
  : kernel (e.g., via 'arm64.nomte') only stops the kernel from advertising
  : the feature; it does not physically disable MTE in the hardware.
  :
  : The ability to disable MTE in the host kernel is used by some systems,
  : such as Android, so that the physical memory otherwise used as tag
  : storage can be used for other things (i.e. treated just like the rest of
  : memory). In this scenario, a malicious host could still access tags in
  : pages donated to a guest using MTE instructions (e.g., STG and LDG),
  : bypassing the kernel's configuration."
  : .
  KVM: arm64: Use kvm_has_mte() in pKVM trap initialization
  KVM: arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled
  KVM: arm64: Trap MTE access and discovery when MTE is disabled
  KVM: arm64: Remove dead code resetting HCR_EL2 for pKVM

Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add debugfs file dumping computed RESx values

Computing RESx values is hard. Verifying that they are correct is
harder. Add a debugfs file called "resx" that will dump all the RESx
values for a given VM.

I found it useful, maybe you will too.

Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-21-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add sanitisation to SCTLR_EL2

Sanitise SCTLR_EL2 the usual way. The most important aspect of
this is that we benefit from SCTLR_EL2.SPAN being RES1 when
HCR_EL2.E2H==0.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-20-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Remove all traces of HCR_EL2.MIOCNCE

MIOCNCE had the potential to eat your data, and also was never
implemented by anyone. It's been retrospectively removed from
the architecture, and we're happy to follow that lead.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-19-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Remove all traces of FEAT_TME

FEAT_TME has been dropped from the architecture. Retrospectively.
I'm sure someone is crying somewhere, but most of us won't.

Clean-up time.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-18-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Simplify handling of full register invalid constraint

Now that we embed the RESx bits in the register description, it becomes
easier to deal with registers that are simply not valid, as their
existence is not satisfied by the configuration (SCTLR2_ELx without
FEAT_SCTLR2, for example). Such registers essentially become RES0 for
any bit that wasn't already advertised as RESx.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-17-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Get rid of FIXED_VALUE altogether

We have now killed every occurrences of FIXED_VALUE, and we can therefore
drop the whole infrastructure. Good riddance.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-16-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Simplify handling of HCR_EL2.E2H RESx

Now that we can link the RESx behaviour with the value of HCR_EL2.E2H,
we can trivially express the tautological constraint that makes E2H
a reserved value at all times.

Fun, isn't it?

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-15-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Move RESx into individual register descriptors

Instead of hacking the RES1 bits at runtime, move them into the
register descriptors. This makes it significantly nicer.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-14-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add RES1_WHEN_E2Hx constraints as configuration flags

"Thanks" to VHE, SCTLR_EL2 radically changes shape depending on the
value of HCR_EL2.E2H, as a lot of the bits that didn't have much
meaning with E2H=0 start impacting EL0 with E2H=1.

This has a direct impact on the RESx behaviour of these bits, and
we need a way to express them.

For this purpose, introduce two new constaints that, when the
controlling feature is not present, force the field to RES1 depending
on the value of E2H. Note that RES0 is still implicit,

This allows diverging RESx values depending on the value of E2H,
something that is required by a bunch of SCTLR_EL2 bits.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-13-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add REQUIRES_E2H1 constraint as configuration flags

A bunch of EL2 configuration are very similar to their EL1 counterpart,
with the added constraint that HCR_EL2.E2H being 1.

For us, this means HCR_EL2.E2H being RES1, which is something we can
statically evaluate.

Add a REQUIRES_E2H1 constraint, which allows us to express conditions
in a much simpler way (without extra code). Existing occurrences are
converted, before we add a lot more.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-12-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Simplify FIXED_VALUE handling

The FIXED_VALUE qualifier (mostly used for HCR_EL2) is pointlessly
complicated, as it tries to piggy-back on the previous RES0 handling
while being done in a different phase, on different data.

Instead, make it an integral part of the RESx computation, and allow
it to directly set RESx bits. This is much easier to understand.

It also paves the way for some additional changes to that will allow
the full removal of the FIXED_VALUE handling.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-11-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Convert HCR_EL2.RW to AS_RES1

Now that we have the AS_RES1 constraint, it becomes trivial to express
the HCR_EL2.RW behaviour.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Correctly handle SCTLR_EL1 RES1 bits for unsupported features

A bunch of SCTLR_EL1 bits must be set to RES1 when the controlling
feature is not present. Add the AS_RES1 qualifier where needed.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-9-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Allow RES1 bits to be inferred from configuration

So far, when a bit field is tied to an unsupported feature, we set
it as RES0. This is almost correct, but there are a few exceptions
where the bits become RES1.

Add a AS_RES1 qualifier that instruct the RESx computing code to
simply do that.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-8-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Inherit RESx bits from FGT register descriptors

The FGT registers have their computed RESx bits stashed in specific
descriptors, which we can easily use when computing the masks used
for the guest.

This removes a bit of boilerplate code.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Extend unified RESx handling to runtime sanitisation

Add a new helper to retrieve the RESx values for a given system
register, and use it for the runtime sanitisation.

This results in slightly better code generation for a fairly hot
path in the hypervisor, and additionally covers all sanitised
registers in all conditions, not just the VNCR-based ones.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Introduce data structure tracking both RES0 and RES1 bits

We have so far mostly tracked RES0 bits, but only made a few attempts
at being just as strict for RES1 bits (probably because they are both
rarer and harder to handle).

Start scratching the surface by introducing a data structure tracking
RES0 and RES1 bits at the same time.

Note that contrary to the usual idiom, this structure is mostly passed
around by value -- the ABI handles it nicely, and the resulting code is
much nicer.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Introduce standalone FGU computing primitive

Computing the FGU bits is made oddly complicated, as we use the RES0
helper instead of using a specific abstraction.

Introduce such an abstraction, which is going to make things significantly
simpler in the future.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Remove duplicate configuration for SCTLR_EL1.{EE,E0E}

We already have specific constraints for SCTLR_EL1.{EE,E0E}, and
making them depend on FEAT_AA64EL1 is just buggy.

Fixes: 6bd4a274b026e ("KVM: arm64: Convert SCTLR_EL1 to config-driven sanitisation")
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: Convert SCTLR_EL2 to sysreg infrastructure

Convert SCTLR_EL2 to the sysreg infrastructure, as per the 2025-12_rel
revision of the Registers.json file.

Note that we slightly deviate from the above, as we stick to the ARM
ARM M.a definition of SCTLR_EL2[9], which is RES0, in order to avoid
dragging the POE2 definitions...

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202184329.2724080-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: nv: Avoid NV stage-2 code when NV is not supported

The NV stage-2 manipulation functions kvm_nested_s2_unmap(),
kvm_nested_s2_wp(), and others, are being called for any stage-2
manipulation regardless of whether nested virtualization is supported or
enabled for the VM.

For protected KVM (pKVM), `struct kvm_pgtable` uses the
`pkvm_mappings` member of the union. This member aliases `ia_bits`,
which is used by the non-protected NV code paths. Attempting to
read `pgt->ia_bits` in these functions results in treating
protected mapping pointers or state values as bit-shift amounts.
This triggers a UBSAN shift-out-of-bounds error:

    UBSAN: shift-out-of-bounds in arch/arm64/kvm/nested.c:1127:34
    shift exponent 174565952 is too large for 64-bit type 'unsigned long'
    Call trace:
     __ubsan_handle_shift_out_of_bounds+0x28c/0x2c0
     kvm_nested_s2_unmap+0x228/0x248
     kvm_arch_flush_shadow_memslot+0x98/0xc0
     kvm_set_memslot+0x248/0xce0

Since pKVM and NV are mutually exclusive, prevent entry into these
NV handling functions if the VM has not allocated any nested MMUs
(i.e., `kvm->arch.nested_mmus_size` is 0).

Fixes: 7270cc9157f47 ("KVM: arm64: nv: Handle VNCR_EL2 invalidation from MMU notifiers")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202152310.113467-1-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Use standard seq_file iterator for vgic-debug debugfs

The current implementation uses `vgic_state_iter` in `struct
vgic_dist` to track the sequence position. This effectively makes the
iterator shared across all open file descriptors for the VM.

This approach has significant drawbacks:
- It enforces mutual exclusion, preventing concurrent reads of the
debugfs file (returning -EBUSY).
- It relies on storing transient iterator state in the long-lived
VM structure (`vgic_dist`).

Refactor the implementation to use the standard `seq_file` iterator.
Instead of storing state in `kvm_arch`, rely on the `pos` argument
passed to the `start` and `next` callbacks, which tracks the logical
index specific to the file descriptor.

This change enables concurrent access and eliminates the
`vgic_state_iter` field from `struct vgic_dist`.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202085721.3954942-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Reimplement vgic-debug XArray iteration

The vgic-debug interface implementation uses XArray marks
(`LPI_XA_MARK_DEBUG_ITER`) to "snapshot" LPIs at the start of iteration.
This modifies global state for a read-only operation and complicates
reference counting, leading to leaks if iteration is aborted or fails.

Reimplement the iterator to use dynamic iteration logic:

- Remove `lpi_idx` from `struct vgic_state_iter`.
- Replace the XArray marking mechanism with dynamic iteration using
  `xa_find_after(..., XA_PRESENT)`.
- Wrap XArray traversals in `rcu_read_lock()`/`rcu_read_unlock()` to
  ensure safety against concurrent modifications (e.g., LPI unmapping).
- Handle potential races where an LPI is removed during iteration by
  gracefully skipping it in `show()`, rather than warning.
- Remove the unused `LPI_XA_MARK_DEBUG_ITER` definition.

This simplifies the lifecycle management of the iterator and prevents
resource leaks associated with the marking mechanism, and paves the way
for using a standard seq_file iterator.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202085721.3954942-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Use standard seq_file iterator for idregs debugfs

The current implementation uses `idreg_debugfs_iter` in `struct
kvm_arch` to track the sequence position. This effectively makes the
iterator shared across all open file descriptors for the VM.

This approach has significant drawbacks:
- It enforces mutual exclusion, preventing concurrent reads of the
  debugfs file (returning -EBUSY).
- It relies on storing transient iterator state in the long-lived VM
  structure (`kvm_arch`).
- The use of `u8` for the iterator index imposes an implicit limit of
  255 registers. While not currently exceeded, this is fragile against
  future architectural growth. Switching to `loff_t` eliminates this
  overflow risk.

Refactor the implementation to use the standard `seq_file` iterator.
Instead of storing state in `kvm_arch`, rely on the `pos` argument
passed to the `start` and `next` callbacks, which tracks the logical
index specific to the file descriptor.

This change enables concurrent access and eliminates the
`idreg_debugfs_iter` field from `struct kvm_arch`.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260202085721.3954942-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

irqchip/gic-v5: Check if impl is virt capable

Now that there is support for creating a GICv5-based guest with KVM,
check that the hardware itself supports virtualisation, skipping the
setting of struct gic_kvm_info if not.

Note: If native GICv5 virt is not supported, then nor is
FEAT_GCIE_LEGACY, so we are able to skip altogether.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260128175919.3828384-33-sascha.bischoff@arm.com
[maz: cosmetic changes]
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: gic: Set vgic_model before initing private IRQs

Different GIC types require the private IRQs to be initialised
differently. GICv5 is the culprit as it supports both a different
number of private IRQs, and all of these are PPIs (there are no
SGIs). Moreover, as GICv5 uses the top bits of the interrupt ID to
encode the type, the intid also needs to computed differently.

Up until now, the GIC model has been set after initialising the
private IRQs for a VCPU. Move this earlier to ensure that the GIC
model is available when configuring the private IRQs. While we're at
it, also move the setting of the in_kernel flag and implementation
revision to keep them grouped together as before.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260128175919.3828384-7-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64/sysreg: Drop ICH_HFGRTR_EL2.ICC_HAPR_EL1 and make RES1

The GICv5 architecture is dropping the ICC_HAPR_EL1 and ICV_HAPR_EL1
system registers. These registers were never added to the sysregs, but
the traps for them were.

Drop the trap bit from the ICH_HFGRTR_EL2 and make it Res1 as per the
upcoming GICv5 spec change. Additionally, update the EL2 setup code to
not attempt to set that bit.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260128175919.3828384-4-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: gic-v3: Switch vGIC-v3 to use generated ICH_VMCR_EL2

The VGIC-v3 code relied on hand-written definitions for the
ICH_VMCR_EL2 register. This register, and the associated fields, is
now generated as part of the sysreg framework. Move to using the
generated definitions instead of the hand-written ones.

There are no functional changes as part of this change.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260128175919.3828384-3-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix various comments

Use tab instead of whitespaces, as well as 2 minor typo fixes.

Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Link: https://patch.msgid.link/20260128075208.23024-1-zenghui.yu@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: nv: Add trap config for DBGWCR<15>_EL1

Seems that it was missed when MDCR_EL2 was first added to the trap
forwarding infrastructure. Add it back.

Fixes: cb31632c4452 ("KVM: arm64: nv: Add trap forwarding for MDCR_EL2")
Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Link: https://patch.msgid.link/20260130094435.39942-1-zenghui.yu@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Correct test for ICH_HCR_EL2_TDIR cap for GICv5 hosts

The original order of checks in the ICH_HCR_EL2_TDIR test returned
with false early in the case where the native GICv3 CPUIF was not
present. The result was that on GICv5 hosts with legacy support -
which do not have the GICv3 CPUIF - the test always returned false.

Reshuffle the checks such that support for GICv5 legacy is checked
prior to checking for the native GICv3 CPUIF.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Fixes: 2a28810cbb8b2 ("KVM: arm64: GICv3: Detect and work around the lack of ICV_DIR_EL1 trapping")
Link: https://patch.msgid.link/20251208152724.3637157-4-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: gic: Enable GICv3 CPUIF trapping on GICv5 hosts if required

Factor out the enable (and printing of) the GICv3 CPUIF traps from the
main GICv3 probe into a separate function. Call said function from the
GICv5 probe for legacy support, ensuring that any required GICv3 CPUIF
traps on GICv5 hosts will be correctly handled, rather than injecting
an undef into the guest.

Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20251208152724.3637157-3-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Simplify PAGE_S2_MEMATTR

Restore PAGE_S2_MEMATTR() to its former glory, keeping the use of
FWB as an implementation detail.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260123191637.715429-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Kill KVM_PGTABLE_S2_NOFWB

Nobody is using this flag anymore, so remove it. This allows
some cleanup by removing stage2_has_fwb(), which is can be replaced
by a direct check on the capability.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260123191637.715429-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Switch pKVM host S2 over to KVM_PGTABLE_S2_AS_S1

Since we have the basics to use the S1 memory attributes as the
final ones with FWB, flip the host over to that when FWB is present.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260123191637.715429-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add KVM_PGTABLE_S2_AS_S1 flag

Plumb the MT_S2{,_FWB}_AS_S1 memory types into the KVM_S2_MEMATTR()
macro with a new KVM_PGTABLE_S2_AS_S1 flag.

Nobody selects it yet.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260123191637.715429-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: Add MT_S2{,_FWB}_AS_S1 encodings

pKVM usage of S2 translation on the host is purely for isolation
purposes, not translation. To that effect, the memory attributes
being used must be that of S1.

With FWB=0, this is easily achieved by using the Normal Cacheable
type (which is the weakest possible memory type) at S2, and let S1
pick something stronger as required.

With FWB=1, the attributes are combined in a different way, and we
cannot arbitrarily use Normal Cacheable. We can, however, use a
memattr encoding that indicates that the final attributes are that
of Stage-1.

Add these encoding and a few pointers to the relevant parts of the
specification. It might come handy some day.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260123191637.715429-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Use kvm_has_mte() in pKVM trap initialization

When initializing HCR traps in protected mode, use kvm_has_mte() to
check for MTE support rather than kvm_has_feat(kvm, ID_AA64PFR1_EL1,
MTE, IMP).

kvm_has_mte() provides a more comprehensive check:
- kvm_has_feat() only checks if MTE is in the guest's ID register view
(i.e., what we advertise to the guest)
- kvm_has_mte() checks both system_supports_mte() AND whether
KVM_ARCH_FLAG_MTE_ENABLED is set for this VM instance

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Inject UNDEF when accessing MTE sysregs with MTE disabled

When MTE hardware is present but disabled via software (`arm64.nomte` or
`CONFIG_ARM64_MTE=n`), the kernel clears `HCR_EL2.ATA` and sets
`HCR_EL2.TID5`, to prevent the use of MTE instructions.

Additionally, accesses to certain MTE system registers trap to EL2 with
exception class ESR_ELx_EC_SYS64. To emulate hardware without MTE (where
such accesses would cause an Undefined Instruction exception), inject
UNDEF into the host.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Trap MTE access and discovery when MTE is disabled

If MTE is not supported by the hardware, or is disabled in the kernel
configuration (`CONFIG_ARM64_MTE=n`) or command line (`arm64.nomte`),
the kernel stops advertising MTE to userspace and avoids using MTE
instructions. However, this is a software-level disable only.

When MTE hardware is present and enabled by EL3 firmware, leaving
`HCR_EL2.ATA` set allows the host to execute MTE instructions (STG, LDG,
etc.) and access allocation tags in physical memory.

Prevent this by clearing `HCR_EL2.ATA` when MTE is disabled. Remove it
from the `HCR_HOST_NVHE_FLAGS` default, and conditionally set it in
`cpu_prepare_hyp_mode()` only when `system_supports_mte()` returns true.
This causes MTE instructions to trap to EL2 when `HCR_EL2.ATA` is
cleared.

Additionally, set `HCR_EL2.TID5` when MTE is disabled. This traps reads
of `GMID_EL1` (Multiple tag transfer ID register) to EL2, preventing the
discovery of MTE parameters (such as tag block size) when the feature is
suppressed.

Early boot code in `head.S` temporarily keeps `HCR_ATA` set to avoid
special-casing initialization paths. This is safe because this code
executes before untrusted code runs and will clear `HCR_ATA` if MTE is
disabled.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Remove dead code resetting HCR_EL2 for pKVM

The pKVM lifecycle does not support tearing down the hypervisor and
returning to the hyp stub once initialized. The transition to protected
mode is one-way.

Consequently, the code path in hyp-init.S responsible for resetting
EL2 registers (triggered by kexec or hibernation) is unreachable in
protected mode.

Remove the dead code handling HCR_EL2 reset for
ARM64_KVM_PROTECTED_MODE.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/pkvm-features-6.20 into kvmarm-master/next

* kvm-arm64/pkvm-features-6.20:
  : .
  : pKVM guest feature trapping fixes, courtesy of Fuad Tabba.
  : .
  KVM: arm64: Prevent host from managing timer offsets for protected VMs
  KVM: arm64: Check whether a VM IOCTL is allowed in pKVM
  KVM: arm64: Track KVM IOCTLs and their associated KVM caps
  KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM
  KVM: arm64: Include VM type when checking VM capabilities in pKVM
  KVM: arm64: Introduce helper to calculate fault IPA offset
  KVM: arm64: Fix MTE flag initialization for protected VMs
  KVM: arm64: Fix Trace Buffer trap polarity for protected VMs
  KVM: arm64: Fix Trace Buffer trapping for protected VMs

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/feat_idst into kvmarm-master/next

* kvm-arm64/feat_idst:
  : .
  : Add support for FEAT_IDST, allowing ID registers that are not implemented
  : to be reported as a normal trap rather than as an UNDEF exception.
  : .
  KVM: arm64: selftests: Add a test for FEAT_IDST
  KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome
  KVM: arm64: pkvm: Add a generic synchronous exception injection primitive
  KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE
  KVM: arm64: Handle CSSIDR2_EL1 and SMIDR_EL1 in a generic way
  KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers
  KVM: arm64: Add a generic synchronous exception injection primitive
  KVM: arm64: Add trap routing for GMID_EL1
  arm64: Repaint ID_AA64MMFR2_EL1.IDS description

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/selftests-6.20 into kvmarm-master/next

* kvm-arm64/selftests-6.20:
  : .
  : Some selftest fixes addressing page alignment issues as well as
  : a bad MMU setup bug, courtesy of Fuad Tabba.
  : .
  KVM: selftests: Fix typos and stale comments in kvm_util
  KVM: selftests: Move page_align() to shared header
  KVM: riscv: selftests: Fix incorrect rounding in page_align()
  KVM: arm64: selftests: Fix incorrect rounding in page_align()
  KVM: arm64: selftests: Disable unused TTBR1_EL1 translations

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch kvm-arm64/vtcr into kvmarm-master/next

* kvm-arm64/vtcr:
  : .
  : VTCR_EL2 conversion to the configuration-driven RESx framework,
  : fixing a couple of UXN/PXN/XN bugs in the process.
  : .
  KVM: arm64: nv: Return correct RES0 bits for FGT registers
  KVM: arm64: Always populate FGT masks at boot time
  KVM: arm64: Honor UX/PX attributes for EL2 S1 mappings
  KVM: arm64: Convert VTCR_EL2 to config-driven sanitisation
  KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co
  arm64: Convert VTCR_EL2 to sysreg infratructure
  arm64: Convert ID_AA64MMFR0_EL1.TGRAN{4,16,64}_2 to UnsignedEnum
  KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers
  KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit
  KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF
  KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps()
  KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate()
  KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp()
  KVM: arm64: Inject UNDEF for a register trap without accessor
  KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load
  KVM: arm64: Fix EL2 S1 XN handling for hVHE setups
  KVM: arm64: gic: Check for vGICv3 when clearing TWI

Signed-off-by: Marc Zyngier <maz@kernel.org>

Merge branch arm64/for-next/cpufeature into kvmarm-master/next

Merge arm64/for-next/cpufeature in to resolve conflicts resulting from
the removal of CONFIG_PAN.

* arm64/for-next/cpufeature:
  arm64: Add support for FEAT_{LS64, LS64_V}
  KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
  KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
  KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  arm64: Unconditionally enable PAN support
  arm64: Unconditionally enable LSE support
  arm64: Add support for TSV110 Spectre-BHB mitigation

Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: Add support for FEAT_{LS64, LS64_V}

Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
These features are identified by ID_AA64ISAR1_EL1.LS64 and the
use of such instructions in userspace (EL0) can be trapped.

As st64bv (FEAT_LS64_V) and st64bv0 (FEAT_LS64_ACCDATA) can not be tell
apart, FEAT_LS64 and FEAT_LS64_ACCDATA which will be supported in later
patch will be exported to userspace, FEAT_LS64_V will be enabled only
in kernel.

In order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP3
and cpuinfo

ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
special memory (device memory) so requires support by the CPU, system
and target memory location (device that support these instructions).
The HWCAP3_LS64, implies the support of CPU and system (since no
identification method from system, so SoC vendors should advertise
support in the CPU if system also support them).

Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
DABT will be generated, so users (probably userspace driver developer)
should make sure the target memory (device) also have the support.
For st64bv 0xffffffffffffffff will be returned as status result for
unsupported memory so user should check it.

Document the restrictions along with HWCAP3_LS64.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest

Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1

Instructions introduced by FEAT_{LS64, LS64_V} is controlled by
HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage
at EL0/1.

This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.

Acked-by: Will Deacon <will@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory

If FEAT_LS64WB not supported, FEAT_LS64* instructions only support
to access Device/Uncacheable memory, otherwise a data abort for
unsupported Exclusive or atomic access (0x35, UAoEF) is generated
per spec. It's implementation defined whether the target exception
level is routed and is possible to implemented as route to EL2 on a
VHE VM according to DDI0487L.b Section C3.2.6 Single-copy atomic
64-byte load/store.

If it's implemented as generate the DABT to the final enabled stage
(stage-2), inject the UAoEF back to the guest after checking the
memslot is valid.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B

Add a bit of documentation for KVM_EXIT_ARM_LDST64B so that userspace
knows what to expect.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots

The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.

However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.

Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.

A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>

arm64: Unconditionally enable PAN support

FEAT_PAN has been around since ARMv8.1 (over 11 years ago), has no compiler
dependency (we have our own accessors), and is a great security benefit.

Drop CONFIG_ARM64_PAN, and make the support unconditionnal.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>

arm64: Unconditionally enable LSE support

LSE atomics have been in the architecture since ARMv8.1 (released in
2014), and are hopefully supported by all modern toolchains.

Drop the optional nature of LSE support in the kernel, and always
compile the support in, as this really is very little code. LL/SC
still is the default, and the switch to LSE is done dynamically.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>

KVM: arm64: nv: Return correct RES0 bits for FGT registers

We had extended the sysreg masking infrastructure to more general
registers, instead of restricting it to VNCR-backed registers, since
commit a0162020095e ("KVM: arm64: Extend masking facility to arbitrary
registers"). Fix kvm_get_sysreg_res0() to reflect this fact.

Note that we're sure that we only deal with FGT registers in
kvm_get_sysreg_res0(), the

if (sr < __VNCR_START__)

is actually a never false, which should probably be removed later.

Fixes: 69c19e047dfe ("KVM: arm64: Add TCR2_EL2 to the sysreg arrays")
Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Link: https://patch.msgid.link/20260121101631.41037-1-zenghui.yu@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org

KVM: arm64: Always populate FGT masks at boot time

We currently only populate the FGT masks if the underlying HW does
support FEAT_FGT. However, with the addition of the RES1 support for
system registers, this results in a lot of noise at boot time, as
reported by Nathan.

That's because even if FGT isn't supported, we still check for the
attribution of the bits to particular features, and not keeping the
masks up-to-date leads to (fairly harmess) warnings.

Given that we want these checks to be enforced even if the HW doesn't
support FGT, enable the generation of FGT masks unconditionally (this
is rather cheap anyway). Only the storage of the FGT configuration is
avoided, which will save a tiny bit of memory on these machines.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Fixes: c259d763e6b09 ("KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co")
Link: https://lore.kernel.org/r/20260120211558.GA834868@ax162
Link: https://patch.msgid.link/20260122085153.535538-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix error checking for FFA_VERSION

According to section 13.2 of the DEN0077 FF-A specification, when
firmware does not support the requested version, it should reply with
FFA_RET_NOT_SUPPORTED(-1). Table 13.6 specifies the type of the error
code as int32.
Currently, the error checking logic compares the unsigned long return
value it got from the SMC layer, against a "-1" literal. This fails due
to a type mismatch: the literal is extended to 64 bits, whereas the
register contains only 32 bits of ones(0x00000000ffffffff).
Consequently, hyp_ffa_init misinterprets the "-1" return value as an
invalid FF-A version. This prevents pKVM initialization on devices where
FF-A is not supported in firmware.
Fix this by explicitly casting res.a0 to s32.

Signed-off-by: Kornel Dulęba <korneld@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251114-pkvm_init_noffa-v1-1-87a82e87c345@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Prevent host from managing timer offsets for protected VMs

For protected VMs, the guest's timer offset state should not be
controlled by the host and must always run with a virtual counter offset
of 0. The existing timer logic allowed the host to set and manage the
timer counter offsets for protected VMs in certain cases.

Disable all host-side management of timer offsets for protected VMs by
adding checks in the relevant code paths.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-10-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Check whether a VM IOCTL is allowed in pKVM

Certain VM IOCTLs are tied to specific VM features. Since pKVM does not
support all features, restrict which IOCTLs are allowed depending on
whether the associated feature is supported.

Use the existing VM capability check as the source of truth to whether
an IOCTL is allowed for a particular VM by mapping the IOCTLs with their
associated capabilities.

Suggested-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-9-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Track KVM IOCTLs and their associated KVM caps

Track KVM IOCTLs (VM IOCTLs for now), and the associated KVM capability
that enables that IOCTL. Add a function that performs the lookup.

This will be used by CoCo VM Hypervisors (e.g., pKVM) to determine
whether a particular KVM IOCTL is allowed for its VMs.

Suggested-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
[maz: don't expose KVM_CAP_BASIC to userspace, and rely on NR_VCPUS
as a proxy for this]
Link: https://patch.msgid.link/20251211104710.151771-8-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM

Supporting MTE in pKVM introduces significant complexity to the
hypervisor at EL2, even for non-protected VMs, since it would require
EL2 to handle tag management.

For now, do not allow KVM_CAP_ARM_MTE for any VM type in protected mode.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Include VM type when checking VM capabilities in pKVM

Certain features and capabilities are restricted in protected mode. Most
of these features are restricted only for protected VMs, but some
are restricted for ALL VMs in protected mode.

Extend the pKVM capability check to pass the VM (kvm), and use that when
determining supported features.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Introduce helper to calculate fault IPA offset

This 12-bit FAR fault IPA offset mask is hard-coded as 'GENMASK(11, 0)'
in several places to reconstruct the full fault IPA.

Introduce FAR_TO_FIPA_OFFSET() to calculate this value in a shared
header and replace all open-coded instances to improve readability.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix MTE flag initialization for protected VMs

The function pkvm_init_features_from_host() initializes guest
features, propagating them from the host. The logic to propagate
KVM_ARCH_FLAG_MTE_ENABLED (Memory Tagging Extension)
has a couple of issues.

First, the check was in the common path, before the divergence for
protected and non-protected VMs. For non-protected VMs, this was
unnecessary, as 'kvm->arch.flags' is completely overwritten by
host_arch_flags immediately after, which already contains the MTE flag.
For protected VMs, this was setting the flag even if the feature is not
allowed.

Second, the check was reading 'host_kvm->arch.flags' instead of using
the local 'host_arch_flags', which is read once from the host flags.

Fix these by moving the MTE flag check inside the protected-VM-only
path, checking if the feature is allowed, and changing it to use the
correct host_arch_flags local variable. This ensures non-protected VMs
get the flag via the bulk copy, and protected VMs get it via an explicit
check.

Fixes: b7f345fbc32a ("KVM: arm64: Fix FEAT_MTE in pKVM")
Reviewed-by: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix Trace Buffer trap polarity for protected VMs

The E2TB bits in MDCR_EL2 control trapping of Trace Buffer system
register accesses. These accesses are trapped to EL2 when the bits are
clear.

The trap initialization logic for protected VMs in pvm_init_traps_mdcr()
had the polarity inverted. When a guest did not support the Trace Buffer
feature, the code was setting E2TB. This incorrectly disabled the trap,
potentially allowing a protected guest to access registers for a feature
it was not given.

Fix this by inverting the operation.

Fixes: f50758260bff ("KVM: arm64: Group setting traps for protected VMs by control register")
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Fix Trace Buffer trapping for protected VMs

For protected VMs in pKVM, the hypervisor should trap accesses to trace
buffer system registers if Trace Buffer isn't supported by the VM.
However, the current code only traps if Trace Buffer External Mode isn't
supported.

Fix this by checking for FEAT_TRBE (Trace Buffer) rather than
FEAT_TRBE_EXT.

Fixes: 9d5261269098 ("KVM: arm64: Trap external trace for protected VMs")
Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251211104710.151771-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: selftests: Fix typos and stale comments in kvm_util

Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`.

- Correct the argument description for `vcpu_args_set` in `kvm_util.h`,
  which incorrectly listed `vm` instead of `vcpu`.
- Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" ->
  "executing").
- Correct the return value description for `vm_vaddr_unused_gap` in
  `kvm_util.c` to match the implementation, which returns an address "at
  or above" `vaddr_min`, not "at or below".

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: selftests: Move page_align() to shared header

To avoid code duplication, move page_align() to the shared `kvm_util.h`
header file. Rename it to vm_page_align(), to make it clear that the
alignment is done with respect to the guest's base page size.

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: riscv: selftests: Fix incorrect rounding in page_align()

The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.

Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.

Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V 64-bit")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: selftests: Fix incorrect rounding in page_align()

The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.

Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.

Fixes: 7a6629ef746d ("kvm: selftests: add virt mem support for aarch64")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: selftests: Disable unused TTBR1_EL1 translations

KVM selftests map all guest code and data into the lower virtual address
range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...)
managed by TTBR1_EL1 is unused and uninitialized.

If a guest accesses the upper range, the MMU attempts a translation
table walk using uninitialized registers, leading to unpredictable
behavior.

Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1,
ensuring that any access to the upper range generates an immediate
Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to
ensure that tagged pointers in the upper range also deterministically
trigger a Translation Fault via EPD1.

Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in
`processor.h` to support this configuration. These are based on their
definitions in `arch/arm64/include/asm/pgtable-hwdef.h`.

Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: selftests: Add a test for FEAT_IDST

Add a very basic test checking that FEAT_IDST actually works for
the {GMID,SMIDR,CSSIDR2}_EL1 registers.

Link: https://patch.msgid.link/20260108173233.2911955-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome

With FEAT_IDST, unimplemented system registers in the feature ID space
must be reported using EC=0x18 at the closest handling EL, rather than
with an UNDEF.

Most of these system registers are always implemented thanks to their
dependency on FEAT_AA64, except for a set of (currently) three registers:
GMID_EL1 (depending on MTE2), CCSIDR2_EL1 (depending on FEAT_CCIDX),
and SMIDR_EL1 (depending on SME).

For these three registers, report their trap as EC=0x18 if they
end-up trapping into KVM and that FEAT_IDST is implemented in the guest.
Otherwise, just make them UNDEF.

Link: https://patch.msgid.link/20260108173233.2911955-9-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: pkvm: Add a generic synchronous exception injection primitive

Similarly to the "classic" KVM code, pKVM doesn't have an "anything
goes" synchronous exception injection primitive.

Carve one out of the UNDEF injection code.

Link: https://patch.msgid.link/20260108173233.2911955-8-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE

If our host has MTE, but the guest doesn't, make sure we set HCR_EL2.TID5
to force GMID_EL1 being trapped. Such trap will be handled by the
FEAT_IDST handling.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://patch.msgid.link/20260108173233.2911955-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Handle CSSIDR2_EL1 and SMIDR_EL1 in a generic way

Now that we can handle ID registers using the FEAT_IDST infrastrcuture,
get rid of the handling of CSSIDR2_EL1 and SMIDR_EL1.

Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://patch.msgid.link/20260108173233.2911955-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers

Add a bit of infrastrtcture to triage_sysreg_trap() to handle the
case of registers falling into the Feature ID space that do not
have a local handler.

For these, we can directly apply the FEAT_IDST semantics and inject
an EC=0x18 exception. Otherwise, an UNDEF will do.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://patch.msgid.link/20260108173233.2911955-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add a generic synchronous exception injection primitive

Maybe in a surprising way, we don't currently have a generic way
to inject a synchronous exception at the EL the vcpu is currently
running at.

Extract such primitive from the UNDEF injection code.

Reviewed-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com>
Link: https://patch.msgid.link/20260108173233.2911955-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Add trap routing for GMID_EL1

HCR_EL2.TID5 is currently ignored by the trap routing infrastructure.
Wire it in the routing table so that GMID_EL1, the sole register
trapped by this bit, is correctly handled in the NV case.

Link: https://patch.msgid.link/20260108173233.2911955-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: Repaint ID_AA64MMFR2_EL1.IDS description

ID_AA64MMFR2_EL1.IDS, as described in the sysreg file, is pretty horrible
as it diesctly give the ESR value. Repaint it using the usual NI/IMP
identifiers to describe the absence/presence of FEAT_IDST.

Also add the new EL3 routing feature, even if we really don't care about it.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://patch.msgid.link/20260108173233.2911955-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Honor UX/PX attributes for EL2 S1 mappings

Now that we potentially have two bits to deal with when setting
execution permissions, make sure we correctly handle them when both
when building the page tables and when reading back from them.

Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251210173024.561160-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Convert VTCR_EL2 to config-driven sanitisation

Describe all the VTCR_EL2 fields and their respective configurations,
making sure that we correctly ignore the bits that are not defined
for a given guest configuration.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251210173024.561160-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co

None of the registers we manage in the feature dependency infrastructure
so far has any RES1 bit. This is about to change, as VTCR_EL2 has
its bit 31 being RES1.

In order to not fail the consistency checks by not describing a bit,
add RES1 bits to the set of immutable bits. This requires some extra
surgery for the FGT handling, as we now need to track RES1 bits there
as well.

There are no RES1 FGT bits *yet*. Watch this space.

Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251210173024.561160-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: Convert VTCR_EL2 to sysreg infratructure

Our definition of VTCR_EL2 is both partial (tons of fields are
missing) and totally inconsistent (some constants are shifted,
some are not). They are also expressed in terms of TCR, which is
rather inconvenient.

Replace the ad-hoc definitions with the the generated version.
This results in a bunch of additional changes to make the code
with the unshifted nature of generated enumerations.

The register data was extracted from the BSD licenced AARCHMRS
(AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0).

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20251210173024.561160-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>